Course Unit Code | 460-4117/01 |
---|
Number of ECTS Credits Allocated | 4 ECTS credits |
---|
Type of Course Unit * | Choice-compulsory type B |
---|
Level of Course Unit * | Second Cycle |
---|
Year of Study * | Second Year |
---|
Semester when the Course Unit is delivered | Winter Semester |
---|
Mode of Delivery | Face-to-face |
---|
Language of Instruction | Czech |
---|
Prerequisites and Co-Requisites | Course succeeds to compulsory courses of previous semester |
---|
Name of Lecturer(s) | Personal ID | Name |
---|
| KRO080 | prof. Ing. Pavel Krömer, Ph.D. |
Summary |
---|
This course provides the students with working knowledge in the area of parallel systems, algorithms and programming.
It concentrates on practical development of parallel codes so that the students can make effective use of the modern parallel hardware, from supercomputers with distributed memory through multicore processors to floating point accelerators and general purpose graphic processing units, to solve computationally extensive problems from different application areas.
An emphasis is put on the introduction to standard parallel paradigms, interfaces, languages, and libraries, as well as on the reflection of the actual development in the field by providing overview of the latest parallel platforms and environments. Parallel programming of distributed memory systems (i.e. the message passing model), shared memory systems (symmetric multiprocessors), and floating-point accelerators will be presented. However, cloud platforms, map-reduce model, and parallel Matlab will be discussed as well.
The tutorials will be devoted to practical design and implementation of parallel algorithms with the help of MPI, OpenMP, UPC, CUDA-C, or parallel Matlab.
|
Learning Outcomes of the Course Unit |
---|
An overview of design, implementation, and evaluation of parallel algorithms. Practical use of programming techniques for selected parallel architectures. Working knowledge of parallel systems and programming. In particular, design of parallel algorithms and parallelization of sequential ones. Implementation of parallel algorithms based on message passing. Analysis and evaluation of parallel algorithms. Optimization and improvement of efficiency. |
Course Contents |
---|
Lectures:
=========
1. Introduction to parallel programming. Processes and threads. Processes and threads from the operating system perspective.
2. Sequential vs. parallel programming. Parallel programming caveats. Deadlock (definition, properties, conditions, detection, elimination).
3. Parallel vs. distributed applications. Classification of parallel systems. Shared memory systems and distributed memory systems. Flynn's taxonomy.
4. Shared memory systems programming. Programming with threads. The pthreads library, Threads in Java and C#. Synchronization and exclusion, deadlock.
5. The OpenMP interface. OpenMP support in modern compilers. OpenMP directives and functions. Reduction in OpenMP.
6. Fork-join model, Cilk and Cilk++ languages. Parallel Matlab. Parallel programming in Python, the NumPy library.
7. Distributed memory systems programming. Message passing. Posix queues, sockets. The MPI interface, fundamental MPI functions. MPI libraries.
8. Partitioned Global Address Space (PGAS) model. The Unified Parallel C (UPC) language.
9. Batch (non-interactive) task execution. Schedulers PBSPro and Torque.
10. Grid and cloud programming. Web services and distributed applications using web services. Map-reduce paradigm and Hadoop framework.
11. Introduction to accelerator programming. GPGPU architecture (program organization, memory organization). Data parallelism. CUDA platform and CUDA-C language.
12. Other interfaces for accelerator programming: OpenCL, OpenACC, OpenMP 4.
13. Intel MIC architecture. Programming the MIC and Intel Xeon Phi. Emerging platforms.
14. Hybrid applications. MPI-OpenMP programming, hybrid CPU-GPU architectures.
Tutorials:
==========
The topics and tasks dealt with at the tutorials will correspond to the topics presented at the lectures.
1. Overview of environments for parallel programming. SIMD instructions.
2. Implementation of a simple multithreaded program.
3. Implementation of a simple multiprocess program. Inter process communication with the help of sockets and queues.
4. Debugging a profiling of parallel programs.
5. Implementation of a simple OpenMP program. Parallelization of a sequential program by OpenMP.
6. Implementation of a simple program in Cilk++. A demonstration of parallel programming in Matlab.
7. Implementation of a simple MPI program.
8. Implementation of a simple program in UPC.
9. The use of PBS, PBSPro, and Torque schedulers; qsub command and QSUB directives.
10. Implementation of a simple map-reduce application under the Hadoop framework.
11. Implementation of a simple data-parallel program in CUDA-C.
12. A demonstration of OpenCL, OpenACC, and OpenMP 4.
13. Intel MIC programming. Implementation of a simple Intel MIC program.
14. Implementation of a simple hybrid MPI - OpenMP program. |
Recommended or Required Reading |
---|
Required Reading: |
---|
1. PA I lecture notes
2. Introduction to Parallel Computing: From Algorithms to Programming on State-of-the-Art Platforms. Roman Trobec, Boštjan Slivnik, Patricio Bulić, Borut Robič. Springer Nature Switzerland, 2018
3. Parallel Programming for Multicore and Cluster Systems. Thomas Rauber, Gudula Rünger. Springer-Verlag Berlin Heidelberg, 2013
4. Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering. Ian Foster, Addison Wesley, 1995 |
1. Sylaby k předmětu Paralelní algoritmy I.
2. Introduction to Parallel Computing: From Algorithms to Programming on State-of-the-Art Platforms. Roman Trobec, Boštjan Slivnik, Patricio Bulić, Borut Robič. Springer Nature Switzerland, 2018
3. Parallel Programming for Multicore and Cluster Systems. Thomas Rauber, Gudula Rünger. Springer-Verlag Berlin Heidelberg, 2013
4. Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering. Ian Foster, Addison Wesley, 1995 |
Recommended Reading: |
---|
1. The OpenMP Common Core: Making OpenMP Simple Again, Timothy G. Mattson, Yun He, Alice E. Koniges. MIT Press, 2019
2. Using OpenMP-The Next Step: Affinity, Accelerators, Tasking, and SIMD. Ruud van der Pas, Eric Stotzer, and Christian Terboven. MIT Press, 2017
3. Distributed Systems (3rd ed.), Andrew S. Tanenbaum, Maarten van Steen, 2017
4. Distributed Computing Principles, Algorithms, and Systems, Ajay D. Kshemkalyani, Mukesh Singhal, Cambridge, 2008
5. Patterns for parallel programming. Timothy Mattson, Beverly Sanders, Berna Massingill, Addison-Wesley, 2004
6. Introduction to Parallel Computing (2nd Edition). Ananth Grama, George Karypis, Vipin Kumar, Anshul Gupta, Addison-Wesley, 2003 |
1. The OpenMP Common Core: Making OpenMP Simple Again, Timothy G. Mattson, Yun He, Alice E. Koniges. MIT Press, 2019
2. Using OpenMP-The Next Step: Affinity, Accelerators, Tasking, and SIMD. Ruud van der Pas, Eric Stotzer, and Christian Terboven. MIT Press, 2017
3. Distributed Systems (3rd ed.), Andrew S. Tanenbaum, Maarten van Steen, 2017
4. Distributed Computing Principles, Algorithms, and Systems, Ajay D. Kshemkalyani, Mukesh Singhal, Cambridge, 2008
5. Patterns for parallel programming. Timothy Mattson, Beverly Sanders, Berna Massingill, Addison-Wesley, 2004
6. Introduction to Parallel Computing (2nd Edition). Ananth Grama, George Karypis, Vipin Kumar, Anshul Gupta, Addison-Wesley, 2003 |
Planned learning activities and teaching methods |
---|
Lectures, Tutorials |
Assesment methods and criteria |
---|
Task Title | Task Type | Maximum Number of Points (Act. for Subtasks) | Minimum Number of Points for Task Passing |
---|
Graded credit | Graded credit | 100 | 51 |