Course Unit Code | 460-4118/01 |
---|
Number of ECTS Credits Allocated | 4 ECTS credits |
---|
Type of Course Unit * | Choice-compulsory type B |
---|
Level of Course Unit * | Second Cycle |
---|
Year of Study * | Second Year |
---|
Semester when the Course Unit is delivered | Summer Semester |
---|
Mode of Delivery | Face-to-face |
---|
Language of Instruction | Czech |
---|
Prerequisites and Co-Requisites | Course succeeds to compulsory courses of previous semester |
---|
Name of Lecturer(s) | Personal ID | Name |
---|
| GAJ03 | doc. Ing. Petr Gajdoš, Ph.D. |
Summary |
---|
The subject follows an existing one called Parallel Algorithms I. Acquired knowledge makes a presumption for understanding of new topics. Selected lecture notes give a ground for practical exercises. nVidia CUDA architecture will be presented in more detail will related tools for parallel programming on GPU. Assumption of parallel programming technics in combination with solving of practical tasks makes the most important premises to pass the final exam. |
Learning Outcomes of the Course Unit |
---|
The main goal consists in the knowledge extension in the area of programming of parallel applications. The lessons extend an existing subject (Parallel Algorithms I). All topics will be focused on usage of graphic processor units (GPU). Students will be familiar with existing architectures of GPUs and frameworks for parallel programming. The CUDA architecture will be explained in more detail with the respect to the fact, that nVidia Research Center has arisen on VŠB-TU Ostrava. Students get necessary knowledge to be able to solve practical tasks with the usage of GPU. They can use it in their diploma work or in several grant projects running on VŠB-TU Ostrava.
Knowledge and skills:
- orientation in the basic concept of architecture of graphic processors
- knowledge in software architecture of parallel program, problem decomposition into grids, blocks and threads
- knowledge in selected framework for parallel programming on GPU
- understanding of algorithm conversion from serial to parallel form
- task distribution over several GPUs, clusters
- students should be able to solve practical tasks in the area of data processing |
Course Contents |
---|
The lecture notes are designed such that they can make the basis for practical exercising on computer labs.
The outline of lessons:
1. Introduction to parallel programming on GPU, a brief history, CUDA
2. CUDA architecture and its integration within standard C++ project
3. Threads and kernel functions
4. CUDA memories, patterns and usage
5. Memory bank conflicts
6. Program execution control, distribution of an algorithm
7. Algorithm performance with respect to its parallelization on GPU
9. Optimization on the data level, effective data structures.
10. Optimization of programs with respect to the maximum GPU performance
11. Support library CUBLAS
12. The Case study
The outline of exercises (exercises are on computer labs):
1. The first application in CUDA
2. Data transfers to/from GPU
3. Threads hierarchy, basic thread life cycle, limits, calling of kernel functions, parameters and restrictions
4. CUDA memories, patterns and usage
5. Memory bank conflicts, access optimization, suitable data structures
6. Streams, parallel calling of kernel functions, synchronization on several levels
7. The case study, experiment with more variants of the same program
8. Vectors and matrices, the case study, large data processing, parallel reduction
9. Introduction to several support libraries for linear algebra
10. The case study, image manipulation, double buffering, optimization at the level of blocks, registers, etc.
11. The case study, Interesting research topics, outline of possible Solutions, experiments
12. Program tuning, debugging with nVidia nSight
|
Recommended or Required Reading |
---|
Required Reading: |
---|
[1] Bjarne Stroustrup. The C++ Programming Language, 4th Edition. Addison-Wesley Professional, 4th edition, 5 2013.
[2] Graham Sellers, Richard S. Wright, and Nicholas Haemel. OpenGL SuperBible: Comprehensive Tutorial and Reference (6th Edition). Addison-Wesley Professional, 6th edition, 7 2013.
[3] John Cheng, Max Grossman, and Ty McKercher. Professional CUDA C Programming. Wrox, 1st edition, 9 2014.
[4] Soyata, Tolga. GPU parallel program development using CUDA. CRC Press, 2018. |
[1] Bjarne Stroustrup. The C++ Programming Language, 4th Edition. Addison-Wesley Professional, 4th edition, 5 2013.
[2] Graham Sellers, Richard S. Wright, and Nicholas Haemel. OpenGL SuperBible: Comprehensive Tutorial and Reference (6th Edition). Addison-Wesley Professional, 6th edition, 7 2013.
[3] John Cheng, Max Grossman, and Ty McKercher. Professional CUDA C Programming. Wrox, 1st edition, 9 2014.
[4] Soyata, Tolga. GPU parallel program development using CUDA. CRC Press, 2018. |
Recommended Reading: |
---|
[1] Bjarne Stroustrup. The C++ Programming Language, 4th Edition. Addison-Wesley Professional, 4th edition, 5 2013.
[2] John Cheng, Max Grossman, and Ty McKercher. Professional CUDA C Programming. Wrox, 1st edition, 9 2014.
[3] Tuomanen, Brian. Hands-On GPU Programming with Python and CUDA: Explore high-performance parallel computing with CUDA. Packt Publishing Ltd, 2018.
[4] Volodymyr Kindratenko, editor. Numerical Computations with GPUs. Springer, 2014 edition, 7 2014.
[5] Vaidya, Bhaumik. Hands-On GPU-Accelerated Computer Vision with OpenCV and CUDA: Effective techniques for processing complex image data in real time using GPUs. Packt Publishing Ltd, 2018.
[6] Jung W. Suh and Youngmin Kim. Accelerating MATLAB with GPU Computing: A Primer with Examples. Morgan Kaufmann, 1st edition, 12 2013.
|
[1] Bjarne Stroustrup. The C++ Programming Language, 4th Edition. Addison-Wesley Professional, 4th edition, 5 2013.
[2] John Cheng, Max Grossman, and Ty McKercher. Professional CUDA C Programming. Wrox, 1st edition, 9 2014.
[3] Tuomanen, Brian. Hands-On GPU Programming with Python and CUDA: Explore high-performance parallel computing with CUDA. Packt Publishing Ltd, 2018.
[4] Volodymyr Kindratenko, editor. Numerical Computations with GPUs. Springer, 2014 edition, 7 2014.
[5] Vaidya, Bhaumik. Hands-On GPU-Accelerated Computer Vision with OpenCV and CUDA: Effective techniques for processing complex image data in real time using GPUs. Packt Publishing Ltd, 2018.
[6] Jung W. Suh and Youngmin Kim. Accelerating MATLAB with GPU Computing: A Primer with Examples. Morgan Kaufmann, 1st edition, 12 2013.
|
Planned learning activities and teaching methods |
---|
Lectures, Individual consultations, Tutorials |
Assesment methods and criteria |
---|
Task Title | Task Type | Maximum Number of Points (Act. for Subtasks) | Minimum Number of Points for Task Passing |
---|
Graded credit | Graded credit | 100 | 51 |