Faculty of Electrical Engineering and Computer Science

Parallel Algorithms II

* Exchange students do not have to consider this information when selecting suitable courses for an exchange stay.

Name of Lecturer(s)	Personal ID	Name
Course Unit Code	460-4118/01
Number of ECTS Credits Allocated	4 ECTS credits
Type of Course Unit *	Choice-compulsory type B
Level of Course Unit *	Second Cycle
Year of Study *	Second Year
Semester when the Course Unit is delivered	Summer Semester
Mode of Delivery	Face-to-face
Language of Instruction	Czech
Prerequisites and Co-Requisites	Course succeeds to compulsory courses of previous semester
	GAJ03	doc. Ing. Petr Gajdoš, Ph.D.
Summary
The subject follows an existing one called Parallel Algorithms I. Acquired knowledge makes a presumption for understanding of new topics. Selected lecture notes give a ground for practical exercises. nVidia CUDA architecture will be presented in more detail will related tools for parallel programming on GPU. Assumption of parallel programming technics in combination with solving of practical tasks makes the most important premises to pass the final exam.
Learning Outcomes of the Course Unit
The main goal consists in the knowledge extension in the area of programming of parallel applications. The lessons extend an existing subject (Parallel Algorithms I). All topics will be focused on usage of graphic processor units (GPU). Students will be familiar with existing architectures of GPUs and frameworks for parallel programming. The CUDA architecture will be explained in more detail with the respect to the fact, that nVidia Research Center has arisen on VŠB-TU Ostrava. Students get necessary knowledge to be able to solve practical tasks with the usage of GPU. They can use it in their diploma work or in several grant projects running on VŠB-TU Ostrava. Knowledge and skills: - orientation in the basic concept of architecture of graphic processors - knowledge in software architecture of parallel program, problem decomposition into grids, blocks and threads - knowledge in selected framework for parallel programming on GPU - understanding of algorithm conversion from serial to parallel form - task distribution over several GPUs, clusters - students should be able to solve practical tasks in the area of data processing
Course Contents
The lecture notes are designed such that they can make the basis for practical exercising on computer labs. The outline of lessons: 1. Introduction to parallel programming on GPU, a brief history, CUDA 2. CUDA architecture and its integration within standard C++ project 3. Threads and kernel functions 4. CUDA memories, patterns and usage 5. Memory bank conflicts 6. Program execution control, distribution of an algorithm 7. Algorithm performance with respect to its parallelization on GPU 9. Optimization on the data level, effective data structures. 10. Optimization of programs with respect to the maximum GPU performance 11. Support library CUBLAS 12. The Case study The outline of exercises (exercises are on computer labs): 1. The first application in CUDA 2. Data transfers to/from GPU 3. Threads hierarchy, basic thread life cycle, limits, calling of kernel functions, parameters and restrictions 4. CUDA memories, patterns and usage 5. Memory bank conflicts, access optimization, suitable data structures 6. Streams, parallel calling of kernel functions, synchronization on several levels 7. The case study, experiment with more variants of the same program 8. Vectors and matrices, the case study, large data processing, parallel reduction 9. Introduction to several support libraries for linear algebra 10. The case study, image manipulation, double buffering, optimization at the level of blocks, registers, etc. 11. The case study, Interesting research topics, outline of possible Solutions, experiments 12. Program tuning, debugging with nVidia nSight
Recommended or Required Reading
Required Reading:
[1] Bjarne Stroustrup. The C++ Programming Language, 4th Edition. Addison-Wesley Professional, 4th edition, 5 2013. [2] Graham Sellers, Richard S. Wright, and Nicholas Haemel. OpenGL SuperBible: Comprehensive Tutorial and Reference (6th Edition). Addison-Wesley Professional, 6th edition, 7 2013. [3] John Cheng, Max Grossman, and Ty McKercher. Professional CUDA C Programming. Wrox, 1st edition, 9 2014. [4] Soyata, Tolga. GPU parallel program development using CUDA. CRC Press, 2018.
[1] Bjarne Stroustrup. The C++ Programming Language, 4th Edition. Addison-Wesley Professional, 4th edition, 5 2013. [2] Graham Sellers, Richard S. Wright, and Nicholas Haemel. OpenGL SuperBible: Comprehensive Tutorial and Reference (6th Edition). Addison-Wesley Professional, 6th edition, 7 2013. [3] John Cheng, Max Grossman, and Ty McKercher. Professional CUDA C Programming. Wrox, 1st edition, 9 2014. [4] Soyata, Tolga. GPU parallel program development using CUDA. CRC Press, 2018.
Recommended Reading:
[1] Bjarne Stroustrup. The C++ Programming Language, 4th Edition. Addison-Wesley Professional, 4th edition, 5 2013. [2] John Cheng, Max Grossman, and Ty McKercher. Professional CUDA C Programming. Wrox, 1st edition, 9 2014. [3] Tuomanen, Brian. Hands-On GPU Programming with Python and CUDA: Explore high-performance parallel computing with CUDA. Packt Publishing Ltd, 2018. [4] Volodymyr Kindratenko, editor. Numerical Computations with GPUs. Springer, 2014 edition, 7 2014. [5] Vaidya, Bhaumik. Hands-On GPU-Accelerated Computer Vision with OpenCV and CUDA: Effective techniques for processing complex image data in real time using GPUs. Packt Publishing Ltd, 2018. [6] Jung W. Suh and Youngmin Kim. Accelerating MATLAB with GPU Computing: A Primer with Examples. Morgan Kaufmann, 1st edition, 12 2013.
[1] Bjarne Stroustrup. The C++ Programming Language, 4th Edition. Addison-Wesley Professional, 4th edition, 5 2013. [2] John Cheng, Max Grossman, and Ty McKercher. Professional CUDA C Programming. Wrox, 1st edition, 9 2014. [3] Tuomanen, Brian. Hands-On GPU Programming with Python and CUDA: Explore high-performance parallel computing with CUDA. Packt Publishing Ltd, 2018. [4] Volodymyr Kindratenko, editor. Numerical Computations with GPUs. Springer, 2014 edition, 7 2014. [5] Vaidya, Bhaumik. Hands-On GPU-Accelerated Computer Vision with OpenCV and CUDA: Effective techniques for processing complex image data in real time using GPUs. Packt Publishing Ltd, 2018. [6] Jung W. Suh and Youngmin Kim. Accelerating MATLAB with GPU Computing: A Primer with Examples. Morgan Kaufmann, 1st edition, 12 2013.
Planned learning activities and teaching methods
Lectures, Individual consultations, Tutorials
Assesment methods and criteria
Task Title	Task Type		Maximum Number of Points (Act. for Subtasks)	Minimum Number of Points for Task Passing
Graded credit	Graded credit		100	51