- Use MPI to create a parallel Odd-Even sort algorithm.
- Conduct scalabitily test and speed up experiment.
Pthread version : Single node thread programming using Pthread.
Hybrid version : Multi-node programming. Use MPI and OpenMP to solve the Mandelbrot set problem.
Optimization :
- MPI work pool
- thread task queue
- vectorization
CPU version : Use thread programming to parallel Blocked Floyd-Warshall Algorithm.
Single GPU version : Implement CUDA version of Blocked Floyd-Warshall Algorithm on a single-GPU.
Multi GPU version : Implement CUDA version of Blocked Floyd-Warshall Algorithm on Multi-GPU.
- Trace UCX project code.
- Modify the code to meet spec.
Use OpenMP and CUDA to parallelize K-Means Clustering algorithm, result in a 7.82 times speedup, with experiments and a complete report. Check the presentation PPT in that folder to see more details.
Optimization :
- Sequential version 91.8s
- OpenMP version 47.2s
- CUDA version 11.7s
A+ 1/80