A parallel program is a computer program written to run on a cluster of computers, instead of just one computer. A cluster of computers is many everyday computers networked together in a specific way so that they can communicate. This is done so that programs with a lot of computations can run faster. Beowulf clusters are a cheaper alternative to supercomputers. They are however less convenient because you have to explicitly program what the different computers do and how they interact. This website will guide you in running some sample parallel programs using a Beowulf cluster.
To write a parallel program in C++ you first of all include the library "mpi.h" in your code. Then you continue to write a program. When you compile this program using the command mpic++ or mpiCC and run it using the command mpirun, a copy of this excecutable will be run simultaneously on every computer in the cluster.
In order to make the computers work together one uses functions from the
library "mpi.h" that do things such as:
-Identify each computer with a number by setting an argument of the function to that number (this is called the computer's "rank")
-Send or recieve data from one specified rank to another specified rank
-Set a barrier that each computer stops at untill all the computers have reached it, then they all continue on and they are syncronized
To learn more about MPI check out these links:
a nice introductory MPI tutorial PDF
a comprehensive list of MPI functions
an extrordinarily detailed documentation for MPI
Some random notes from my experience parallel programming with
-Sending information from computer to computer takes the most time, so keep that to a absolute minimum! Because of this, sometimes it is more time efficient to perform the same calculation on every computer even though that might seem wasteful, just so you dont have to send the results out.
-Every data object you declare is like an array of that object, one on each computer.
-It is difficult to get into the mindset when writing code, that this code is being run multiple times, once on every computer.
-It helps to keep strict track of each function and whether it needs to be called by all machines at once or if it can be called on its own.