-
Notifications
You must be signed in to change notification settings - Fork 0
Settings OPENMP
The 1d_pimc
program can be used in a multi-thread mode using OPENMP. Parallelization is applied to the update algorithm which represent the core of the MC simulation. Indeed, most of the running time is devoted to the MC steps, jumping from a configuration to another in the definition of the chain.
The mode is activated by using the flag -fopenmp
when compiling the code. In the makefile
provided, this option is not commented out, so remove it if needed. Otherwise, the _OPENMP
variable is defined through the code. Notice that, in this program, all the OPENMP pragmas and modifications appears only in the pimcclass.*
files. Once activated, the program needs to know how many threads must be used. This setting is placed in the pimcclass.h
header
#ifdef _OPENMP
// compiled with -fopenmp _OPENMP is defined
#include <omp.h>
#define omp_num_threads 2 // max used number_of_threads
#else
#define omp_get_thread_num() 0
#define omp_get_max_threads() 1
#endif
by default, the number of threads is set to 2 if multi-thread mode is selected. Choose the number you want and re-compile the code in order to make the change effective (this is quite unhandy but I plan to upgrade the code and handle this parameter in a different way very soon, using environmental variable in the makefile).
Now some words about the implementation of the multi-thread update. The Metropolis step needs to read the value the configuration in certain point (selecting a particle and a time-slice) and try a random move, with a defined probability of accepting or rejecting it. In a trivial (and wrong) way, we could assign subsequent points to different threads and perform the update in parallel. But here comes the problem: in the evaluation of the local action needed to perform the Metropolis test, we need to read not only the value in a certain point, but also other values of the configuration. For example, using a right derivative we need to read the position of the particle in the next time-slice and even worse, if an interaction term is present, we need to read also the position of the other particles in the same time-slice. Also, it is not possible to use just one random number generator because of it is not possible to keep under control its state, risking to generate the same number. In order to handle this problem, the following strategy has been adopted in the code:
- Define as many random number generator as the number of threads, seeding them differently.
- Split the configuration in even and odd time-slices.
- Using only the even or the odd time-slices, assign them to different threads.
- Do the update working on parallel on the threads.
- Switch to the remaining time-slices and repeat from 2)
This should be thread-safe: at each thread corresponds a different random number generator, so its state is accessed only by its thread; accessing the next or the previous time-slice (for example if one uses a right or left derivative) is safe since even and odd time-slices are separated; the other particles positions are kept fixed since each thread works controls all the time-slice (this is needed if the interaction potential is long-ranged).