ROOT and OpenMp

Parallel analysis of data in the ROOT environment can be accomplished using OpenMP technologies. In particular, the Minuit2 library was written using openMP. When working with this library You must set the USE_PARALLEL_MINUIT2 and USE_OPENMP environment variables.

Also, OpenMP technology butt in the TMVA package - a comprehensive toolkit for multidimensional analysis data in ROOT. In classes that implement the genetic algorithm: GeneticAlgorithm, GeneticPopulation - using OpenMP parallelized large loops. The FitUtil class of the mathematical library implements the FitUtilParallel () method.

Finally, ROOT allows the execution of user macros written using library procedures. OpenMP, but currently only in batch mode.

Let's look at an example of how to start filling histograms using OpenMP.

The first step is to edit the .bashrc file. In this file you need to add the address where ROOT on your car. For example, for a hybrilit cluster:

# .bashrc
....................
....................
....................

# User specific aliases and functions

export ROOTSYS=/cvmfs/hybrilit.jinr.ru/sw/root
export PATH=$PATH:$ROOTSYS/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$ROOTSYS/lib

Next, in order to access OpenMP in the user macro you need to add

		#include <omp.h> 
		

Below is the code of the macro in which the random number is drawn, the calculations with this number and the filling histograms

#include "TROOT.h"
#include "TH1F.h"
#include "TRandom.h"
#include "TCanvas.h"
#include "TFrame.h"
#include "TDatime.h"
#include "TF1.h"
#include "TFile.h"
#include "TStopwatch.h"
#include <iostream>
#include <omp.h>

using namespace std;

int main(){	
gROOT->Time();
TStopwatch timer;
Int_t nHist=48;
Long64_t n=10000000;
TH1F *fH1F[48];
Int_t i,j;
Double_t y;
TRandom *fRandom=new TRandom();
TFile *f=new TFile("ompYes.root","recreate");
cout<<"max threads="<<omp_get_max_threads()<<endl;
timer.Start();
for(i=0;i<nHist;i++){
fH1F[i] = new TH1F(Form("hpx%d",i),"The px distribution",200,0,200);
}
#pragma omp parallel for shared(fH1F) private(j,i)
for(i=0;i<nHist;i++){
   for(j=0;j<n;j++){
      Double_t x=fRandom->Gaus(0.,i+1.);
      y=sqrt(x*x*x*x+x*x+1)-cos(x+5)*sin(x-5);
      fH1F[i]->Fill(y);
   }
}
f->Write();
timer.Stop();
cout<<"time: ";
timer.Print("m");
f->Close();
return 0;
}

To compile a macro, you must run the command

g++ -fopenmp NameMacros.C `root-config --cflags --libs`

Upon successful compilation, an executable binary file is generated. By default, the name of the binary file is a.out. Details on how to run OpenMP applications on the hybrilit cluster can be viewed. here. Here we give the recommended type of script file, for example, with 12 streams:

#!/bin/sh
#SBATCH -p cpu
#SBATCH -c 12
#SBATCH t 60
export OMP_NUM_THREADS=12
export OMP_PLACES=cores
./a.out

To start the application, use the following command:

$ sbatch omp_script

It should be noted that using OpenMP you can easily parallelize arithmetic operations occurring in macros. Specific methods of the ROOT package often do not allow parallelization using this technology. For example, the classes TTree and TFile are not thread-safe because they manipulate global data and not all of this data is fully protected to ensure thread safety. Therefore, the objects of these classes should not be shared (without blocking) between threads. However, you can create multiple TFile objects. (and therefore TTree objects) by reading the same physical file.

Keep in mind the fact that the formation of flows also takes time. It may happen that winnings in time from the program on several threads will be less than the time spent on the formation of threads. The use of OpenMP is justified only if a significant amount is performed in each stream. arithmetic operations. Otherwise, it is preferable to use PROOF for program parallelization. By this link you can find an example (download file) that compares the effectiveness program parallelization using OpenMP and PROOF.