Data shuffling from Dr Ian Korir on 2010-10-28 (RootTalk)

From: Dr Ian Korir <IKorir_at_nnr.co.za>
Date: Thu, 28 Oct 2010 15:13:09 +0200

Dear ROOTers,

This is a very basic questions. I have about 20M 1D array of integers (which actually representing cells of larger arrays data structures) and want to increase the entropy or randomness of the data. This may mean shuffling at least 20M!/2 times in the ideal situation even though basic maths says only 6 or 7 will do. This could probably take a century or so with a standard workstation.

Now here comes my question, which is the most optimum way of doing the shuffling for timely reason?

My present algorithm goes like this for 1000 shuffles of the cell locations:

//--------------------------------------------------------------------

TRandom3 gRan;

gRan.SetSeed(0);

const Int_t UWL = 20000000;

Int_t dd1[UWL];

Int_t dd2[UWL];

Int_t i;

Int_t is;

Int_t LOOP = 500;

for (i=0;i<UWL;i++) dd1[i]=i+1;

    printf("\n===================== Shuffling for %i Event
=====================", LOOP);

while (LOOP-->0){

for (i=0;i<UWL;){

is=gRan.Integer(UWL);

if (dd1[is]>0) {dd2[i]=dd1[is]; dd1[is]=-1;i++;}

}

for (i=0;i<UWL;){

is=gRan.Integer(UWL);

if (dd2[is]>0) {dd1[i]=dd2[is]; dd2[is]=-1;i++;}

}

printf("\nShuffling @ event %i ", LOOP);

}//for LOOPS

//--------------------------------------------------------------------

Is there a simpler and faster approach in root to perform such kind of entropy increase?

On another note, which is faster to perform data synthesise, use of store data in say TH1F or routine arrays, given same data sizes?

Regards,

Ian Received on Thu Oct 28 2010 - 15:12:36 CEST

This archive was generated by hypermail 2.2.0 : Thu Oct 28 2010 - 17:50:01 CEST