In principle errors are trapped by ROOT. However, correct continuation depends a lot on where the error happened. If it was somewhere in the interpreter, things are currently not correctly reset to allow a restart. We expect to work on this problem with Masa during the ROOT workshop in Fermilab. If the error happens somewhere else a lot depends on how good the code can handle re-entrancy, etc. The system should support some global reset so that one can continue with a next event in a clean state. Cheers, Fons. Valeriy Onuchin wrote: > > Hi Rooters! > We are using ROOT for online monitoring > http://emcal06.rhic.bnl.gov/~onuchin/Sproot/html/USER_Index.html > > One of the main our problems is providing > fault tolerant processing = > providing recovery from system/ROOT/process failure. > > If anybody has solutions or experience how to deal with it ? > > With best regards, Valery > > P.S. > Suppose similar problems must be in offline processing too, > e.g. AtlasFast and Star have a chain of makers, > what do you do when one of the makers crashed your root session? > > ... and suggestion > we are using TMapFiles for local storage of processed data > http://emcal06.rhic.bnl.gov/~onuchin/Sproot/html/DbManager.html > after introducing TMapRec it became possible to loop over > objects in TMapFile , > > but could you(Fons) change TMapFile:AcquireSemaphore() > and TMapFile::ReleaseSemaphore() from protected to public ? -- Org: CERN, European Laboratory for Particle Physics. Mail: 1211 Geneve 23, Switzerland E-Mail: Fons.Rademakers@cern.ch Phone: +41 22 7679248 WWW: http://root.cern.ch/~rdm/ Fax: +41 22 7677910
This archive was generated by hypermail 2b29 : Tue Jan 04 2000 - 00:43:30 MET