Log of /trunk/proof/proofplayer/inc/TPacketizerAdaptive.h
Parent Directory
Revision
38810 -
(
view)
(
download)
(
as text)
(
annotate)
-
[select for diffs]
Modified
Tue Apr 12 16:22:59 2011 UTC (3 years, 9 months ago) by
ganis
File length: 6457 byte(s)
Diff to
previous 35196
Patch to correctly honour selector abort status settings in PROOF. Currently only
the TSelector::kAbortProcess was handled by stopping processing. In particular
TSelector::kAbortFile was ignored; this recently created some problems in ALICE
with corrupted files, with repeated attempts to read events eventually leading to
bad_alloc exceptions.
This patch also fixes other related issues, in particular with the reporting of the
non-processed {files, events} in the final 'MissingFiles' list. This list should
now account much more precisely of the number of events which could not be processed.
It also fixes a problem with the final update of the progress information affecting
occasionally cases with skipped events.
Revision
35196 -
(
view)
(
download)
(
as text)
(
annotate)
-
[select for diffs]
Modified
Wed Sep 8 11:44:34 2010 UTC (4 years, 4 months ago) by
ganis
File length: 6354 byte(s)
Diff to
previous 34748
Make the recently introduced list of files to be processed owned by TPacketizerAdaptive, instead
of a static in TPacketizerAdaptive::TFileStat. This fixes a possible problem when running multiple
queries in the same session.
Revision
34748 -
(
view)
(
download)
(
as text)
(
annotate)
-
[select for diffs]
Modified
Mon Aug 9 10:18:05 2010 UTC (4 years, 5 months ago) by
ganis
File length: 6248 byte(s)
Diff to
previous 34637
Add the possibility to save the perfomance information shown by the dialog into a small
ntuple included in the output list. The ntuple contains 5 floats (processing time, number
of active workers, event rate, MBytes read, number of effective sessions on the cluster)
and it is filled each time the number of active workers changes or at max 100 regular
intervals at least 5 secs apart; in this way the ntuple has at most O(100 entries + number
of workers).
To enable the saving of the ntuple execute the following:
proof->SetParameter("PROOF_SaveProgressPerf", "yes");
before running the query. The ntuple is called 'PROOF_ProgressPerfNtuple'.
This patch also adds to the output list the parameters used by the active packetizer. Some
parameters of general interest (currently MinPacketTime and MaxPacketTime) have been moved
to TVirtualPacketizer and are always added to the list. Each packetizer is then responsible
of adding its relevant specific parameter to the dedicated list. The dedicated list is hosted
in TVirtualPacketizer and is transferred to the output list by TProofPlayer when finalising
the output list at the end of the query.
Revision
34637 -
(
view)
(
download)
(
as text)
(
annotate)
-
[select for diffs]
Modified
Wed Jul 28 14:40:47 2010 UTC (4 years, 5 months ago) by
ganis
File length: 6343 byte(s)
Diff to
previous 34527
From Maciek Nabożny and me:
Add the possibility to single-out disk partitions in the packetizer; this works by
adding the beginning of a path in the name defining a new TFileNode (e.g. 'host://disk1'
instead of 'host' only as it was so far).
The feature was requested both by ATLAS and ALICE; it is optional and can be triggered
by defining the rootrc variable
Packetizer.Partitions /disk1,/disk2,/disk3
(The administrator of a PROOF cluster can add this via 'xpd.putrc'; the user can test this
via the parameter 'PROOF_PacketizerPartitions'; see runProof.C).
In the extreme case of all files on the same disk are grouped together in the dataset
definition this addition allows to save up to 20% of processing time on a 4 core machine
with 2 disks. A systematic study of the impact of this development is on going.
Revision
34527 -
(
view)
(
download)
(
as text)
(
annotate)
-
[select for diffs]
Modified
Wed Jul 21 10:14:35 2010 UTC (4 years, 6 months ago) by
ganis
File length: 6249 byte(s)
Diff to
previous 33781
From Maciek Nabożny and me.
- Optimize the packetizer behaviour when the number of files left to be processed
is smaller than the number of workers and at least one file has a number of events
significantly larger than the average.
- Better apply the upper/lower limits on the expected packet processing time
- Fix an issue with validating the exact number of needed files when the information
about the entries is already available.
Revision
33781 -
(
view)
(
download)
(
as text)
(
annotate)
-
[select for diffs]
Modified
Tue Jun 8 14:13:39 2010 UTC (4 years, 7 months ago) by
ganis
File length: 6054 byte(s)
Diff to
previous 32204
- Optimize the validation step in the case not all the entries are required.
The validation step is stopped as soon as the requested number of events is reached.
If the parameter "PROOF_ValidateByFile" is set to 1, the number of files is exactly what
needed; otherwise the number of files may exceed the number of fles needed by #workers-1
(this is the default because additional, serial, checks are needed to ensure that only
the files really required are open).
This feature was requested in the context of ALICE reconstruction.
This new feature is used as an example in the "eventproc" tutorial in runProof.C .
- The patch also fixes a subtle bug affecting the (possibly rare) case when not all entries
are required and # entries does not correspond to an complete subset of files (e.g.
# entries = 1001000 with files of 100000 entries each). The effect was uncomplete
processing (skipped events, magenta bar) or a session freeze.
Revision
32204 -
(
view)
(
download)
(
as text)
(
annotate)
-
[select for diffs]
Modified
Wed Feb 3 19:17:40 2010 UTC (4 years, 11 months ago) by
ganis
File length: 6008 byte(s)
Diff to
previous 30859
Add support for processing many datasets in one go in TProof::Process(const char *dataset, ...).
Two options are provided:
- 'grand dataset': the datasets are added up and considered as a single dataset;
syntax: "dataset1|dataset2|..."
- 'keep separated': the datasets are processed one after the other; the user is
notified in the selector of the change of dataset so she/he
has the opportunity to separate the results. A new packetizer,
TPacketizerMulti, has been developed for this case: it basically
contains a list of standard packetizers (one for each dataset) and
loops over them.
Syntax: "dataset1,dataset2,..." or dataset1 dataset2 ..."
In both cases, entry-list can be applied using the syntax "dataset<<entrylist", e.g.
"dataset1<<el1|dataset2<<el2|".
See http://root.cern.ch/drupal/content/working-data-sets#currentelem for more details.
A test for the new functionality has been added to test/stressProof.cxx .
Revision
30859 -
(
view)
(
download)
(
as text)
(
annotate)
-
[select for diffs]
Modified
Sat Oct 24 14:53:07 2009 UTC (5 years, 3 months ago) by
ganis
File length: 6089 byte(s)
Diff to
previous 25896
Patch for improved performance monitoring. The 'Rate Plot' button in the progress
dialog is renamed 'Performance plot' and shows up to 4 plots with the event/sec,
the average read chunck size, the number of active workers and the number of active
PROOF sessions on the cluster, all as a function of processing time.
The read chunck size plot allows to monitor the usage of the cache.
The istantaneous processing rate (event/sec) is now better estimate: a few issues
with the normalizing times have ben solved, removing the artificial structures that
were observed.
The possibility to set a max packet time length is introduced (default 30 s); this
can be changes with the parameter PROOF_MaxPacketTime.
The size of the cache is also taken into account to optimize the use of the cache.
The parameter PROOF_UseParallelUnzip has been introduced to toggle the use of the
parallel unzip (default off for now).
A page describing the new performance plots is under preparation at
http://root.cern.ch/drupal/content/progress-dialog
Revision
25896 -
(
view)
(
download)
(
as text)
(
annotate)
-
[select for diffs]
Modified
Mon Oct 20 17:01:31 2008 UTC (6 years, 3 months ago) by
ganis
File length: 5896 byte(s)
Diff to
previous 25827
From Jan:
- Using consistently the recently introduced TProofProgressStatus in the kPROOF_GETPACKET
messages sent to TPacketizerUnit, TPacketizerAdaptive and TPacketizer; the message contains
the status of progress since the start of processing on a given node.
- Introduce TVirtualPacketizer::TVirtualSlaveStat as a base class of all the TSlaveStat
packetizer specific auxilliary classes.
- Full implemenation of GetProgressStatus() and AddProcessed(TProofProgressStatus *st) members
for TPacketizerUnit::TSlaveStat.
This patch should fix some consistency problems experienced after the patch introducing
TProofProgressStatus .
Revision
25827 -
(
view)
(
download)
(
as text)
(
annotate)
-
[select for diffs]
Modified
Wed Oct 15 14:02:59 2008 UTC (6 years, 3 months ago) by
ganis
File length: 5847 byte(s)
Diff to
previous 25273
From Jan:
- Added the possibility to handle removed workers and partly processed packets. When a worker is stopped
while processing a packet it finishes, the current event and the rest of the packet is reassigned to another
worker. This is done via two interfaces:
- TVirtualPacketizer::AddProcessed(TSlave *sl, TProofProgressStatus *st, TList **)
- TVirtualPacketizer::ReassignPacket.
- New class TProofProgressStatus used to keep the query progress stauts in all the TProofPlayer objects and in
TPacketizerAdaptive::TSlaveStat. This class is also used to structure the relevant information send in
kPROOF_GETPACKET and kPROOF_STOPPROCESS messages.
- The class TPacketizerProgressive is removed completely.
- The PROOF protocol version is increased to 19: this is to handle the changes in the kPROOF_STOPPROCESS and
kPROOF_GETPACKET messages in Master - worker communication.
Revision
25273 -
(
view)
(
download)
(
as text)
(
annotate)
-
[select for diffs]
Modified
Wed Aug 27 08:56:06 2008 UTC (6 years, 4 months ago) by
rdm
File length: 5748 byte(s)
Diff to
previous 23986
From Jan:
- A new optional version of PROOF with dynamic worker startup.
It can be enabled by the admin or a user with 'Proof.DynamicStartup'.
A session starts only on the master. When a query processing starts
at the master TXProofServ::GetWorkers() is called. It receives a
list of machines from the scheduler and the workers on the
machines are started. The environment is copied from the master
to the workers. It includes: the include and dynamic library paths,
the set of enabled packages as well as the macros loaded by the user.
- A new method TProof::AddWorkers(TList *workers) was added. It adds
the workers just before the query.
- A packet resubmitting mechanism. When a worker dies all the packets
that it processed are resubmitted.
- In TPacketizerAdaptive: fixing initialization of fgMaxSlaveCnt. By
default it was initialized twice.
Revision
23986 -
(
view)
(
download)
(
as text)
(
annotate)
-
[select for diffs]
Modified
Fri May 23 09:10:18 2008 UTC (6 years, 8 months ago) by
ganis
File length: 5515 byte(s)
Diff to
previous 23244
Import fixes / new functionality from branches/dev/proof:
- proof/proofplayer/src/TPacketizerAdaptive.cxx
Implement the classic strategy of the TPacketizer in TPacketizerAdaptive.
The strategy can be changed from adaptive (default) to TPacketizer with:
"PROOF_PacketizerStrategy" parameter to PROOF
- proof/proofplayer/src/TProofPlayer.cxx
Fixed error messages for 'MissingFiles' and 'FailedPackets' lists.
Improve fault detection by creating a list of failed packets upon a mismatch
in the expected and actual number of processed events; the list is added to
the output list.
- proof/proofplayer/src/TVirtualPacketizer.cxx
Make sure that something has been processed before setting kIsDone.
Improve fault detection by creating a list of failed packets upon a mismatch
in the expected and actual number of processed events; the list is added to
the output list.
- proof/proofplayer/inc/TVirtualPacketizer.h
Improve fault detection by creating a list of failed packets upon a mismatch
in the expected and actual number of processed events; the list is added to
the output list.
- proof/proofplayer/inc/TPacketizerAdaptive.h
Implement the classic strategy of the TPacketizer in TPacketizerAdaptive
- proof/proof/src/TProofServ.cxx
Fixes:
+ option string: "stageOnly" --> "stagedOnly".
+ add parenthesis to avoid a warning after the previous patch.
+ remove the objects added to the missingFiles in TDSet::Add from the 'dataset'
before deleting it
+ fixed an error HandleCheckFile ('kPROOF_WorkDir' instead of 'kPROOF_PackDir').
+ make fCacheDir and fPackageDir controllable via directive
+ in TProofServ::ErrorHandler: do not create the related additional buffer if
not logging to syslog
Added functionality:
- add possibility to flag an "Info" message as service message using
the prefix "|Svc" in the location field; e.g.
Info("SetupCommon|Svc", "Test of SvcMsg");
will produce something like
09:28:24 6892 Mst-0 | SvcMsg in <TXProofServ::SetupCommon>: Test of SvcMsg
This is needed to be able in the future to filter-out messages needed
by some services (e.g. the forthcoming memory checker) which should not be
displayed by default.
- proof/proofd/src/XrdProofSched.cxx
Improve the calculation of the number of workers to assign by using fMinForQuery
as a minimum.
Fix the length of method separators.
Fix signed/unsigned warning.
Revision
23244 -
(
view)
(
download)
(
as text)
(
annotate)
-
[select for diffs]
Modified
Wed Apr 16 08:05:01 2008 UTC (6 years, 9 months ago) by
ganis
File length: 5501 byte(s)
Diff to
previous 22635
TPacketizer:
- Fix a problem in TPacketizer when using the cached info (it was already
fixed in TPacketizerAdaptive)
- Make sure that both Long_t and Int_t are supported for maximum number of
workers per filenode
- Add some conditional debug statements
TPacketizerAdaptive:
- Make sure that both Long_t and Int_t are supported for maximum number of
workers per filenode
- Add some conditional debug statements
Revision
20862 -
(
view)
(
download)
(
as text)
(
annotate)
-
[select for diffs]
Modified
Sun Nov 18 14:51:42 2007 UTC (7 years, 2 months ago) by
ganis
Original Path:
trunk/proofplayer/inc/TPacketizerAdaptive.h
File length: 5400 byte(s)
Diff to
previous 20682
Synchronize with braches/dev/proof r20835
Summary:
TProofServ:
- read session.rootrc with level kEnvChange to be able to change existing
settings
- Avoid deleting a query result twice in some special cases
- Fix a problem with the initialization of fKeptQueries
rootrc.in:
- restore default settings the asynchronous reading
TFileCacheRead:
- add missing protection in ReadBuffer
TEventIter:
- Enable the usage of TTreeCache
XrdProofdProtocol:
- Additional check on the ownership of the unix socket
- Improve notification during Reset
- Reduce default timeout on admin requests and make it configurable
XrdProofConn:
- Use the configurable maxtry everywhere where relevant
TPacketizerAdaptive:
- Store info on all the processed packets in per-worker lists
TProofMgr, TXProofMgr, TXSocket, XProofProtocol, XrdProofdProtocol:
- Add possibility for the admin to broadcast a message to the connected users
getProof:
- Add missing protection
Revision
20682 -
(
view)
(
download)
(
as text)
(
annotate)
-
[select for diffs]
Modified
Tue Nov 6 15:51:59 2007 UTC (7 years, 2 months ago) by
ganis
Original Path:
trunk/proofplayer/inc/TPacketizerAdaptive.h
File length: 5459 byte(s)
Diff to
previous 20086
Import branches/dev/proof r20654
Summary:
+ Improvements
- Add support for SSH SOCKS4 tunnelling; the local port for the tunnel can be specified
in the master URL, e.g. TProof::Open("master/?tunnel:8000")
- Add the possibility to plot the estimated instantaneous rate
- Add "PROOF_ForceLocal" parameter to the !TPacketizerAdaptive; if set to 1, all the data
are processed locally.
- Add support for remote grep functionality while retrieving logs (needed by the forthcoming
memory monitor)
+ Bug fixes
- Several small fixes to revive the multi-master mode.
- XrdProofdProtocol:
- add missing lock to the client instance in SendMsg to avoid screwing up requests from
workers on the same machine
- lock the mutex of the requester when setting prorities
- add notification during Reset
- fix problem with the detection of the 'allow' directive
- fix problem with the parsing of the return value from XrdProofServProxy::TerminateProofServ()
- always use the effective user to retrieve info fom another server
(XrdProofdManager::GetProofConn is now used)
- fix possible dead-locks from debug notifications done after hard-killing a session
- re-enable the garbage collector thread of the connection manager in XrdProofConn to
fix a problem with closing physical connections;
- fix a problem with !CleanupProofServ in the case of a non-privileged daemon running
in multi-user mode
- introduce a timeout when waiting for the startup of a 'proofserv'.
- XrdProofConn: init mutex in the ctor; lock in SendRecv
- XrdProofSched: add support for using the priorities defined in the group manager to define
the number of workers for sessions
- TProof:
- Broadcast priorities to unique nodes only
- timeout after 5 mins the initial Collect to avoid clients getting stuck at this stage
- add support for generic timeout in Collect (disabled by default)
- fix a problem with SendFile.
- TXProofServ: add a call to !TProof::InterruptCurrentMonitor() in Terminate() to stop
infinite loops in Collect
- TXSocket:
- Implement a flag to interrupt a TXSocket while waiting for messages
- Split the session creation timeout in 4 attempts: the total timeout is the same but
it may circumvent occasional forking problems.
- XROOTD:
- fix a potential (possibly accademic) memory leak in the client
- fix a access permission problem with Kerberos ticket forwarding
- fix bug preventing 'locate' to work properly
- re-enable optimized 'locate'
Revision
20086 -
(
view)
(
download)
(
as text)
(
annotate)
-
[select for diffs]
Added
Mon Sep 24 18:19:34 2007 UTC (7 years, 4 months ago) by
ganis
Original Path:
trunk/proofplayer/inc/TPacketizerAdaptive.h
File length: 5299 byte(s)
From Jan:
- Add a heuristic mechanism to avoid undervaluing the processing rate and to short packets.
- Add parameters: PROOF_MinPacketTime and PROOF_PacketAsAFraction to allow tuning performance.
- Change name TAdaptivePacketizer to TPacketizerAdaptive in order to follow the naming
convention for packetizers
- Slightly tuning a few parameters.
This form allows you to request diffs between any two revisions of this file.
For each of the two "sides" of the diff,
enter a numeric revision.