[root] / trunk / proof / proofplayer / src / TPacketizerAdaptive.cxx Repository:
ViewVC logotype

Log of /trunk/proof/proofplayer/src/TPacketizerAdaptive.cxx

Parent Directory Parent Directory


Links to HEAD: (view) (download) (as text) (annotate)
Sticky Revision:

Revision 44065 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Wed May 2 12:41:50 2012 UTC (2 years, 8 months ago) by ganis
File length: 81978 byte(s)
Diff to previous 44023
Fix bunch of Coverity reports

Revision 44023 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Mon Apr 30 09:04:05 2012 UTC (2 years, 8 months ago) by ganis
File length: 81908 byte(s)
Diff to previous 44021
Fix warning introduced by previous fix

Revision 44021 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Mon Apr 30 08:49:11 2012 UTC (2 years, 8 months ago) by ganis
File length: 81906 byte(s)
Diff to previous 42377
Fix for Coveruty reports

Revision 42377 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Fri Dec 2 15:38:41 2011 UTC (3 years, 1 month ago) by ganis
File length: 81906 byte(s)
Diff to previous 41785
Improve debug notification

Revision 41785 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Fri Nov 4 17:01:33 2011 UTC (3 years, 2 months ago) by ganis
File length: 81164 byte(s)
Diff to previous 41635
   Fix issue in TPacketizerAdaptive and TPacketizer preventi proper selection of
   event sub-ranges, i.e. when processing num != -1 entries starting at first > 0 .

Revision 41635 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Fri Oct 28 15:02:31 2011 UTC (3 years, 2 months ago) by ganis
File length: 80893 byte(s)
Diff to previous 41619
Another bunch of Coverity-related fixes

Revision 41619 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Fri Oct 28 10:07:39 2011 UTC (3 years, 2 months ago) by ganis
File length: 80825 byte(s)
Diff to previous 39835
Fix a first bunch of Coverity issues

Revision 39835 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Mon Jun 20 15:37:30 2011 UTC (3 years, 7 months ago) by ganis
File length: 80534 byte(s)
Diff to previous 38810
  - In TPacketizer and TPacketizerAdaptive, fix an issue with counter updating when a number
    of events to be processed is specified (it was working up to a certain number of files and
    then it was getting screwed up).
  - In TPacketizerAdaptive, fix an issue with the option 'ForceLocal' on 'file:///' URLs.
  - In TProofPlayer, optimize two conditional scopes.
  - In TProofPlayerLite, make sure that the Progress timer is stopped when issuing STOP.

Revision 38810 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Tue Apr 12 16:22:59 2011 UTC (3 years, 9 months ago) by ganis
File length: 80009 byte(s)
Diff to previous 38616
  Patch to correctly honour selector abort status settings in PROOF. Currently only
  the TSelector::kAbortProcess was handled by stopping processing. In particular
  TSelector::kAbortFile was ignored; this recently created some problems in ALICE
  with corrupted files, with repeated attempts to read events eventually leading to
  bad_alloc exceptions.

  This patch also fixes other related issues, in particular with the reporting of the
  non-processed {files, events} in the final 'MissingFiles' list. This list should
  now account much more precisely of the number of events which could not be processed.

  It also fixes a problem with the final update of the progress information affecting
  occasionally cases with skipped events.

Revision 38616 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Thu Mar 24 17:50:46 2011 UTC (3 years, 10 months ago) by ganis
File length: 77081 byte(s)
Diff to previous 37980
  Fix a bug checking the first event, probably introduced by 'fix' #37980 .
  Should fix the issue reported in Savannah #78921 .

Revision 37980 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Fri Feb 4 12:37:57 2011 UTC (3 years, 11 months ago) by ganis
File length: 76908 byte(s)
Diff to previous 37396
 Fix for the issue reported in Alice Savannah #75820 .

Revision 37396 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Wed Dec 8 13:12:00 2010 UTC (4 years, 1 month ago) by rdm
File length: 76782 byte(s)
Diff to previous 36491
From Gerri:
- Fix a problem with the registration of missing files in the 'MissingFiles'
  list. The files that could not be open during processing were not properly
  registered in the list (only those found missing during validation or
  giving problmes durign reading were correctly added).
- Add method TProof::ShowMissingFiles() to facilitate the display of the
  list of missing files.
- Add method TProof::GetMissingFiles() to get a TFileCollection (dataset)
  with the missing files for further processing.

Revision 36491 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Wed Nov 3 11:57:18 2010 UTC (4 years, 2 months ago) by ganis
File length: 76065 byte(s)
Diff to previous 36086
  In TPacketizerAdaptive::Reset, fix an issue preventing correct worker-to-filenode matching.
  The problem was introduced with the patch making the the packetizer disk partition-aware 
  and prevented the 'ForceLocal' option to work properly.
  Also, make sure that the FQDN is used consistently, and that 'localhost' or '127.0.0.1' are
  correctly matched to the local machine host name.

Revision 36086 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Tue Oct 5 16:15:41 2010 UTC (4 years, 3 months ago) by ganis
File length: 74815 byte(s)
Diff to previous 35312
Fix a bunch of issues found by Coverity

Revision 35312 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Wed Sep 15 20:56:04 2010 UTC (4 years, 4 months ago) by ganis
File length: 74784 byte(s)
Diff to previous 35235
   In TPacketizerAdaptive::GetNextUnAlloc, reset 'node' to NULL if no more
   files on it. Fixes an issue introduced by a recent optimization preventing
   processing of the full dataset in some cases with a small number of workers.

Revision 35235 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Sun Sep 12 07:45:13 2010 UTC (4 years, 4 months ago) by ganis
File length: 74674 byte(s)
Diff to previous 35196
   - Add files to the list of files to process only when finally validated.
     Should solve issue reported in ALICE Savannah #72162
     (https://savannah.cern.ch/bugs/?72162)

Revision 35196 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Wed Sep 8 11:44:34 2010 UTC (4 years, 4 months ago) by ganis
File length: 74526 byte(s)
Diff to previous 34748
  Make the recently introduced list of files to be processed owned by TPacketizerAdaptive, instead
  of a static in TPacketizerAdaptive::TFileStat. This fixes a possible problem when running multiple
  queries in the same session.

Revision 34748 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Mon Aug 9 10:18:05 2010 UTC (4 years, 5 months ago) by ganis
File length: 74470 byte(s)
Diff to previous 34743
   Add the possibility to save the perfomance information shown by the dialog into a small
   ntuple included in the output list. The ntuple contains 5 floats (processing time, number
   of active workers, event rate, MBytes read, number of effective sessions on the cluster)
   and it is filled each time the number of active workers changes or at max 100 regular
   intervals at least 5 secs apart; in this way the ntuple has at most O(100 entries + number
   of workers).
   To enable the saving of the ntuple execute the following:
               proof->SetParameter("PROOF_SaveProgressPerf", "yes");
   before running the query. The ntuple is called 'PROOF_ProgressPerfNtuple'.

   This patch also adds to the output list the parameters used by the active packetizer. Some
   parameters of general interest (currently MinPacketTime and MaxPacketTime) have been moved
   to TVirtualPacketizer and are always added to the list. Each packetizer is then responsible
   of adding its relevant specific parameter to the dedicated list. The dedicated list is hosted
   in TVirtualPacketizer and is transferred to the output list by TProofPlayer when finalising
   the output list at the end of the query.

Revision 34743 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Fri Aug 6 15:24:42 2010 UTC (4 years, 5 months ago) by ganis
File length: 74356 byte(s)
Diff to previous 34637
Avoid resolving the workers FQDN when running in PROOF-Lite: may create unnecessary delays

Revision 34637 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Wed Jul 28 14:40:47 2010 UTC (4 years, 5 months ago) by ganis
File length: 74172 byte(s)
Diff to previous 34557
   From Maciek Nabożny and me:

   Add the possibility to single-out disk partitions in the packetizer; this works by
   adding the beginning of a path in the name defining a new TFileNode (e.g. 'host://disk1'
   instead of 'host' only as it was so far).
   The feature was requested both by ATLAS and ALICE; it is optional and can be triggered
   by defining the rootrc variable

           Packetizer.Partitions  /disk1,/disk2,/disk3

   (The administrator of a PROOF cluster can add this via 'xpd.putrc'; the user can test this
    via the parameter 'PROOF_PacketizerPartitions'; see runProof.C).
   In the extreme case of all files on the same disk are grouped together in the dataset
   definition this addition allows to save up to 20% of processing time on a 4 core machine
   with 2 disks. A systematic study of the impact of this development is on going.

Revision 34557 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Thu Jul 22 15:12:01 2010 UTC (4 years, 6 months ago) by rdm
File length: 69835 byte(s)
Diff to previous 34533
fix cases in ROOT code where we would truncate the TTime (to avoid the
new error messages in TTime operator long on 32-bit platforms).

Revision 34533 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Wed Jul 21 13:12:12 2010 UTC (4 years, 6 months ago) by ganis
File length: 69846 byte(s)
Diff to previous 34532
   TPacketizerAdaptive
    - Better to use TDSetElement::GetNum() instead of TDSetElement::GetEntries() for the
      recent packetizer optimization (includes cases where fractions of files are processed)
   runProof
    - Add option 'uneven' to tutorial 'eventproc' to simulate some uneveness in the entries
      per file
    - Solve a few issues with formats

Revision 34532 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Wed Jul 21 12:52:32 2010 UTC (4 years, 6 months ago) by ganis
File length: 69865 byte(s)
Diff to previous 34527
Where relevant, use the fast version of TDSetElement::GetEntries

Revision 34527 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Wed Jul 21 10:14:35 2010 UTC (4 years, 6 months ago) by ganis
File length: 69845 byte(s)
Diff to previous 34416
   From Maciek Nabożny and me.

   - Optimize the packetizer behaviour when the number of files left to be processed
     is smaller than the number of workers and at least one file has a number of events
     significantly larger than the average.
   - Better apply the upper/lower limits on the expected packet processing time
   - Fix an issue with validating the exact number of needed files when the information
     about the entries is already available.

Revision 34416 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Wed Jul 14 15:39:20 2010 UTC (4 years, 6 months ago) by ganis
File length: 65945 byte(s)
Diff to previous 34254
   Fix problem with packet re-assignment in case of a worker death. Some packets
   were processed twice or more times.
   A new method MergeElement(TDSetElement *elem) has been added to TDSetElement in
   order to simplify merging of contiguous or overlapping packets and avoid artificial
   fragmentation of the re-assigned parts.

Revision 34254 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Wed Jun 30 16:29:36 2010 UTC (4 years, 6 months ago) by ganis
File length: 63759 byte(s)
Diff to previous 33781
Fix warnings in notification statements

Revision 33781 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Tue Jun 8 14:13:39 2010 UTC (4 years, 7 months ago) by ganis
File length: 63747 byte(s)
Diff to previous 32204
   - Optimize the validation step in the case not all the entries are required.
     The validation step is stopped as soon as the requested number of events is reached.
     If the parameter "PROOF_ValidateByFile" is set to 1, the number of files is exactly what
     needed; otherwise the number of files may exceed the number of fles needed by #workers-1
     (this is the default because additional, serial, checks are needed to ensure that only
      the files really required are open).
     This feature was requested in the context of ALICE reconstruction.
     This new feature is used as an example in the "eventproc" tutorial in runProof.C .
   - The patch also fixes a subtle bug affecting the (possibly rare) case when not all entries
     are required and # entries does not correspond to an complete subset of files (e.g.
     # entries = 1001000 with files of 100000 entries each). The effect was uncomplete
     processing (skipped events, magenta bar) or a session freeze.

Revision 32204 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Wed Feb 3 19:17:40 2010 UTC (4 years, 11 months ago) by ganis
File length: 62259 byte(s)
Diff to previous 31707
   Add support for processing many datasets in one go in TProof::Process(const char *dataset, ...).
   Two options are provided:
   - 'grand dataset':  the datasets are added up and considered as a single dataset;
                       syntax: "dataset1|dataset2|..."
   - 'keep separated': the datasets are processed one after the other; the user is
                       notified in the selector of the change of dataset so she/he
                       has the opportunity to separate the results. A new packetizer,
                       TPacketizerMulti, has been developed for this case: it basically
                       contains a list of standard packetizers (one for each dataset) and
                       loops over them.
                       Syntax: "dataset1,dataset2,..." or dataset1 dataset2 ..."
   In both cases, entry-list can be applied using the syntax "dataset<<entrylist", e.g.
   "dataset1<<el1|dataset2<<el2|".
   See http://root.cern.ch/drupal/content/working-data-sets#currentelem for more details.

   A test for the new functionality has been added to test/stressProof.cxx .

Revision 31707 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Wed Dec 9 09:16:57 2009 UTC (5 years, 1 month ago) by ganis
File length: 61954 byte(s)
Diff to previous 31702
   Fix issues found by Coverity:
     - #439, #438, #437: missing check on possibly NULL pointer

Revision 31702 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Wed Dec 9 07:50:11 2009 UTC (5 years, 1 month ago) by ganis
File length: 61779 byte(s)
Diff to previous 31296
   Fix issues found by Coverity:
     - #8634: use of 'slave' after delete
     - #8564: uninitialized member in TPacketizerAdaptive::TFileNode

Revision 31296 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Wed Nov 18 20:35:00 2009 UTC (5 years, 2 months ago) by rdm
File length: 61689 byte(s)
Diff to previous 30953
small code cosmetics.

Revision 30953 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Tue Nov 3 08:42:49 2009 UTC (5 years, 2 months ago) by ganis
File length: 61737 byte(s)
Diff to previous 30899
  If enabled, send monitoring information from the master at each GetNextPacket
  (at each call of TPerfStat::PacketEvent) to allow extrnal real-time progress
  monitoring.

Revision 30899 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Wed Oct 28 12:22:17 2009 UTC (5 years, 2 months ago) by ganis
File length: 61627 byte(s)
Diff to previous 30863
Improve data node / worker matching by always using the host FQDN

Revision 30863 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Sun Oct 25 17:00:50 2009 UTC (5 years, 2 months ago) by ganis
File length: 61570 byte(s)
Diff to previous 30862
Fix a couple of issues foudn by the nigthlies

Revision 30862 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Sun Oct 25 08:26:46 2009 UTC (5 years, 3 months ago) by ganis
File length: 61566 byte(s)
Diff to previous 30859
  - Further improvement in the estimation of the current rate
  - Fix a problem preventing the chunck size to be displayed in some cases
  - Adjust the scale for displaying the read bytes (use GB or TB when relevant)

Revision 30859 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Sat Oct 24 14:53:07 2009 UTC (5 years, 3 months ago) by ganis
File length: 61587 byte(s)
Diff to previous 30574
  Patch for improved performance monitoring. The 'Rate Plot' button in the progress
  dialog is renamed 'Performance plot' and shows up to 4 plots with the event/sec,
  the average read chunck size, the number of active workers and the number of active
  PROOF sessions on the cluster, all as a function of processing time.

  The read chunck size plot allows to monitor the usage of the cache.

  The istantaneous processing rate (event/sec) is now better estimate: a few issues
  with the normalizing times have ben solved, removing the artificial structures that
  were observed.

  The possibility to set a max packet time length is introduced (default 30 s); this
  can be changes with the parameter PROOF_MaxPacketTime.
  The size of the cache is also taken into account to optimize the use of the cache.

  The parameter PROOF_UseParallelUnzip has been introduced to toggle the use of the 
  parallel unzip (default off for now).

  A page describing the new performance plots is under preparation at 
           http://root.cern.ch/drupal/content/progress-dialog

Revision 30574 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Tue Oct 6 10:18:06 2009 UTC (5 years, 3 months ago) by ganis
File length: 58395 byte(s)
Diff to previous 30066
   By default do not set any limit on the number of workers accessing a given file server;
   following to recent reports, this seems a better default with current hardware.
   The value can be changed/set via the PROOF_MaxSlavesPerNode parameter, as before.

Revision 30066 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Tue Sep 8 14:45:24 2009 UTC (5 years, 4 months ago) by ganis
File length: 58588 byte(s)
Diff to previous 28052
Fix notification message

Revision 28052 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Thu Apr 2 08:42:14 2009 UTC (5 years, 9 months ago) by ganis
File length: 58573 byte(s)
Diff to previous 27772
Fix a problem with element validation when using entry lists

Revision 27772 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Mon Mar 16 09:16:48 2009 UTC (5 years, 10 months ago) by ganis
File length: 58481 byte(s)
Diff to previous 26875
From Jan: fix a problem in checking the number of packet events

Revision 26875 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Fri Dec 12 14:21:34 2008 UTC (6 years, 1 month ago) by ganis
File length: 58487 byte(s)
Diff to previous 26381

   Patch to:
   - Add the possibility to set upper limits on the virtual memory (changes in TProofPlayer,
     TProofServ, TXProofServ and pmain (separate patch)).
     If enebled, the session gets firts a warning when it reaches 80% of the limit, and then
     processing is stopped whenit exceeds 95% of the limit, sending back the results.
     Also, the memory footprint is notified when the session is terminated.
   - Make sure the the active valuse in XrdProofWorker are always correct; this was not the
     case for dynamic startup as the notification at the end of the query was not done.
     This information is crucial or the scheduler.
     The way the information is stored in XrdProofdProofServ had to be modified
     and a new internal message type (kReleaseWorker) added.
     (Changes in several proofd classes, TProof, TXsocket and TXProofServ)
   - Fix problem with setting a static upper limit of the sessions that can be started,
     and enable this functionality for the dynamic mode (changes in XrdProofSched)
   - Remove a deleted worker from all the lists in TProof::MarkBad to avoid later attempts of use.
   - Better control the use of the internal pipe for socket readiness notification in TXSocket.

Revision 26381 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Sat Nov 22 17:15:24 2008 UTC (6 years, 2 months ago) by ganis
File length: 58510 byte(s)
Diff to previous 25896
  - Fixes for the asynchronous mode:
    - Fully localize the general handling of the input messages into two new methods
      TProof::HandleInputMessage(TMessage *) and TProofServ::HandleSocketInput(TMessage *);
      these methods are callable form any place that needs to intercept some messages and
      do something specific on them (examples are: TProofServ::GetNextPacket, TXProofServ::Get,
      TPacketizerAdaptive::ValidateFiles, ...); this allows to remove several duplications
      and to make sure that no message is lost or wrongly dispatched.
    - Simplify the recursive infrastructure for TProofServ::HandleSocketInput; in particular,
      TProofServ::HandleSocketInputDuringProcess is removed as it is a special case of the
      HandleSocketInput, saving duplications.
    - Always use a kPROOF_CHECKFILE message in replies to check file operations; failures were
      signaled with kPROOF_FATAL which may have some undesired side-effects, depending on the
      timing
    - Add support for one level of recursivity in TProof::Collect .
    - Fix a problem with TProof::Finalize when called with default arguments (on the last query)
    - In TProof::SendFile, send to unique workers only in the "cache" option is specified
    - Remove the call to Finalize for DrawSelect queries, as it is done via the feedback
      mechanisms  

  - Additions/fixes in test/stressProof.cxx:
    - New test for the asynchronous mode;
    - Fine-tuning of the progress display in batch mode;

Revision 25896 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Mon Oct 20 17:01:31 2008 UTC (6 years, 3 months ago) by ganis
File length: 59648 byte(s)
Diff to previous 25859
   From Jan:
   - Using consistently the recently introduced TProofProgressStatus in the kPROOF_GETPACKET
     messages sent to TPacketizerUnit, TPacketizerAdaptive and TPacketizer; the message contains
     the status of progress since the start of processing on a given node.
   - Introduce TVirtualPacketizer::TVirtualSlaveStat as a base class of all the TSlaveStat
     packetizer specific auxilliary classes.
   - Full implemenation of GetProgressStatus() and AddProcessed(TProofProgressStatus *st) members
     for TPacketizerUnit::TSlaveStat.

     This patch should fix some consistency problems experienced after the patch introducing
     TProofProgressStatus .

Revision 25859 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Fri Oct 17 16:38:52 2008 UTC (6 years, 3 months ago) by ganis
File length: 60963 byte(s)
Diff to previous 25827
   - Make all packetizers understand the GETPACKET messages containing the new TProofProgressStatus
     structure
   - Add a few missing protections

Revision 25827 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Wed Oct 15 14:02:59 2008 UTC (6 years, 3 months ago) by ganis
File length: 60397 byte(s)
Diff to previous 25273
   From Jan:

   - Added the possibility to handle removed workers and partly processed packets. When a worker is stopped
     while processing a packet it finishes, the current event and the rest of the packet is reassigned to another
     worker. This is done via two interfaces:
       - TVirtualPacketizer::AddProcessed(TSlave *sl, TProofProgressStatus *st, TList **)
       - TVirtualPacketizer::ReassignPacket.
   - New class TProofProgressStatus used to keep the query progress stauts in all the TProofPlayer objects and in
     TPacketizerAdaptive::TSlaveStat. This class is also used to structure the relevant information send in
     kPROOF_GETPACKET and kPROOF_STOPPROCESS messages.
   - The class TPacketizerProgressive is removed completely.
   - The PROOF protocol version is increased to 19: this is to handle the changes in the kPROOF_STOPPROCESS and
     kPROOF_GETPACKET messages in Master - worker communication.

Revision 25273 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Wed Aug 27 08:56:06 2008 UTC (6 years, 4 months ago) by rdm
File length: 55361 byte(s)
Diff to previous 24719
From Jan:
- A new optional version of PROOF with dynamic worker startup.
  It can be enabled by the admin or a user with 'Proof.DynamicStartup'.
  A session starts only on the master. When a query processing starts
  at the master TXProofServ::GetWorkers() is called. It receives a 
  list of machines from the scheduler and the workers on the
  machines are started. The environment is copied from the master
  to the workers. It includes: the include and dynamic library paths,
  the set of enabled packages as well as the macros loaded by the user.

- A new method TProof::AddWorkers(TList *workers) was added. It adds
  the workers just before the query.

- A packet resubmitting mechanism. When a worker dies all the packets
  that it processed are resubmitted.

- In TPacketizerAdaptive: fixing initialization of fgMaxSlaveCnt. By
  default it was initialized twice.

Revision 24719 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Wed Jul 9 07:07:25 2008 UTC (6 years, 6 months ago) by ganis
File length: 51543 byte(s)
Diff to previous 24568

    Patch refactorizing the XrdProofd plugin.
    The class XrdProofdProtocol is now in charge only of the operations strctly related
    to the XProofd protocol. All auxilliary services have been moved to dedicated service
    classes controlled by XrdProofdManager. In particular:
    - XrdProofdClientMgr handles the clients (represented by instances of XrdProofdClient)
      including login, authentication and access control
    - XrdProofdProofServMgr handles the PROOF sessions (represented by instances of
      XrdProofdProofServ) including creation, attachment, detachment, destruction and
      environment setting
    - XrdProofdNetMgr handles connections between instances of XProofd running on different
      nodes
    - XrdProofdPriorityMgr handles the session priorities

    A special effort has been done to get rid of all possible internal dead-lock situations.
    Internal actions on clients and sessions are now all asynchronous, governed by internal
    pipes.

    The new plugin also offers new functionality, among which:
    - a XProofd admin area, located under <xrd.admin>/.xproof.<port>, keeps information about
      active and terminated sessions, and the minimal state for active clients. This is used
      to reguraly check the client and session activity, to cleanup orphalin sessions and
      to shutdown inactive client connections.
      In particular this allows a more solid implementation of Reset, which now exists in two
      flavours: 'soft', TProof::Reset(<masterurl>), (default) asks the sessions to terminate gently;
      'hard', TProof::Reset(<masterurl>,kTRUE) schedules all sessions for forced termination.
    - support for automatic attempts for reconnection in the case the daemon restarts.
      This allows to reconfigure the plugin by xrootd restart w/o affecting the running
      sessions.
    - support the definition of workers via config file directive, getting de facto rid of
      the proof.conf file.
    - domain+level control of printout message; the format has been improved: in particular
      all information messages contain now the tag 'xpd-I' and all error messages the tag
      'xpd-E', so that they can easily be grepped out from the log file.

    The Wiki 'XrdProofdDirectives Directives' page has been updated with the new directives.

Revision 24568 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Thu Jun 26 11:57:42 2008 UTC (6 years, 6 months ago) by ganis
File length: 50956 byte(s)
Diff to previous 23986
   - Fix a race condition possibly affecting the handling of workers death
   - Improve diagnostic from MarkBad: clients will now receive something like this

root [1] Worker 'localhost-0.1' has been removed from the active list

 +++ Message from top master at aleph025.cern.ch:1093 : marking localhost:1093 (0.1) as bad
 +++ Reason: problems receiving a message in TProof::CollectInputFrom(...)

 +++ Most likely your code crashed on worker 0.1 at localhost:1093.
 +++ Please check the session logs for error messages either using
 +++ the 'Show logs' button or executing
 +++
 +++ root [] TProof::Mgr("aleph025.cern.ch:1093")->GetSessionLogs()->Display("0.1",0)

Revision 23986 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Fri May 23 09:10:18 2008 UTC (6 years, 8 months ago) by ganis
File length: 50792 byte(s)
Diff to previous 23632
   Import fixes / new functionality from branches/dev/proof:

   - proof/proofplayer/src/TPacketizerAdaptive.cxx
      Implement the classic strategy of the TPacketizer in TPacketizerAdaptive.
      The strategy can be changed from adaptive (default) to TPacketizer with:
      "PROOF_PacketizerStrategy" parameter to PROOF

   - proof/proofplayer/src/TProofPlayer.cxx
      Fixed error messages for 'MissingFiles' and 'FailedPackets' lists.
      Improve fault detection by creating a list of failed packets upon a mismatch
      in the expected and actual number of processed events; the list is added to
      the output list.

   - proof/proofplayer/src/TVirtualPacketizer.cxx
      Make sure that something has been processed before setting kIsDone.
      Improve fault detection by creating a list of failed packets upon a mismatch
      in the expected and actual number of processed events; the list is added to
      the output list.

   - proof/proofplayer/inc/TVirtualPacketizer.h
      Improve fault detection by creating a list of failed packets upon a mismatch
      in the expected and actual number of processed events; the list is added to
      the output list.

   - proof/proofplayer/inc/TPacketizerAdaptive.h
      Implement the classic strategy of the TPacketizer in TPacketizerAdaptive

   - proof/proof/src/TProofServ.cxx
      Fixes:
       + option string: "stageOnly" --> "stagedOnly".
       + add parenthesis to avoid a warning after the previous patch.
       + remove the objects added to the missingFiles in TDSet::Add from the 'dataset'
         before deleting it
       + fixed an error HandleCheckFile ('kPROOF_WorkDir' instead of 'kPROOF_PackDir').
       + make fCacheDir and fPackageDir controllable via directive
       + in TProofServ::ErrorHandler: do not create the related additional buffer if
         not logging to syslog
      Added functionality:
       - add possibility to flag an "Info" message as service message using
         the prefix "|Svc" in the location field; e.g.

            Info("SetupCommon|Svc", "Test of SvcMsg");

         will produce something like

            09:28:24  6892 Mst-0 | SvcMsg in <TXProofServ::SetupCommon>: Test of SvcMsg

         This is needed to be able in the future to filter-out messages needed
         by some services (e.g. the forthcoming memory checker) which should not be
         displayed by default.

   - proof/proofd/src/XrdProofSched.cxx
      Improve the calculation of the number of workers to assign by using fMinForQuery
      as a minimum.
      Fix the length of method separators.
      Fix signed/unsigned warning.

Revision 23632 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Thu May 1 20:45:12 2008 UTC (6 years, 8 months ago) by ganis
File length: 52332 byte(s)
Diff to previous 23631
Fix gcc 4.3 warnings (mostly shadowed variables)

Revision 23631 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Thu May 1 10:50:11 2008 UTC (6 years, 8 months ago) by rdm
File length: 52356 byte(s)
Diff to previous 23247
Fix coding conventions.

Revision 23247 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Wed Apr 16 09:04:56 2008 UTC (6 years, 9 months ago) by ganis
File length: 52357 byte(s)
Diff to previous 23244
Form Jan: drop invalid elements from the list of elements to be processed

Revision 23244 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Wed Apr 16 08:05:01 2008 UTC (6 years, 9 months ago) by ganis
File length: 52198 byte(s)
Diff to previous 23074
  TPacketizer:
  - Fix a problem in TPacketizer when using the cached info (it was already
    fixed in TPacketizerAdaptive)
  - Make sure that both Long_t and Int_t are supported for maximum number of
    workers per filenode
  - Add some conditional debug statements

  TPacketizerAdaptive:
  - Make sure that both Long_t and Int_t are supported for maximum number of
    workers per filenode
  - Add some conditional debug statements

Revision 23074 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Wed Apr 9 08:41:38 2008 UTC (6 years, 9 months ago) by ganis
File length: 51489 byte(s)
Diff to previous 22635
  - Implement optimized file validation where all information already available
    is used
  - In TPacketizerAdaptive:
    Add a heuristic mechanism to avoid undervaluing the processing rate and to short packets;
    Slightly tuning a few parameters

Revision 22635 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Thu Mar 13 10:50:20 2008 UTC (6 years, 10 months ago) by rdm
File length: 49003 byte(s)
Diff to previous 22075
move all PROOF related libraries under the new proof directory.

Revision 22075 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Fri Feb 8 16:56:59 2008 UTC (6 years, 11 months ago) by ganis
Original Path: trunk/proofplayer/src/TPacketizerAdaptive.cxx
File length: 49003 byte(s)
Diff to previous 21250
   From Jan:
   - change the type of "PROOF_MaxSlavesPerNode", "PROOF_ForceLocal" and "PROOF_PacketAsAFraction" parameters
     from Long_t to Int_t;
   - make the max workers per node configurable in .rootrc (Packetizer.MaxWorkersPerNode: <desired number>)

Revision 21250 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Thu Dec 6 23:55:35 2007 UTC (7 years, 1 month ago) by ganis
Original Path: trunk/proofplayer/src/TPacketizerAdaptive.cxx
File length: 48644 byte(s)
Diff to previous 20882
  - Fix problem crashing the application at reconfig (bug #28778)
  - Fix a bug preventing standard Scalla/Xrootd if-else-fi config constructs
    to work
  - Fix a bug in XrdProofdManager::GetProofConn in checking existing valid
    connections
  - Fix a bug in TProofServ::ApplyMaxQueries preventing the correct removal
    of empty query directories
  - Fix coding convention violations in TPacketizerAdaptive (from Jan)

Revision 20882 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Mon Nov 19 11:31:26 2007 UTC (7 years, 2 months ago) by rdm
Original Path: trunk/proofplayer/src/TPacketizerAdaptive.cxx
File length: 48544 byte(s)
Diff to previous 20862
Set property svn:eol-style LF on all source and Makefiles. This should avoid
problems with Win32 line endings ending up in the repository. All MS tools
support LF eols fine.

Revision 20862 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Sun Nov 18 14:51:42 2007 UTC (7 years, 2 months ago) by ganis
Original Path: trunk/proofplayer/src/TPacketizerAdaptive.cxx
File length: 48544 byte(s)
Diff to previous 20682
   Synchronize with braches/dev/proof r20835

   Summary:

   TProofServ:
     - read session.rootrc with level kEnvChange to be able to change existing
       settings
     - Avoid deleting a query result twice in some special cases
     - Fix a problem with the initialization of fKeptQueries
   rootrc.in:
     - restore default settings the asynchronous reading
   TFileCacheRead:
     - add missing protection in ReadBuffer
   TEventIter:
     - Enable the usage of TTreeCache
   XrdProofdProtocol:
     - Additional check on the ownership of the unix socket
     - Improve notification during Reset
     - Reduce default timeout on admin requests and make it configurable
   XrdProofConn:
     - Use the configurable maxtry everywhere where relevant
   TPacketizerAdaptive:
     - Store info on all the processed packets in per-worker lists

   TProofMgr, TXProofMgr, TXSocket, XProofProtocol, XrdProofdProtocol:
     - Add possibility for the admin to broadcast a message to the connected users

   getProof:
     - Add missing protection

Revision 20682 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Tue Nov 6 15:51:59 2007 UTC (7 years, 2 months ago) by ganis
Original Path: trunk/proofplayer/src/TPacketizerAdaptive.cxx
File length: 47880 byte(s)
Diff to previous 20307
  Import branches/dev/proof r20654

  Summary:

  + Improvements
    - Add support for SSH SOCKS4 tunnelling; the local port for the tunnel can be specified
      in the master URL, e.g. TProof::Open("master/?tunnel:8000")
    - Add the possibility to plot the estimated instantaneous rate
    - Add "PROOF_ForceLocal" parameter to the !TPacketizerAdaptive; if set to 1, all the data
      are processed locally.
    - Add support for remote grep functionality while retrieving logs (needed by the forthcoming
      memory monitor)
  + Bug fixes
    - Several small fixes to revive the multi-master mode.
    - XrdProofdProtocol:
      - add missing lock to the client instance in SendMsg to avoid screwing up requests from
        workers on the same machine
         - lock the mutex of the requester when setting prorities
         - add notification during Reset
         - fix problem with the detection of the 'allow' directive
         - fix problem with the parsing of the return value from XrdProofServProxy::TerminateProofServ()
         - always use the effective user to retrieve info fom another server
           (XrdProofdManager::GetProofConn is now used)
         - fix possible dead-locks from debug notifications done after hard-killing a session
         - re-enable the garbage collector thread of the connection manager in XrdProofConn to
           fix a problem with closing physical connections;
         - fix a problem with !CleanupProofServ in the case of a non-privileged daemon running
           in multi-user mode
         - introduce a timeout when waiting for the startup of a 'proofserv'.
    - XrdProofConn: init mutex in the ctor; lock in SendRecv
    - XrdProofSched: add support for using the priorities defined in the group manager to define
      the number of workers for sessions
      - TProof:
         - Broadcast priorities to unique nodes only
         - timeout after 5 mins the initial Collect to avoid clients getting stuck at this stage
         - add support for generic timeout in Collect (disabled by default)
         - fix a problem with SendFile.
      - TXProofServ: add a call to !TProof::InterruptCurrentMonitor() in Terminate() to stop 
        infinite loops in Collect
      - TXSocket:
         - Implement a flag to interrupt a TXSocket while waiting for messages
         - Split the session creation timeout in 4 attempts: the total timeout is the same but
           it may circumvent occasional forking problems.
      - XROOTD: 
         - fix a potential (possibly accademic) memory leak in the client
         - fix a access permission problem with Kerberos ticket forwarding
         - fix bug preventing 'locate' to work properly
         - re-enable optimized 'locate'

Revision 20307 - (view) (download) (as text) (annotate) - [select for diffs]
Modified Thu Oct 11 10:58:50 2007 UTC (7 years, 3 months ago) by ganis
Original Path: trunk/proofplayer/src/TPacketizerAdaptive.cxx
File length: 44811 byte(s)
Diff to previous 20086
  Import branches/dev/proof r20306

  Summary (see branch logs for more details):

  - Set of changes related to CPU quota control including
    - broadcast of centrally determined priorities
    - mechanism to renice processes in a quantitative way
  - New packetizer for non-tree based analysis and related API in TProof (from L. Tran-Thanh)
  - Support for merging output objects saved in files on the workers (from L. Tran-Thanh and me)
  - Improve version binary compatibility checks using also the SVN revision number
   (when available) to define the running version.
  - Extend the version binary compatibility checks also to the cached selector binaries.
  - Extend TDSet::Lookup so that in case of missing files, it can remove them from the
    dataset (option removeMissing must be set).
  - Move the data set lookup to the TProofPlayerRemote::Process.
  - Handle properly the case of incomple datasets: if the file is not found in the lookup
    don't try to validate it; add it, instead, to a 'missingFiles' list returned in the
    output list (fixing bug #28800 in Savannah).

Revision 20086 - (view) (download) (as text) (annotate) - [select for diffs]
Added Mon Sep 24 18:19:34 2007 UTC (7 years, 4 months ago) by ganis
Original Path: trunk/proofplayer/src/TPacketizerAdaptive.cxx
File length: 44768 byte(s)
  From Jan:
   - Add a heuristic mechanism to avoid undervaluing the processing rate and to short packets.
   - Add parameters: PROOF_MinPacketTime and PROOF_PacketAsAFraction to allow tuning performance.
   - Change name TAdaptivePacketizer to TPacketizerAdaptive in order to follow the naming
     convention for packetizers
   - Slightly tuning a few parameters.

This form allows you to request diffs between any two revisions of this file. For each of the two "sides" of the diff, enter a numeric revision.

  Diffs between and
  Type of Diff should be a

Sort log by:

Subversion Admin
ViewVC Help
Powered by ViewVC 1.0.9