Logo ROOT  
Reference Guide
 
Loading...
Searching...
No Matches
hadd.cxx
Go to the documentation of this file.
1/**
2 \file hadd.cxx
3 \brief This program will merge compatible ROOT objects, such as histograms, Trees and RNTuples,
4 from a list of root files and write them to a target root file.
5 In order for a ROOT object to be mergeable, it must implement the Merge() function.
6 Non-mergeable objects will have all instances copied as-is into the target file.
7 The target file must not be identical to one of the source files.
8
9 Syntax:
10 ```{.cpp}
11 hadd [flags] targetfile source1 source2 ... [flags]
12 ```
13
14 Flags can be passed before or after the positional arguments.
15 The first positional (non-flag) argument will be interpreted as the targetfile.
16 After that, the first sequence of positional arguments will be interpreted as the input files.
17 If two sequences of positional arguments are separated by flags, hadd will emit an error and abort.
18
19 By default, any argument starting with `-` is interpreted as a flag. If you want to pass filenames
20 starting with `-` you need to pass them after `--`:
21 ```{.cpp}
22 hadd [flags] -- -file1 -file2 ...
23 ```
24 Note that in this case you need to pass ALL positional arguments after `--`.
25
26 If a flag requires an argument, the argument can be specified in any of these ways:
27
28 # All equally valid:
29 -j 16
30 -j16
31 -j=16
32
33 The first syntax is the preferred one since it's backward-compatible with previous versions of hadd.
34 The -f flag is an exception to this rule: it only supports the `-f[0-9]` syntax.
35
36 Note that merging multiple flags is NOT supported: `-jfa` will be interpreted as -j=fa, which is invalid!
37
38 The flags are as follows:
39
40 \param -a Append to the output
41 \param -cachesize <SIZE> Resize the prefetching cache used to speed up I/O operations (use 0 to disable).
42 \param -d <DIR> Carry out the partial multiprocess execution in the specified directory
43 \param -dbg Enable verbosity. If -j was specified, do not not delete partial files
44 stored inside working directory.
45 \param -experimental-io-features <FEATURES> Enables the corresponding experimental feature for output trees.
46 \see ROOT::Experimental::EIOFeatures
47 \param -f Force overwriting of output file.
48 \param -f[0-9] Set target compression level. 0 = uncompressed, 9 = highly compressed. Default is 101
49 (kDefaultZLIB). You can also specify the full compression algorithm, e.g. -f505.
50 \param -fk Sets the target file to contain the baskets with the same compression as the input files
51 (unless -O is specified). Compresses the meta data using the compression level specified
52 in the first input or the compression setting after fk (for example 505 when using -fk505)
53 \param -ff The compression level used is the one specified in the first input
54 \param -j [N_JOBS] Parallelise the execution in `N_JOBS` processes. If the number of processes is not specified,
55 or is 0, use the system maximum.
56 \param -k Skip corrupt or non-existent files, do not exit
57 \param -L <FILE> Read the list of objects from FILE and either only merge or skip those objects depending on
58 the value of "-Ltype". FILE must contain one object name per line, which cannot contain
59 whitespaces or '/'. You can also pass TDirectory names, which apply to the entire directory
60 content. Lines beginning with '#' are ignored. If this flag is passed, "-Ltype" MUST be
61 passed as well.
62 \param -Ltype <SkipListed|OnlyListed> Sets the type of operation performed on the objects listed in FILE given with the
63 "-L" flag. "SkipListed" will skip all the listed objects; "OnlyListed" will only merge those
64 objects. If this flag is passed, "-L" must be passed as well.
65 \param -n <N_FILES> Open at most `N` files at once (use 0 to request to use the system maximum - which is also
66 the default)
67 \param -O Re-optimize basket size when merging TTree
68 \param -T Do not merge Trees
69 \param -v [LEVEL] Explicitly set the verbosity level: 0 request no output, 99 is the default
70 \return hadd returns a status code: 0 if OK, 1 otherwise
71
72 For example assume 3 files f1, f2, f3 containing histograms hn and Trees Tn
73 - f1 with h1 h2 h3 T1
74 - f2 with h1 h4 T1 T2
75 - f3 with h5
76 the result of
77 ```
78 hadd -f x.root f1.root f2.root f3.root
79 ```
80 will be a file x.root with h1 h2 h3 h4 h5 T1 T2
81 where
82 - h1 will be the sum of the 2 histograms in f1 and f2
83 - T1 will be the merge of the Trees in f1 and f2
84
85 The files may contain sub-directories.
86
87 If the source files contains histograms and Trees, one can skip
88 the Trees with
89 ```
90 hadd -T targetfile source1 source2 ...
91 ```
92
93 Wildcarding and indirect files are also supported
94 ```
95 hadd result.root myfil*.root
96 ```
97 will merge all files in myfil*.root
98 ```
99 hadd result.root file1.root @list.txt file2. root myfil*.root
100 ```
101 will merge file1.root, file2.root, all files in myfil*.root
102 and all files in the indirect text file list.txt ("@" as the first
103 character of the file indicates an indirect file. An indirect file
104 is a text file containing a list of other files, including other
105 indirect files, one line per file).
106
107 If the sources and and target compression levels are identical (default),
108 the program uses the TChain::Merge function with option "fast", ie
109 the merge will be done without unzipping or unstreaming the baskets
110 (i.e. direct copy of the raw byte on disk). The "fast" mode is typically
111 5 times faster than the mode unzipping and unstreaming the baskets.
112
113 If the option -cachesize is used, hadd will resize (or disable if 0) the
114 prefetching cache use to speed up I/O operations.
115
116 For options that take a size as argument, a decimal number of bytes is expected.
117 If the number ends with a `k`, `m`, `g`, etc., the number is multiplied
118 by 1000 (1K), 1000000 (1MB), 1000000000 (1G), etc.
119 If this prefix is followed by `i`, the number is multiplied by the traditional
120 1024 (1KiB), 1048576 (1MiB), 1073741824 (1GiB), etc.
121 The prefix can be optionally followed by B whose casing is ignored,
122 eg. 1k, 1K, 1Kb and 1KB are the same.
123
124 \note By default histograms are added. However hadd does not support the case where
125 histograms have their bit TH1::kIsAverage set.
126
127 \authors Rene Brun, Dirk Geppert, Sven A. Schmidt, Toby Burnett
128*/
129#include "Compression.h"
130#include "TClass.h"
131#include "TFile.h"
132#include "TFileMerger.h"
133#include "THashList.h"
134#include "TKey.h"
135#include "TSystem.h"
136#include "TUUID.h"
137
138#include <ROOT/RConfig.hxx>
139#include <ROOT/StringConv.hxx>
140#include <ROOT/TIOFeatures.hxx>
141
142#include "haddCommandLineOptionsHelp.h"
143
144#include <climits>
145#include <cstdlib>
146#include <filesystem>
147#include <fstream>
148#include <iostream>
149#include <optional>
150#include <sstream>
151#include <string>
152
153#ifndef R__WIN32
155#endif
156
157////////////////////////////////////////////////////////////////////////////////
158
159inline std::ostream &Err()
160{
161 std::cerr << "Error in <hadd>: ";
162 return std::cerr;
163}
164
165inline std::ostream &Warn()
166{
167 std::cerr << "Warning in <hadd>: ";
168 return std::cerr;
169}
170
171inline std::ostream &Info()
172{
173 std::cerr << "Info in <hadd>: ";
174 return std::cerr;
175}
176
177using IntFlag_t = uint32_t;
178
179struct HAddArgs {
182 bool fForce;
185 bool fDebug;
188
189 std::optional<std::string> fWorkingDir;
190 std::optional<IntFlag_t> fNProcesses;
191 std::optional<std::string> fObjectFilterFile;
192 std::optional<Int_t> fObjectFilterType;
193 std::optional<TString> fCacheSize;
194 std::optional<ROOT::TIOFeatures> fFeatures;
195 std::optional<IntFlag_t> fMaxOpenedFiles;
196 std::optional<IntFlag_t> fVerbosity;
197 std::optional<IntFlag_t> fCompressionSettings;
198
201 // This is set to true if and only if the user passed `--`. In this special
202 // case, we must not stop parsing positional arguments even if we find one
203 // that starts with a `-`.
205};
206
208
209static EFlagResult FlagToggle(const char *arg, const char *flagStr, bool &flagOut)
210{
211 const auto argLen = strlen(arg);
212 const auto flagLen = strlen(flagStr);
213 if (argLen == flagLen && strncmp(arg, flagStr, flagLen) == 0) {
214 if (flagOut)
215 Warn() << "duplicate flag: " << flagStr << "\n";
216 flagOut = true;
218 }
220}
221
222// NOTE: not using std::stoi or similar because they have bad error checking.
223// std::stoi will happily parse "120notvalid" as 120.
224static std::optional<IntFlag_t> StrToUInt(const char *str)
225{
226 if (!str)
227 return {};
228
229 uint32_t res = 0;
230 do {
231 if (!isdigit(*str))
232 return {};
233 if (res * 10 < res) // overflow is an error
234 return {};
235 res *= 10;
236 res += *str - '0';
237 } while (*++str);
238
239 return res;
240}
241
242template <typename T>
247
248template <typename T>
249static FlagConvResult<T> ConvertArg(const char *);
250
251template <>
253{
254 return {arg, EFlagResult::kParsed};
255}
256
257template <>
259{
260 // Don't even try to parse arg if it doesn't look like a number.
261 if (!isdigit(*arg))
262 return {0, EFlagResult::kIgnored};
263
264 auto intOpt = StrToUInt(arg);
265 if (intOpt)
266 return {*intOpt, EFlagResult::kParsed};
267
268 Err() << "error parsing integer argument '" << arg << "'\n";
269 return {0, EFlagResult::kErr};
270}
271
272template <>
274{
276 std::stringstream ss;
277 ss.str(arg);
278 std::string item;
279 while (std::getline(ss, item, ',')) {
280 if (!features.Set(item))
281 Warn() << "ignoring unknown feature request: " << item << "\n";
282 }
284}
285
287{
288 TString cacheSize;
289 int size;
292 Err() << "could not parse the cache size passed after -cachesize: '" << arg << "'\n";
293 return {"", EFlagResult::kErr};
295 double m;
296 const char *munit = nullptr;
298 Warn() << "the cache size passed after -cachesize is too large: " << arg << " is greater than " << m << munit
299 << ". We will use the maximum value.\n";
300 return {std::to_string(m) + munit, EFlagResult::kParsed};
301 } else {
302 cacheSize = "cachesize=";
303 cacheSize.Append(arg);
304 }
305 return {cacheSize, EFlagResult::kParsed};
306}
307
309{
310 if (strcmp(arg, "SkipListed") == 0)
312 if (strcmp(arg, "OnlyListed") == 0)
314
315 Err() << "invalid argument for -Ltype: '" << arg << "'. Can only be 'SkipListed' or 'OnlyListed' (case matters).\n";
316 return {{}, EFlagResult::kErr};
317}
318
319// Parses a flag that is followed by an argument of type T.
320// If `defaultVal` is provided, the following argument is optional and will be set to `defaultVal` if missing.
321// `conv` is used to convert the argument from string to its type T.
322template <typename T>
323static EFlagResult
324FlagArg(int argc, char **argv, int &argIdxInOut, const char *flagStr, std::optional<T> &flagOut,
325 std::optional<T> defaultVal = std::nullopt, FlagConvResult<T> (*conv)(const char *) = ConvertArg<T>)
326{
327 int argIdx = argIdxInOut;
328 const char *arg = argv[argIdx] + 1;
329 int argLen = strlen(arg);
330 int flagLen = strlen(flagStr);
331 const char *nxtArg = nullptr;
332
333 if (strncmp(arg, flagStr, flagLen) != 0)
335
336 bool argIsSeparate = false;
337 if (argLen > flagLen) {
338 // interpret anything after the flag as the argument.
339 nxtArg = arg + flagLen;
340 // Ignore one '=', if present
341 if (nxtArg[0] == '=')
342 ++nxtArg;
343 } else if (argLen == flagLen) {
344 argIsSeparate = true;
345 if (argIdx + 1 < argc) {
346 ++argIdxInOut;
348 } else {
349 Err() << "expected argument after '-" << flagStr << "' flag.\n";
350 return EFlagResult::kErr;
351 }
352 } else {
354 }
355
356 auto converted = conv(nxtArg);
357 if (converted.fResult == EFlagResult::kParsed) {
358 flagOut = converted.fValue;
359 } else if (converted.fResult == EFlagResult::kIgnored) {
360 if (defaultVal && argIsSeparate) {
362 // If we had tried parsing the next argument, step back one arg idx.
364 } else {
365 Err() << "the argument after '-" << flagStr << "' flag was not of the expected type.\n";
366 return EFlagResult::kErr;
367 }
368 } else {
369 return EFlagResult::kErr;
370 }
371
373}
374
376{
377 // Must be a number between 0 and 509 (with a 0 in the middle)
378 if (compSettings == 0)
379 return true;
380 // We also accept [1-9] as aliases of [101-109], but it's discouraged.
381 if (compSettings >= 1 && compSettings <= 9) {
382 Warn() << "interpreting " << compSettings << " as " << 100 + compSettings
383 << "."
384 " This behavior is deprecated, please use the full compression settings.\n";
385 return true;
386 }
387 return (compSettings >= 100 && compSettings <= 509) && ((compSettings / 10) % 10 == 0);
388}
389
390// The -f flag has a somewhat complicated logic.
391// We have 4 cases:
392// 1. -f
393// 2. -ff
394// 3. -fk
395// 4. -f[0-509]
396//
397// and a combination thereof (e.g. -fk101, -ff202, -ffk, -fk209)
398// -ff and -f[0-509] are incompatible.
399//
400// ALL these flags imply '-f' ("force overwrite"), but only if they parse successfully.
401// This means that if we see a -f[something] and that "something" doesn't parse to a valid
402// number between 0 and 509, or f or k, we consider the flag invalid and skip it without
403// setting any state.
404//
405// Note that we don't allow `-f [0-9]` because that would be a backwards-incompatible
406// change with the previous arg parsing semantic, changing the meaning of a cmdline like:
407//
408// $ hadd -f 200 f.root g.root # <- '200' is the output file, not an argument to -f!
409static EFlagResult FlagF(const char *arg, HAddArgs &args)
410{
411 if (arg[0] != 'f')
413
414 args.fForce = true;
415 const char *cur = arg + 1;
416 while (*cur) {
417 switch (cur[0]) {
418 case 'f':
420 Warn() << "duplicate flag: -ff\n";
421 if (args.fCompressionSettings) {
422 std::cerr
423 << "[err] Cannot specify both -ff and -f[0-9]. Either use the first input compression or specify it.\n";
424 return EFlagResult::kErr;
425 } else
426 args.fUseFirstInputCompression = true;
427 break;
428 case 'k':
429 if (args.fKeepCompressionAsIs)
430 Warn() << "duplicate flag: -fk\n";
431 args.fKeepCompressionAsIs = true;
432 break;
433 default:
434 if (isdigit(cur[0])) {
435 if (args.fUseFirstInputCompression) {
436 Err() << "cannot specify both -ff and -f[0-9]. Either use the first input compression or "
437 "specify it.\n";
438 return EFlagResult::kErr;
439 } else if (!args.fCompressionSettings) {
440 if (auto compLv = StrToUInt(cur)) {
443 // we can't see any other argument after the number, so we return here to avoid
444 // incorrectly parsing the rest of the characters in `arg`.
446 } else {
447 Err() << *compLv << " is not a supported compression settings.\n";
448 return EFlagResult::kErr;
449 }
450 } else {
451 Err() << "failed to parse compression settings '" << cur << "' as an integer.\n";
452 return EFlagResult::kErr;
453 }
454 } else {
455 Err() << "cannot specify -f[0-9] multiple times!\n";
456 return EFlagResult::kErr;
457 }
458 } else {
459 Err() << "invalid flag: " << arg << "\n";
460 return EFlagResult::kErr;
461 }
462 }
463 ++cur;
464 }
465
467}
468
469// Returns nullopt if any of the flags failed to parse.
470// If an unknown flag is encountered, it will print a warning and go on.
471static std::optional<HAddArgs> ParseArgs(int argc, char **argv)
472{
473 HAddArgs args{};
474
475 enum {
481
482 for (int argIdx = 1; argIdx < argc; ++argIdx) {
483 const char *argRaw = argv[argIdx];
484 if (!*argRaw)
485 continue;
486
487 if (!args.fNoFlagsAfterPositionalArguments && argRaw[0] == '-' && argRaw[1] != '\0') {
488 if (argRaw[1] == '-' && argRaw[2] == '\0') {
489 // special case `--`: force parsing to consider all future args as positional arguments.
491 Err()
492 << "found `--`, but we've already parsed (or are still parsing) a sequence of positional arguments!"
493 " This is not supported: you must have exactly one sequence of positional arguments, so if you"
494 " need to use `--` make sure to pass *all* positional arguments after it.";
495 return {};
496 }
497 args.fNoFlagsAfterPositionalArguments = true;
498 continue;
499 }
500
501 // parse flag
503
504 const char *arg = argRaw + 1;
505 bool validFlag = false;
506
507#define PARSE_FLAG(func, ...) \
508 do { \
509 if (!validFlag) { \
510 const auto res = func(__VA_ARGS__); \
511 if (res == EFlagResult::kErr) \
512 return {}; \
513 validFlag = res == EFlagResult::kParsed; \
514 } \
515 } while (0)
516
517 // NOTE: if two flags have the same prefix (e.g. -Ltype and -L) always put the longest one first!
518 PARSE_FLAG(FlagToggle, arg, "T", args.fNoTrees);
519 PARSE_FLAG(FlagToggle, arg, "a", args.fAppend);
520 PARSE_FLAG(FlagToggle, arg, "k", args.fSkipErrors);
521 PARSE_FLAG(FlagToggle, arg, "O", args.fReoptimize);
522 PARSE_FLAG(FlagToggle, arg, "dbg", args.fDebug);
523 PARSE_FLAG(FlagArg, argc, argv, argIdx, "d", args.fWorkingDir);
524 PARSE_FLAG(FlagArg, argc, argv, argIdx, "j", args.fNProcesses, {0});
525 PARSE_FLAG(FlagArg, argc, argv, argIdx, "Ltype", args.fObjectFilterType, {}, ConvertFilterType);
526 PARSE_FLAG(FlagArg, argc, argv, argIdx, "L", args.fObjectFilterFile);
527 PARSE_FLAG(FlagArg, argc, argv, argIdx, "cachesize", args.fCacheSize, {}, ConvertCacheSize);
528 PARSE_FLAG(FlagArg, argc, argv, argIdx, "experimental-io-features", args.fFeatures);
529 PARSE_FLAG(FlagArg, argc, argv, argIdx, "n", args.fMaxOpenedFiles);
530 PARSE_FLAG(FlagArg, argc, argv, argIdx, "v", args.fVerbosity, {99});
531 PARSE_FLAG(FlagF, arg, args);
532
533#undef PARSE_FLAG
534
535 if (!validFlag)
536 Warn() << "unknown flag: " << argRaw << "\n";
537
538 } else if (!args.fOutputArgIdx) {
539 // First positional argument is the output
540 args.fOutputArgIdx = argIdx;
543 } else {
544 // We should be in the same positional argument group as the output, error otherwise
546 if (!args.fFirstInputIdx) {
547 args.fFirstInputIdx = argIdx;
548 }
549 } else {
550 Err() << "seen a positional argument '" << argRaw
551 << "' after some flags."
552 " Positional arguments were already parsed at this point (from '"
553 << argv[args.fOutputArgIdx]
554 << "' onwards), and you can only have one sequence of them, so you cannot pass more."
555 " Please group your positional arguments all together so that hadd works as you expect.\n"
556 "Cmdline: ";
557 for (int i = 0; i < argc; ++i)
558 std::cerr << argv[i] << " ";
559 std::cerr << "\n";
560
561 return {};
562 }
563 }
564 }
565
566 return args;
567}
568
569// Returns the flags to add to the file merger's flags, or -1 in case of errors.
570static Int_t ParseFilterFile(const std::optional<std::string> &filterFileName,
571 std::optional<Int_t> objectFilterType, TFileMerger &fileMerger)
572{
573 if (filterFileName) {
574 std::ifstream filterFile(*filterFileName);
575 if (!filterFile) {
576 Err() << "error opening filter file '" << *filterFileName << "'\n";
577 return -1;
578 }
580 std::string line;
581 std::string objPath;
582 int nObjects = 0;
583 while (std::getline(filterFile, line)) {
584 std::istringstream ss(line);
585 // only read exactly 1 token per line (strips any whitespaces and such)
586 objPath.clear();
587 ss >> objPath;
588 if (!objPath.empty() && objPath[0] != '#') {
589 filteredObjects.Append(objPath + ' ');
590 ++nObjects;
591 }
592 }
593
594 if (nObjects) {
595 Info() << "added " << nObjects << " object from filter file '" << *filterFileName << "'\n";
596 fileMerger.AddObjectNames(filteredObjects);
597 } else {
598 Warn() << "no objects were added from filter file '" << *filterFileName << "'\n";
599 }
600
601 assert(objectFilterType.has_value());
602 const auto filterFlag = *objectFilterType;
604 return filterFlag;
605 }
606 return 0;
607}
608
609int main(int argc, char **argv)
610{
611 if (argc < 3 || "-h" == std::string(argv[1]) || "--help" == std::string(argv[1])) {
613 return (argc == 2 && ("-h" == std::string(argv[1]) || "--help" == std::string(argv[1]))) ? 0 : 1;
614 }
615
616 const auto argsOpt = ParseArgs(argc, argv);
617 if (!argsOpt)
618 return 1;
619 const HAddArgs &args = *argsOpt;
620
622 Int_t maxopenedfiles = args.fMaxOpenedFiles.value_or(0);
623 Int_t verbosity = args.fVerbosity.value_or(99);
624 Int_t newcomp = args.fCompressionSettings.value_or(-1);
625 TString cacheSize = args.fCacheSize.value_or("");
626
627 // For the -j flag (nProcesses), we check if the flag is present and, if so, if it has a
628 // valid value (i.e. any value > 0).
629 // If the flag is present at all, we do multiprocessing. If the value of nProcesses is invalid,
630 // we default to the number of cpus on the machine.
631 Bool_t multiproc = args.fNProcesses.has_value();
632 int nProcesses;
633 if (args.fNProcesses && *args.fNProcesses > 0) {
634 nProcesses = *args.fNProcesses;
635 } else {
636 SysInfo_t s;
637 gSystem->GetSysInfo(&s);
638 nProcesses = s.fCpus;
639 }
640 if (multiproc)
641 Info() << "parallelizing with " << nProcesses << " processes.\n";
642
643 // If the user specified a workingDir, use that. Otherwise, default to the system temp dir.
644 std::string workingDir;
645 if (!args.fWorkingDir) {
647 } else if (args.fWorkingDir && gSystem->AccessPathName(args.fWorkingDir->c_str())) {
648 Err() << "could not access the directory specified: " << *args.fWorkingDir << ".\n";
649 return 1;
650 } else {
651 workingDir = *args.fWorkingDir;
652 }
653
654 // Verify that -L and -Ltype are either both present or both absent.
655 if (args.fObjectFilterFile.has_value() != args.fObjectFilterType.has_value()) {
656 Err() << "-L must always be passed along with -Ltype.\n";
657 return 1;
658 }
659
660 const char *targetname = 0;
661 if (!args.fOutputArgIdx) {
662 Err() << "missing output file.\n";
663 return 1;
664 }
665 if (!args.fFirstInputIdx) {
666 Err() << "missing input file.\n";
667 return 1;
668 }
670
671 if (verbosity > 1)
672 Info() << "target file: " << targetname << "\n";
673
674 if (args.fCacheSize)
675 Info() << "Using " << cacheSize << "\n";
676
677 ////////////////////////////// end flags processing /////////////////////////////////
678
679 gSystem->Load("libTreePlayer");
680
682 fileMerger.SetMsgPrefix("hadd");
683 fileMerger.SetPrintLevel(verbosity - 1);
684 if (maxopenedfiles > 0) {
685 fileMerger.SetMaxOpenedFiles(maxopenedfiles);
686 }
687 // The following section will collect all input filenames into a vector,
688 // including those listed within an indirect file.
689 // If any file can not be accessed, it will error out, unless args.fSkipErrors is true
690 std::vector<std::string> allSubfiles;
691 for (int a = args.fFirstInputIdx; a < argc; ++a) {
692 if (!args.fNoFlagsAfterPositionalArguments && argv[a] && argv[a][0] == '-') {
693 break;
694 }
695 if (argv[a] && argv[a][0] == '@') {
696 std::ifstream indirect_file(argv[a] + 1);
697 if (!indirect_file.is_open()) {
698 Err() << "could not open indirect file " << (argv[a] + 1) << std::endl;
699 if (!args.fSkipErrors)
700 return 1;
701 } else {
702 std::string line;
703 while (indirect_file) {
704 if (std::getline(indirect_file, line) && line.length()) {
705 if (gSystem->AccessPathName(line.c_str(), kReadPermission) == kTRUE) {
706 Err() << "could not validate the file name \"" << line << "\" within indirect file "
707 << (argv[a] + 1) << std::endl;
708 if (!args.fSkipErrors)
709 return 1;
710 } else if (std::filesystem::exists(targetname) && std::filesystem::equivalent(line, targetname)) {
711 Err() << "file " << line << " cannot be both the target and an input!\n";
712 if (!args.fSkipErrors)
713 return 1;
714 } else {
715 allSubfiles.emplace_back(line);
716 }
717 }
718 }
719 }
720 } else {
721 const std::string line = argv[a];
722 if (gSystem->AccessPathName(line.c_str(), kReadPermission) == kTRUE) {
723 Err() << "could not validate argument \"" << line << "\" as input file " << std::endl;
724 if (!args.fSkipErrors)
725 return 1;
726 } else if (std::filesystem::exists(targetname) && std::filesystem::equivalent(line, targetname)) {
727 Err() << "file " << line << " cannot be both the target and an input!\n";
728 if (!args.fSkipErrors)
729 return 1;
730 } else {
731 allSubfiles.emplace_back(line);
732 }
733 }
734 }
735 if (allSubfiles.empty()) {
736 Err() << "could not find any valid input file " << std::endl;
737 return 1;
738 }
739 // The next snippet determines the output compression if unset
740 if (newcomp == -1) {
742 // grab from the first file.
743 TFile *firstInput = TFile::Open(allSubfiles.front().c_str());
744 if (firstInput && !firstInput->IsZombie())
745 newcomp = firstInput->GetCompressionSettings();
746 else
748 delete firstInput;
749 fileMerger.SetMergeOptions(TString("FirstSrcCompression"));
750 } else {
752 fileMerger.SetMergeOptions(TString("DefaultCompression"));
753 }
754 }
755 if (verbosity > 1) {
756 if (args.fKeepCompressionAsIs && !args.fReoptimize)
757 Info() << "compression setting for meta data: " << newcomp << '\n';
758 else
759 Info() << "compression setting for all output: " << newcomp << '\n';
760 }
761 if (args.fAppend) {
762 if (!fileMerger.OutputFile(targetname, "UPDATE", newcomp)) {
763 Err() << "error opening target file for update :" << targetname << ".\n";
764 return 2;
765 }
766 } else if (!fileMerger.OutputFile(targetname, args.fForce, newcomp)) {
767 Err() << "error opening target file (does " << targetname << " exist?).\n";
768 if (!args.fForce)
769 Info() << "pass \"-f\" argument to force re-creation of output file.\n";
770 return 1;
771 }
772
773 auto step = (allSubfiles.size() + nProcesses - 1) / nProcesses;
774 if (multiproc && step < 3) {
775 // At least 3 files per process
776 step = 3;
777 nProcesses = (allSubfiles.size() + step - 1) / step;
778 Info() << "each process should handle at least 3 files for efficiency."
779 " Setting the number of processes to: "
780 << nProcesses << std::endl;
781 }
782 if (nProcesses == 1)
784
785 std::vector<std::string> partialFiles;
786
787#ifndef R__WIN32
788 // this is commented out only to try to prevent false positive detection
789 // from several anti-virus engines on Windows, and multiproc is not
790 // supported on Windows anyway
791 if (multiproc) {
792 auto uuid = TUUID();
793 auto partialTail = uuid.AsString();
794 for (auto i = 0; (i * step) < allSubfiles.size(); i++) {
795 std::stringstream buffer;
796 buffer << workingDir << "/partial" << i << "_" << partialTail << ".root";
797 partialFiles.emplace_back(buffer.str());
798 }
799 }
800#endif
801
802 auto mergeFiles = [&](TFileMerger &merger) {
803 if (args.fReoptimize) {
804 merger.SetFastMethod(kFALSE);
805 } else {
806 if (!args.fKeepCompressionAsIs && merger.HasCompressionChange()) {
807 // Don't warn if the user has requested any re-optimization.
808 Warn() << "Sources and Target have different compression settings\n"
809 "hadd merging will be slower\n";
810 }
811 }
812 merger.SetNotrees(args.fNoTrees);
813 merger.SetMergeOptions(TString(merger.GetMergeOptions()) + " " + cacheSize);
814 merger.SetIOFeatures(features);
817 if (extraFlags < 0)
818 return false;
820 if (args.fAppend)
822 else
824 Bool_t status = merger.PartialMerge(fileMergerFlags);
825 return status;
826 };
827
828 auto sequentialMerge = [&](TFileMerger &merger, int start, int nFiles) {
829 for (auto i = start; i < (start + nFiles) && i < static_cast<int>(allSubfiles.size()); i++) {
830 if (!merger.AddFile(allSubfiles[i].c_str())) {
831 if (args.fSkipErrors) {
832 Warn() << "skipping file with error: " << allSubfiles[i] << std::endl;
833 } else {
834 Err() << "exiting due to error in " << allSubfiles[i] << std::endl;
835 return kFALSE;
836 }
837 }
838 }
839 return mergeFiles(merger);
840 };
841
842 auto parallelMerge = [&](int start) {
844 mergerP.SetMsgPrefix("hadd");
845 mergerP.SetPrintLevel(verbosity - 1);
846 if (maxopenedfiles > 0) {
847 mergerP.SetMaxOpenedFiles(maxopenedfiles / nProcesses);
848 }
849 if (!mergerP.OutputFile(partialFiles[start / step].c_str(), newcomp)) {
850 Err() << "error opening target partial file\n";
851 exit(1);
852 }
853 return sequentialMerge(mergerP, start, step);
854 };
855
856 auto reductionFunc = [&]() {
857 for (const auto &pf : partialFiles) {
858 fileMerger.AddFile(pf.c_str());
859 }
860 return mergeFiles(fileMerger);
861 };
862
863 Bool_t status;
864
865#ifndef R__WIN32
866 if (multiproc) {
868 auto res = p.Map(parallelMerge, ROOT::TSeqI(0, allSubfiles.size(), step));
869 status = std::accumulate(res.begin(), res.end(), 0U) == partialFiles.size();
870 if (status) {
871 status = reductionFunc();
872 } else {
873 Err() << "failed at the parallel stage\n";
874 }
875 if (!args.fDebug) {
876 for (const auto &pf : partialFiles) {
877 gSystem->Unlink(pf.c_str());
878 }
879 }
880 } else {
881 status = sequentialMerge(fileMerger, 0, allSubfiles.size());
882 }
883#else
884 status = sequentialMerge(fileMerger, 0, allSubfiles.size());
885#endif
886
887 if (status) {
888 if (verbosity == 1) {
889 Info() << "merged " << allSubfiles.size() << " (" << fileMerger.GetMergeList()->GetEntries()
890 << ") input (partial) files into " << targetname << ".\n";
891 }
892 return 0;
893 } else {
894 if (verbosity == 1) {
895 Err() << "failure during the merge of " << allSubfiles.size() << " ("
896 << fileMerger.GetMergeList()->GetEntries() << ") input (partial) files into " << targetname << ".\n";
897 }
898 return 1;
899 }
900}
int main()
Definition Prototype.cxx:12
#define a(i)
Definition RSha256.hxx:99
size_t size(const MatrixT &matrix)
retrieve the size of a square matrix
bool Bool_t
Definition RtypesCore.h:63
int Int_t
Definition RtypesCore.h:45
constexpr Bool_t kFALSE
Definition RtypesCore.h:94
constexpr Bool_t kTRUE
Definition RtypesCore.h:93
ROOT::Detail::TRangeCast< T, true > TRangeDynCast
TRangeDynCast is an adapter class that allows the typed iteration through a TCollection.
winID h TVirtualViewer3D TVirtualGLPainter p
@ kReadPermission
Definition TSystem.h:55
R__EXTERN TSystem * gSystem
Definition TSystem.h:572
TIOFeatures provides the end-user with the ability to change the IO behavior of data written via a TT...
This class provides a simple interface to execute the same task multiple times in parallel,...
This class provides file copy and merging services.
Definition TFileMerger.h:30
@ kAll
Merge all type of objects (default)
Definition TFileMerger.h:78
@ kIncremental
Merge the input file with the content of the output file (if already existing).
Definition TFileMerger.h:73
@ kSkipListed
Skip objects specified in fObjectNames list.
Definition TFileMerger.h:82
@ kOnlyListed
Only the objects specified in fObjectNames list.
Definition TFileMerger.h:81
@ kRegular
Normal merge, overwriting the output file.
Definition TFileMerger.h:72
A ROOT file is an on-disk file, usually with extension .root, that stores objects in a file-system-li...
Definition TFile.h:131
static TFile * Open(const char *name, Option_t *option="", const char *ftitle="", Int_t compress=ROOT::RCompressionSetting::EDefaults::kUseCompiledDefault, Int_t netopt=0)
Create / open a file.
Definition TFile.cxx:4130
Basic string class.
Definition TString.h:139
TString & Append(const char *cs)
Definition TString.h:572
virtual int GetSysInfo(SysInfo_t *info) const
Returns static system info, like OS type, CPU type, number of CPUs RAM size, etc into the SysInfo_t s...
Definition TSystem.cxx:2458
virtual int Load(const char *module, const char *entry="", Bool_t system=kFALSE)
Load a shared library.
Definition TSystem.cxx:1857
virtual Bool_t AccessPathName(const char *path, EAccessMode mode=kFileExists)
Returns FALSE if one can access a file using the specified access mode.
Definition TSystem.cxx:1296
virtual int Unlink(const char *name)
Unlink, i.e.
Definition TSystem.cxx:1381
virtual const char * TempDirectory() const
Return a user configured or systemwide directory to create temporary files in.
Definition TSystem.cxx:1482
This class defines a UUID (Universally Unique IDentifier), also known as GUIDs (Globally Unique IDent...
Definition TUUID.h:42
TLine * line
static EFlagResult FlagArg(int argc, char **argv, int &argIdxInOut, const char *flagStr, std::optional< T > &flagOut, std::optional< T > defaultVal=std::nullopt, FlagConvResult< T >(*conv)(const char *)=ConvertArg< T >)
Definition hadd.cxx:324
EFlagResult
Definition hadd.cxx:207
static bool ValidCompressionSettings(int compSettings)
Definition hadd.cxx:375
FlagConvResult< IntFlag_t > ConvertArg< IntFlag_t >(const char *arg)
Definition hadd.cxx:258
#define PARSE_FLAG(func,...)
static FlagConvResult< Int_t > ConvertFilterType(const char *arg)
Definition hadd.cxx:308
static Int_t ParseFilterFile(const std::optional< std::string > &filterFileName, std::optional< Int_t > objectFilterType, TFileMerger &fileMerger)
Definition hadd.cxx:570
static FlagConvResult< T > ConvertArg(const char *)
uint32_t IntFlag_t
Definition hadd.cxx:177
static std::optional< HAddArgs > ParseArgs(int argc, char **argv)
Definition hadd.cxx:471
FlagConvResult< ROOT::TIOFeatures > ConvertArg< ROOT::TIOFeatures >(const char *arg)
Definition hadd.cxx:273
std::ostream & Warn()
Definition hadd.cxx:165
std::ostream & Info()
Definition hadd.cxx:171
static FlagConvResult< TString > ConvertCacheSize(const char *arg)
Definition hadd.cxx:286
static EFlagResult FlagF(const char *arg, HAddArgs &args)
Definition hadd.cxx:409
static EFlagResult FlagToggle(const char *arg, const char *flagStr, bool &flagOut)
Definition hadd.cxx:209
static std::optional< IntFlag_t > StrToUInt(const char *str)
Definition hadd.cxx:224
std::ostream & Err()
Definition hadd.cxx:159
static constexpr const char kCommandLineOptionsHelp[]
void ToHumanReadableSize(value_type bytes, Bool_t si, Double_t *coeff, const char **units)
Return the size expressed in 'human readable' format.
EFromHumanReadableSize FromHumanReadableSize(std::string_view str, T &value)
Convert strings like the following into byte counts 5MB, 5 MB, 5M, 3.7GB, 123b, 456kB,...
EFlagResult fResult
Definition hadd.cxx:245
bool fNoFlagsAfterPositionalArguments
Definition hadd.cxx:204
bool fKeepCompressionAsIs
Definition hadd.cxx:186
bool fForce
Definition hadd.cxx:182
std::optional< TString > fCacheSize
Definition hadd.cxx:193
std::optional< IntFlag_t > fCompressionSettings
Definition hadd.cxx:197
bool fNoTrees
Definition hadd.cxx:180
std::optional< Int_t > fObjectFilterType
Definition hadd.cxx:192
int fFirstInputIdx
Definition hadd.cxx:200
std::optional< IntFlag_t > fNProcesses
Definition hadd.cxx:190
bool fUseFirstInputCompression
Definition hadd.cxx:187
std::optional< std::string > fObjectFilterFile
Definition hadd.cxx:191
bool fSkipErrors
Definition hadd.cxx:183
std::optional< IntFlag_t > fVerbosity
Definition hadd.cxx:196
std::optional< IntFlag_t > fMaxOpenedFiles
Definition hadd.cxx:195
std::optional< std::string > fWorkingDir
Definition hadd.cxx:189
int fOutputArgIdx
Definition hadd.cxx:199
bool fDebug
Definition hadd.cxx:185
bool fReoptimize
Definition hadd.cxx:184
std::optional< ROOT::TIOFeatures > fFeatures
Definition hadd.cxx:194
bool fAppend
Definition hadd.cxx:181
@ kUseCompiledDefault
Use the compile-time default setting.
Definition Compression.h:53
Int_t fCpus
Definition TSystem.h:162
TMarker m
Definition textangle.C:8