Logo ROOT  
Reference Guide
 
Loading...
Searching...
No Matches
hadd.cxx
Go to the documentation of this file.
1/**
2 \file hadd.cxx
3 \brief This program will merge compatible ROOT objects, such as histograms, Trees and RNTuples,
4 from a list of root files and write them to a target root file.
5 In order for a ROOT object to be mergeable, it must implement the Merge() function.
6 Non-mergeable objects will have all instances copied as-is into the target file.
7 The target file must not be identical to one of the source files.
8
9 Syntax:
10 ```{.cpp}
11 hadd [flags] targetfile source1 source2 ... [flags]
12 ```
13
14 Flags can be passed before or after the positional arguments.
15 The first positional (non-flag) argument will be interpreted as the targetfile.
16 After that, the first sequence of positional arguments will be interpreted as the input files.
17 If two sequences of positional arguments are separated by flags, hadd will emit an error and abort.
18
19 By default, any argument starting with `-` is interpreted as a flag. If you want to pass filenames
20 starting with `-` you need to pass them after `--`:
21 ```{.cpp}
22 hadd [flags] -- -file1 -file2 ...
23 ```
24 Note that in this case you need to pass ALL positional arguments after `--`.
25
26 If a flag requires an argument, the argument can be specified in any of these ways:
27
28 # All equally valid:
29 -j 16
30 -j16
31 -j=16
32
33 The first syntax is the preferred one since it's backward-compatible with previous versions of hadd.
34 The -f flag is an exception to this rule: it only supports the `-f[0-9]` syntax.
35
36 Note that merging multiple flags is NOT supported: `-jfa` will be interpreted as -j=fa, which is invalid!
37
38 The flags are as follows:
39
40 \param -a Append to the output
41 \param -cachesize <SIZE> Resize the prefetching cache used to speed up I/O operations (use 0 to disable).
42 \param -d <DIR> Carry out the partial multiprocess execution in the specified directory
43 \param -dbg Enable verbosity. If -j was specified, do not not delete partial files
44 stored inside working directory.
45 \param -experimental-io-features <FEATURES> Enables the corresponding experimental feature for output trees.
46 \see ROOT::Experimental::EIOFeatures
47 \param -f Force overwriting of output file.
48 \param -f[0-9] Set target compression algorithm `i` and level `j` passing the number `i*100 + j`, e.g. `-f505`.
49 The last digit (`j`) can be set from 0 = uncompressed to 9 = highly compressed.
50 The first digit (`i`) is 1 for ZLIB, 2 for LZMA, 4 for LZ4 and 5 for ZSTD.
51 Recommended numbers are 101 (ZLIB), 207 (LZMA), 404 (LZ4), 505 (ZSTD),
52 The default value for this flag is 101 (kDefaultZLIB).
53 See ROOT::RCompressionSetting and TFile::TFile documentation for more details.
54 \param -fk Sets the target file to contain the baskets with the same compression as the input files
55 (unless -O is specified). Compresses the meta data using the compression level specified
56 in the first input or the compression setting after fk (for example 505 when using -fk505)
57 \param -ff The compression level used is the one specified in the first input
58 \param -j [N_JOBS] Parallelise the execution in `N_JOBS` processes. If the number of processes is not specified,
59 or is 0, use the system maximum.
60 \param -k Skip corrupt or non-existent files, do not exit
61 \param -L <FILE> Read the list of objects from FILE and either only merge or skip those objects depending on
62 the value of "-Ltype". FILE must contain one object name per line, which cannot contain
63 whitespaces or '/'. You can also pass TDirectory names, which apply to the entire directory
64 content. Lines beginning with '#' are ignored. If this flag is passed, "-Ltype" MUST be
65 passed as well.
66 \param -Ltype <SkipListed|OnlyListed> Sets the type of operation performed on the objects listed in FILE given with the
67 "-L" flag. "SkipListed" will skip all the listed objects; "OnlyListed" will only merge those
68 objects. If this flag is passed, "-L" must be passed as well.
69 \param -n <N_FILES> Open at most `N` files at once (use 0 to request to use the system maximum - which is also
70 the default). This number includes both the input reading files as well as the output file.
71 Thus, if set to 1, it will be automatically replaced to a minimum of 2. If set to a too large value,
72 it will be clipped to the system maximum.
73 \param -O Re-optimize basket size when merging TTree
74 \param -T Do not merge Trees
75 \param -v [LEVEL] Explicitly set the verbosity level: 0 request no output, 99 is the default
76 \return hadd returns a status code: 0 if OK, 1 otherwise
77
78 For example assume 3 files f1, f2, f3 containing histograms hn and Trees Tn
79 - f1 with h1 h2 h3 T1
80 - f2 with h1 h4 T1 T2
81 - f3 with h5
82 the result of
83 ```
84 hadd -f x.root f1.root f2.root f3.root
85 ```
86 will be a file x.root with h1 h2 h3 h4 h5 T1 T2
87 where
88 - h1 will be the sum of the 2 histograms in f1 and f2
89 - T1 will be the merge of the Trees in f1 and f2
90
91 The files may contain sub-directories.
92
93 If the source files contains histograms and Trees, one can skip
94 the Trees with
95 ```
96 hadd -T targetfile source1 source2 ...
97 ```
98
99 Wildcarding and indirect files are also supported
100 ```
101 hadd result.root myfil*.root
102 ```
103 will merge all files in myfil*.root
104 ```
105 hadd result.root file1.root @list.txt file2. root myfil*.root
106 ```
107 will merge file1.root, file2.root, all files in myfil*.root
108 and all files in the indirect text file list.txt ("@" as the first
109 character of the file indicates an indirect file. An indirect file
110 is a text file containing a list of other files, including other
111 indirect files, one line per file).
112
113 If the sources and and target compression levels are identical (default),
114 the program uses the TChain::Merge function with option "fast", ie
115 the merge will be done without unzipping or unstreaming the baskets
116 (i.e. direct copy of the raw byte on disk). The "fast" mode is typically
117 5 times faster than the mode unzipping and unstreaming the baskets.
118
119 If the option -cachesize is used, hadd will resize (or disable if 0) the
120 prefetching cache use to speed up I/O operations.
121
122 For options that take a size as argument, a decimal number of bytes is expected.
123 If the number ends with a `k`, `m`, `g`, etc., the number is multiplied
124 by 1000 (1K), 1000000 (1MB), 1000000000 (1G), etc.
125 If this prefix is followed by `i`, the number is multiplied by the traditional
126 1024 (1KiB), 1048576 (1MiB), 1073741824 (1GiB), etc.
127 The prefix can be optionally followed by B whose casing is ignored,
128 eg. 1k, 1K, 1Kb and 1KB are the same.
129
130 \note By default histograms are added. However hadd does not support the case where
131 histograms have their bit TH1::kIsAverage set.
132
133 \authors Rene Brun, Dirk Geppert, Sven A. Schmidt, Toby Burnett
134*/
135#include "Compression.h"
136#include "TClass.h"
137#include "TFile.h"
138#include "TFileMerger.h"
139#include "THashList.h"
140#include "TKey.h"
141#include "TSystem.h"
142#include "TUUID.h"
143
144#include <ROOT/RConfig.hxx>
145#include <ROOT/StringConv.hxx>
146#include <ROOT/TIOFeatures.hxx>
147
148#include "haddCommandLineOptionsHelp.h"
149
150#include <climits>
151#include <cstdlib>
152#include <filesystem>
153#include <fstream>
154#include <iostream>
155#include <optional>
156#include <sstream>
157#include <string>
158
159#ifndef R__WIN32
161#endif
162
163////////////////////////////////////////////////////////////////////////////////
164
165inline std::ostream &Err()
166{
167 std::cerr << "Error in <hadd>: ";
168 return std::cerr;
169}
170
171inline std::ostream &Warn()
172{
173 std::cerr << "Warning in <hadd>: ";
174 return std::cerr;
175}
176
177inline std::ostream &Info()
178{
179 std::cerr << "Info in <hadd>: ";
180 return std::cerr;
181}
182
183using IntFlag_t = uint32_t;
184
185struct HAddArgs {
188 bool fForce;
191 bool fDebug;
194
195 std::optional<std::string> fWorkingDir;
196 std::optional<IntFlag_t> fNProcesses;
197 std::optional<std::string> fObjectFilterFile;
198 std::optional<Int_t> fObjectFilterType;
199 std::optional<TString> fCacheSize;
200 std::optional<ROOT::TIOFeatures> fFeatures;
201 std::optional<IntFlag_t> fMaxOpenedFiles;
202 std::optional<IntFlag_t> fVerbosity;
203 std::optional<IntFlag_t> fCompressionSettings;
204
207 // This is set to true if and only if the user passed `--`. In this special
208 // case, we must not stop parsing positional arguments even if we find one
209 // that starts with a `-`.
211};
212
214
215static EFlagResult FlagToggle(const char *arg, const char *flagStr, bool &flagOut)
216{
217 const auto argLen = strlen(arg);
218 const auto flagLen = strlen(flagStr);
219 if (argLen == flagLen && strncmp(arg, flagStr, flagLen) == 0) {
220 if (flagOut)
221 Warn() << "duplicate flag: " << flagStr << "\n";
222 flagOut = true;
224 }
226}
227
228// NOTE: not using std::stoi or similar because they have bad error checking.
229// std::stoi will happily parse "120notvalid" as 120.
230static std::optional<IntFlag_t> StrToUInt(const char *str)
231{
232 if (!str)
233 return {};
234
235 uint32_t res = 0;
236 do {
237 if (!isdigit(*str))
238 return {};
239 if (res * 10 < res) // overflow is an error
240 return {};
241 res *= 10;
242 res += *str - '0';
243 } while (*++str);
244
245 return res;
246}
247
248template <typename T>
253
254template <typename T>
255static FlagConvResult<T> ConvertArg(const char *);
256
257template <>
259{
260 return {arg, EFlagResult::kParsed};
261}
262
263template <>
265{
266 // Don't even try to parse arg if it doesn't look like a number.
267 if (!isdigit(*arg))
268 return {0, EFlagResult::kIgnored};
269
270 auto intOpt = StrToUInt(arg);
271 if (intOpt)
272 return {*intOpt, EFlagResult::kParsed};
273
274 Err() << "error parsing integer argument '" << arg << "'\n";
275 return {0, EFlagResult::kErr};
276}
277
278template <>
280{
282 std::stringstream ss;
283 ss.str(arg);
284 std::string item;
285 while (std::getline(ss, item, ',')) {
286 if (!features.Set(item))
287 Warn() << "ignoring unknown feature request: " << item << "\n";
288 }
290}
291
293{
294 TString cacheSize;
295 int size;
298 Err() << "could not parse the cache size passed after -cachesize: '" << arg << "'\n";
299 return {"", EFlagResult::kErr};
301 double m;
302 const char *munit = nullptr;
304 Warn() << "the cache size passed after -cachesize is too large: " << arg << " is greater than " << m << munit
305 << ". We will use the maximum value.\n";
306 return {std::to_string(m) + munit, EFlagResult::kParsed};
307 } else {
308 cacheSize = "cachesize=";
309 cacheSize.Append(arg);
310 }
311 return {cacheSize, EFlagResult::kParsed};
312}
313
315{
316 if (strcmp(arg, "SkipListed") == 0)
318 if (strcmp(arg, "OnlyListed") == 0)
320
321 Err() << "invalid argument for -Ltype: '" << arg << "'. Can only be 'SkipListed' or 'OnlyListed' (case matters).\n";
322 return {{}, EFlagResult::kErr};
323}
324
325// Parses a flag that is followed by an argument of type T.
326// If `defaultVal` is provided, the following argument is optional and will be set to `defaultVal` if missing.
327// `conv` is used to convert the argument from string to its type T.
328template <typename T>
329static EFlagResult
330FlagArg(int argc, char **argv, int &argIdxInOut, const char *flagStr, std::optional<T> &flagOut,
331 std::optional<T> defaultVal = std::nullopt, FlagConvResult<T> (*conv)(const char *) = ConvertArg<T>)
332{
333 int argIdx = argIdxInOut;
334 const char *arg = argv[argIdx] + 1;
335 int argLen = strlen(arg);
336 int flagLen = strlen(flagStr);
337 const char *nxtArg = nullptr;
338
339 if (strncmp(arg, flagStr, flagLen) != 0)
341
342 bool argIsSeparate = false;
343 if (argLen > flagLen) {
344 // interpret anything after the flag as the argument.
345 nxtArg = arg + flagLen;
346 // Ignore one '=', if present
347 if (nxtArg[0] == '=')
348 ++nxtArg;
349 } else if (argLen == flagLen) {
350 argIsSeparate = true;
351 if (argIdx + 1 < argc) {
352 ++argIdxInOut;
354 } else {
355 Err() << "expected argument after '-" << flagStr << "' flag.\n";
356 return EFlagResult::kErr;
357 }
358 } else {
360 }
361
362 auto converted = conv(nxtArg);
363 if (converted.fResult == EFlagResult::kParsed) {
364 flagOut = converted.fValue;
365 } else if (converted.fResult == EFlagResult::kIgnored) {
366 if (defaultVal && argIsSeparate) {
368 // If we had tried parsing the next argument, step back one arg idx.
370 } else {
371 Err() << "the argument after '-" << flagStr << "' flag was not of the expected type.\n";
372 return EFlagResult::kErr;
373 }
374 } else {
375 return EFlagResult::kErr;
376 }
377
379}
380
382{
383 // Must be a number between 0 and 509 (with a 0 in the middle)
384 if (compSettings == 0)
385 return true;
386 // We also accept [1-9] as aliases of [101-109], but it's discouraged.
387 if (compSettings >= 1 && compSettings <= 9) {
388 Warn() << "interpreting " << compSettings << " as " << 100 + compSettings
389 << "."
390 " This behavior is deprecated, please use the full compression settings.\n";
391 return true;
392 }
393 return (compSettings >= 100 && compSettings <= 509) && ((compSettings / 10) % 10 == 0);
394}
395
396// The -f flag has a somewhat complicated logic.
397// We have 4 cases:
398// 1. -f
399// 2. -ff
400// 3. -fk
401// 4. -f[0-509]
402//
403// and a combination thereof (e.g. -fk101, -ff202, -ffk, -fk209)
404// -ff and -f[0-509] are incompatible.
405//
406// ALL these flags imply '-f' ("force overwrite"), but only if they parse successfully.
407// This means that if we see a -f[something] and that "something" doesn't parse to a valid
408// number between 0 and 509, or f or k, we consider the flag invalid and skip it without
409// setting any state.
410//
411// Note that we don't allow `-f [0-9]` because that would be a backwards-incompatible
412// change with the previous arg parsing semantic, changing the meaning of a cmdline like:
413//
414// $ hadd -f 200 f.root g.root # <- '200' is the output file, not an argument to -f!
415static EFlagResult FlagF(const char *arg, HAddArgs &args)
416{
417 if (arg[0] != 'f')
419
420 args.fForce = true;
421 const char *cur = arg + 1;
422 while (*cur) {
423 switch (cur[0]) {
424 case 'f':
426 Warn() << "duplicate flag: -ff\n";
427 if (args.fCompressionSettings) {
428 std::cerr
429 << "[err] Cannot specify both -ff and -f[0-9]. Either use the first input compression or specify it.\n";
430 return EFlagResult::kErr;
431 } else
432 args.fUseFirstInputCompression = true;
433 break;
434 case 'k':
435 if (args.fKeepCompressionAsIs)
436 Warn() << "duplicate flag: -fk\n";
437 args.fKeepCompressionAsIs = true;
438 break;
439 default:
440 if (isdigit(cur[0])) {
441 if (args.fUseFirstInputCompression) {
442 Err() << "cannot specify both -ff and -f[0-9]. Either use the first input compression or "
443 "specify it.\n";
444 return EFlagResult::kErr;
445 } else if (!args.fCompressionSettings) {
446 if (auto compLv = StrToUInt(cur)) {
449 // we can't see any other argument after the number, so we return here to avoid
450 // incorrectly parsing the rest of the characters in `arg`.
452 } else {
453 Err() << *compLv << " is not a supported compression settings.\n";
454 return EFlagResult::kErr;
455 }
456 } else {
457 Err() << "failed to parse compression settings '" << cur << "' as an integer.\n";
458 return EFlagResult::kErr;
459 }
460 } else {
461 Err() << "cannot specify -f[0-9] multiple times!\n";
462 return EFlagResult::kErr;
463 }
464 } else {
465 Err() << "invalid flag: " << arg << "\n";
466 return EFlagResult::kErr;
467 }
468 }
469 ++cur;
470 }
471
473}
474
475// Returns nullopt if any of the flags failed to parse.
476// If an unknown flag is encountered, it will print a warning and go on.
477static std::optional<HAddArgs> ParseArgs(int argc, char **argv)
478{
479 HAddArgs args{};
480
481 enum {
487
488 for (int argIdx = 1; argIdx < argc; ++argIdx) {
489 const char *argRaw = argv[argIdx];
490 if (!*argRaw)
491 continue;
492
493 if (!args.fNoFlagsAfterPositionalArguments && argRaw[0] == '-' && argRaw[1] != '\0') {
494 if (argRaw[1] == '-' && argRaw[2] == '\0') {
495 // special case `--`: force parsing to consider all future args as positional arguments.
497 Err()
498 << "found `--`, but we've already parsed (or are still parsing) a sequence of positional arguments!"
499 " This is not supported: you must have exactly one sequence of positional arguments, so if you"
500 " need to use `--` make sure to pass *all* positional arguments after it.";
501 return {};
502 }
503 args.fNoFlagsAfterPositionalArguments = true;
504 continue;
505 }
506
507 // parse flag
509
510 const char *arg = argRaw + 1;
511 bool validFlag = false;
512
513#define PARSE_FLAG(func, ...) \
514 do { \
515 if (!validFlag) { \
516 const auto res = func(__VA_ARGS__); \
517 if (res == EFlagResult::kErr) \
518 return {}; \
519 validFlag = res == EFlagResult::kParsed; \
520 } \
521 } while (0)
522
523 // NOTE: if two flags have the same prefix (e.g. -Ltype and -L) always put the longest one first!
524 PARSE_FLAG(FlagToggle, arg, "T", args.fNoTrees);
525 PARSE_FLAG(FlagToggle, arg, "a", args.fAppend);
526 PARSE_FLAG(FlagToggle, arg, "k", args.fSkipErrors);
527 PARSE_FLAG(FlagToggle, arg, "O", args.fReoptimize);
528 PARSE_FLAG(FlagToggle, arg, "dbg", args.fDebug);
529 PARSE_FLAG(FlagArg, argc, argv, argIdx, "d", args.fWorkingDir);
530 PARSE_FLAG(FlagArg, argc, argv, argIdx, "j", args.fNProcesses, {0});
531 PARSE_FLAG(FlagArg, argc, argv, argIdx, "Ltype", args.fObjectFilterType, {}, ConvertFilterType);
532 PARSE_FLAG(FlagArg, argc, argv, argIdx, "L", args.fObjectFilterFile);
533 PARSE_FLAG(FlagArg, argc, argv, argIdx, "cachesize", args.fCacheSize, {}, ConvertCacheSize);
534 PARSE_FLAG(FlagArg, argc, argv, argIdx, "experimental-io-features", args.fFeatures);
535 PARSE_FLAG(FlagArg, argc, argv, argIdx, "n", args.fMaxOpenedFiles);
536 PARSE_FLAG(FlagArg, argc, argv, argIdx, "v", args.fVerbosity, {99});
537 PARSE_FLAG(FlagF, arg, args);
538
539#undef PARSE_FLAG
540
541 if (!validFlag)
542 Warn() << "unknown flag: " << argRaw << "\n";
543
544 } else if (!args.fOutputArgIdx) {
545 // First positional argument is the output
546 args.fOutputArgIdx = argIdx;
549 } else {
550 // We should be in the same positional argument group as the output, error otherwise
552 if (!args.fFirstInputIdx) {
553 args.fFirstInputIdx = argIdx;
554 }
555 } else {
556 Err() << "seen a positional argument '" << argRaw
557 << "' after some flags."
558 " Positional arguments were already parsed at this point (from '"
559 << argv[args.fOutputArgIdx]
560 << "' onwards), and you can only have one sequence of them, so you cannot pass more."
561 " Please group your positional arguments all together so that hadd works as you expect.\n"
562 "Cmdline: ";
563 for (int i = 0; i < argc; ++i)
564 std::cerr << argv[i] << " ";
565 std::cerr << "\n";
566
567 return {};
568 }
569 }
570 }
571
572 return args;
573}
574
575// Returns the flags to add to the file merger's flags, or -1 in case of errors.
576static Int_t ParseFilterFile(const std::optional<std::string> &filterFileName,
577 std::optional<Int_t> objectFilterType, TFileMerger &fileMerger)
578{
579 if (filterFileName) {
580 std::ifstream filterFile(*filterFileName);
581 if (!filterFile) {
582 Err() << "error opening filter file '" << *filterFileName << "'\n";
583 return -1;
584 }
586 std::string line;
587 std::string objPath;
588 int nObjects = 0;
589 while (std::getline(filterFile, line)) {
590 std::istringstream ss(line);
591 // only read exactly 1 token per line (strips any whitespaces and such)
592 objPath.clear();
593 ss >> objPath;
594 if (!objPath.empty() && objPath[0] != '#') {
595 filteredObjects.Append(objPath + ' ');
596 ++nObjects;
597 }
598 }
599
600 if (nObjects) {
601 Info() << "added " << nObjects << " object from filter file '" << *filterFileName << "'\n";
602 fileMerger.AddObjectNames(filteredObjects);
603 } else {
604 Warn() << "no objects were added from filter file '" << *filterFileName << "'\n";
605 }
606
607 assert(objectFilterType.has_value());
608 const auto filterFlag = *objectFilterType;
610 return filterFlag;
611 }
612 return 0;
613}
614
615int main(int argc, char **argv)
616{
617 if (argc < 3 || "-h" == std::string(argv[1]) || "--help" == std::string(argv[1])) {
619 return (argc == 2 && ("-h" == std::string(argv[1]) || "--help" == std::string(argv[1]))) ? 0 : 1;
620 }
621
622 const auto argsOpt = ParseArgs(argc, argv);
623 if (!argsOpt)
624 return 1;
625 const HAddArgs &args = *argsOpt;
626
628 Int_t maxopenedfiles = args.fMaxOpenedFiles.value_or(0);
629 Int_t verbosity = args.fVerbosity.value_or(99);
630 Int_t newcomp = args.fCompressionSettings.value_or(-1);
631 TString cacheSize = args.fCacheSize.value_or("");
632
633 // For the -j flag (nProcesses), we check if the flag is present and, if so, if it has a
634 // valid value (i.e. any value > 0).
635 // If the flag is present at all, we do multiprocessing. If the value of nProcesses is invalid,
636 // we default to the number of cpus on the machine.
637 Bool_t multiproc = args.fNProcesses.has_value();
638 int nProcesses;
639 if (args.fNProcesses && *args.fNProcesses > 0) {
640 nProcesses = *args.fNProcesses;
641 } else {
642 SysInfo_t s;
643 gSystem->GetSysInfo(&s);
644 nProcesses = s.fCpus;
645 }
646 if (multiproc)
647 Info() << "parallelizing with " << nProcesses << " processes.\n";
648
649 // If the user specified a workingDir, use that. Otherwise, default to the system temp dir.
650 std::string workingDir;
651 if (!args.fWorkingDir) {
653 } else if (args.fWorkingDir && gSystem->AccessPathName(args.fWorkingDir->c_str())) {
654 Err() << "could not access the directory specified: " << *args.fWorkingDir << ".\n";
655 return 1;
656 } else {
657 workingDir = *args.fWorkingDir;
658 }
659
660 // Verify that -L and -Ltype are either both present or both absent.
661 if (args.fObjectFilterFile.has_value() != args.fObjectFilterType.has_value()) {
662 Err() << "-L must always be passed along with -Ltype.\n";
663 return 1;
664 }
665
666 const char *targetname = 0;
667 if (!args.fOutputArgIdx) {
668 Err() << "missing output file.\n";
669 return 1;
670 }
671 if (!args.fFirstInputIdx) {
672 Err() << "missing input file.\n";
673 return 1;
674 }
676
677 if (verbosity > 1)
678 Info() << "target file: " << targetname << "\n";
679
680 if (args.fCacheSize)
681 Info() << "Using " << cacheSize << "\n";
682
683 ////////////////////////////// end flags processing /////////////////////////////////
684
685 gSystem->Load("libTreePlayer");
686
688 fileMerger.SetMsgPrefix("hadd");
689 fileMerger.SetPrintLevel(verbosity - 1);
690 if (maxopenedfiles > 0) {
691 fileMerger.SetMaxOpenedFiles(maxopenedfiles);
692 }
693 // The following section will collect all input filenames into a vector,
694 // including those listed within an indirect file.
695 // If any file can not be accessed, it will error out, unless args.fSkipErrors is true
696 std::vector<std::string> allSubfiles;
697 for (int a = args.fFirstInputIdx; a < argc; ++a) {
698 if (!args.fNoFlagsAfterPositionalArguments && argv[a] && argv[a][0] == '-') {
699 break;
700 }
701 if (argv[a] && argv[a][0] == '@') {
702 std::ifstream indirect_file(argv[a] + 1);
703 if (!indirect_file.is_open()) {
704 Err() << "could not open indirect file " << (argv[a] + 1) << std::endl;
705 if (!args.fSkipErrors)
706 return 1;
707 } else {
708 std::string line;
709 while (indirect_file) {
710 if (std::getline(indirect_file, line) && line.length()) {
711 if (gSystem->AccessPathName(line.c_str(), kReadPermission) == kTRUE) {
712 Err() << "could not validate the file name \"" << line << "\" within indirect file "
713 << (argv[a] + 1) << std::endl;
714 if (!args.fSkipErrors)
715 return 1;
716 } else if (std::filesystem::exists(targetname) && std::filesystem::equivalent(line, targetname)) {
717 Err() << "file " << line << " cannot be both the target and an input!\n";
718 if (!args.fSkipErrors)
719 return 1;
720 } else {
721 allSubfiles.emplace_back(line);
722 }
723 }
724 }
725 }
726 } else {
727 const std::string line = argv[a];
728 if (gSystem->AccessPathName(line.c_str(), kReadPermission) == kTRUE) {
729 Err() << "could not validate argument \"" << line << "\" as input file " << std::endl;
730 if (!args.fSkipErrors)
731 return 1;
732 } else if (std::filesystem::exists(targetname) && std::filesystem::equivalent(line, targetname)) {
733 Err() << "file " << line << " cannot be both the target and an input!\n";
734 if (!args.fSkipErrors)
735 return 1;
736 } else {
737 allSubfiles.emplace_back(line);
738 }
739 }
740 }
741 if (allSubfiles.empty()) {
742 Err() << "could not find any valid input file " << std::endl;
743 return 1;
744 }
745 // The next snippet determines the output compression if unset
746 if (newcomp == -1) {
748 // grab from the first file.
749 TFile *firstInput = TFile::Open(allSubfiles.front().c_str());
750 if (firstInput && !firstInput->IsZombie())
751 newcomp = firstInput->GetCompressionSettings();
752 else
754 delete firstInput;
755 fileMerger.SetMergeOptions(TString("FirstSrcCompression"));
756 } else {
758 fileMerger.SetMergeOptions(TString("DefaultCompression"));
759 }
760 }
761 if (verbosity > 1) {
762 if (args.fKeepCompressionAsIs && !args.fReoptimize)
763 Info() << "compression setting for meta data: " << newcomp << '\n';
764 else
765 Info() << "compression setting for all output: " << newcomp << '\n';
766 }
767 if (args.fAppend) {
768 if (!fileMerger.OutputFile(targetname, "UPDATE", newcomp)) {
769 Err() << "error opening target file for update :" << targetname << ".\n";
770 return 2;
771 }
772 } else if (!fileMerger.OutputFile(targetname, args.fForce, newcomp)) {
773 Err() << "error opening target file (does " << targetname << " exist?).\n";
774 if (!args.fForce)
775 Info() << "pass \"-f\" argument to force re-creation of output file.\n";
776 return 1;
777 }
778
779 auto step = (allSubfiles.size() + nProcesses - 1) / nProcesses;
780 if (multiproc && step < 3) {
781 // At least 3 files per process
782 step = 3;
783 nProcesses = (allSubfiles.size() + step - 1) / step;
784 Info() << "each process should handle at least 3 files for efficiency."
785 " Setting the number of processes to: "
786 << nProcesses << std::endl;
787 }
788 if (nProcesses == 1)
790
791 std::vector<std::string> partialFiles;
792
793#ifndef R__WIN32
794 // this is commented out only to try to prevent false positive detection
795 // from several anti-virus engines on Windows, and multiproc is not
796 // supported on Windows anyway
797 if (multiproc) {
798 auto uuid = TUUID();
799 auto partialTail = uuid.AsString();
800 for (auto i = 0; (i * step) < allSubfiles.size(); i++) {
801 std::stringstream buffer;
802 buffer << workingDir << "/partial" << i << "_" << partialTail << ".root";
803 partialFiles.emplace_back(buffer.str());
804 }
805 }
806#endif
807
808 auto mergeFiles = [&](TFileMerger &merger) {
809 if (args.fReoptimize) {
810 merger.SetFastMethod(kFALSE);
811 } else {
812 if (!args.fKeepCompressionAsIs && merger.HasCompressionChange()) {
813 // Don't warn if the user has requested any re-optimization.
814 Warn() << "Sources and Target have different compression settings\n"
815 "hadd merging will be slower\n";
816 }
817 }
818 merger.SetNotrees(args.fNoTrees);
819 merger.SetMergeOptions(TString(merger.GetMergeOptions()) + " " + cacheSize);
820 merger.SetIOFeatures(features);
823 if (extraFlags < 0)
824 return false;
826 if (args.fAppend)
828 else
830 Bool_t status = merger.PartialMerge(fileMergerFlags);
831 return status;
832 };
833
834 auto sequentialMerge = [&](TFileMerger &merger, int start, int nFiles) {
835 for (auto i = start; i < (start + nFiles) && i < static_cast<int>(allSubfiles.size()); i++) {
836 if (!merger.AddFile(allSubfiles[i].c_str())) {
837 if (args.fSkipErrors) {
838 Warn() << "skipping file with error: " << allSubfiles[i] << std::endl;
839 } else {
840 Err() << "exiting due to error in " << allSubfiles[i] << std::endl;
841 return kFALSE;
842 }
843 }
844 }
845 return mergeFiles(merger);
846 };
847
848 auto parallelMerge = [&](int start) {
850 mergerP.SetMsgPrefix("hadd");
851 mergerP.SetPrintLevel(verbosity - 1);
852 if (maxopenedfiles > 0) {
853 mergerP.SetMaxOpenedFiles(maxopenedfiles / nProcesses);
854 }
855 if (!mergerP.OutputFile(partialFiles[start / step].c_str(), newcomp)) {
856 Err() << "error opening target partial file\n";
857 exit(1);
858 }
859 return sequentialMerge(mergerP, start, step);
860 };
861
862 auto reductionFunc = [&]() {
863 for (const auto &pf : partialFiles) {
864 fileMerger.AddFile(pf.c_str());
865 }
866 return mergeFiles(fileMerger);
867 };
868
869 Bool_t status;
870
871#ifndef R__WIN32
872 if (multiproc) {
874 auto res = p.Map(parallelMerge, ROOT::TSeqI(0, allSubfiles.size(), step));
875 status = std::accumulate(res.begin(), res.end(), 0U) == partialFiles.size();
876 if (status) {
877 status = reductionFunc();
878 } else {
879 Err() << "failed at the parallel stage\n";
880 }
881 if (!args.fDebug) {
882 for (const auto &pf : partialFiles) {
883 gSystem->Unlink(pf.c_str());
884 }
885 }
886 } else {
887 status = sequentialMerge(fileMerger, 0, allSubfiles.size());
888 }
889#else
890 status = sequentialMerge(fileMerger, 0, allSubfiles.size());
891#endif
892
893 if (status) {
894 if (verbosity == 1) {
895 Info() << "merged " << allSubfiles.size() << " (" << fileMerger.GetMergeList()->GetEntries()
896 << ") input (partial) files into " << targetname << ".\n";
897 }
898 return 0;
899 } else {
900 if (verbosity == 1) {
901 Err() << "failure during the merge of " << allSubfiles.size() << " ("
902 << fileMerger.GetMergeList()->GetEntries() << ") input (partial) files into " << targetname << ".\n";
903 }
904 return 1;
905 }
906}
int main()
Definition Prototype.cxx:12
#define a(i)
Definition RSha256.hxx:99
size_t size(const MatrixT &matrix)
retrieve the size of a square matrix
bool Bool_t
Definition RtypesCore.h:63
int Int_t
Definition RtypesCore.h:45
constexpr Bool_t kFALSE
Definition RtypesCore.h:94
constexpr Bool_t kTRUE
Definition RtypesCore.h:93
ROOT::Detail::TRangeCast< T, true > TRangeDynCast
TRangeDynCast is an adapter class that allows the typed iteration through a TCollection.
winID h TVirtualViewer3D TVirtualGLPainter p
@ kReadPermission
Definition TSystem.h:55
R__EXTERN TSystem * gSystem
Definition TSystem.h:572
TIOFeatures provides the end-user with the ability to change the IO behavior of data written via a TT...
This class provides a simple interface to execute the same task multiple times in parallel,...
This class provides file copy and merging services.
Definition TFileMerger.h:30
@ kAll
Merge all type of objects (default)
Definition TFileMerger.h:78
@ kIncremental
Merge the input file with the content of the output file (if already existing).
Definition TFileMerger.h:73
@ kSkipListed
Skip objects specified in fObjectNames list.
Definition TFileMerger.h:82
@ kOnlyListed
Only the objects specified in fObjectNames list.
Definition TFileMerger.h:81
@ kRegular
Normal merge, overwriting the output file.
Definition TFileMerger.h:72
A ROOT file is an on-disk file, usually with extension .root, that stores objects in a file-system-li...
Definition TFile.h:131
static TFile * Open(const char *name, Option_t *option="", const char *ftitle="", Int_t compress=ROOT::RCompressionSetting::EDefaults::kUseCompiledDefault, Int_t netopt=0)
Create / open a file.
Definition TFile.cxx:4131
Basic string class.
Definition TString.h:139
TString & Append(const char *cs)
Definition TString.h:572
virtual int GetSysInfo(SysInfo_t *info) const
Returns static system info, like OS type, CPU type, number of CPUs RAM size, etc into the SysInfo_t s...
Definition TSystem.cxx:2470
virtual int Load(const char *module, const char *entry="", Bool_t system=kFALSE)
Load a shared library.
Definition TSystem.cxx:1869
virtual Bool_t AccessPathName(const char *path, EAccessMode mode=kFileExists)
Returns FALSE if one can access a file using the specified access mode.
Definition TSystem.cxx:1308
virtual int Unlink(const char *name)
Unlink, i.e.
Definition TSystem.cxx:1393
virtual const char * TempDirectory() const
Return a user configured or systemwide directory to create temporary files in.
Definition TSystem.cxx:1494
This class defines a UUID (Universally Unique IDentifier), also known as GUIDs (Globally Unique IDent...
Definition TUUID.h:42
TLine * line
static EFlagResult FlagArg(int argc, char **argv, int &argIdxInOut, const char *flagStr, std::optional< T > &flagOut, std::optional< T > defaultVal=std::nullopt, FlagConvResult< T >(*conv)(const char *)=ConvertArg< T >)
Definition hadd.cxx:330
EFlagResult
Definition hadd.cxx:213
static bool ValidCompressionSettings(int compSettings)
Definition hadd.cxx:381
FlagConvResult< IntFlag_t > ConvertArg< IntFlag_t >(const char *arg)
Definition hadd.cxx:264
#define PARSE_FLAG(func,...)
static FlagConvResult< Int_t > ConvertFilterType(const char *arg)
Definition hadd.cxx:314
static Int_t ParseFilterFile(const std::optional< std::string > &filterFileName, std::optional< Int_t > objectFilterType, TFileMerger &fileMerger)
Definition hadd.cxx:576
static FlagConvResult< T > ConvertArg(const char *)
uint32_t IntFlag_t
Definition hadd.cxx:183
static std::optional< HAddArgs > ParseArgs(int argc, char **argv)
Definition hadd.cxx:477
FlagConvResult< ROOT::TIOFeatures > ConvertArg< ROOT::TIOFeatures >(const char *arg)
Definition hadd.cxx:279
std::ostream & Warn()
Definition hadd.cxx:171
std::ostream & Info()
Definition hadd.cxx:177
static FlagConvResult< TString > ConvertCacheSize(const char *arg)
Definition hadd.cxx:292
static EFlagResult FlagF(const char *arg, HAddArgs &args)
Definition hadd.cxx:415
static EFlagResult FlagToggle(const char *arg, const char *flagStr, bool &flagOut)
Definition hadd.cxx:215
static std::optional< IntFlag_t > StrToUInt(const char *str)
Definition hadd.cxx:230
std::ostream & Err()
Definition hadd.cxx:165
static constexpr const char kCommandLineOptionsHelp[]
void ToHumanReadableSize(value_type bytes, Bool_t si, Double_t *coeff, const char **units)
Return the size expressed in 'human readable' format.
EFromHumanReadableSize FromHumanReadableSize(std::string_view str, T &value)
Convert strings like the following into byte counts 5MB, 5 MB, 5M, 3.7GB, 123b, 456kB,...
EFlagResult fResult
Definition hadd.cxx:251
bool fNoFlagsAfterPositionalArguments
Definition hadd.cxx:210
bool fKeepCompressionAsIs
Definition hadd.cxx:192
bool fForce
Definition hadd.cxx:188
std::optional< TString > fCacheSize
Definition hadd.cxx:199
std::optional< IntFlag_t > fCompressionSettings
Definition hadd.cxx:203
bool fNoTrees
Definition hadd.cxx:186
std::optional< Int_t > fObjectFilterType
Definition hadd.cxx:198
int fFirstInputIdx
Definition hadd.cxx:206
std::optional< IntFlag_t > fNProcesses
Definition hadd.cxx:196
bool fUseFirstInputCompression
Definition hadd.cxx:193
std::optional< std::string > fObjectFilterFile
Definition hadd.cxx:197
bool fSkipErrors
Definition hadd.cxx:189
std::optional< IntFlag_t > fVerbosity
Definition hadd.cxx:202
std::optional< IntFlag_t > fMaxOpenedFiles
Definition hadd.cxx:201
std::optional< std::string > fWorkingDir
Definition hadd.cxx:195
int fOutputArgIdx
Definition hadd.cxx:205
bool fDebug
Definition hadd.cxx:191
bool fReoptimize
Definition hadd.cxx:190
std::optional< ROOT::TIOFeatures > fFeatures
Definition hadd.cxx:200
bool fAppend
Definition hadd.cxx:187
@ kUseCompiledDefault
Use the compile-time default setting.
Definition Compression.h:53
Int_t fCpus
Definition TSystem.h:162
TMarker m
Definition textangle.C:8