Logo ROOT  
Reference Guide
 
Loading...
Searching...
No Matches
hadd.cxx
Go to the documentation of this file.
1/**
2 \file hadd.cxx
3 \brief This program will merge compatible ROOT objects, such as histograms, Trees and RNTuples,
4 from a list of root files and write them to a target root file.
5 In order for a ROOT object to be mergeable, it must implement the Merge() function.
6 Non-mergeable objects will have all instances copied as-is into the target file.
7 The target file must not be identical to one of the source files.
8
9 Syntax:
10 ```{.cpp}
11 hadd [flags] targetfile source1 source2 ... [flags]
12 ```
13
14 Flags can be passed before or after the positional arguments.
15 The first positional (non-flag) argument will be interpreted as the targetfile.
16 After that, the first sequence of positional arguments will be interpreted as the input files.
17 If two sequences of positional arguments are separated by flags, hadd will emit an error and abort.
18
19 By default, any argument starting with `-` is interpreted as a flag. If you want to pass filenames
20 starting with `-` you need to pass them after `--`:
21 ```{.cpp}
22 hadd [flags] -- -file1 -file2 ...
23 ```
24 Note that in this case you need to pass ALL positional arguments after `--`.
25
26 If a flag requires an argument, the argument can be specified in any of these ways:
27
28 # All equally valid:
29 -j 16
30 -j16
31 -j=16
32
33 The first syntax is the preferred one since it's backward-compatible with previous versions of hadd.
34 The -f flag is an exception to this rule: it only supports the `-f[0-9]` syntax.
35
36 Note that merging multiple flags is NOT supported: `-jfa` will be interpreted as -j=fa, which is invalid!
37
38 The flags are as follows:
39
40 \param -a Append to the output
41 \param -cachesize <SIZE> Resize the prefetching cache used to speed up I/O operations (use 0 to disable).
42 \param -d <DIR> Carry out the partial multiprocess execution in the specified directory
43 \param -dbg Enable verbosity. If -j was specified, do not not delete partial files
44 stored inside working directory.
45 \param -experimental-io-features <FEATURES> Enables the corresponding experimental feature for output trees.
46 \see ROOT::Experimental::EIOFeatures
47 \param -f Force overwriting of output file.
48 \param -f[0-9] Set target compression level. 0 = uncompressed, 9 = highly compressed. Default is 101
49 (kDefaultZLIB). You can also specify the full compression algorithm, e.g. -f505.
50 \param -fk Sets the target file to contain the baskets with the same compression as the input files
51 (unless -O is specified). Compresses the meta data using the compression level specified
52 in the first input or the compression setting after fk (for example 505 when using -fk505)
53 \param -ff The compression level used is the one specified in the first input
54 \param -j [N_JOBS] Parallelise the execution in `N_JOBS` processes. If the number of processes is not specified,
55 or is 0, use the system maximum.
56 \param -k Skip corrupt or non-existent files, do not exit
57 \param -n <N_FILES> Open at most `N` files at once (use 0 to request to use the system maximum - which is also
58 the default)
59 \param -O Re-optimize basket size when merging TTree
60 \param -T Do not merge Trees
61 \param -v [LEVEL] Explicitly set the verbosity level: 0 request no output, 99 is the default
62 \return hadd returns a status code: 0 if OK, 1 otherwise
63
64 For example assume 3 files f1, f2, f3 containing histograms hn and Trees Tn
65 - f1 with h1 h2 h3 T1
66 - f2 with h1 h4 T1 T2
67 - f3 with h5
68 the result of
69 ```
70 hadd -f x.root f1.root f2.root f3.root
71 ```
72 will be a file x.root with h1 h2 h3 h4 h5 T1 T2
73 where
74 - h1 will be the sum of the 2 histograms in f1 and f2
75 - T1 will be the merge of the Trees in f1 and f2
76
77 The files may contain sub-directories.
78
79 If the source files contains histograms and Trees, one can skip
80 the Trees with
81 ```
82 hadd -T targetfile source1 source2 ...
83 ```
84
85 Wildcarding and indirect files are also supported
86 ```
87 hadd result.root myfil*.root
88 ```
89 will merge all files in myfil*.root
90 ```
91 hadd result.root file1.root @list.txt file2. root myfil*.root
92 ```
93 will merge file1.root, file2.root, all files in myfil*.root
94 and all files in the indirect text file list.txt ("@" as the first
95 character of the file indicates an indirect file. An indirect file
96 is a text file containing a list of other files, including other
97 indirect files, one line per file).
98
99 If the sources and and target compression levels are identical (default),
100 the program uses the TChain::Merge function with option "fast", ie
101 the merge will be done without unzipping or unstreaming the baskets
102 (i.e. direct copy of the raw byte on disk). The "fast" mode is typically
103 5 times faster than the mode unzipping and unstreaming the baskets.
104
105 If the option -cachesize is used, hadd will resize (or disable if 0) the
106 prefetching cache use to speed up I/O operations.
107
108 For options that take a size as argument, a decimal number of bytes is expected.
109 If the number ends with a `k`, `m`, `g`, etc., the number is multiplied
110 by 1000 (1K), 1000000 (1MB), 1000000000 (1G), etc.
111 If this prefix is followed by `i`, the number is multiplied by the traditional
112 1024 (1KiB), 1048576 (1MiB), 1073741824 (1GiB), etc.
113 The prefix can be optionally followed by B whose casing is ignored,
114 eg. 1k, 1K, 1Kb and 1KB are the same.
115
116 \note By default histograms are added. However hadd does not support the case where
117 histograms have their bit TH1::kIsAverage set.
118
119 \authors Rene Brun, Dirk Geppert, Sven A. Schmidt, Toby Burnett
120*/
121#include "Compression.h"
122#include "TClass.h"
123#include "TFile.h"
124#include "TFileMerger.h"
125#include "THashList.h"
126#include "TKey.h"
127#include "TSystem.h"
128#include "TUUID.h"
129
130#include <ROOT/RConfig.hxx>
131#include <ROOT/StringConv.hxx>
132#include <ROOT/TIOFeatures.hxx>
133
134#include "haddCommandLineOptionsHelp.h"
135
136#include <climits>
137#include <cstdlib>
138#include <filesystem>
139#include <fstream>
140#include <iostream>
141#include <optional>
142#include <sstream>
143#include <string>
144
145#ifndef R__WIN32
147#endif
148
149////////////////////////////////////////////////////////////////////////////////
150
151inline std::ostream &Err()
152{
153 std::cerr << "Error in <hadd>: ";
154 return std::cerr;
155}
156
157inline std::ostream &Warn()
158{
159 std::cerr << "Warning in <hadd>: ";
160 return std::cerr;
161}
162
163inline std::ostream &Info()
164{
165 std::cerr << "Info in <hadd>: ";
166 return std::cerr;
167}
168
169using IntFlag_t = uint32_t;
170
171struct HAddArgs {
174 bool fForce;
177 bool fDebug;
180
181 std::optional<std::string> fWorkingDir;
182 std::optional<IntFlag_t> fNProcesses;
183 std::optional<TString> fCacheSize;
184 std::optional<ROOT::TIOFeatures> fFeatures;
185 std::optional<IntFlag_t> fMaxOpenedFiles;
186 std::optional<IntFlag_t> fVerbosity;
187 std::optional<IntFlag_t> fCompressionSettings;
188
191 // This is set to true if and only if the user passed `--`. In this special
192 // case, we must not stop parsing positional arguments even if we find one
193 // that starts with a `-`.
195};
196
198
199static EFlagResult FlagToggle(const char *arg, const char *flagStr, bool &flagOut)
200{
201 const auto argLen = strlen(arg);
202 const auto flagLen = strlen(flagStr);
203 if (argLen == flagLen && strncmp(arg, flagStr, flagLen) == 0) {
204 if (flagOut)
205 Warn() << "duplicate flag: " << flagStr << "\n";
206 flagOut = true;
208 }
210}
211
212// NOTE: not using std::stoi or similar because they have bad error checking.
213// std::stoi will happily parse "120notvalid" as 120.
214static std::optional<IntFlag_t> StrToUInt(const char *str)
215{
216 if (!str)
217 return {};
218
219 uint32_t res = 0;
220 do {
221 if (!isdigit(*str))
222 return {};
223 if (res * 10 < res) // overflow is an error
224 return {};
225 res *= 10;
226 res += *str - '0';
227 } while (*++str);
228
229 return res;
230}
231
232template <typename T>
237
238template <typename T>
239static FlagConvResult<T> ConvertArg(const char *);
240
241template <>
243{
244 return {arg, EFlagResult::kParsed};
245}
246
247template <>
249{
250 // Don't even try to parse arg if it doesn't look like a number.
251 if (!isdigit(*arg))
252 return {0, EFlagResult::kIgnored};
253
254 auto intOpt = StrToUInt(arg);
255 if (intOpt)
256 return {*intOpt, EFlagResult::kParsed};
257
258 Err() << "error parsing integer argument '" << arg << "'\n";
259 return {0, EFlagResult::kErr};
260}
261
262template <>
264{
266 std::stringstream ss;
267 ss.str(arg);
268 std::string item;
269 while (std::getline(ss, item, ',')) {
270 if (!features.Set(item))
271 Warn() << "ignoring unknown feature request: " << item << "\n";
272 }
274}
275
277{
278 TString cacheSize;
279 int size;
282 Err() << "could not parse the cache size passed after -cachesize: '" << arg << "'\n";
283 return {"", EFlagResult::kErr};
285 double m;
286 const char *munit = nullptr;
288 Warn() << "the cache size passed after -cachesize is too large: " << arg << " is greater than " << m << munit
289 << ". We will use the maximum value.\n";
290 return {std::to_string(m) + munit, EFlagResult::kParsed};
291 } else {
292 cacheSize = "cachesize=";
293 cacheSize.Append(arg);
294 }
295 return {cacheSize, EFlagResult::kParsed};
296}
297
298// Parses a flag that is followed by an argument of type T.
299// If `defaultVal` is provided, the following argument is optional and will be set to `defaultVal` if missing.
300// `conv` is used to convert the argument from string to its type T.
301template <typename T>
302static EFlagResult
303FlagArg(int argc, char **argv, int &argIdxInOut, const char *flagStr, std::optional<T> &flagOut,
304 std::optional<T> defaultVal = std::nullopt, FlagConvResult<T> (*conv)(const char *) = ConvertArg<T>)
305{
306 int argIdx = argIdxInOut;
307 const char *arg = argv[argIdx] + 1;
308 int argLen = strlen(arg);
309 int flagLen = strlen(flagStr);
310 const char *nxtArg = nullptr;
311
312 if (strncmp(arg, flagStr, flagLen) != 0)
314
315 bool argIsSeparate = false;
316 if (argLen > flagLen) {
317 // interpret anything after the flag as the argument.
318 nxtArg = arg + flagLen;
319 // Ignore one '=', if present
320 if (nxtArg[0] == '=')
321 ++nxtArg;
322 } else if (argLen == flagLen) {
323 argIsSeparate = true;
324 if (argIdx + 1 < argc) {
325 ++argIdxInOut;
327 } else {
328 Err() << "expected argument after '-" << flagStr << "' flag.\n";
329 return EFlagResult::kErr;
330 }
331 } else {
333 }
334
335 auto converted = conv(nxtArg);
336 if (converted.fResult == EFlagResult::kParsed) {
337 flagOut = converted.fValue;
338 } else if (converted.fResult == EFlagResult::kIgnored) {
339 if (defaultVal && argIsSeparate) {
341 // If we had tried parsing the next argument, step back one arg idx.
343 } else {
344 Err() << "the argument after '-" << flagStr << "' flag was not of the expected type.\n";
345 return EFlagResult::kErr;
346 }
347 } else {
348 return EFlagResult::kErr;
349 }
350
352}
353
355{
356 // Must be a number between 0 and 509 (with a 0 in the middle)
357 if (compSettings == 0)
358 return true;
359 // We also accept [1-9] as aliases of [101-109], but it's discouraged.
360 if (compSettings >= 1 && compSettings <= 9) {
361 Warn() << "interpreting " << compSettings << " as " << 100 + compSettings
362 << "."
363 " This behavior is deprecated, please use the full compression settings.\n";
364 return true;
365 }
366 return (compSettings >= 100 && compSettings <= 509) && ((compSettings / 10) % 10 == 0);
367}
368
369// The -f flag has a somewhat complicated logic.
370// We have 4 cases:
371// 1. -f
372// 2. -ff
373// 3. -fk
374// 4. -f[0-509]
375//
376// and a combination thereof (e.g. -fk101, -ff202, -ffk, -fk209)
377// -ff and -f[0-509] are incompatible.
378//
379// ALL these flags imply '-f' ("force overwrite"), but only if they parse successfully.
380// This means that if we see a -f[something] and that "something" doesn't parse to a valid
381// number between 0 and 509, or f or k, we consider the flag invalid and skip it without
382// setting any state.
383//
384// Note that we don't allow `-f [0-9]` because that would be a backwards-incompatible
385// change with the previous arg parsing semantic, changing the meaning of a cmdline like:
386//
387// $ hadd -f 200 f.root g.root # <- '200' is the output file, not an argument to -f!
388static EFlagResult FlagF(const char *arg, HAddArgs &args)
389{
390 if (arg[0] != 'f')
392
393 args.fForce = true;
394 const char *cur = arg + 1;
395 while (*cur) {
396 switch (cur[0]) {
397 case 'f':
399 Warn() << "duplicate flag: -ff\n";
400 if (args.fCompressionSettings) {
401 std::cerr
402 << "[err] Cannot specify both -ff and -f[0-9]. Either use the first input compression or specify it.\n";
403 return EFlagResult::kErr;
404 } else
405 args.fUseFirstInputCompression = true;
406 break;
407 case 'k':
408 if (args.fKeepCompressionAsIs)
409 Warn() << "duplicate flag: -fk\n";
410 args.fKeepCompressionAsIs = true;
411 break;
412 default:
413 if (isdigit(cur[0])) {
414 if (args.fUseFirstInputCompression) {
415 Err() << "cannot specify both -ff and -f[0-9]. Either use the first input compression or "
416 "specify it.\n";
417 return EFlagResult::kErr;
418 } else if (!args.fCompressionSettings) {
419 if (auto compLv = StrToUInt(cur)) {
422 // we can't see any other argument after the number, so we return here to avoid
423 // incorrectly parsing the rest of the characters in `arg`.
425 } else {
426 Err() << *compLv << " is not a supported compression settings.\n";
427 return EFlagResult::kErr;
428 }
429 } else {
430 Err() << "failed to parse compression settings '" << cur << "' as an integer.\n";
431 return EFlagResult::kErr;
432 }
433 } else {
434 Err() << "cannot specify -f[0-9] multiple times!\n";
435 return EFlagResult::kErr;
436 }
437 } else {
438 Err() << "invalid flag: " << arg << "\n";
439 return EFlagResult::kErr;
440 }
441 }
442 ++cur;
443 }
444
446}
447
448// Returns nullopt if any of the flags failed to parse.
449// If an unknown flag is encountered, it will print a warning and go on.
450static std::optional<HAddArgs> ParseArgs(int argc, char **argv)
451{
452 HAddArgs args{};
453
454 enum {
460
461 for (int argIdx = 1; argIdx < argc; ++argIdx) {
462 const char *argRaw = argv[argIdx];
463 if (!*argRaw)
464 continue;
465
466 if (!args.fNoFlagsAfterPositionalArguments && argRaw[0] == '-' && argRaw[1] != '\0') {
467 if (argRaw[1] == '-' && argRaw[2] == '\0') {
468 // special case `--`: force parsing to consider all future args as positional arguments.
470 Err()
471 << "found `--`, but we've already parsed (or are still parsing) a sequence of positional arguments!"
472 " This is not supported: you must have exactly one sequence of positional arguments, so if you"
473 " need to use `--` make sure to pass *all* positional arguments after it.";
474 return {};
475 }
476 args.fNoFlagsAfterPositionalArguments = true;
477 continue;
478 }
479
480 // parse flag
482
483 const char *arg = argRaw + 1;
484 bool validFlag = false;
485
486#define PARSE_FLAG(func, ...) \
487 do { \
488 if (!validFlag) { \
489 const auto res = func(__VA_ARGS__); \
490 if (res == EFlagResult::kErr) \
491 return {}; \
492 validFlag = res == EFlagResult::kParsed; \
493 } \
494 } while (0)
495
496 PARSE_FLAG(FlagToggle, arg, "T", args.fNoTrees);
497 PARSE_FLAG(FlagToggle, arg, "a", args.fAppend);
498 PARSE_FLAG(FlagToggle, arg, "k", args.fSkipErrors);
499 PARSE_FLAG(FlagToggle, arg, "O", args.fReoptimize);
500 PARSE_FLAG(FlagToggle, arg, "dbg", args.fDebug);
501 PARSE_FLAG(FlagArg, argc, argv, argIdx, "d", args.fWorkingDir);
502 PARSE_FLAG(FlagArg, argc, argv, argIdx, "j", args.fNProcesses, {0});
503 PARSE_FLAG(FlagArg, argc, argv, argIdx, "cachesize", args.fCacheSize, {}, ConvertCacheSize);
504 PARSE_FLAG(FlagArg, argc, argv, argIdx, "experimental-io-features", args.fFeatures);
505 PARSE_FLAG(FlagArg, argc, argv, argIdx, "n", args.fMaxOpenedFiles);
506 PARSE_FLAG(FlagArg, argc, argv, argIdx, "v", args.fVerbosity, {99});
507 PARSE_FLAG(FlagF, arg, args);
508
509#undef PARSE_FLAG
510
511 if (!validFlag)
512 Warn() << "unknown flag: " << argRaw << "\n";
513
514 } else if (!args.fOutputArgIdx) {
515 // First positional argument is the output
516 args.fOutputArgIdx = argIdx;
519 } else {
520 // We should be in the same positional argument group as the output, error otherwise
522 if (!args.fFirstInputIdx) {
523 args.fFirstInputIdx = argIdx;
524 }
525 } else {
526 Err() << "seen a positional argument '" << argRaw
527 << "' after some flags."
528 " Positional arguments were already parsed at this point (from '"
529 << argv[args.fOutputArgIdx]
530 << "' onwards), and you can only have one sequence of them, so you cannot pass more."
531 " Please group your positional arguments all together so that hadd works as you expect.\n"
532 "Cmdline: ";
533 for (int i = 0; i < argc; ++i)
534 std::cerr << argv[i] << " ";
535 std::cerr << "\n";
536
537 return {};
538 }
539 }
540 }
541
542 return args;
543}
544
545int main(int argc, char **argv)
546{
547 if (argc < 3 || "-h" == std::string(argv[1]) || "--help" == std::string(argv[1])) {
549 return (argc == 2 && ("-h" == std::string(argv[1]) || "--help" == std::string(argv[1]))) ? 0 : 1;
550 }
551
552 const auto argsOpt = ParseArgs(argc, argv);
553 if (!argsOpt)
554 return 1;
555 const HAddArgs &args = *argsOpt;
556
558 Int_t maxopenedfiles = args.fMaxOpenedFiles.value_or(0);
559 Int_t verbosity = args.fVerbosity.value_or(99);
560 Int_t newcomp = args.fCompressionSettings.value_or(-1);
561 TString cacheSize = args.fCacheSize.value_or("");
562
563 // For the -j flag (nProcesses), we check if the flag is present and, if so, if it has a
564 // valid value (i.e. any value > 0).
565 // If the flag is present at all, we do multiprocessing. If the value of nProcesses is invalid,
566 // we default to the number of cpus on the machine.
567 Bool_t multiproc = args.fNProcesses.has_value();
568 int nProcesses;
569 if (args.fNProcesses && *args.fNProcesses > 0) {
570 nProcesses = *args.fNProcesses;
571 } else {
572 SysInfo_t s;
573 gSystem->GetSysInfo(&s);
574 nProcesses = s.fCpus;
575 }
576 if (multiproc)
577 Info() << "parallelizing with " << nProcesses << " processes.\n";
578
579 // If the user specified a workingDir, use that. Otherwise, default to the system temp dir.
580 std::string workingDir;
581 if (!args.fWorkingDir) {
583 } else if (args.fWorkingDir && gSystem->AccessPathName(args.fWorkingDir->c_str())) {
584 Err() << "could not access the directory specified: " << *args.fWorkingDir << ".\n";
585 return 1;
586 } else {
587 workingDir = *args.fWorkingDir;
588 }
589
590 gSystem->Load("libTreePlayer");
591
592 const char *targetname = 0;
593 if (!args.fOutputArgIdx) {
594 Err() << "missing output file.\n";
595 return 1;
596 }
597 if (!args.fFirstInputIdx) {
598 Err() << "missing input file.\n";
599 return 1;
600 }
602
603 if (verbosity > 1)
604 Info() << "target file: " << targetname << "\n";
605
606 if (args.fCacheSize)
607 Info() << "Using " << cacheSize << "\n";
608
610 fileMerger.SetMsgPrefix("hadd");
611 fileMerger.SetPrintLevel(verbosity - 1);
612 if (maxopenedfiles > 0) {
613 fileMerger.SetMaxOpenedFiles(maxopenedfiles);
614 }
615 // The following section will collect all input filenames into a vector,
616 // including those listed within an indirect file.
617 // If any file can not be accessed, it will error out, unless args.fSkipErrors is true
618 std::vector<std::string> allSubfiles;
619 for (int a = args.fFirstInputIdx; a < argc; ++a) {
620 if (!args.fNoFlagsAfterPositionalArguments && argv[a] && argv[a][0] == '-') {
621 break;
622 }
623 if (argv[a] && argv[a][0] == '@') {
624 std::ifstream indirect_file(argv[a] + 1);
625 if (!indirect_file.is_open()) {
626 Err() << "could not open indirect file " << (argv[a] + 1) << std::endl;
627 if (!args.fSkipErrors)
628 return 1;
629 } else {
630 std::string line;
631 while (indirect_file) {
632 if (std::getline(indirect_file, line) && line.length()) {
633 if (gSystem->AccessPathName(line.c_str(), kReadPermission) == kTRUE) {
634 Err() << "could not validate the file name \"" << line << "\" within indirect file "
635 << (argv[a] + 1) << std::endl;
636 if (!args.fSkipErrors)
637 return 1;
638 } else if (std::filesystem::exists(targetname) && std::filesystem::equivalent(line, targetname)) {
639 Err() << "file " << line << " cannot be both the target and an input!\n";
640 if (!args.fSkipErrors)
641 return 1;
642 } else {
643 allSubfiles.emplace_back(line);
644 }
645 }
646 }
647 }
648 } else {
649 const std::string line = argv[a];
650 if (gSystem->AccessPathName(line.c_str(), kReadPermission) == kTRUE) {
651 Err() << "could not validate argument \"" << line << "\" as input file " << std::endl;
652 if (!args.fSkipErrors)
653 return 1;
654 } else if (std::filesystem::exists(targetname) && std::filesystem::equivalent(line, targetname)) {
655 Err() << "file " << line << " cannot be both the target and an input!\n";
656 if (!args.fSkipErrors)
657 return 1;
658 } else {
659 allSubfiles.emplace_back(line);
660 }
661 }
662 }
663 if (allSubfiles.empty()) {
664 Err() << "could not find any valid input file " << std::endl;
665 return 1;
666 }
667 // The next snippet determines the output compression if unset
668 if (newcomp == -1) {
670 // grab from the first file.
671 TFile *firstInput = TFile::Open(allSubfiles.front().c_str());
672 if (firstInput && !firstInput->IsZombie())
673 newcomp = firstInput->GetCompressionSettings();
674 else
676 delete firstInput;
677 fileMerger.SetMergeOptions(TString("FirstSrcCompression"));
678 } else {
680 fileMerger.SetMergeOptions(TString("DefaultCompression"));
681 }
682 }
683 if (verbosity > 1) {
684 if (args.fKeepCompressionAsIs && !args.fReoptimize)
685 Info() << "compression setting for meta data: " << newcomp << '\n';
686 else
687 Info() << "compression setting for all output: " << newcomp << '\n';
688 }
689 if (args.fAppend) {
690 if (!fileMerger.OutputFile(targetname, "UPDATE", newcomp)) {
691 Err() << "error opening target file for update :" << targetname << ".\n";
692 return 2;
693 }
694 } else if (!fileMerger.OutputFile(targetname, args.fForce, newcomp)) {
695 Err() << "error opening target file (does " << targetname << " exist?).\n";
696 if (!args.fForce)
697 Info() << "pass \"-f\" argument to force re-creation of output file.\n";
698 return 1;
699 }
700
701 auto step = (allSubfiles.size() + nProcesses - 1) / nProcesses;
702 if (multiproc && step < 3) {
703 // At least 3 files per process
704 step = 3;
705 nProcesses = (allSubfiles.size() + step - 1) / step;
706 Info() << "each process should handle at least 3 files for efficiency."
707 " Setting the number of processes to: "
708 << nProcesses << std::endl;
709 }
710 if (nProcesses == 1)
712
713 std::vector<std::string> partialFiles;
714
715#ifndef R__WIN32
716 // this is commented out only to try to prevent false positive detection
717 // from several anti-virus engines on Windows, and multiproc is not
718 // supported on Windows anyway
719 if (multiproc) {
720 auto uuid = TUUID();
721 auto partialTail = uuid.AsString();
722 for (auto i = 0; (i * step) < allSubfiles.size(); i++) {
723 std::stringstream buffer;
724 buffer << workingDir << "/partial" << i << "_" << partialTail << ".root";
725 partialFiles.emplace_back(buffer.str());
726 }
727 }
728#endif
729
730 auto mergeFiles = [&](TFileMerger &merger) {
731 if (args.fReoptimize) {
732 merger.SetFastMethod(kFALSE);
733 } else {
734 if (!args.fKeepCompressionAsIs && merger.HasCompressionChange()) {
735 // Don't warn if the user has requested any re-optimization.
736 Warn() << "Sources and Target have different compression settings\n"
737 "hadd merging will be slower\n";
738 }
739 }
740 merger.SetNotrees(args.fNoTrees);
741 merger.SetMergeOptions(TString(merger.GetMergeOptions()) + " " + cacheSize);
742 merger.SetIOFeatures(features);
743 Bool_t status;
744 if (args.fAppend)
745 status = merger.PartialMerge(TFileMerger::kIncremental | TFileMerger::kAll);
746 else
747 status = merger.Merge();
748 return status;
749 };
750
751 auto sequentialMerge = [&](TFileMerger &merger, int start, int nFiles) {
752 for (auto i = start; i < (start + nFiles) && i < static_cast<int>(allSubfiles.size()); i++) {
753 if (!merger.AddFile(allSubfiles[i].c_str())) {
754 if (args.fSkipErrors) {
755 Warn() << "skipping file with error: " << allSubfiles[i] << std::endl;
756 } else {
757 Err() << "exiting due to error in " << allSubfiles[i] << std::endl;
758 return kFALSE;
759 }
760 }
761 }
762 return mergeFiles(merger);
763 };
764
765 auto parallelMerge = [&](int start) {
767 mergerP.SetMsgPrefix("hadd");
768 mergerP.SetPrintLevel(verbosity - 1);
769 if (maxopenedfiles > 0) {
770 mergerP.SetMaxOpenedFiles(maxopenedfiles / nProcesses);
771 }
772 if (!mergerP.OutputFile(partialFiles[start / step].c_str(), newcomp)) {
773 Err() << "error opening target partial file\n";
774 exit(1);
775 }
776 return sequentialMerge(mergerP, start, step);
777 };
778
779 auto reductionFunc = [&]() {
780 for (const auto &pf : partialFiles) {
781 fileMerger.AddFile(pf.c_str());
782 }
783 return mergeFiles(fileMerger);
784 };
785
786 Bool_t status;
787
788#ifndef R__WIN32
789 if (multiproc) {
791 auto res = p.Map(parallelMerge, ROOT::TSeqI(0, allSubfiles.size(), step));
792 status = std::accumulate(res.begin(), res.end(), 0U) == partialFiles.size();
793 if (status) {
794 status = reductionFunc();
795 } else {
796 Err() << "failed at the parallel stage\n";
797 }
798 if (!args.fDebug) {
799 for (const auto &pf : partialFiles) {
800 gSystem->Unlink(pf.c_str());
801 }
802 }
803 } else {
804 status = sequentialMerge(fileMerger, 0, allSubfiles.size());
805 }
806#else
807 status = sequentialMerge(fileMerger, 0, allSubfiles.size());
808#endif
809
810 if (status) {
811 if (verbosity == 1) {
812 Info() << "merged " << allSubfiles.size() << " (" << fileMerger.GetMergeList()->GetEntries()
813 << ") input (partial) files into " << targetname << ".\n";
814 }
815 return 0;
816 } else {
817 if (verbosity == 1) {
818 Err() << "failure during the merge of " << allSubfiles.size() << " ("
819 << fileMerger.GetMergeList()->GetEntries() << ") input (partial) files into " << targetname << ".\n";
820 }
821 return 1;
822 }
823}
int main()
Definition Prototype.cxx:12
#define a(i)
Definition RSha256.hxx:99
size_t size(const MatrixT &matrix)
retrieve the size of a square matrix
bool Bool_t
Definition RtypesCore.h:63
int Int_t
Definition RtypesCore.h:45
constexpr Bool_t kFALSE
Definition RtypesCore.h:94
constexpr Bool_t kTRUE
Definition RtypesCore.h:93
ROOT::Detail::TRangeCast< T, true > TRangeDynCast
TRangeDynCast is an adapter class that allows the typed iteration through a TCollection.
winID h TVirtualViewer3D TVirtualGLPainter p
@ kReadPermission
Definition TSystem.h:45
R__EXTERN TSystem * gSystem
Definition TSystem.h:561
TIOFeatures provides the end-user with the ability to change the IO behavior of data written via a TT...
This class provides a simple interface to execute the same task multiple times in parallel,...
This class provides file copy and merging services.
Definition TFileMerger.h:30
@ kAll
Merge all type of objects (default)
Definition TFileMerger.h:78
@ kIncremental
Merge the input file with the content of the output file (if already existing).
Definition TFileMerger.h:73
A ROOT file is an on-disk file, usually with extension .root, that stores objects in a file-system-li...
Definition TFile.h:53
static TFile * Open(const char *name, Option_t *option="", const char *ftitle="", Int_t compress=ROOT::RCompressionSetting::EDefaults::kUseCompiledDefault, Int_t netopt=0)
Create / open a file.
Definition TFile.cxx:4088
Basic string class.
Definition TString.h:139
TString & Append(const char *cs)
Definition TString.h:572
virtual int GetSysInfo(SysInfo_t *info) const
Returns static system info, like OS type, CPU type, number of CPUs RAM size, etc into the SysInfo_t s...
Definition TSystem.cxx:2458
virtual int Load(const char *module, const char *entry="", Bool_t system=kFALSE)
Load a shared library.
Definition TSystem.cxx:1857
virtual Bool_t AccessPathName(const char *path, EAccessMode mode=kFileExists)
Returns FALSE if one can access a file using the specified access mode.
Definition TSystem.cxx:1296
virtual int Unlink(const char *name)
Unlink, i.e.
Definition TSystem.cxx:1381
virtual const char * TempDirectory() const
Return a user configured or systemwide directory to create temporary files in.
Definition TSystem.cxx:1482
This class defines a UUID (Universally Unique IDentifier), also known as GUIDs (Globally Unique IDent...
Definition TUUID.h:42
TLine * line
static EFlagResult FlagArg(int argc, char **argv, int &argIdxInOut, const char *flagStr, std::optional< T > &flagOut, std::optional< T > defaultVal=std::nullopt, FlagConvResult< T >(*conv)(const char *)=ConvertArg< T >)
Definition hadd.cxx:303
EFlagResult
Definition hadd.cxx:197
static bool ValidCompressionSettings(int compSettings)
Definition hadd.cxx:354
FlagConvResult< IntFlag_t > ConvertArg< IntFlag_t >(const char *arg)
Definition hadd.cxx:248
#define PARSE_FLAG(func,...)
static FlagConvResult< T > ConvertArg(const char *)
uint32_t IntFlag_t
Definition hadd.cxx:169
static std::optional< HAddArgs > ParseArgs(int argc, char **argv)
Definition hadd.cxx:450
FlagConvResult< ROOT::TIOFeatures > ConvertArg< ROOT::TIOFeatures >(const char *arg)
Definition hadd.cxx:263
std::ostream & Warn()
Definition hadd.cxx:157
std::ostream & Info()
Definition hadd.cxx:163
static FlagConvResult< TString > ConvertCacheSize(const char *arg)
Definition hadd.cxx:276
static EFlagResult FlagF(const char *arg, HAddArgs &args)
Definition hadd.cxx:388
static EFlagResult FlagToggle(const char *arg, const char *flagStr, bool &flagOut)
Definition hadd.cxx:199
static std::optional< IntFlag_t > StrToUInt(const char *str)
Definition hadd.cxx:214
std::ostream & Err()
Definition hadd.cxx:151
static constexpr const char kCommandLineOptionsHelp[]
void ToHumanReadableSize(value_type bytes, Bool_t si, Double_t *coeff, const char **units)
Return the size expressed in 'human readable' format.
EFromHumanReadableSize FromHumanReadableSize(std::string_view str, T &value)
Convert strings like the following into byte counts 5MB, 5 MB, 5M, 3.7GB, 123b, 456kB,...
EFlagResult fResult
Definition hadd.cxx:235
bool fNoFlagsAfterPositionalArguments
Definition hadd.cxx:194
bool fKeepCompressionAsIs
Definition hadd.cxx:178
bool fForce
Definition hadd.cxx:174
std::optional< TString > fCacheSize
Definition hadd.cxx:183
std::optional< IntFlag_t > fCompressionSettings
Definition hadd.cxx:187
bool fNoTrees
Definition hadd.cxx:172
int fFirstInputIdx
Definition hadd.cxx:190
std::optional< IntFlag_t > fNProcesses
Definition hadd.cxx:182
bool fUseFirstInputCompression
Definition hadd.cxx:179
bool fSkipErrors
Definition hadd.cxx:175
std::optional< IntFlag_t > fVerbosity
Definition hadd.cxx:186
std::optional< IntFlag_t > fMaxOpenedFiles
Definition hadd.cxx:185
std::optional< std::string > fWorkingDir
Definition hadd.cxx:181
int fOutputArgIdx
Definition hadd.cxx:189
bool fDebug
Definition hadd.cxx:177
bool fReoptimize
Definition hadd.cxx:176
std::optional< ROOT::TIOFeatures > fFeatures
Definition hadd.cxx:184
bool fAppend
Definition hadd.cxx:173
@ kUseCompiledDefault
Use the compile-time default setting.
Definition Compression.h:53
Int_t fCpus
Definition TSystem.h:152
TMarker m
Definition textangle.C:8