Re: hadd: "too many open files"

From: Noel Dawe <Noel.Dawe_at_cern.ch>
Date: Fri, 12 Aug 2011 09:16:51 +0200


Thanks!

On Thu, Aug 11, 2011 at 9:40 PM, Philippe Canal <pcanal_at_fnal.gov> wrote:

> Hi Noel,
>
> TFileMerger and hadd are now limited to 'ulimit -n' (minus some wiggle room
> for files opened by the system or CINT) files opened at the same
> and the value of this maximum can be customized at run-time (hadd -n max,
> TFileMerger::SetMaxOpenedFiles). (In revision 40569 and up).
>
>
> Cheers,
> Philippe.
>
> On 8/8/11 2:11 AM, Noel Dawe wrote:
>
> Hi Philippe,
>
> I see. Why not perform the merge in batches containing a maximum of "ulimit
> -n" files then? Or add an option -n allowing the user to specify a maximum
> number of files to consider at once. Although taking a slight performance
> hit if more than "ulimit -n" files were being merged, at least hadd would
> not hit the system limits and fail. I think the slight performance hit is
> definitely worth actually running to completion. Actually, I think any
> necessary performance hit is worth it. Otherwise users typically write some
> kind of wrapper script which calls hadd on a subset of the files until all
> files are merged (essentially doing as suggested above).
>
> Noel
>
> On Mon, Aug 8, 2011 at 1:51 AM, Philippe Canal <pcanal_at_fnal.gov> wrote:
>
>> Hi Noel,
>>
>> The current scheme comes from 2 observation, one being that opening a file
>> is comparitively slow especially if the file is not local.
>> The 2nd is that it is more efficient time wise to get one object to be
>> merged and then merge into this object the equivalent
>> objects from all the remaining files and then to move on to the next
>> object/directory. This is particular helpful with deep directory
>> hierarchy are its reduced the number of traversal that are needed.
>>
>> Cheers,
>> Philippe.
>>
>>
>> On 8/6/11 5:19 AM, Noel Dawe wrote:
>>
>>> I don't know why hadd needs to open all the files at the same time but
>>> probably a better way to write this tool would be to never open more than
>>> two files at once: copy the first file to the destination and keep it open,
>>> then pop off the next file, open it, merge it into the first, close it, then
>>> pop off the next file and open it, etc...
>>>
>>> Noel
>>>
>>
>
Received on Fri Aug 12 2011 - 09:17:21 CEST

This archive was generated by hypermail 2.2.0 : Fri Aug 12 2011 - 17:50:02 CEST