Re: Re: ??:RE: remotely root file view from Rene Brun on 2007-10-05 (RootTalk)

From: Rene Brun <Rene.Brun_at_cern.ch>
Date: Fri, 05 Oct 2007 21:58:12 +0200

Hi Christian,

Christian Holm Christensen wrote:
> Hi Rene,
>
> On Fri, 2007-10-05 at 15:19 +0200, Rene Brun wrote:
>
>> Christian,
>>
>> Of course we know about Ajax and Carrot. Both systems have very serious
>> shortcomings
>> -either having to run ROOT (or user application) on the web server
>>
>
> Why is that a problem? If you're concerned about security, you can
> simply have your web-server run in a chroot jail - and voila, your data
> is safe. Or use https, or ... there are tons of ways to run secure
> web-applications.
>

Just let me know one single site that will accept to run a ROOT application under a web server.
I am not talking about private web servers under user control, but public services.
>
>> -or/and have a very slow rendering because only very low level
>> graphics objects (jpg, gif, svg, etc) files are transported across the net
>>
>
> SVG or similar is not low level. Compare shipping an SVG image of a
> histogram of 100 bins of 1,000,000 events each with 100 particles of a
> complex calculation (say m^2=(e^2-p^2)), to shipping the full four
> vectors of the same 100,000,000 particles. I think there's far less data
> in the SVG image than in the "raw" data. Of course, the remote server
> should ship enough data for the client to quickly do the most common
> operations (excluding things like fitting, but including things like
> rotating a 3D object).
>

I was proposing to transfer the histogram object (TH1) instead of its graphics representation.
Talking about your example, there is a factor 27 between TH1 and its SVG view !!
>
>> -or/and use only a very small subset of the client capabilities.
>>
>
> That's why I suggested using something like Ajaxterm. With that, you
> have the full power of a command line UI. Of course, what this would
> mean, is that ROOT made an graphics device for web-rendering - that is,
> an Ajax (for example) based implementation of TVirtualX (or someting
> like that). Now, running ROOT through the web-browser, is almost like
> doing an SSH to a remote machine, but instead of doing X rendering, you
> do web-browser rendering.
>

This is the good old x-terminal model. Do you really believe that this model is still valid
at the time of multi-core cpus laptops? I think we can just a bit better ::)
> Apple used to have this idea, that the display was simply yet another
> Postscript device - that is, there was little difference between the
> printer and what you saw on the screen (wonder why Mac was preferred
> graphics professionals :-) - in this day and age of web-based
> applications, perhaps there should be no difference between a local
> monitor and a web-page from the point of view of rendering?
>

As I said, this was a good model (but 20 years ago)
>
>
>> The point is that processing ROOT objects (visualization of histograms,
>> Trees, etc, or queries to Trees require a substantial fraction of the
>> ROOT core classes on the client side.
>>
>
> Why? What does the client need to know, other than - here's a prompt,
> type in what you want to do! It's up to the server side to know what
> to do (like any other CGI based interface).
>

Because I want to exploit the power of my local client station as much as possible.
Visualization, GL rendering, UI and all other operations that make sense on a laptop.
Forget the X term model. A central server executing user queries is not going to scale beyond toy applications.
>
>> And the problem is that installing the ROOT libraries is not just a few
>> seconds exercise.
>>
>
> That depends on whether you're running Debian or anything else. With
> Debian, you simple do
>
> apt-get install root-system
>
> and you're ready to go :-) Upgrades? Just do "apt-get upgrade".
>

Yes and NO. Your model is valid if the debian package contains what I want. This will be less and less the case. May be the corresponding binary does not exist (or most probably already obsolete). One should be able to access only the strict minimum necessary for a particular job. If another task
requires different libraries from the package, you download them (and of course cache them) when
you need the functionality. We believe that it is much better to package a small core package
that fullfills the need of say 70% users. People needing extra exotic packages will download them on demand.
If the binary is not available, the system must be able to compile on the fly (now I am describing BOOT ::)

>
>> Because we are very well aware of these requirements, we launched
>> several months ago the BOOT project (see my talk at CHEP07 about BOOT).
>> The idea is that from any web browser, you can easily install the
>> minimum set of libraries to run the tasks described above (say less than
>> one minute install time).
>>
>
> Correct me if I'm wrong, but you install binary (architecture-dependent)
> software on the client. Suppose you don't have the privileges to do
> so? Suppose your architecture is not supported?
>

That is exactly my point. You should be able to compile with your local options ( -g, -Ox,
-msse, etc) even if a binary package is available for your system.
>
>> Using the new TBrowser you can browse any web file, including ROOT
>> files, or alternatively via the Remote login facility start a remote
>> session where you execute your normal scripts).
>>
>
> That is, SSH, right? SSH requires X display, access rights, and so on.
> Perhaps you want to set up an environment for students that should be
> able to play with data from their hostile home machines - you don't want
> to give them accounts or things like that. Nor can you be sure they
> have X or the like - perhaps they are running things from the library
> where they have limited privileges. The point is, that the client
> should be thought of as a hostile, limited, dump machine with a simple
> web-client.
>
> NO!!!! You can start the remote application via ssh, but you do not send
graphics objects back, you send objects like histograms or canvases (collections of original objects).
Forget the X11 model in this context (far too slow on high latency WANS)
>> In both modes graphics
>> is native graphics and only high level objects transported through the
>> network with the existing ROOt I/O machinery. This is the solution
>> adopted by other major tools like the videoplayers, Adobe, etc.
>>
>
> Erhm, no. What clever people do, is that they run rendering on the
> client and number crunching on the server (since the data is there).
> That means, that you ship only what needs to be rendered to screen to
> the client, and everything else is done near the data.
>

That is correct. Of course in our model, the server could be a PROOF man on a remote cluster.
> What Adobe (and I'm assuming you're talking about PDFs over the web), is
> that they make sure the graphical objects are device independent (DVI
> anyone?) - usually by doing vector graphics rather than bitmaps. Then
> it's up to the client to render it the best way possible on the client
> side. Video players are usually dumber than that. They get a stream
> of frames that they need to render the best way possible.
>
> Another way to do this, is to partially implement the
> rendering/processing on client and server. That's what Java web
> applications do. You run a small program on your client machine to do
> rendering and other such client stuff, while the server takes care of
> the bulk of the processing. X it self is another example of this. In a
> sense, one should turn the web-browser in to an X display (I have no
> idea whether anyone one has done this) - which of course should be done
> in a platform/web-browser independent way i.e., using Java.
>

Yes, that is the conventional Java model over the web and also the reason why most of these applications are very slow.
>
>> We investigated the possibility to have a plug-in to the local client
>> browser, but we were totally disappointed by the instability in this
>> area and the frequent changes in protocols. We opted instead for
>> a separate window running native ROOT/BOOT.
>>
>
> So, if you have a window in which you render stuff, you have to use
> either the native API, or some Web-based API, like Java or, God
> forbid, .NET. Essentially, it's the web-browser as an X display
> (though perhaps not as general).
>
> However, the processing should take place near the data. So if the data
> is on the server, the processing should be done there. If the data is
> at the client, then the processing should happen there (in which case,
> you probably don't need the server in the first place :-).
>

Yes, of course. I am not proposing to do all the computations on the client side. If you take PROOF as a model, all the I/O and CPU intensive applications run on the PROOF servers. When a PROOF query terminates, the results of all workers are automatically merged. Some results are kept for further processing on the PROOF cluster (could be Trees). Other objects like histograms can easily go back to the client where they can be viewed in many different formats and pictures generated locally.
> In any case, the point might be, that installing ROOT on the client is
> not an option. That's the central thing to keep in mind, and that's
> what I think most people in this thread was concerned about.
>
>
>> One of the requirements for BOOT was to achieve a very small size for
>> the executable.
>>
>
> With most modern day computers, the size of an executable does not
> matter (unless it's Gigabytes big). The point is, of course, that it's
> the data that will take up most of the memory rather than the executable
> code (if that's not the case, then you're shooting sparrows with
> cannons).
>

We are running into circles. My model is distributed and not the old mainframe/Xterm model.
The size of the executable module REALLY matters. The startup time for an application is proportional to its virtual memory. I know that computer scientists do not care too much, but I can tell you that this makes the difference at the end.
>
>> With version 5.17 we are now around 15 MBytes of virtual memory and 7
>> MBytes of real memory.
>>
>
> Ehrm, I only care about the "virtual memory" size. The "real" memory
> size is dependent on the OS, RAM, environment, and so on. The virtual
> size tells you have much memory the process could possibly take up.
>
>
>> The development of BOOT will still take a few more months. Once BOOT
>> will be available,
>> it will include an automatic update facility like the ones found on most
>> operating systems today, such that
>> moving to a new version (or going back to an old version) should be
>> extremely easy.
>>
>
> Argh! Don't try to replace software updates/installation and package
> mangement with custom tools - it will _never_ meet the same standards as
> the OS tools. I understand the temptation to make a custom format for
> this, since it gives a (false) sense of power and control (for getting
> the job done), but most of the time it results in bloat, unstable
> systems, and does far worse than any OS tool would do - witness the
> various Grid "solutions" and their failures. The problem is, that
> implementers of custom distributions often fail to recognise the
> complexity of a full operating system and the application environment.
> So please - leave distribution to the distributions and focus on what
> you do best - great analysis tools.
>

It is not our intention to reinvent the wheel. At the contrary we intend to use the good work
and tools already available in the OSs or open source systems.
> I say this with the utmost respect for your work - and I sincerely that
> you will think twice about this. Distributing software is not an easy
> task, and I think your time is better spent on other tasks. If you
> really want to be distributors, perhaps the first step would be to make
> RPMs (make redhat; rpmbuild -ta root.spec root-vXX.YY.ZZ-source.tar.gz)
> and distribute those - that would cover ~80% of the users (and
> up/downgrades _are_ possib
>

see my previous remarks above. The idea is to move from the old era of distributing software
to something more dynamic and scalable where you access the software that you need
(from source or binaries).

Rene Received on Fri Oct 05 2007 - 21:58:35 CEST

This archive was generated by hypermail 2.2.0 : Fri Oct 05 2007 - 23:50:01 CEST