RE: Troubling running TMultiLayerPerceptron with ROOT 5.08/00 from Nick West on 2006-03-14 (RootTalk)

From: Nick West <n.west1_at_physics.ox.ac.uk>
Date: Tue, 14 Mar 2006 16:03:43 -0000

Hi Christophe,

Thanks for your quick and detailed reply.

> I looked at your example, and was not able to reproduce the problem
> automatically with the CVS HEAD version of root. (I got it only a few
> times, probably depending on the random seed, changing the number of
> epoch changed the behavior.)

That's strange, I checked again, with last night's ROOT, and nothing has changed:-

  Test set distribution.  Mean : 0.825711 RMS 0.030122
  Test set distribution.  Mean : 0.743775 RMS 0.0454574
  Test set distribution.  Mean : 0.600384 RMS 0.0348653
  Test set distribution.  Mean : 0.580275 RMS 0.0476791
  Test set distribution.  Mean : 0.872623 RMS 0.0363429
  Test set distribution.  Mean : 0.626193 RMS 0.039023
  Test set distribution.  Mean : 0.796209 RMS 0.0609749

still 100% with narrow RMS i.e. a total failure. Also, just a few days ago Andrea reported:-

"I've tried your test case and can reproduce the problem. In fact, looking at the "mlp->DrawResult(0,"test")" plots show that often the networks degenerate and give the same result almost independently from the inputs (roughly the everage of "type")."

which is exactly what we see.

> Looking at your test macro, I also noticed the following:
> - you have twice more type==1 than type==0. Using a weight in that
> case can help producing more symmetric distributions.

I would be the first to admit that I don't know much about Neural nets or their sensitivities to training sets, but I am slightly surprised that
a ratio of 2:1 could have a serious impact on the finding of a minimum. I have tried using the ctor you use:-

TMultiLayerPerceptron("@trkPHperPlane, @eventPlanes, @shwPHperStrip:5:type!",

                        "1+(type==0)", 
                         inTree, 
                         "Entry$%5",
                         "!(Entry$%5)");

that admits a weight ("1+(type==0)") but it appears to make no difference to my script:-

  Test set distribution.  Mean : 0.913281 RMS 0.0342676
  Test set distribution.  Mean : 0.350724 RMS 0.0899779
  Test set distribution.  Mean : 0.614752 RMS 0.0535525

> - you might also benefit from the developments by Andrea, by putting a

> "!" in the last layer.

I tried that too and again:-

  Test set distribution.  Mean : 0.87205  RMS 0.00425949
  Test set distribution.  Mean : 0.675302 RMS 0.0372925
  Test set distribution.  Mean : 0.614752 RMS 0.0535525

> I attached a modified version of your test, together with the
> resulting distribution.

In your script you have switched back to the default learning method:-

// mlp->SetLearningMethod(TMultiLayerPerceptron::kStochastic);

This does help a lot:-

  Test set distribution.  Mean : 0.575912 RMS 0.242467
  Test set distribution.  Mean : 0.575818 RMS 0.243014
  Test set distribution.  Mean : 0.554317 RMS 0.179214
  Test set distribution.  Mean : 0.537773 RMS 0.0772103
  Test set distribution.  Mean : 0.570603 RMS 0.241381
  Test set distribution.  Mean : 0.576507 RMS 0.240721
  Test set distribution.  Mean : 0.576922 RMS 0.243028
  Test set distribution.  Mean : 0.575999 RMS 0.243222
  Test set distribution.  Mean : 0.576616 RMS 0.242029
  Test set distribution.  Mean : 0.535745 RMS 0.0795188
  Test set distribution.  Mean : 0.576142 RMS 0.243109
  Test set distribution.  Mean : 0.535266 RMS 0.0879367
  Test set distribution.  Mean : 0.536488 RMS 0.0834719
  Test set distribution.  Mean : 0.535279 RMS 0.079165
  Test set distribution.  Mean : 0.580323 RMS 0.14462

and your example plot with Mean = .5657 and RMS = 0.2438 is typical of a success. However there are still failures: 1/3 in the above case.

We understand that we should not be selecting the Stochastic method, but the history to this was that our fitting appeared fine under 4.04/02 and it's our strong impression that in switching to the current version it is far less reliable. Of course it makes it very hard for you to help us if all we can do is give you our fitting program and say vaguely that "It is not as good as it was". So we sort out a simplified model and then tried to find a way of making the difference between ROOT 4 and ROOT 5 as dramatic as possible, hoping that way it might be possible for you to diagnose the problem.

The other change you make

from: mlp->Train(50,"text,update=10"); to: mlp->Train(200,"text,graph,update=10");

does not help us; once it has found a bad minimum it stays there forever.

Cheers,

Nick. Received on Tue Mar 14 2006 - 17:03:57 MET

This archive was generated by hypermail 2.2.0 : Mon Jan 01 2007 - 16:31:57 MET