Boost Python and PyROOT, SetBranchAddress

Hi everybody,

I am having some trouble with the interoperability between a library wrapped with Boost Python and PyROOT. The library has several classes which inherit from TObject and which are wrapped using Boost Python. Objects of these classes should be read/writable from/into TTrees in python. This poses problems, as trying to use TTree::Branch or TTree::SetBranchAddress fails with:

TBranch* TTree::Branch(const char* name, void** obj, Int_t bufsize = 32000, Int_t splitlevel = 99) => could not convert argument 2

This is quite expected, as PyROOT does not know that the (boost) python object it gets is inheriting from TObject “behind the scenes”. Looking for a workaround, I tried adding a converter to my boost python class like so:

PyObject* fitResult_getRootFitResult(rpwa::fitResult& self)
{
	return TPython::ObjectProxy_FromVoidPtr(&self, "fitResult", false);
}

Then I tried

tree.SetBranchAddress("fitResult", fitResult.getRootFitResult())

which changes the error to “could not convert argument 1 (attempt to bind ROOT object w/o class)”. The error message when trying to create a new branch stays the same as above, even when using the converted object.

Is there some TPython magic I can invoke that will make this work transparently. While I cannot change the code of the wrapped objects, I can add any conversion code while wrapping which might be required. Also, I have a traditional python module layer around the shared library created by Boost Python, so I can also add python code which is executed when the module is imported. Ideally, I would like everything to work out-of-the box, i.e. without the user having to use some special workarounds on her/his side.

Thanks in advance for your help!

Regards,
e_dude

Hi,

the code is saying that “fitResult” is not a known class. To first order, it seems that the actual name should have been “rpwa::fitResult” rather than “fitResult”.

Second, make sure that there is a dictionary for “rpwa::fitResult”. This is both needed for PyROOT as well as for use of that class with a TTree.

(As an alternative, you can pass a PyCObject a from boost.python through a void* bound with PyROOT. But even then you’d still need a dictionary for use with the TTree.)

Cheers,
Wim

Hi,

thanks for your answer!

“fitResult” is the branch name, not the class name, so the namespace qualifier should not be there. Or did I misunderstand something?

Also, the dictionary is produced, these classes are used with C++ codes and trees regularly.

How would the PyCObject solution look in code? Is that not what I attempted with my helper function “getRootFitResult”?

Regards,
e_dude

[quote=“explorer_dude”]“fitResult” is the branch name, not the class name, so the namespace qualifier should not be there. Or did I misunderstand something?[/quote]In the second call it is, but I’m talking about this code snippet:

PyObject* fitResult_getRootFitResult(rpwa::fitResult& self) { return TPython::ObjectProxy_FromVoidPtr(&self, "fitResult", false); } where it is the class name. The namespace is used in the type of the argument, but not when asking to bind the object to that class.

[quote=“explorer_dude”]How would the PyCObject solution look in code? Is that not what I attempted with my helper function “getRootFitResult”?[/quote]A PyCObject is a builtin Python type (http://docs.python.org/2/c-api/cobject.html); what the helper function returns is a PyROOT bound instance. The CINT dictionary code of SetBranchAddress() takes a void* which is supposed to actually be an actual void**. Thus, if you package the “&self” into a PyCObject, it should pass properly.

Cheers,
Wim

Hi again,

I am sorry to keep bothering you with this, but I cannot seem to get it to work as it should. First, you were completely right with the missing namespace, so I changed the function to:

	PyObject* fitResult_getAsRootObject(rpwa::fitResult& self)
	{
		return TPython::ObjectProxy_FromVoidPtr(self, self.ClassName(), false);
	}

However, when using this, I get a segmentation violation. Here’s my code in python:

	print("start of the mess")
	fitResultFile = pyRootPwa.ROOT.TFile.Open(args.fitResult, "READ")
	fitResultTree = fitResultFile.Get(config.fitResultTreeName)
	print("constructing fit result")
	currentFitResult = pyRootPwa.core.fitResult()
	print("constructed fit result")
	fitResultTree.SetBranchAddress(config.fitResultBranchName, currentFitResult.getAsRootObject())
	print("set branch address")
	print("getting entry returns '" + str(fitResultTree.GetEntry(0)) + "'")
	print("got 0. entry")
	print("accessing fit result (got '" + str(currentFitResult.nmbWaves()) + "' should be '5')")
	print(currentFitResult.nmbEvents())
	print("done")
	massRange = generatorManager.getGenerator().getTPrimeAndMassPicker().massRange()
	print("did other boost::python stuff")
	print("getting 0. entry again")
	fitResultTree.GetEntry(0)
	print(currentFitResult.massBinCenter())
	print("done")

And the resulting output:

start of the mess
constructing fit result
+++ rpwa::fitResult::fitResult(): debug: fit result constructed! 0x3380800
constructed fit result
+++ rpwa::fitResult::fitResult(): debug: fit result copied!
+++ rpwa::fitResult::fitResult(): debug: fit result constructed! 0x34e2170
+++ rpwa::fitResult::~fitResult(): debug: fit result destroyed! 0x34e2170
set branch address
+++ rpwa::fitResult::fitResult(): debug: fit result constructed! 0x35880d0
getting entry returns '2089'
got 0. entry
accessing fit result (got '0' should be '5')
0
done
did other boost::python stuff
getting 0. entry again

 *** Break *** segmentation violation

I attached the rather extensive backtrace. It is important to note that already the first GetEntry(0) fails, the fitResult is not returning the correct values afterwards. I have found some kind of workaround to this by changing the wrapper function to

	PyObject* fitResult_getAsRootObject(rpwa::fitResult& self)
	{
		rpwa::fitResult* newCxxObj = new rpwa::fitResult(self);
		return TPython::ObjectProxy_FromVoidPtr(newCxxObj, newCxxObj->ClassName(), false);
	}

Obviously, this has the problem that I am not getting the same fitResult back which I put in, and fitResult.getAsRootObject() and fitResult will be different objects. At least if I change

	currentFitResult = pyRootPwa.core.fitResult()
	print("constructed fit result")
	fitResultTree.SetBranchAddress(config.fitResultBranchName, currentFitResult.getAsRootObject())

to

	currentFitResult = pyRootPwa.core.fitResult().getAsRootObject()
	print("constructed fit result")
	fitResultTree.SetBranchAddress(config.fitResultBranchName, currentFitResult)

it seems to work in this context. However, as the so constructed currentFitResult is different from the (carefully) boost python wrapped one, this is not really a nice solution.

Do you have any idea why the conversion without copying does not work? Some problem with memory management in boost python or root? Again, I am sorry to keep bothering with these problems on the fringe of PyRoot and thank you very much for your help!

Regards,
e_dude
backtrace.txt (32.5 KB)

Hi,

no, I don’t even see where the copying would come from that is in the printout of the first example, but doesn’t not seem anywhere in the code. Also, if that’s the address (i.e. this) that you’re printing, then you seem to have three of them?

When you specify ‘false’ as the third argument to ObjectProxy_FromVoidPtr, then that makes sure that PyROOT will not delete the object. Even if there were to be a bug there, PyROOT certainly does not make any copies.

Now, in the working example, if you really do: currentFitResult = pyRootPwa.core.fitResult().getAsRootObject()then for sure that will fail w/o an intermediate copy on the C++ side. After all, you construct a python object (that owns the C++ one) first, and that gets deleted immediately after the getAsRootObject() call returns.

One way to debug this further, is if you look at the result of repr(currentFitResult). The address printed after ‘at’ is the address of the C++ object (not the address of the bound python object, as is more common). You can compare that with the ‘this’ printout and see what you have.

Cheers,
Wim

Hi,

having some time, I tried once again to tackle this project, but I seem to be running into the same problems as before. At least now I found one version which works:


def pyPrint(text):
	print("[PYTHON]: " + str(text))

if __name__ == "__main__":

	fitResultFileName = "fitResult.root"
	fitResultFile = ROOT.TFile.Open(fitResultFileName, "READ")
	pyPrint("getting fit result tree")
	fitResultTree = fitResultFile.Get("pwa")
	pyPrint("got fit result tree '" + repr(fitResultTree))
	pyPrint("before instantiating fit result")
	fitResult = pyRootPwa.core.fitResult()
	pyPrint("instantiated fit result")

	pyPrint("before getting root fit result")
	rootFitResult = fitResult.getAsRootObject()
	pyPrint("got root fit result with address '" + repr(rootFitResult))
	pyPrint("before setting branch address")
	fitResultTree.SetBranchAddress('fitResult_v2', rootFitResult)
	pyPrint("set branch address")

	pyPrint("start reading TTree")
	for i in range(fitResultTree.GetEntries()/8):
		pyPrint("before getting entry " + str(i))
		fitResultTree.GetEntry(i)
		pyPrint("got entry " + str(i))
		pyPrint("before printing number of events")
		pyPrint(fitResult.nmbEvents())
		pyPrint("printed number of events")

with the C++ function

	PyObject* fitResult_getAsRootObject(rpwa::fitResult& self)
	{
		return TPython::ObjectProxy_FromVoidPtr(&self, self.ClassName(), false);
	}

The output of this being:

So this is working as expected, at least, which is a start. Note that it seems to be normal behavior for SetBranchAddress to create and destroy an object of the type it is given. Now, the problem is, if I change the way the branch address is set to

fitResultTree.SetBranchAddress('fitResult_v2', fitResult.getAsRootObject())

or do

	pyPrint("before getting root fit result")
	rootFitResult = fitResult.getAsRootObject()
	pyPrint("got root fit result with address '" + repr(rootFitResult))
	fitResultTree.SetBranchAddress('fitResult_v2', rootFitResult)

	del rootFitResult

I get a segmentation violation:

I do not quite understand this. It seems the deletion of the ObjectProxy returned by fitResult.getAsRootObject() somehow breaks the reading of the tree, even though the C++ fitResult is not deleted (no printout of the deconstructor). On the other hand, a new C++ fitResult seems to be created for some reason. Do you have any insight into why this might be happening? In my understanding, this should work, should it not?

I explored another possibility to solve this problem. I defined another C++ function for the fitResult:

	void fitResult_setBranchAddress(rpwa::fitResult* self, PyObject* pyTree, std::string name)
	{
		TObject* object = (TObject*)TPython::ObjectProxy_AsVoidPtr(pyTree);
		TTree* tree = dynamic_cast<TTree*>(object);
		if(not tree) {
			PyErr_SetString(PyExc_TypeError, "Got invalid input for tree when executing rpwa::amplitudeTreeLeaf::branch()");
			bp::throw_error_already_set();
		}
		std::cout << "setting ttree '" << tree << "' pointer to pointer on '" << self << "'." << std::endl;
		tree->SetBranchAddress(name.c_str(), &self);
	}

which I hoped could simplify things because then I can just do:

	pyPrint("before setting fitResult::setBranchAddress")
	fitResult.setBranchAddress(fitResultTree, 'fitResult_v2')
	pyPrint("set branch address")

But this too leads to a segmentation violation:

Also here, I cannot understand. Everything seems to go fine when setting the branch address, the correct tree and fitResult are used and nothing is deleted. Still reading from the tree fails.

I am really sorry for the long post, but at some point, I have to solve this problem. The point is that I am writing this to be used by other people and having exactly one “magic” which works while all else fails with a segmentation violation will be a problem. Next to the golden solution (i.e. my fitResult can be used natively with TTree::SetBranchAddress), having a fitResult.setBranchAddress(tree, name) would be really nice.

Thanks again for your help! If you need any more information, please let me know. If you would like to look deeper into this, I could try to produce some minimal breaking example.

Best regards,
e_dude

Hi,

haven’t read it all in detail, but from the looks of it you take a pointer to a stack variable (the function argument given to fitResult_setBranchAddress) to get the pointer-to-pointer, so that get overwritten after that portion of the stack is reused.

Cheers,
Wim

Dear Wim,

thank you for your answer!

Unfortunately, I do not think this is happening. If it was, one should see the message from the fitResult’s deconstructor somewhere, but it does not appear.

To make this easier, I invested a few hours of time and produced a breaking example (attached). As this requires boost::python objects, it is a full CMake project which one should be able to compile if python, (compiled) boost::python and root is around. After compilation, the $PYTHONPATH has to point to ./build/pyLib to find the python module. There are three scripts around which demonstrate the working and the two breaking variants. The object being used is defined in ./pyInterface/fitResult.h, the boost::python wrapping is done in ./pyInterface/rootPwaPy.cc. Please let me know anything is unclear with this example.

I hope with this we can find out what’s happening.

Best regards,
e_dude

edit: Note that on my machine, the breaking examples do not produce a segmentation violation. Instead, the values read from the tree are just empty (i.e. nmbEvents() returns 0).
boostPythonPyRootBreakingExample.tar.gz (20.8 KB)

Note sure where you expect problems from a destructor, but the code is quite clear:

void fitResult_setBranchAddress(rpwa::fitResult* self, PyObject* pyTree, std::string name) { // .. snip ... tree->SetBranchAddress(name.c_str(), &self); }The pointer variable ‘self’ only has a life time for the duration of the call since it lives on the stack (obviously, the object pointed to by self has a longer life time).

I’ll see when I’ll have time to run the code; but this week I’m up against a conference deadline and preparing for two students who start next week.

Cheers,
Wim

Hi,

ah, now I understand what you mean. Trying around for a little while, I found that that was indeed the problem. I will have to think how I can keep the pointer alive, but it seems possible to fix this.

Thank you very much!

Regards,
e_dude