Python3Intro

From Yade

This page is an introduction (mainly) for the work of Alban Daumer on python bindings in yade.

Why python

Yade is written in c++; python is an interpreted language. C++ code executes very fast but has to be compiled and is rather verbose. Python code, on the other hand, is slower, but:

  • is interpreted at runtime without compilation (flexibility), and
  • is much easier to write than c++ (productivity).

The computational part of Yade will remain in c++, but things that do not execute very often and need to flexible are easier to be written in python. This includes:

  •  scene generation (simulation setup);
  • non-heavy data processing, such as occasional saving of some variables for plotting;
  • "simulation flow" in the sense of: stopping/modifying the simulation if we meet some criteria;
  • command-line interface for the simulation and modifying it by hand during execution.

To make python intuitive, we create "bindings", an interface that wraps around yade internals and mimics them; in this way, user that knows yade internals can easily use python bindings (and vice versa).

Current state of python bindings (as of 24.6.2008), i.e. python II.

(This is the second incarnation of python bindings. The first one was very rudimentary and is not even of historical interest now.)

Yade code is designed in such a way that there are a few "root classes" (let's call them like that), and all classes inherit, directly or indirectly, from one of them (Engine, Body, Interaction, PhysicalParameters, GeometricalModel, ...). The class hierarchy can be browsed in the doxygen documentation. All (most, actually) root classes derive from Serializable.

Serialization

For the purposes of loading/saving simulations, yade has its own serialization library (converting structured data into linear stream), written by Janek. This library is serializes "fundamental variables", which are numbers, vectors, quaternions, strings, bools (and others?), containers thereof (std::vector, std::list, ...) and then pointers to other yade classes.

Classes deriving from Serializable register their attributes (data members) with the REGISTER_ATTRIBUTE macro, that creates bidirectional mapping from attribute name (as string) to the attribute address. Registered attributes are processed one after another during serialization, which results in a stream representation of those variables; the only currently supported format of the stream is XML representation (XMLFormatManager); (there used to be BINFormatManager, but was removed due to mysterious clashes with python).

For example, if a class Sphere had an attribute called center of type Vector3r and value Vector3r(1.,2.,3.), the XML-serialized form of the Sphere object would read

< _className_="Sphere" center="{1 2 3}" />.

Then on deserialization from XML, the class factory (the ability to create class instance based on its name as string - again, the class has to be registered with REGISTER_SERIALIZABLE etc...) would create an instance of the Sphere class and the center attribute would be parsed, so finally Sphere::center would be set to Vector3r(1,2,3) as it was before.

Now about python

Current python glue relies heavily on the serialization interface; there are wrappers for all root classes (see, for instance, yadeControl.cpp:191 for pyBody), and pyBody has shared_ptr<Body> proxee, which is (shared) pointer to the actual c++ object it wraps around.

The pyBody class further defines wrappedPyGet and wrappedPySet (exposed as __getitem__ and __setitem__ in python, which are called for the indexing operator []), which can get/set an attribute based on its name. For example <source lang="python"> b=Body() # create an instance of Body, which constructs: new pyBody object and its proxee, Body b['isDynamic']

  1. requests an attribute called 'isDynamic' from the wrapper
  2. the b['isDynamic'] translates to b.__getitem__('isDynamic')
  3. b.__getitem__('isDynamic') in python calls pyBody::wrappedPyGet("isDynamic") in the c++ code in turn
  4. pyBody::wrappedPyGet("isDynamic") asks the serializer for the value of the attribute named "isDynamic"
  5. [...]
  6. the serializer returns the value of the attribute "isDynamic" in a serialized form, i.e. as string "1" (=true)
  7. pyBody::wrappedPyGet gets the type of the value (no need to get in details here) and attempts conversion to the right type (bool, in this case) and returns it wrapped in boost::python::object to the __getitem__
  8. python gives you the answer, which is True (python bool type)

</source> If an attribute is assigned, as in <source lang="python"> b['isDynamic']=True </source> the python value True is converted to string "1", which is passed to the deserializer, that changes the value of the wrapped instance.

(Dis)advantages of the current approach

Disadvantages:

  • Converting data to string and back is quite inefficient; it is sensible when setting parameters for thousands of elements.
  • Containers are not handled as they should be (like vector<vector<int> > doesn't work, even vector<Vector3r> doesn't work)
  • The glue does not expose methods
  • Every root class has to be wrapped by hand (there are many macros currently)
  • The sytax instance['attributeName'] is ugly

Advantages:

  • The glue is quite thin since we leverage existing serialization interface

Python III.

Requirements

  •  values are not converted to and from strings; there is direct and automatic conversion from c++ type to python type
    • the conversion is flexible, i.e. it can convert Wm3::Vector3r into Scientific.Geometry.Vector and vice versa
    • it can handle (nested) containers gracefully
  • as automatic as possible
  • uses boost::python (as we do now, anyway)

Alternatives

Py++

  • Py++ is automatic, flexible and powerful wrapper generator
  • uses gccxml to get representation of all classes and attributes, generates c++ code that creates the glue
  • converts basic types (as boost::python does), and handles containers very well (indexing suite v. 2)
  • is actively being used, with very good references

Xrtti

  • Xrtti extends the runtime-type-identification mechanism of standard c++, again by using gccxml and generating c++-compilable class description
  • does not intergrate with python directly (hence, no containers, no conversions and so on)
  • seems to be one-man project

Others?

sip and swig need hand-written interface descriptions, which is what we want to avoid.

Resolution

Therefore it seems that py++ is the best option, unless proven otherwise.

Tentative workplan

I am trying to suggest the order in which you need to have some competences which I consider important for the success of the project. If you have any of them already, very good. From jean-Francois' references, I assume you have good knowledge of c++ (including templates, inheritance, type-casting, polymorphism and virtual functions) and of Linux (which, in this case means: compiling and installing programs, rough idea about the toolchain sources-compiler-linker).

Please ask Remi (or whoever is responsible for that) to have rather powerful computer with recent software - like ubuntu hardy or debian lenny (or unstable): you will need python 2.5, scons 0.98, boost 1.34 (better: 1.35), g++ 4.2 (better: 4.3 or snapshot). If you setup distcc with your mates, only better for you (beware, compiler versions must match of all machines!).

I would like you to use launchpad.net for tracking bugs and tasks you want to do, and wikia for some notes you take along the way. I have good experience with setting up a blog page on wiki and writing every day about what you did, what worked and what did not and so on. That can be useful for anyone interested in your work. Call all pages you create on wikia Python3Something

  1. Create an account at [launchpad], once done, I will create your own branch of yade where you can experiment without breaking anything.
    • Become familiar with the revision control system bzr (bzr checkout, commit, push, pull, diff; easy).
  2. Subscribe to yade-dev@lists.berlios.de
  3. Read a decent python tutorial, like [this one]; do some experiments with python. You should know how classes and instances work, what are modules and a bit about namespaces.
  4. Read a bit about scons], to know roughly how yade building process works (scons is written in python, too).
  5. Have an idea about how yade works (structure of simulation, the engine loop); there are docs on this on the wiki (How_it_works, [[1]] section 2.2.4, other stuff)
  6. Have a look at the current python binding: PythonPrimer, SimpleSceneTutorial (compares c++ and python versions of the same thing)
  7. Get to know basics of boost::python [boost::python], experiment with that for a while
  8. Download and install latest py++, run that on some rudimentary example
  9. Run py++ on yade itself, until the wrapper is successfully generated; the new wrapper should co-exist with the old one (which is in the module yade.wrapper)
  10. think about ease of use of the interface, create some name-transforming rules for py++, limit wrapper only to data members that should be exposed, ...
  11. Create scons target that will re-build the wrapper.
  12. Create sample simulation with the new wrapper.

Obviously, last 4 steps are the core of your work and will likely take most of the time. I assume you would inform me about your progress every few days so that we can discuss things are steer the project to a successful end.

Many thanks for you interest and engagement.