Colliders performace

From Yade

Revision as of 20:18, 19 September 2011 by Bchareyre (talk | contribs)

Results

This graph shows

  • "init": time for the first step
  • "step": average time for next 100 steps (normalized per step)
  • I had to put time for first step of PersistentSAPCollider to log scale.
  • Machine: Intel i7 2.7GHz, DDR3 RAM

Colliders-perf.svg

  • SpatialQuickSortCollider scaled with N^2 and is (significantly) slower, especially for big packings; the initial step is not significantly longer that regular step.
  • PersistentSAPCollider scales with something over N×log N. The first step is significantly slower than the next ones.
  • InsertionSortCollider scales the same as PersistentSAPCollider, but in absolute numbers is about 50% faster in regular steps and over 10x (!!) faster on the initial step.

Running

$ cd examples/collider-perf
$ export OMP_NUM_THREADS=1 # to make sure, for openMP-enabled builds
$ yade-trunk-opt-multi perf.table perf.py
$ python mkGraph.py *.log

Other machines

  • Machine: AMD Athlon(tm) XP 2100+, 1.7GHz

Colliders-gl.svg

  • Machine: Intel(R) Xeon(R) CPU E5410 @ 2.33GHz

Xeon233 collider-perf.png

TODO (post your graphs here, with machine description)


Improved InsertionSort

Results obtained with a modified version of the insertion sort collider are given below (5000 iterations in the initial phase of a triaxial confinement, single thread, Intel(R) Xeon(R) CPU W3530 @ 2.80GHz). The code is candidate for release (https://code.launchpad.net/~bruno-chareyre/yade/collide).

As opposed to the results above, the times are given per simulation step, not per execution of the collider::action(). The speedup approaches x10 for 96k spheres. The speedup in terms of total time for one step is slightly higher than x2.

The improvement is more sensible in multithread, since the modified version keeps the ratio of costs collider/interaction loop small. With 96k particles and the larger Verlet distance, the collider corresponds to 16% of the total cpu time (initialization excluded).


ColliderTimes.png

ColliderTimesOptimal.png