DEM formulation¶
In this chapter, we mathematically describe general features of explicit DEM simulations, with some reference to Yade implementation of these algorithms. They are given roughly in the order as they appear in simulation; first, two particles might establish a new interaction, which consists in
detecting collision between particles;
creating new interaction and determining its properties (such as stiffness); they are either precomputed or derived from properties of both particles;
Then, for already existing interactions, the following is performed:
strain evaluation;
stress computation based on strains;
force application to particles in interaction.
This simplified description serves only to give meaning to the ordering of sections within this chapter. A more detailed description of this simulation loop is given later.
In this chapter we refer to kinematic variables of the contacts as ``strains``, although at this scale it is also common to speak of ``displacements``. Which semantic is more appropriate depends on the conceptual model one is starting from, and therefore it cannot be decided independently of specific problems. The reader familiar with displacements can mentaly replace normal strain and shear strain by normal displacement and shear displacement, respectively, without altering the meaning of what follows.
Collision detection¶
Generalities¶
Exact computation of collision configuration between two particles can be relatively expensive (for instance between Sphere and Facet). Taking a general pair of bodies
fast collision detection using approximate predicate
and ; they are pre-constructed in such a way as to abstract away individual features of and and satisfy the condition(1)¶(likewise for
). The approximate predicate is called ``bounding volume’’ (Bound in Yade) since it bounds any particle’s volume from outside (by virtue of the implication). It follows that and, by applying modus tollens,(2)¶which is a candidate exclusion rule in the proper sense.
By filtering away impossible collisions in (2), a more expensive, exact collision detection algorithms can be run on possible interactions, filtering out remaining spurious couples
. These algorithms operate on and and have to be able to handle all possible combinations of shape types.
It is only the first step we are concerned with here.
Algorithms¶
Collision evaluation algorithms have been the subject of extensive research in fields such as robotics, computer graphics and simulations. They can be roughly divided in two groups:
- Hierarchical algorithms
which recursively subdivide space and restrict the number of approximate checks in the first pass, knowing that lower-level bounding volumes can intersect only if they are part of the same higher-level bounding volume. Hierarchy elements are bounding volumes of different kinds: octrees [Jung1997], bounding spheres [Hubbard1996], k-DOP’s [Klosowski1998].
- Flat algorithms
work directly with bounding volumes without grouping them in hierarchies first; let us only mention two kinds commonly used in particle simulations:
- Sweep and prune
algorithm operates on axis-aligned bounding boxes, which overlap if and only if they overlap along all axes. These algorithms have roughly
complexity, where is number of particles as long as they exploit temporal coherence of the simulation.- Grid algorithms
represent continuous
space by a finite set of regularly spaced points, leading to very fast neighbor search; they can reach the complexity [Munjiza1998] and recent research suggests ways to overcome one of the major drawbacks of this method, which is the necessity to adjust grid cell size to the largest particle in the simulation ([Munjiza2006], the ``multistep’’ extension).
- Temporal coherence
expresses the fact that motion of particles in simulation is not arbitrary but governed by physical laws. This knowledge can be exploited to optimize performance.
Numerical stability of integrating motion equations dictates an upper limit on
On a finer level, it is common to enlarge
Sweep and prune¶
Let us describe in detail the sweep and prune algorithm used for collision detection in Yade (class InsertionSortCollider). Axis-aligned bounding boxes (Aabb) are used as
Presence of overlap of two Aabb’s can be determined from conjunction of separate overlaps of intervals along each axis (fig-sweep-and-prune):
where

Sweep and prune algorithm (shown in 2D), where Aabb of each sphere is represented by minimum and maximum value along each axis. Spatial overlap of Aabb’s is present if they overlap along all axes. In this case,
The collider keeps 3 separate lists (arrays)
where id
referring to particle it belongs to, and a flag whether it is lower or upper bound.
In the initial step, all lists are sorted (using quicksort, average
At each successive step, lists are already pre-sorted. Inversions occur where a particle’s coordinate has just crossed another particle’s coordinate; this number is limited by numerical stability of simulation and its physical meaning (giving spatio-temporal coherence to the algorithm). The insertion sort algorithm swaps neighboring elements if they are inverted, and has complexity between
overlap along the current axis, if an upper bound inverts (swaps) with a lower bound (i.e. that the upper bound with a higher coordinate was out of order in coming before the lower bound with a lower coordinate). Overlap along the other 2 axes is checked and if there is overlap along all axes, a new potential interaction is created.
End of overlap along the current axis, if lower bound inverts (swaps) with an upper bound. If there is only potential interaction between the two particles in question, it is deleted.
Nothing if both bounds are upper or both lower.
Aperiodic insertion sort¶
Let us show the sort algorithm on a sample sequence of numbers:

Elements are traversed from left to right; each of them keeps inverting (swapping) with neighbors to the left, moving left itself, until any of the following conditions is satisfied:
( |
the sorting order with the left neighbor is correct, or |
( |
the element is at the beginning of the sequence. |
We start at the leftmost element (the current element is marked

It obviously immediately satisfies (

Condition (

The last element

All elements having been traversed, the sequence is now sorted.
It is obvious that if the initial sequence were sorted, elements only would have to be traversed without any inversion to handle (that happens in
For each inversion during the sort in simulation, the function that investigates change in Aabb overlap is invoked, creating or deleting interactions.
The periodic variant of the sort algorithm is described in Periodic insertion sort algorithm, along with other periodic-boundary related topics.
Optimization with Verlet distances¶
As noted above, [Verlet1967] explored the possibility of running the collision detection only sparsely by enlarging predicates
In Yade, this is achieved by enlarging Aabb of particles by fixed relative length (or Verlet’s distance) in all dimensions
triggering the collider re-run as soon as one particle gives:
The number of particles and the number of available threads is also to be considered for choosing an appropriate Verlet’s distance. A larger distance will result in less time spent in the collider (which runs single-threaded) and more time in computing interactions (multi-threaded). Typically, large
Creating interaction between particles¶
Collision detection described above is only approximate. Exact collision detection depends on the geometry of individual particles and is handled separately. In Yade terminology, the Collider creates only potential interactions; potential interactions are evaluated exactly using specialized algorithms for collision of two spheres or other combinations. Exact collision detection must be run at every timestep since it is at every step that particles can change their mutual position (the collider is only run sometimes if the Verlet distance optimization is in use). Some exact collision detection algorithms are described in Kinematic variables; in Yade, they are implemented in classes deriving from IGeomFunctor (prefixed with Ig2
).
Besides detection of geometrical overlap (which corresponds to IGeom in Yade), there are also non-geometrical properties of the interaction to be determined (IPhys). In Yade, they are computed for every new interaction by calling a functor deriving from IPhysFunctor (prefixed with Ip2
) which accepts the given combination of Material types of both particles.
Stiffnesses¶
Basic DEM interaction defines two stiffnesses: normal stiffness
Naturally, such analysis is highly simplifying and does not account for particle radius distribution, packing configuration and other possible parameters such as the interaction radius introduced later.
Normal stiffness¶
The algorithm commonly used in Yade computes normal interaction stiffness as stiffness of two springs in serial configuration with lengths equal to the sphere radii (fig-spheres-contact-stiffness).

Series of 2 springs representing normal stiffness of contact between 2 spheres.¶
Let us define distance
The most used class computing interaction properties Ip2_FrictMat_FrictMat_FrictPhys uses
Some formulations define an equivalent cross-section
For reasons given above, no pretense about equality of particle-level
Other parameters¶
Non-elastic parameters differ for various material models. Usually, though, they are averaged from the particles’ material properties, if it makes sense. For instance, Ip2_CpmMat_CpmMat_CpmPhys averages most quantities, while Ip2_FrictMat_FrictMat_FrictPhys computes internal friction angle as
Kinematic variables¶
In the general case, mutual configuration of two particles has 6 degrees of freedom (DoFs) just like a beam in 3D space: both particles have 6 DoFs each, but the interaction itself is free to move and rotate in space (with both spheres) having 6 DoFs itself; then

Degrees of freedom of configuration of two spheres. Normal motion appears if there is a difference of linear velocity along the interaction axis (
We will only describe normal and shear components of the relative movement in the following, leaving torsion and bending aside. The reason is that most constitutive laws for contacts do not use the latter two.
Normal deformation¶
Constants¶
Let us consider two spheres with initial centers
These quantities are constant throughout the life of the interaction and are computed only once when the interaction is established. The distance

Geometry of the initial contact of 2 spheres; this case pictures spheres which already overlap when the contact is created (which can be the case at the beginning of a simulation) for the sake of generality. The initial contact point
Distances
The value of
Contact cross-section¶
Some constitutive laws are formulated with strains and stresses (Law2_ScGeom_CpmPhys_Cpm, the concrete model described later, for instance); in that case, equivalent cross-section of the contact must be introduced for the sake of dimensionality. The exact definition is rather arbitrary; the CPM model (Ip2_CpmMat_CpmMat_CpmPhys) uses the relation
which will be used to convert stresses to forces, if the constitutive law used is formulated in terms of stresses and strains. Note that other values than
Variables¶
The following state variables are updated as spheres undergo motion during the simulation (as
and
The contact point
Normal displacement and strain can be defined as
Since
For massively compressive simulations, it might be beneficial to use the logarithmic strain, such that the strain tends to
Such definition, however, has the disadvantage of effectively increasing rigidity (up to infinity) of contacts, requiring
Shear deformation¶
In order to keep
Geometrical meaning of shear strain is shown in fig-shear-2d.

Evolution of shear displacement
The classical incremental algorithm is widely used in DEM codes and is described frequently ([Luding2008], [Alonso2004]). Yade implements this algorithm in the ScGeom class. At each step, shear displacement
Contact moves dues to changes of the spheres’ positions
and , which updates current and as per (8) and (7). is perpendicular to the contact plane at the previous step and must be updated so that ; this is done by perpendicular projection to the plane first (which might decrease ) and adding what corresponds to spatial rotation of the interaction instead:Mutual movement of spheres, using only its part perpendicular to
; denotes mutual velocity of spheres at the contact point:
Finally, we compute
Contact model (example)¶
The kinematic variables of an interaction are used to determine the forces acting on both spheres via a constitutive law. In DEM generally, some constitutive laws are expressed using strains and stresses while others prefer displacement/force formulation. The law described here falls in the latter category.
The constitutive law presented here is the most common in DEM, originally proposed by Cundall. While the kinematic variables are described in the previous section regardless of the contact model, the force evaluation depends on the nature of the material being modeled. The constitutive law presented here is the simplest non-cohesive elastic-frictional contact model, which Yade implements in Law2_ScGeom_FrictPhys_CundallStrack (all constitutive laws derive from base class LawFunctor).
When new contact is established (discussed in Engines) it has its properties (IPhys) computed from Materials associated with both particles. In the simple case of frictional material FrictMat, Ip2_FrictMat_FrictMat_FrictPhys creates a new FrictPhys instance, which defines normal stiffness
At each step, given normal and shear displacements
where
Summary force
Motion integration¶
Each particle accumulates generalized forces (forces and torques) from the contacts in which it participates. These generalized forces are then used to integrate motion equations for each particle separately; therefore, we omit
The customary leapfrog scheme (also known as the Verlet scheme) is used, with some adjustments for rotation of non-spherical particles, as explained below. The “leapfrog” name comes from the fact that even derivatives of position/orientation are known at on-step points, whereas odd derivatives are known at mid-step points. Let us recall that we use
Described integration algorithms are implemented in the NewtonIntegrator class in Yade.
Position¶
Integrating motion consists in using current acceleration
Using the 2nd order finite difference with step
from which we express
Typically,
i.e. the mean velocity during the previous step, which is known. Plugging this approximate into the
which is
The algorithm can then be written down by first computing current mean velocity
Positions are known at times
Orientation¶
YADE has three different algorithms for integrating the rotational motion of non-spherical particles and one for spherical particles.
Orientation (spherical)¶
Updating particle orientation
We use the same approximation scheme, obtaining an equation analogous to (9)
The quaternion
Finally, we compute the next orientation
Orientation (aspherical)¶
Integrating the rotation of aspherical particles is considerably more complicated than their position, as their local reference frame is not inertial. Rotation of rigid body in the local frame, where inertia matrix
Due to the presence of both
The default algorithm and the most accurate one was proposed by [delValle2023]. The algorithm uses a leapfrog formulation that conserves the norm of the quaternion. [Omelyan1998], a more general version of [Omelyan1999] algorithm, is also implemented. Previously, YADE used the algorithm described by [Allen1989] (pg. 84–89) and designed by [Fincham1992] for molecular dynamics problems; it consists of extending the leapfrog algorithm by mid-step/on-step estimators of quantities known at on-step/mid-step points in the basic formulation. Although it has received criticism and more precise algorithms were known ([Omelyan1999], [Neto2006], [Johnson2008]), this algorithm is implemented in Yade for its relative simplicity.
Each body has its local coordinate system based on the principal axes of inertia for that body. We use
For a given particle, we know
(constant) inertia matrix; diagonal, since in local, principal coordinates, external torque, current orientation (and its equivalent rotation matrix ), mid-step angular velocity, mid-step angular momentum; this is an auxiliary variable needed in Fincham’s algorithm. It will be zero in the initial step.
SPIRAL Algorithm ([delValle2023])¶
Our goal is to compute new values of the latter three, that is,
where
where the quantity inside the parenthesis is a quaternion represented by its scalar part and its imaginary (vectorial) part. The algorithm offers a third-order approximation for both the quaternion and angular velocity calculations. As this formulation conserves the norm of the quaternion, it does not need to be normalized every time step. It is normalized every NewtonIntegrator.normalizeEvery steps. To finish, we compute the angular velocity and momentum in the global reference frame:
Omelyan Algorithm¶
[Omelyan1999] algorithm is also a leapfrog formulation. However, note that in a leapfrog formulation, we require the mid-step velocity and the current derivative of the velocity. But, in the case of Euler’s equation, the current angular acceleration depends on the current angular velocity, which is unknown. Then, Omelyan proposes to interpolate the current angular velocity product as
Then, we can compute the orientation of the particle with
The norm-conserving derivative of a quaternion can be calculated as
where
In the same way as the last algorithm, it is a third-order approximation, and the formulation is orthonormal, meaning that the norm of the quaternion is conserved. However, this formulation is numerically not as stable as the previous one.
Fincham Algorithm¶
Unlike the other two algorithms, [Fincham1992] does not conserve the norm of the quaternion. Then, NewtonIntegrator.normalizeEvery has no effect over this algorithm. This algorithm is second-order. The algorithm goes as follows: first, we estimate the current angular momentum and compute the current local angular velocity:
Then, we evaluate
Clumps (rigid aggregates)¶
DEM simulations frequently make use of rigid aggregates of particles to model complex shapes [Price2007] called clumps, typically composed of many spheres. Dynamic properties of clumps are computed from the properties of its members:
For non-overlapping clump members the clump’s mass
is summed over members, the inertia tensor is computed using the parallel axes theorem: , where is the mass of clump member , is the distance from center of clump member to clump’s centroid and is the inertia tensor of the clump member .For overlapping clump members the clump’s mass
is summed over cells using a regular grid spacing inside axis-aligned bounding box (Aabb) of the clump, the inertia tensor is computed using the parallel axes theorem: , where is the mass of cell , is the distance from cell center to clump’s centroid and is the inertia tensor of the cell .
Local axes are oriented such that they are principal and inertia tensor is diagonal and clump’s orientation is changed to compensate rotation of the local system, as to not change the clump members’ positions in global space. Initial positions and orientations of all clump members in local coordinate system are stored.
In Yade (class Clump), clump members behave as stand-alone particles during simulation for purposes of collision detection and contact resolution, except that they have no contacts created among themselves within one clump. It is at the stage of motion integration that they are treated specially. Instead of integrating each of them separately, forces/torques on those particles
Motion of the clump is then integrated, using aspherical rotation integration. Afterwards, clump members are displaced in global space, to keep their initial positions and orientations in the clump’s local coordinate system. In such a way, relative positions of clump members are always the same, resulting in the behavior of a rigid aggregate.
Numerical damping¶
In simulations of quasi-static phenomena, it it desirable to dissipate kinetic energy of particles. Since most constitutive laws (including Law_ScGeom_FrictPhys_Basic shown above, Contact model (example)) do not include velocity-based damping (such as one in [Addetta2001]), it is possible to use artificial numerical damping. The formulation is described in [Pfc3dManual30], although our version is slightly adapted. The basic idea is to decrease forces which increase the particle velocities and vice versa by
where
it acts on forces (accelerations), not constraining uniform motion;
it is independent of eigenfrequencies of particles, they will be all damped equally;
it needs only the dimensionless parameter
which does not have to be scaled.
In Yade, we use the adapted form
where we replaced the previous mid-step velocity
In Yade, damping (11) is implemented in the NewtonIntegrator engine; the damping coefficient
Stability considerations¶
Critical timestep¶
In order to ensure stability for the explicit integration sceheme, an upper limit is imposed on
where
Single mass-spring system¶
Single 1D mass-spring system with mass
where
does not depend on initial conditions. Since there is one single mass,
for a single oscillator.
General mass-spring system¶
In a general mass-spring system, the highest frequency occurs if two connected masses
The overall critical timestep is then
This equation can be used for all 6 degrees of freedom (DOF) in translation and rotation, by considering generalized mass and stiffness matrices
DEM simulations¶
In DEM simulations, per-particle stiffness
with
Note that for computation efficiency reasons, eigenvalues of the stiffness matrices are not computed. They are only approximated assuming than DOF’s are uncoupled, and using the diagonal terms of
There is one important condition that
Estimation of by wave propagation speed¶
Estimating timestep in absence of interactions is based on the connection between interaction stiffnesses and the particle’s properties. Note that in this section, symbols
In Yade, particles have associated Material which defines density
number of interactions per particle
; for a “reasonable” radius distribution, however, there is a geometrically imposed upper limit (12 for a packing of spheres with equal radii, for instance);the exact relationship the between particles’ rigidities
, , supposing only that is somehow proportional to them.
By defining
For our purposes, we define
This algorithm is implemented in the utils.PWaveTimeStep function.
Let us compare this result to (14); this necessitates making several simplifying hypotheses:
all particles are spherical and have the same radius
;the sphere’s material has the same
and ;the average number of contacts per sphere is
;the contacts have sufficiently uniform spatial distribution around each particle;
the
ratio is constant for all interactions;contact stiffness
is computed from using a formula of the form(17)¶where
is some constant depending on the algorithm in usefootnote{For example, in the concrete particle model (Ip2_CpmMat_CpmMat_CpmPhys), while in the classical DEM model (Ip2_FrictMat_FrictMat_FrictPhys) as implemented in Yade.} and is half-distance between spheres in contact, equal to for the case of interaction radius . If (and by consequence), all interactions will have the same stiffness . In other cases, we will consider as the average stiffness computed from average (see below).
As all particles have the same parameters, we drop the
We try to express the average per-particle stiffness from (16). It is a sum over all interactions where
Moreover, since all directions are equal, we can write the per-body stiffness as
and can put constant terms (everything) in front of the summation.
we substitute
The ratio of timestep
Actual values of this ratio depend on characteristics of packing
- Concrete particle model
computes contact stiffness from the equivalent area
first (6), is the initial contact length, which will be, for interaction radius (5) , in average larger than . For ,we can roughly estimate , gettingwhere
by comparison with (17).Interaction radius
leads to average interactions per sphere for dense packing of spheres with the same radius . is calibrated to match the desired macroscopic Poisson’s ratio .Finally, we obtain the ratio
showing significant overestimation by the p-wave algorithm.
- Non-cohesive dry friction model
is the basic model proposed by Cundall explained in Contact model (example). Supposing almost-constant sphere radius
and rather dense packing, each sphere will have interactions on average (that corresponds to maximally dense packing of spheres with a constant radius). If we use the Ip2_FrictMat_FrictMat_FrictPhys class, we have , as ; we again use (for lack of a more significant value). In this case, we obtain the resultwhich again overestimates the numerical critical timestep.
To conclude, p-wave timestep gives estimate proportional to the real
Non-elastic constraints¶
Let us note at this place that not only
Periodic boundary conditions¶
While most DEM simulations happen in
The initial
After the definition of the initial cell’s geometry,
Deformations handling¶
The deformation of the cell over time is defined via a tensor representing the gradient of an homogeneous velocity field
The velocity gradient is integrated automatically over time, and the cumulated transformation is reflected in the transformation matrix
Along with the automatic integration of cell transformation, there is an option to homothetically displace all particles so that
Collision detection in periodic cell¶
In usual implementations, particle positions are forced to be inside the cell by wrapping their positions if they get over the boundary (so that they appear on the other side). As we wanted to avoid abrupt changes of position (it would make particle’s velocity inconsistent with step displacement change), a different method was chosen.
Approximate collision detection¶
Pass 1 collision detection (based on sweep and prune algorithm, sect. Sweep and prune) operates on axis-aligned bounding boxes (Aabb) of particles. During the collision detection phase, bounds of all Aabb’s are wrapped inside the cell in the first step. At subsequent runs, every bound remembers by how many cells it was initially shifted from coordinate given by the Aabb and uses this offset repeatedly as it is being updated from Aabb during particle’s motion. Bounds are sorted using the periodic insertion sort algorithm (sect. Periodic insertion sort algorithm), which tracks periodic cell boundary
Upon inversion of two Aabb’s, their collision along all three axes is checked, wrapping real coordinates inside the cell for that purpose.
This algorithm detects collisions as if all particles were inside the cell but without the need of constructing “ghost particles” (to represent periodic image of a particle which enters the cell from the other side) or changing the particle’s positions.
It is required by the implementation (and partly by the algorithm itself) that particles do not span more than half of the current cell size along any axis; the reason is that otherwise two (or more) contacts between both particles could appear, on each side. Since Yade identifies contacts by Body.id of both bodies, they would not be distinguishable.
In presence of shear, the sweep-and-prune collider could not sort bounds independently along three axes: collision along

Constructing axis-aligned bounding box (Aabb) of a sphere in simulation space coordinates (without periodic cell – left) and transformed cell coordinates (right), where collision detection axes
The restriction of a single particle not spanning more than half of the transformed axis becomes stringent as Aabb is enlarged due to shear. Considering Aabb of a sphere with radius Collider::invalidatePersistentData
). Cell flipping is implemented in the Cell.flipCell function. Automatic flip can be enabled using Cell.flipFlippable.

Flipping cell (Cell.flipCell) to avoid infinite stretch of the bounding boxes’ spans with growing
This algorithm is implemented in InsertionSortCollider and is used whenever simulation is periodic (Omega.isPeriodic); individual BoundFunctor’s are responsible for computing sheared Aabb’s; currently it is implemented for spheres and facets (in Bo1_Sphere_Aabb and Bo1_Facet_Aabb respectively).
Exact collision detection¶
When the collider detects approximate contact (on the Aabb level) and the contact does not yet exist, it creates potential contact, which is subsequently checked by exact collision algorithms (depending on the combination of Shapes). Since particles can interact over many periodic cells (recall we never change their positions in simulation space), the collider embeds the relative cell coordinate of particles in the interaction itself (Interaction.cellDist) as an integer vector
By storing the integral offset
Periodic insertion sort algorithm¶
The extension of sweep and prune algorithm (described in Sweep and prune) to periodic boundary conditions is non-trivial. Its cornerstone is a periodic variant of the insertion sort algorithm, which involves keeping track of the “period” of each boundary; e.g. taking period
This algorithm was also extended to handle non-orthogonal periodic Cell boundaries by working in transformed rather than Cartesian coordinates; this modifies computation of Aabb from Cartesian coordinates in which bodies are positioned (treated in detail in Approximate collision detection).
The sort algorithm is tracking Aabb extrema along all axes. At the collider’s initialization, each value is assigned an integral period, i.e. its distance from the cell’s interior expressed in the cell’s dimension along its respective axis, and is wrapped to a value inside the cell. We put the period number in subscript.
Let us give an example of coordinate sequence along

with cell
Sorting starts from the first element in the cell, i.e. right of
( |
stop inverting if neighbors are ordered; |
( |
current element left of |
( |
current element right of |
( |
inversion across |
( |
if after ( |
In the first step, (

We move to next element

The next element is

We move (wrapping around) to

and so is the last element

Computational aspects¶
Cost¶
The DEM computation using an explicit integration scheme demands a relatively high number of steps during simulation, compared to implicit scehemes. The total computation time
linearly, the number of steps
, where is timestep safety factor; can be estimated by p-wave velocity using and (sect. Estimation of \Dtcr by wave propagation speed) as . Thereforethe number of particles
; for fixed value of simulated domain volume and particle radiuswhere
is packing porosity, roughly for dense irregular packings of spheres of similar radius.The dependency is not strictly linear (which would be the best case), as some algorithms do not scale linearly; a case in point is the sweep and prune collision detection algorithm introduced in sect. Sweep and prune, with scaling roughly
.The number of interactions scales with
, as long as packing characteristics are the same.the number of computational cores
; in the ideal case, the dependency would be inverse-linear were all algorithms parallelized (in Yade, collision detection is not).
Let us suppose linear scaling. Additionally, let us suppose that the material to be simulated (
This (rather trivial) result is essential to realize DEM scaling; if we want to have finer results, refining the “mesh” by halving
For very crude estimates, one can use a known simulation to obtain a machine “constant”
with the meaning of time per particle and per timestep (in the order of
Result indeterminism¶
It is naturally expected that running the same simulation several times will give exactly the same results: although the computation is done with finite precision, round-off errors would be deterministically the same at every run. While this is true for single-threaded computation where exact order of all operations is given by the simulation itself, it is not true anymore in multi-threaded computation which is described in detail in later sections.
The straight-forward manner of parallel processing in explicit DEM is given by the possibility of treating interactions in arbitrary order. Strain and stress is evaluated for each interaction independently, but forces from interactions have to be summed up. If summation order is also arbitrary (in Yade, forces are accumulated for each thread in the order interactions are processed, then summed together), then the results can be slightly different. For instance
(1/10.)+(1/13.)+(1/17.)=0.23574660633484162
(1/17.)+(1/13.)+(1/10.)=0.23574660633484165
As forces generated by interactions are assigned to bodies in quasi-random order, summary force