Overview

This free online book marks our commitment to make the theory and algorithms of physics-based simulations accessible to everyone.

Contributing

If you are interested in contributing to editing and improving this book, please do it through a Github pull request on the mdbook-src repository (not the HTML repository), or directly contact Minchen Li and Chenfanfu Jiang.

Depending on the nature of your contribution, you'll be listed as book co-authors or community contributors in future builds of the book.

Version 1.0.2 (Released 2025/7):

Co-authors

Minchen Li, Carnegie Mellon University
Chenfanfu Jiang, University of California, Los Angeles
Zhaofeng Luo, Carnegie Mellon University
Wenxin Du, University of California, Los Angeles
Chang Yu, University of California, Los Angeles
Žiga Kovačič, Cornell University
Tianyi Xie, University of California, Los Angeles

Community Contributors (Github)

Minchen Li (@liminchen), Chenfanfu Jiang (@cffjiang), Zhaoming Xie (@zxie-bdai), Yuqi Meng (@ARessegetesStery), Zhaofeng Luo (@Roushelfy), Xiang Chen (@xchen-cs), Gábor Szabó (@szabgab), Zhenyi Wang (@TempContainer), Wenxin Du (dwxrycb123), Chang Yu (@g1n0st), Tianyi Xie (@XPandora), Žiga Kovačič (@zzigak), Steven Xu (@RWBY-Aloupeep)

BibTeX

@book{li2025physics,
  title   = {Physics-Based Simulation},
  author  = {Minchen Li and Chenfanfu Jiang and Zhaofeng Luo and Wenxin Du and Chang Yu and {\v{Z}}iga Kova{\v{c}}i{\v{c}} and Tianyi Xie},
  year    = {2025},
  month   = jul,
  version = {1.0.2},
  url     = {https://phys-sim-book.github.io/}
}

Discrete Space and Time

In this lecture, we explore the simulation of deformable solids with the aim of developing a discrete, computationally solvable problem. The primary goal is to introduce the abstract algebraic concepts inherent in this problem. We approach elasticity simulation using a top-down architectural view, placing mathematical modeling at the forefront.

The study of classical elastic solids physics largely revolves around Partial Differential Equations (PDEs). In continuum mechanics and finite element analysis literature, the norm is to first derive the continuous form of these PDEs, elaborating on each term's origin, before adapting them to discrete programming languages. Often, this adaptation appears in later sections, creating a sense of anticipation for the reader.

This book, however, takes a different route. It weaves continuum mechanics and PDEs into the discussion as needed, evenly distributing these topics to avoid overwhelming the reader. This method links theory to practice incrementally, enhancing understanding.

We introduce the main problem formulation early, offering an overview of its numerical solutions. This gives readers an initial comprehensive view, sparking curiosity and motivating deeper exploration in later chapters. This strategy makes the learning process smoother and more intuitive, helping readers effortlessly connect complex concepts and quickly grasp the subject's core.

Our aim is to provide a well-rounded, thorough, and engaging exploration of deformable solids simulation, valuable for both students and seasoned researchers in the field.

Representations of a Solid Geometry

In everyday life, solid objects are perceived as continuous. Yet, in the digital world of computers, where we use discrete numbers for representation, a range of interesting methods arises.

One method is parametrization. Consider a 3D sphere, which can be described as ${x \in R^{3} ∣ ∥ x - c ∥ \leq r, c \in R^{3}, r > 0}$ , centered at point $ \mathbf{c} $ with radius $ r $. This approach extends beyond spheres to include shapes like half-spaces, boxes, ellipsoids, tori, and others, characterized by their interior using functions such as signed distances. However, parametrization faces challenges when handling complex geometries that are frequently encountered in real-world scenarios. An emerging exception to this limitation is the use of advanced neural representations employing neural networks. These newer methods show promise in effectively representing more intricate geometrical forms.

An alternative is representing with sampling. This involves choosing points on and inside the object. But points alone aren't enough; we typically need to establish connectivity between them to define the object’s boundaries for applications like rendering and 3D printing. Monitoring how a cluster of points shifts over time also helps in measuring deformation.

In continuum mechanics, an object is seen as having a continuous density field. Digitally, this continuity must be represented discretely, usually through defining the connectivity of the solid's geometry.

Remark 1.1.1 (Other Solid Representations). There are other methods for representing solid geometries, such as voxel-based approaches. These methods divide the space into a 3D grid of small boxes, or voxels, with each voxel representing a segment of the object, similar to pixels in a 2D image. Voxel-based methods are advantageous for several reasons. Firstly, they can act as a discrete level set representation, capable of modeling complex geometries and tracking their evolution over time. Each voxel contains information about its position relative to the object's surface, offering an efficient discrete approximation of the continuous level set function. This is beneficial for algorithms involved in surface evolution, shape optimization, and collision detection. Secondly, voxel-based approaches are conducive to Constructive Solid Geometry (CSG) operations. This technique in solid modeling uses Boolean operators to combine simpler shapes into complex 3D models. The voxelized framework allows for straightforward and efficient execution of operations like union, intersection, and difference on the voxel grid. This enables the easy creation and modification of intricate shapes.

Example 1.1.1 (Mesh). The method of creating a mesh by directly connecting points with edges or triangles is a popular technique in computational geometry. This concept is illustrated in the accompanying figure, where the left and middle images show two different meshes. Notably, even though these meshes utilize the same sampled points or nodes, they have distinct connectivities, resulting in different shapes. The rightmost mesh in the figure demonstrates a transformation from one shape to another. This mesh represents a deformation of the middle mesh, achieved by vertically compressing its upper half.
Figure 1.1.1. Mesh

Example 1.1.2 (Particle and Grid). By implementing a uniform grid structure in our spatial representation, we record the extent of solid matter at each node location. This allows us to use our sampled points to calculate the density of the solid at each grid node. This method is beneficial for quantifying the solid's distribution within the grid and for establishing a network of connectivity among the original sampled points. Refer to the accompanying figure for a visual demonstration of this concept. In the figure, the sampled points are depicted as green dots. The grid nodes, where we record solid densities, are shown as black circles. These nodes are connected through the grid, illustrated with blue lines.
Figure 1.1.2. Particle and grid

In the field of modern solid simulation, the described methods of defining connectivity are crucial. The first method, establishing connections through a mesh of edges or triangles, is foundational to Finite Element Method (FEM) simulators. The second approach, which involves using a uniform grid to compute solid density and establish connectivity, is integral to Material Point Method (MPM) simulators [Jiang et al. 2016]. This book largely concentrates on the former method, delving into the intricacies of FEM. The mesh-based structure of FEM is particularly effective in handling complex domains by breaking them down into simpler elements. This makes FEM an essential tool in the study and simulation of deformable solids, and understanding its nuances is vital for those engaged in this area of study.

At first glance, the use of two representations of solid geometry in the MPM might appear redundant. Yet, this dual approach gives MPM a significant edge, especially in simulating dynamic events like solid fractures. In such cases, FEM would necessitate meticulous modification of the edges and elements that define the original connectivity to accurately depict the damage. In contrast, MPM efficiently handles these scenarios. The uniform grid naturally accommodates the separation of body parts in a fracture, as the lack of material at fracture nodes leads to an automatic disconnection of adjacent grid nodes. This attribute allows MPM to excel in managing changes in solid topology.

However, when it comes to simulation accuracy control, the Finite Element Method (FEM) excels. FEM operates directly on the mesh, obviating the need for constant information transfer, thus ensuring greater precision. This level of accuracy makes FEM an invaluable resource in the precise simulation of deformable solids, which is the primary emphasis of this book.

The technique of consolidating coordinates of each sampled point into an extended vector, denoted as $ x\in\mathbb{R}^{dn} $ (refer to the figure below), provides an effective means to describe a specific geometric configuration, given a constant connectivity. In this representation, $d$ indicates the dimension of space (1, 2, or 3), and $n$ represents the total number of points. Similarly, attributes like velocity, acceleration, and forces at each sample point can be amalgamated into corresponding extended vectors, symbolized as $v$, $a$, and $f$ respectively. This organized approach to data presentation not only aids in comprehensively understanding the various parameters and their interrelations but also streamlines the mathematical formulation of the simulation process.

**Figure 1.1.3.** Stacked position vector

Newton's Second Law

Having defined a method for representing a solid geometry at a single instance in time, we now face the challenge of predicting the solid's motion and deformation over time. This prediction is a key component for accurate simulation.

Newton's second law, expressed as $\mathbf{f} = m \mathbf{a}$, indicates that forces $\mathbf{f}$ are the primary reasons for changes in velocity, as indicated by acceleration $\mathbf{a}$. It's important to understand that when a solid's displacement fields extend beyond simple translational or rotational movements, or a linear combination thereof, it indicates deformation. By applying Newton's second law to each sample point, we can effectively predict the movement and deformation of solids. This concept is concisely represented in vector form:

$\frac{d x}{d t} M \frac{d v}{d t} = v, = f . (1.2.1)$

In this representation, $M\in\mathbb{R}^{dn\times dn}$ is the mass matrix, and $x$, $v$, and $f$ are the column vectors stacking position, velocity, and force, respectively. This approach lays the groundwork for our simulations of deformable solids, integrating principles of motion in both discrete space and continuous time.

Remark 1.2.1 (Stacked Variables). Though the mass matrix $M$ isn't necessarily a diagonal matrix in theory, it's often simplified to one in practical applications. This results in a lumped mass matrix, representing a system of discrete point masses and offering an efficient way to handle complex systems. Consider a two-point system in two dimensions to illustrate this. The lumped mass matrix for such a system is represented as: \[ M = \begin{pmatrix} m_1 & & & \\ & m_1 & & \\ & & m_2 & \\ & & & m_2 \end{pmatrix}, \] Here, we assume vectors like ${v}$ (as well as ${x}$ and ${f}$) are stacked in a specific order: \[ v = (v_{11}, v_{12}, v_{21}, v_{22})^T, \] where $v_{i\alpha}$ denotes the $\alpha$th component of the velocity $\mathbf{v}_i$ for the $i$th point. Such an organized structure simplifies calculations significantly and enhances the understanding of the system's dynamics.

Time Integration

Newton's second law lays the foundation for a series of Ordinary Differential Equations (ODEs) expressed in their continuous forms. This is analogous to how we previously used sampled points in space to discretely represent continuous geometries. Now, we take a similar approach but in the realm of time. By sampling points in time, we can effectively represent time derivatives, such as $\frac{\mathbf{d} x}{\mathbf{d} t}$ and $\frac{\mathbf{d} v}{\mathbf{d} t}$.

Definition 1.3.1 (Time Integration). When discretizing time into fixed small intervals, we denote the time at the $n$-th step as $t^n$, commonly referred to as a timestep. The length of this interval, or timestep size, is given by $\Delta t = t^{n+1} - t^n$. The timestep count, $n$, is typically a whole number starting from zero, making $t^0=0 s$ the starting point of a simulation.

The concept of timesteps leads to the introduction of symbols $x^n$, $v^n$, and $f^n$ to represent the positions, velocities, and forces of nodes at the $n$-th timestep, respectively. The term timestepping, or time integration, refers to the process of calculating $x^{n+1}, v^{n+1}$ from $x^n, v^n$ at each incremental timestep $n=0,1,2,\dots$. For a visual demonstration, consider an Armadillo slingshot animation. Each frame in this animation is computed progressively from left to right, as illustrated in the figure below. In this context, timestepping mirrors a cinematic progression, revealing the evolving dynamics of a system in a step-by-step manner.

**Figure 1.3.1.** Armadillo slingshot frame by frame

In the context of this book and the simulation scenarios we examine, a crucial assumption must be emphasized: we always possess exact knowledge of the initial values $x^0$ and $v^0$ at the start of our simulation. Furthermore, for each timestep, we either have a method to calculate $f^n$ based on a physical model, or we have its precise value readily available, as with a constant force such as gravity. This assumption is fundamental to our approach, ensuring that simulations are grounded in known initial conditions and forces, thereby allowing for more accurate and reliable outcomes.

Explicit Time Integration

Explicit time integration schemes provide a direct method to calculate $x^{n+1},v^{n+1}$ by substituting known values into simple formulas, which is why these are called explicit. This section focuses on two basic explicit schemes: Forward Euler and Symplectic Euler methods.

Forward Euler

To convert our continuous-time system to a discrete form, we employ the forward difference approximation. In this approximation, the derivative $(\frac{\mathbf{d} x}{\mathbf{d} t})^n$ is estimated as $\frac{x^{n+1} - x^n}{\Delta t}$, and likewise, $(\frac{\mathbf{d} v}{\mathbf{d} t})^n$ as $\frac{v^{n+1} - v^n}{\Delta t}$. The superscript $n$ represents the state variables at the $n$th timestep. Consequently, the discrete version of our system is expressed as: $\frac{x ^{n + 1} - x ^{n}}{Δ t} M \frac{v ^{n + 1} - v ^{n}}{Δ t} = v^{n}, = f^{n} . (1.4.1)$ Assuming a constant mass over time, these equations provide a clear mechanism to update our state variables. Knowing the current values $x^n$, $v^n$, and $f^n$ at timestep $n$, we can directly determine their values at the next timestep, $n+1$.

Method 1.4.1 (Forward Euler Time Integration for Newton's Second Law). In the Forward Euler method, the state variables $x^{n+1}$ and $v^{n+1}$ at the next time step $n+1$ are calculated based on the current values $x^n$ and $v^n$. The update rules are given by: $x^{n + 1} v^{n + 1} = x^{n} + Δ t v^{n}, = v^{n} + Δ t M^{- 1} f^{n} . (1.4.2)$ Here, $\Delta t$ represents the time step size, $M$ is the mass matrix, and $f^n$ is the force at the current time step $n$.

The forward Euler method is considered unconditionally unstable, implying that irrespective of the chosen small time step $\Delta t$, the numerical solution will eventually grow significantly (explode) for equations with nonconstant $f$, while the exact solution remains unaffected (refer to Figure 1.4.1, left).

Symplectic Euler

If we put superscript $n+1$ on $v$ in the position derivative discretization while keeping the velocity derivative the same, we get a new update rule:

Method 1.4.2 (Symplectic Euler Time Integration for Newton's Second Law). Given the current state variables, the mass matrix, and the time step size from $t^n$ to $t^{n+1}$, $x^{n + 1} v^{n + 1} = x^{n} + Δ t v^{n + 1} = v^{n} + Δ t M^{- 1} f^{n}, (1.4.3)$ where $n=0,1,2,\dots$.

With a minor alteration, the integration becomes conditionally stable. This implies that if $\Delta t$ remains within a problem-specific limit, we can effectively confine the numerical error of the solution. Moreover, the Symplectic Euler method exhibits an appealing trait of system energy preservation, as demonstrated in the middle of the figure below.

**Figure 1.4.1 (Stability of Time Integrators).** The provided illustration showcases a particle executing constant circular motion, simulated using the forward Euler, Symplectic Euler, and implicit Euler methods, respectively from left to right. The varying colors within the illustration represent the progression of time. Notably, each method exhibits distinct characteristics in the simulation: the forward Euler simulation eventually undergoes an unstable escalation, the Symplectic Euler closely adheres to the theoretical trajectory, and the implicit Euler, while maintaining stability, gradually brings the motion to a halt.

Implicit Time Integration

In contrast to explicit time integration, implicit time integration requires solving a system of equations to determine the values of $x^{n+1}$ and $v^{n+1}$. A notable benefit of this approach is its potential for greatly improved stability. The simplest form of implicit integration, the backward Euler method, is introduced as follows.

Method 1.5.1 (Backward Euler Time Integration Application to Newton's Second Law). Given the current state variables, the mass matrix, and the time interval from $t^n$ to $t^{n+1}$, the update rules are as follows: $x^{n + 1} v^{n + 1} = x^{n} + Δ t v^{n + 1}, = v^{n} + Δ t M^{- 1} f^{n + 1}, (1.5.1)$ where $n$ ranges from $0,1,2,\dots$.

In many scenarios discussed in this book, the forces are derived from position vectors $x$. Thus, we can represent $f^{n+1} = f(x^{n+1})$. It's crucial to recognize that the update for $x^{n+1}$ depends on knowing $v^{n+1}$, yet the calculation of $v^{n+1}$ is contingent on $x^{n+1}$. This interdependence creates a cyclical dependency, necessitating the resolution of a system of equations to accurately find $x^{n+1}$ and $v^{n+1}$. By formulating $v^{n+1} = (x^{n+1} - x^n) / \Delta t$, Equation (1.5.1) can be rephrased as: $M (x^{n + 1} - (x^{n} + Δ t v^{n})) - Δ t^{2} f (x^{n + 1}) = 0. (1.5.2)$ Given that forces $f$ often exhibit nonlinearity with respect to positions $x$, Equation (1.5.2) generally becomes nonlinear, requiring the use of nonlinear root finding techniques like Newton's method for solution.

Method 1.5.2 (Newton's Method Applied to Backward Euler Time Integration). As described in the algorithm below, Newton's method is an iterative technique starting from an initial estimate $x^i$ of the solution. At the current iteration $x^i$, it linearly approximates $f(x^{n+1}) \approx f(x^i) + (x^{n+1}-x^i) \nabla f(x^i)$, then resolves a linear system and updates the iteration. This process is repeated until a satisfactory degree of convergence is reached.
Algorithm 1.5.1 (Newton's Method for Backward Euler Time Integration).

While the backward Euler method ensures unconditional stability even for large values of $\Delta t$, it's crucial to recognize that increasing $\Delta t$ may lead to poorer system conditioning. This complication can make solving the linear system more challenging. Additionally, it's important to remember that force linearization is an approximation. If the initial estimate for the solution is far from the actual solution, the standard iteration of Newton's method might not converge, and it could even diverge.

In later discussions, we will introduce a modified version of Newton's method. This adaptation is designed to guarantee convergence for specific types of problems, regardless of the initial estimate or the size of $\Delta t$.

Summary

Simulating solids involves predicting changes in their position and form over time. To achieve this on computers, both geometry and time must be represented discretely.

Geometries are typically represented using sample points interconnected in specific ways:

Finite Element Methods (FEM) connect sample points through unstructured meshes.
Material Point Methods (MPM) utilize uniform Cartesian grids to link sample points. FEM excels in delivering high-precision results, while MPM is advantageous for handling topological changes. This book primarily focuses on FEM.

Time is discretized into distinct moments, with finite difference methods applied to calculate temporal derivatives of physical quantities, in line with Newton's second law.

The Forward Euler method is generally avoided due to its unconditional instability. Conversely, the Symplectic Euler method is explicit and conditionally stable, often preferred for well-conditioned problems. For stiff problems, the Backward Euler method, unconditionally stable but requiring the resolution of nonlinear equation systems, is commonly used despite its computational intensity and potential for numerical instability.

In the next lecture, we will explore the optimization perspective of implicit time integration, offering robustness in solving these problems.

Optimization Framework

Optimization Time Integrator

With the backward Euler method, each timestep necessitates solving a nonlinear system of equations, as outlined in Equation (1.5.2). Effectively, this equates to addressing an optimization problem stated as: $x^{n + 1} where E (x) = arg x min E (x) = \frac{1}{2} ∥ x - \tilde{x}^{n} ∥_{M}^{2} + Δ t^{2} P (x) . (2.1.1)$ Here, $\tilde{x}^n = x^n + \Delta t v^n$, $\frac{1}{2} \|x - \tilde{x}^n\|^2_M = \frac{1}{2} (x - \tilde{x}^n)^T M (x - \tilde{x}^n)$ represents the inertia term, $P(x)$ stands for the potential energy for forces $f(x)$ with $\frac{\partial P}{\partial x}(x) = -f(x)$, and $E(x)$ is known as the Incremental Potential. At the local minimum of $E(x)$, $\frac{\partial E}{\partial x}(x^{n+1}) = 0$, corresponding to Equation (1.5.2).

Viewing time integration as an optimization problem enables us to utilize well-established optimization methods to robustly acquire the solutions. It also allows for a consistent framework for modeling more complex physical phenomena.

Definition 2.1.1 (Conservative Forces). Forces $f(x)$ for which a potential energy $P(x)$ exists such that $\frac{\partial P(x)}{\partial x} = -f(x)$, are termed conservative forces. Both common elasticity forces and body forces such as gravity are examples of conservative forces. They can be easily integrated into the optimization framework by adding the potential energy term into the Incremental Potential.

Remark 2.1.1 (The gravitational force). The gravitational force acting on an object of mass $m$ (represented by the force $F = -mg\mathbf{z}$) at a height $h$ above the Earth's surface, where $g$ is the acceleration due to gravity and $\mathbf{z}$ is the upward-pointing unit vector, corresponds to the gravitational potential energy $U = mgh$. Here, $U$ is the work done against gravity to move the object from a reference point (at $h = 0$) to height $h$. The force is the negative gradient of the energy with respect to the position (written mathematically as $F = -\nabla U$), which confirms the principle of conservation of energy. Taking the derivative of $U$ with respect to $h$, we obtain $\nabla U = mg\mathbf{z}$, and thus $F = -\nabla U = -mg\mathbf{z}$, which matches our starting expression for the force.

Remark 2.1.2 (Elasticity). Elasticity is the capacity of a solid object to maintain its resting shape in response to external forces. Under the influence of elasticity, the sample points on the same solid will be bound together during the simulation. A more rigid solid will have a stiffer elasticity energy, providing a larger elasticity force for the same degree of deformation, thereby aiding in the restoration of the resting shape. The Armadillo slingshot example (Figure 1.3.1) demonstrates typical elasticity effects. Elasticity is a common property across all solids, regardless of their geometric form, and whether they are intuitively rigid or non-rigid, e.g., metal, wood, soft tissue, rubber, cloth, hair, sand, etc.

Dirichlet Boundary Conditions

Potential energies aren't the only means of modeling physical phenomena; constraints are equally vital. Let's start by considering the simplest form, linear equality constraints. The constrained optimization problem is defined as follows: $x min E (x) s . t . A x = b, (2.2.1)$ Here, $A\in \mathbb{R}^{m\times dn}$ and $b\in \mathbb{R}^{m}$ represent $m$ linear equality constraints.

During simulations, it's often necessary to control the position of certain points on a solid at each timestep. This can involve fixing a set of nodes to model immovable objects like the ground or obstacles, or guiding the motion of solids by moving specific nodes along predetermined paths. For example, in the slingshot scenario (Figure 1.3.1), the Armadillo's feet and ears are stationary. This type of control is known as Dirichlet boundary conditions (BC). These conditions can be expressed as linear equality constraints within the optimization time integrator framework.

To put it into perspective, the matrix $A$ in Equation (2.2.1) would typically be an $m\times dn$ matrix (with $m\leq dn$ and $m \mod d = 0$), which selects the coordinates of the BC points. Correspondingly, $b$ would be an $m \times 1$ vector defining the prescribed locations. By solving the optimization problem, the chosen points are fixed at these specified locations, which can vary from one timestep to the next.

At the local minimum of the problem in Equation (2.2.1), the KKT condition $\nabla E (x) - A^{T} λ = 0 A x = b (2.2.2)$ is met, where $\lambda \in \mathbb{R}^{m}$ represents the Lagrange multiplier vector, comprising all the Lagrange multipliers.

Remark 2.2.1 (Solving KKT Systems). Solving nonlinear optimization problems with equality constraints is feasible by directly addressing the nonlinear KKT (Karush-Kuhn-Tucker) system, as seen in Equation (2.2.2). Methods like Newton's method are commonly employed for this purpose. However, this approach can be computationally intensive. For boundary conditions, the unique structure of the matrix $A$ can be leveraged, allowing us to resolve the constrained problem in an unconstrained manner. Techniques for this approach will be demonstrated in later lectures.

Contact

To accurately simulate solids, it's essential to ensure that they don't interpenetrate, as shown in the figure below (left side). One effective approach is to enforce the CFL (Courant-Friedrichs-Lewy condition) upper limit on timestep sizes, particularly in methods like MPM. In Finite Element Methods (FEM), this requires precise modeling of contact forces. However, accurately modeling contact poses a challenge. Contact is inherently a non-smooth process, happening abruptly as solids make contact. There isn't a potential energy formulation that can accurately depict this phenomenon.

In practical applications, determining if two objects have collided typically involves visually and mentally assessing their proximity. When the distance between them isn't zero, it indicates that space remains and no collision has occurred. This concept is crucial in modeling interactions between objects in a computational context.

To avoid collision or penetration, we can ensure that the distance between the surfaces of the moving objects never reduces to zero. This approach is particularly useful in time integration problems within computational simulations. We model this scenario using inequality constraints, which, when combined with boundary conditions, formulate our time integration problem as follows: $x min E (x) s . t . A x = b and \forall k, c_{k} (x) \geq ϵ . (2.3.1)$ Here, $c_k$ measures the distance between specific pairs of regions on the surface of the solids, and $\epsilon \rightarrow 0$ is a tiny positive value to ensure $c_k(x)$ remains strictly positive.

At the local minimum of the problem in Equation (2.3.1), we adhere to the Karush-Kuhn-Tucker (KKT) condition, as follows: $\nabla E (x) - A^{T} λ - k \sum γ_{k} \nabla c_{k} (x) = 0, A x = b, \forall k, c_{k} (x) - ϵ \geq 0, γ_{k} \geq 0, γ_{k} (c_{k} (x) - ϵ) = 0. (2.3.2)$ In this condition, $\gamma_k$ is the Lagrange multiplier for the constraint $c_k(x) \geq \epsilon$. To break it down, $\nabla c_k(x)$ points in the direction of the contact force for contacting pair $k$. The combination of this direction with the magnitude represented by $\gamma_k$ gives us the actual contact force at that point.

Remark 2.3.1 (The Complementarity Slackness Condition). The complementarity slackness condition $\gamma_k (c_k(x) - \epsilon) = 0$ plays a critical role in ensuring that contact forces are present ($\gamma_k \neq 0$) exclusively when the solids are in touch ($c_k(x) = \epsilon$). On the contrary, when the solids are not touching ($c_k(x) > \epsilon$), there should be no contact forces ($\gamma_k = 0$).

Definition 2.3.1 (Active Set). In optimization problems with inequality constraints defined as \[ \forall k, \ c_k(x) \geq 0, \] the active set is defined as \[ \{ l \ | \ c_l(x^*) = 0 \}. \] Here, $x^*$ is a local optimal solution of the problem.

Remark 2.3.2 (Combinatorial Difficulty). The complementarity slackness condition reveals that only constraints within the active set will exhibit non-zero Lagrange multiplier $\gamma_k$ at the solution. This suggests that, unlike equality constraints, inequality constraints not only require solving for the value of the Lagrange multipliers but also demand the identification of which $\gamma_k$ should be set to $0$. This presents a combinatorial difficulty.

A wide array of techniques are available for addressing optimization problems with inequality constraints. Each method introduces a distinct approach, effectively targeting various facets of the problem.

Primal-Dual Methods: This class of methods tackles both the primal problem (the original optimization problem) and its dual problem simultaneously. The dual problem often provides valuable insights into the primal problem's solution, making this approach attractive. These methods are iterative, refining an initial solution by leveraging the relationship between the primal and dual problems. However, designing and implementing primal-dual algorithms can be intricate, requiring a careful balance between the two problem types. While effective, these methods may not be efficient or straightforward for complex, high-dimensional problems.
Projected Steepest Descent Methods: A modification of the classic steepest descent method, these methods address constraints. At each iteration, the algorithm moves in the steepest descent direction, then projects back onto the feasible set if it deviates due to constraints. This method's simplicity and straightforwardness make it popular, but it may struggle with ill-conditioned problems where convergence is slow, or with constraints that are challenging to project onto.
Interior-Point Methods: Also known as barrier methods, these techniques introduce a barrier function that penalizes infeasible solutions, thereby steering the solution towards the feasible region's interior. This approach effectively transforms a constrained problem into an unconstrained one, solvable using conventional techniques. However, the barrier function's choice significantly impacts the method's performance. While efficient for certain problem types, these methods may falter with problems where the feasible region is difficult to define or lacks a simple interior.

While each of these methodologies has its own strengths and weaknesses, our primary focus will be on a robust and accurate contact modeling method, known as Incremental Potential Contact (IPC). IPC distinguishes itself by approximating the contact process with a smooth potential energy. This transformation effectively turns the problem into an unconstrained one, facilitating the application of various efficient and robust optimization techniques. A key feature of IPC is its capability to control the approximation error relative to the non-smooth formulation within a predetermined bound. This characteristic adds a layer of robustness and reliability to the method, making it an especially promising approach for the problem at hand.

Friction

Friction is a crucial element in physical interactions involving movement, often significantly influencing simulation outcomes. Thus, its precise modeling is vital for realistic and reliable simulations. See Figure 2.3.1 on the right for a demonstration of a scenario that requires a precise representation of friction.

One of the most widely adopted models for friction is the Coulomb Friction model. This model hinges on the Maximal Dissipation Principal (MDP), effectively capturing the nonsmooth transition between static and dynamic frictions. Static friction is the force preventing an object from initiating movement, whereas dynamic friction, or kinetic friction, opposes the motion of a moving object. The Coulomb Friction model accurately depicts the critical transition between these two friction types.

In the standard Material Point Method (MPM), friction is inherently modeled by the grid. However, this method has its drawbacks, notably an uncontrollable and unrealistically large friction coefficient.

For the Finite Element Method (FEM), friction can be more realistically and controllably represented through an approximated potential energy in the Incremental Potential Contact (IPC) model. This fits well within our optimization time integration framework. By using potential energy to approximate friction, we not only maintain the robustness of the simulation but also gain control over the accuracy of the friction model.

In subsequent lectures, we will delve into the specific techniques and methodologies employed in the IPC model to represent friction forces and their role in enhancing the accuracy and realism of simulations.

Summary

The objective of our discussions so far has been to devise a reliable solution for the unconditional stable implicit time integration problem. We aimed to address the issue of non-convergent solutions arising from truncation errors. We tackled this by reformulating the time integration problem as a minimization problem. This formulation not only allowed us to apply well-established optimization techniques, but it also facilitated a consistent modeling framework for different physical phenomena.

Here is a quick summary of the techniques used for modeling various phenomena within this framework:

For conservative forces like gravity and elasticity, we used potential energies. These were integrated into the objective function to create an accurate representation of the forces involved.
Boundary conditions, which specify the constraints on the system, were modeled using simple linear equality constraints. This helped us restrict the system to feasible states while performing the simulation.
To prevent interpenetration between solid objects during the simulation, we used inequality constraints to model contact and friction. These constraints ensured that objects maintained their physical integrity and behaved as expected when they came in contact with each other.

An important aspect to note here is that, we can utilize the unique structure of the boundary conditions to enforce the equality constraints in an unconstrained way. This will lead to a significant reduction in computational complexity.

Moreover, we introduced the concept of the Incremental Potential Contact (IPC) method. The IPC method models contact and friction as smooth potential energies with a controllable level of accuracy. This ensures a robust and accurate simulation of solid objects, free from interpenetration.

Moving forward, in the next lecture, we will delve into the projected Newton method for solving unconstrained optimization problems. This method offers the advantage of global convergence, meaning that the method is guaranteed to converge regardless of the initial configuration, provided it is feasible. This feature is highly desirable for complex simulations and it helps make the method more robust and reliable.

Projected Newton

Convergence Issue of Newton's Method

In addressing the minimization problem presented by implicit Euler time integration (referenced in Equation (2.1.1)), employing Newton's method (outlined in Algorithm 1.5.1) is a viable strategy for the resultant system of nonlinear equations. This involves setting the gradient of the Incremental Potential Energy to zero:

$\nabla E (x) = 0.$

However, the application of this method to cases such as nonlinear elasticity, particularly in the Neo-Hookean model, does not always guarantee convergence. The presence of truncation errors, especially in scenarios involving large time steps or significant deformations, can adversely affect the convergence process.

Example 3.1.1 (Illustration of Newton's Convergence Issue). To elucidate the issue of Newton's method non-convergence, let's consider a one-dimensional minimization problem characterized by the objective function: $f (x) = ln (e^{- x} + e^{x}) .$ We can evaluate the function at $x = 2$ and approximate it using a quadratic energy $g (x)$ , which is defined as: $g (x) = f (2) + f^{'} (2) (x - 2) + \frac{1}{2} f^{''} (2) (x - 2)^{2} .$ The joint plot of $f (x)$ and $g (x)$ (Figure 3.1.1) distinctly exhibits that the next iteration would exceed the actual target, landing at a point ( $x = - 11.645$ ) further from the actual solution at $x = 0$ . The subsequent iterations amplify this deviation, leading to a trajectory that diverges. It's worth noting that this demonstration involves a convex function $f (x) = ln (e^{- x} + e^{x})$ . The problem can become even more complex when Newton's method is applied to non-convex elasticity energies.

Figure 3.1.1. An iteration of Newton's method for $min_{x} E (x) = ln (e^{- x} + e^{x})$ at $x = 2$ .

Remark 3.1.1 (Convexity of Energies). Convex functions are characterized by symmetric and positive-definite (SPD) second-order derivatives throughout their domain. Conversely, the energy in most models of nonlinear elasticity used in computer graphics is rotation invariant. This implies that the energy value remains unchanged regardless of the rotational orientation of objects or elements. Such rotation invariance leads to non-convexity, making the optimization process more complex.

Definition 3.1.1 (Symmetric Positive-Definiteness). A square matrix $A \in R^{n \times n}$ is symmetric positive-definite if

$A = A^{T}$ , and

$v^{T} A v > 0$ for all $v \in R^{n}, v \neq = 0$ .

Unlike directly solving nonlinear equations, a minimization problem provides an energy measure that enables the assurance of global convergence using a technique called line search.

Line Search

In iterative minimization methods, line search is a technique used to select a fraction of the step in each iteration, ensuring the objective energy decreases at the new point.

Specifically, for Newton's method, line 4 in Algorithm 1.5.1 is modified from $x^i \leftarrow x$ to $x^i \leftarrow x^i + \alpha (x - x^i)$, where $\alpha \in (0,1]$ is the step size, essential for the reduction of energy. This leads to two critical questions: Does such an $\alpha$ always exist? And how is $\alpha$ calculated?

Remark 3.2.1 (Existence of $\alpha$). For a smooth objective energy $E(x)$ at $x^i$ where $\nabla E(x^i) \neq 0$, if a search direction $p=x-x^i$ is descent, namely $p^T \nabla E(x^i) < 0$, then there exists $\alpha > 0$ such that $E(x^i + \alpha p) < E(x^i)$.

Method 3.2.1 (Backtracking Line Search). Given a descent direction, we can find a reasonably large $\alpha$ by simply halving it starting from $1$ until the energy at the new location is smaller than the current (see Algorithm 3.2.1).
Algorithm 3.2.1 (The Backtracking Line Search Algorithm).

Remark 3.2.2 (Other Line Search Methods). There are other line search methods that attempt to apply polynomial interpolations to find an $\alpha$ such that the energy at the new location is closer to a local minimum on the line segment $x^i + s p$, ($s\in(0,1]$). However, these methods generally incur higher computational costs and may not necessarily enhance the overall wall-clock timing of the optimization.

Now, with line search, if Newton's method consistently generates a descent search direction, then the method is guaranteed to converge for any initial configuration on any smooth energy with a lower bound. We know that in iteration $i$, $p = -(\nabla^2 E(x^i))^{-1} \nabla E(x^i)$, so $p^T \nabla E(x^i)$ equals $-\nabla E(x^i)^T (\nabla^2 E(x^i))^{-T} \nabla E(x^i)$. For convex energies, $\nabla^2 E(x^i)$ is always Symmetric Positive Definite (SPD), and so is $(\nabla^2 E(x^i))^{-T}$, making $p$ always a descent direction. However, for non-convex energies, this assurance does not always hold. One approach to address this issue is to approximate the energies locally using convex energy proxies.

Gradient-based Optimization

The search direction of the standard Newton's method is calculated by minimizing the local quadratic approximation of the objective energy: $p = arg Δ x min (E (x^{i}) + Δ x^{T} \nabla E (x^{i}) + \frac{1}{2} Δ x^{T} P Δ x) (3.3.1)$ where $P = \nabla^2 E(x^i)$. In general gradient-based optimization methods, $p$ can be calculated by Equation (3.3.1) with any proxy matrix $P$. Setting $P = I$ results in $p = -\nabla E(x^i)$, as used in the standard gradient descent method. This approach converges more slowly than Newton's method, as the energy approximation is of a lower order. The closer the proxy matrix $P$ is to the Hessian matrix $\nabla^2 E(x^i)$, the faster the convergence. Hence, using an SPD approximation of the Hessian matrix as the proxy ensures that the search direction is always descent, while maintaining a convergence rate close to quadratic. This is akin to approximating non-convex energies locally using a convex energy proxy.

A straightforward method to obtain such an SPD approximation involves first projecting $\nabla^2 E(x^i)$ onto its closest semi-definite matrix by solving $P min ∥ P - \nabla^{2} E (x^{i}) ∥_{F} s . t . v^{T} P v \geq 0 \forall v \neq = 0,$ and then introducing perturbations to ensure that $P$ is invertible. The solution in this case is $P = Q \hat{\Lambda} Q^{-1}$, where $P = Q \Lambda Q^{-1}$ is the eigendecomposition, and $\hat{Λ}_{ij} = Λ_{ij}$ if $\Lambda_{ij} > 0$, otherwise $\hat{\Lambda}_{ij} = 0$. Intuitively, $P$ is obtained by zeroing out all the negative eigenvalues of $\nabla^2 E(x^i)$.

Definition 3.3.1 (Eigendecomposition). The eigendecomposition of a square matrix $A \in \mathbb{R}^{n \times n}$ is $A = Q Λ Q^{- 1}$ where $Q = [q_1, q_2, ..., q_n]$ is composed of the eigenvectors $q_i$ of $A$, $∥ q_{i} ∥ = 1$ ; $\Lambda = [\lambda_1, \lambda_2, ..., \lambda_n]$, with $\lambda_1 \geq \lambda_2 \geq ..., \lambda_n$ being the eigenvalues of $A$; and $Aq_i = \lambda_i q_i$.

However, in simulation, $\nabla^2 E(x^i)$ is usually a large sparse matrix, and performing eigendecomposition on it would be prohibitively expensive. Fortunately, we will discover later in this book that the Incremental Potential in solids simulation can be expressed as a separable sum of energies defined on local stencils, such as a triangle in the 2D Finite Element Method (FEM) mesh: $E (x) = j \sum E_{j} (x_{j 1}, x_{j 2}, ...),$ where $\mathbf{x}_{jk}$ are the nodes associated with the energy $E_j$. Consequently, we can conveniently obtain a reasonably good SPD approximation by zeroing out the negative eigenvalues of each $\nabla^2 E_i$ defined on a small number of nodes and aggregating them.

Example 3.3.1 (Local Projection Method). To simulate elasticity in 2D on a triangle mesh with 10,201 nodes and 20,000 triangles, the Hessian matrix $\nabla^2 E(x)$ is $20,402 \times 20,402$. For the local projection method described above, it requires 20,000 eigendecompositions on $6 \times 6$ matrices. Considering the computational complexity of eigendecomposition on an $n \times n$ matrix is worse than $O(n^2)$, this rough estimation already suggests a more than $500\times$ speedup for this medium-sized problem when employing the local projection methods.

In addition, since the mass matrix in $\nabla^2 E(x^i)$ is Symmetric Positive Definite (SPD) and the sum of SPD matrices remains SPD, there is no need for perturbations when projecting other matrices. We now summarize the globally convergent projected Newton method for backward Euler time integration in Algorithm 3.3.1.

Algorithm 3.3.1 (Projected Newton Method for Backward Euler Time Integration).

Remark 3.3.1 (Stopping Criteria). From Equation (3.3.1), we understand that $∥ p ∥$ can be interpreted as a quadratic approximation of the distance from the current estimate $x^i$ to the optimal solution. Hence, we utilize $∥ p ∥_{\infty} /Δ t$ as a more intuitive measure for the stopping criteria. This approach transforms it into a velocity unit and takes the maximum magnitude across every node.

Summary

After examining the convergence issues of traditional Newton's method, even on smooth convex energies, we introduced a backtracking line search scheme for minimizing the Incremental Potential of Implicit Euler time integration, ensuring global convergence.
To guarantee the discovery of a positive step size, the Incremental Potential Hessian is projected onto a nearby Symmetric Positive Definite (SPD) matrix. This SPD projection is efficiently achieved by eliminating the negative eigenvalues of the Hessian matrices for each non-convex energy stencil, involving only a few nodes.
A convergence criterion that provides a more intuitive and consistent method for setting tolerance is also introduced, utilizing the Newton search direction.

In the next lecture, we will conclude with a clear demonstration of all the covered topics through a simple 2D case study.

Case Study: 2D Mass Spring*

Up to now, we have completed a high-level introduction to the optimization-based solids simulation framework. In this lecture, we elaborate on how to implement a simple 2D elastodynamics simulator with Python3 (CPU) and MUDA (GPU).

Sections in this book with Python CPU and MUDA GPU implementations will be marked with a * right after the title. All the Python and MUDA implementations can be found at https://github.com/phys-sim-book/solid-sim-tutorial and https://github.com/phys-sim-book/solid-sim-tutorial-gpu, respectively. The excutable project for this section is in the /1_mass_spring folder of these repositories.

Spatial and Temporal Discretizations

In representing solid geometries, we employ a mesh structure. We can further simplify the representation by connecting nodes on the mesh with edges. To facilitate this process, especially for geometries like squares, we can script a mesh generator. This generator allows for specifying both the side length of the square and the desired resolution of the mesh.

Implementation 4.1.1 (Square Mesh Generation, square_mesh.py).

import numpy as np
import os

def generate(side_length, n_seg):
    # sample nodes uniformly on a square
    x = np.array([[0.0, 0.0]] * ((n_seg + 1) ** 2))
    step = side_length / n_seg
    for i in range(0, n_seg + 1):
        for j in range(0, n_seg + 1):
            x[i * (n_seg + 1) + j] = [-side_length / 2 + i * step, -side_length / 2 + j * step]
    
    # connect the nodes with edges
    e = []
    # horizontal edges
    for i in range(0, n_seg):
        for j in range(0, n_seg + 1):
            e.append([i * (n_seg + 1) + j, (i + 1) * (n_seg + 1) + j])
    # vertical edges
    for i in range(0, n_seg + 1):
        for j in range(0, n_seg):
            e.append([i * (n_seg + 1) + j, i * (n_seg + 1) + j + 1])
    # diagonals
    for i in range(0, n_seg):
        for j in range(0, n_seg):
            e.append([i * (n_seg + 1) + j, (i + 1) * (n_seg + 1) + j + 1])
            e.append([(i + 1) * (n_seg + 1) + j, i * (n_seg + 1) + j + 1])

    return [x, e]

In the code, n_seg represents the number of edges along each side of the square. The nodes are uniformly distributed across the square and interconnected through horizontal, vertical, and diagonal edges. For instance, calling generate(1.0, 4) constructs a mesh as depicted in Figure 4.1.1. This implementation utilizes the array data structures from the Numpy library, which provides convenient methods for handling the vector algebra required in subsequent steps.

**Figure 4.1.1.** A $4 \times 4$ square mesh generated by calling `generate(1.0, 4)` defined in Square Mesh Generation script above.

For temporal discretization, our approach is the implicit Euler method. The Incremental Potential, which needs to be minimized in time step $n$, is represented as follows: $E (x) = \frac{1}{2} ∥ x - (x^{n} + h v^{n}) ∥_{M}^{2} + h^{2} P (x) . (4.1.1)$ Next, our focus shifts to implementing the calculations for the energy value, gradient, and Hessian for both the inertia term and the potential energy $P(x)$.

Inertia Term

For the inertia term, with $\tilde{x}^n = x^n + h v^n$, we have \[ E_I(x) = \frac{1}{2}\|x - \tilde{x}^n \|_M^2, \quad \nabla E_I(x) = M(x - \tilde{x}^n), \quad \text{and} \quad \nabla^2 E_I(x) = M, \] which is straightforward to implement:

Implementation 4.2.1 (InertiaEnergy.py).

import numpy as np

def val(x, x_tilde, m):
    sum = 0.0
    for i in range(0, len(x)):
        diff = x[i] - x_tilde[i]
        sum += 0.5 * m[i] * diff.dot(diff)
    return sum

def grad(x, x_tilde, m):
    g = np.array([[0.0, 0.0]] * len(x))
    for i in range(0, len(x)):
        g[i] = m[i] * (x[i] - x_tilde[i])
    return g

def hess(x, x_tilde, m):
    IJV = [[0] * (len(x) * 2), [0] * (len(x) * 2), np.array([0.0] * (len(x) * 2))]
    for i in range(0, len(x)):
        for d in range(0, 2):
            IJV[0][i * 2 + d] = i * 2 + d
            IJV[1][i * 2 + d] = i * 2 + d
            IJV[2][i * 2 + d] = m[i]
    return IJV

The functions val(), grad(), and hess() are designed to compute different components of the inertia term. Specifically:

val(): Computes the value of the inertia term.
grad(): Calculates the gradient of the inertia term.
hess(): Determines the Hessian of the inertia term.

Regarding the Hessian matrix, a memory-efficient approach is employed. Rather than allocating a large two-dimensional array to store all entries of the Hessian matrix, only the nonzero entries are kept. This is achieved using the IJV structure, which consists of three lists:

Row Index: Identifies the row position of each nonzero entry.
Column Index: Indicates the column position of each nonzero entry.
Value: The actual nonzero value at the specified row and column.

This method significantly reduces memory usage and computational costs associated with downstream processing.

Mass-Spring Potential Energy

In this case study, we focus exclusively on incorporating the mass-spring elasticity potential into our system. The concept of mass-spring elasticity is akin to treating each edge of the mesh as if it were a spring. This approach is inspired by Hooke's Law, allowing us to formulate the potential energy on edge $e$ as follows:

$P_{e} (x) = l^{2} \frac{1}{2} k (\frac{∥ x _{1} - x _{2} ∥ ^{2}}{l ^{2}} - 1)^{2}, (4.3.1)$

Here, $x_{1}$ and $x_{2}$ represent the current positions of the two endpoints of the edge. The variable $l$ denotes the original length of the edge, and $k$ is a parameter controlling the spring's stiffness. Notably, when the distance between the two endpoints $∥ x_{1} - x_{2} ∥$ equals the original length $l$ , the potential energy $P_{e} (x)$ attains its global minimum value of $0$ , indicating no force is exerted.

An important aspect of this formulation is the inclusion of $l^{2}$ at the beginning. This is analogous to integrating the spring energy across the solid and choosing edges as quadrature points. This integration helps maintain a consistent relationship between the stiffness behavior and the parameter $k$ , regardless of mesh resolution variations.

Another deviation from standard spring energy formulations is our avoidance of the square root operation. We directly use $∥ x_{1} - x_{2} ∥^{2}$ , making our model polynomial in nature. This simplification yields more streamlined expressions for the gradient and Hessian:

$\frac{\partial P _{e}}{\partial x _{1}} (x) = - \frac{\partial P _{e}}{\partial x _{2}} (x) = 2 k (\frac{∥ x _{1} - x _{2} ∥ ^{2}}{l ^{2}} - 1) (x_{1} - x_{2}),$

$\frac{\partial ^{2} P _{e}}{\partial x _{1}^{2}} (x) = \frac{\partial ^{2} P _{e}}{\partial x _{2}^{2}} (x) = - \frac{\partial ^{2} P _{e}}{\partial x _{1} x _{2}} (x) = - \frac{\partial ^{2} P _{e}}{\partial x _{2} x _{1}} (x) = \frac{4 k}{l ^{2}} (x_{1} - x_{2}) (x_{1} - x_{2})^{T} + 2 k (\frac{∥ x _{1} - x _{2} ∥ ^{2}}{l ^{2}} - 1) I = \frac{2 k}{l ^{2}} (2 (x_{1} - x_{2}) (x_{1} - x_{2})^{T} + (∥ x_{1} - x_{2} ∥^{2} - l^{2}) I) .$

The total potential energy of the system, denoted as $P (x)$ , can be derived by summing the potential energy across all edges. This is calculated using Equation (4.3.1). Thus, the total potential energy is expressed as: $P (x) = e \sum P_{e} (x)$ where the summation is taken over all edges in the mesh.

Implementation 4.3.1 (MassSpringEnergy.py).

import numpy as np
import utils

def val(x, e, l2, k):
    sum = 0.0
    for i in range(0, len(e)):
        diff = x[e[i][0]] - x[e[i][1]]
        sum += l2[i] * 0.5 * k[i] * (diff.dot(diff) / l2[i] - 1) ** 2
    return sum

def grad(x, e, l2, k):
    g = np.array([[0.0, 0.0]] * len(x))
    for i in range(0, len(e)):
        diff = x[e[i][0]] - x[e[i][1]]
        g_diff = 2 * k[i] * (diff.dot(diff) / l2[i] - 1) * diff
        g[e[i][0]] += g_diff
        g[e[i][1]] -= g_diff
    return g

def hess(x, e, l2, k):
    IJV = [[0] * (len(e) * 16), [0] * (len(e) * 16), np.array([0.0] * (len(e) * 16))]
    for i in range(0, len(e)):
        diff = x[e[i][0]] - x[e[i][1]]
        H_diff = 2 * k[i] / l2[i] * (2 * np.outer(diff, diff) + (diff.dot(diff) - l2[i]) * np.identity(2))
        H_local = utils.make_PSD(np.block([[H_diff, -H_diff], [-H_diff, H_diff]]))
        # add to global matrix
        for nI in range(0, 2):
            for nJ in range(0, 2):
                indStart = i * 16 + (nI * 2 + nJ) * 4
                for r in range(0, 2):
                    for c in range(0, 2):
                        IJV[0][indStart + r * 2 + c] = e[i][nI] * 2 + r
                        IJV[1][indStart + r * 2 + c] = e[i][nJ] * 2 + c
                        IJV[2][indStart + r * 2 + c] = H_local[nI * 2 + r, nJ * 2 + c]
    return IJV

In dealing with the Hessian matrix of the mass-spring energy, a key consideration is its non-symmetric positive definite (SPD) nature. To address this, a specific modification is employed: we neutralize the negative eigenvalues of the local Hessian corresponding to each edge. This is done prior to incorporating these local Hessians into the global matrix. The process involves setting negative eigenvalues to zero, thus ensuring that the resulting global Hessian matrix adheres more closely to the desired SPD properties. This modification is crucial for Newton's method.

Implementation 4.3.2 (Positive Semi-Definite Projection).

import numpy as np
import numpy.linalg as LA

def make_PSD(hess):
    [lam, V] = LA.eigh(hess)    # Eigen decomposition on symmetric matrix
    # set all negative Eigenvalues to 0
    for i in range(0, len(lam)):
        lam[i] = max(0, lam[i])
    return np.matmul(np.matmul(V, np.diag(lam)), np.transpose(V))

Optimization Time Integrator

Having established the capability to evaluate the Incremental Potential for arbitrary configurations, we now turn our attention to the implementation of the optimization time integrator. This integrator is crucial for minimizing the Incremental Potential, which in turn updates the nodal positions and velocities. This implementation follows the approach outlined in Algorithm 3.3.1:

Implementation 4.4.1 (time_integrator.py).

import copy
from cmath import inf

import numpy as np
import numpy.linalg as LA
import scipy.sparse as sparse
from scipy.sparse.linalg import spsolve

import InertiaEnergy
import MassSpringEnergy

def step_forward(x, e, v, m, l2, k, h, tol):
    x_tilde = x + v * h     # implicit Euler predictive position
    x_n = copy.deepcopy(x)

    # Newton loop
    iter = 0
    E_last = IP_val(x, e, x_tilde, m, l2, k, h)
    p = search_dir(x, e, x_tilde, m, l2, k, h)
    while LA.norm(p, inf) / h > tol:
        print('Iteration', iter, ':')
        print('residual =', LA.norm(p, inf) / h)

        # line search
        alpha = 1
        while IP_val(x + alpha * p, e, x_tilde, m, l2, k, h) > E_last:
            alpha /= 2
        print('step size =', alpha)

        x += alpha * p
        E_last = IP_val(x, e, x_tilde, m, l2, k, h)
        p = search_dir(x, e, x_tilde, m, l2, k, h)
        iter += 1

    v = (x - x_n) / h   # implicit Euler velocity update
    return [x, v]

def IP_val(x, e, x_tilde, m, l2, k, h):
    return InertiaEnergy.val(x, x_tilde, m) + h * h * MassSpringEnergy.val(x, e, l2, k)     # implicit Euler

def IP_grad(x, e, x_tilde, m, l2, k, h):
    return InertiaEnergy.grad(x, x_tilde, m) + h * h * MassSpringEnergy.grad(x, e, l2, k)   # implicit Euler

def IP_hess(x, e, x_tilde, m, l2, k, h):
    IJV_In = InertiaEnergy.hess(x, x_tilde, m)
    IJV_MS = MassSpringEnergy.hess(x, e, l2, k)
    IJV_MS[2] *= h * h    # implicit Euler
    IJV = np.append(IJV_In, IJV_MS, axis=1)
    H = sparse.coo_matrix((IJV[2], (IJV[0], IJV[1])), shape=(len(x) * 2, len(x) * 2)).tocsr()
    return H

def search_dir(x, e, x_tilde, m, l2, k, h):
    projected_hess = IP_hess(x, e, x_tilde, m, l2, k, h)
    reshaped_grad = IP_grad(x, e, x_tilde, m, l2, k, h).reshape(len(x) * 2, 1)
    return spsolve(projected_hess, -reshaped_grad).reshape(len(x), 2)

Here step_forward() is essentially a direct translation of the projected Newton method with line search (Algorithm 3.3.1), and we implemented the Incremental Potential value (IP_val()), gradient (IP_grad()), and Hessian (IP_hess()) evaluations as separate functions for clarity.

For the computation of search directions, we utilize the linear solver from the Scipy library, which is adept at handling sparse matrices. Notably, this solver accepts matrices in the Compressed Sparse Row (CSR) format. The choice of this format and solver is driven by their efficiency in processing and memory usage, which is particularly advantageous when dealing with large-scale problems with large sparse matricies often encountered in computational simulations.

Simulator with Visualization

Having gathered all necessary elements for our 2D mass-spring simulator, the next step is to implement the simulator. This implementation will operate in a step-by-step manner and include visualization capabilities to enhance understanding and engagement.

Implementation 4.5.1 (simulator.py).

# Mass-Spring Solids Simulation

import numpy as np  # numpy for linear algebra
import pygame       # pygame for visualization
pygame.init()

import square_mesh   # square mesh
import time_integrator

# simulation setup
side_len = 1
rho = 1000  # density of square
k = 1e5     # spring stiffness
initial_stretch = 1.4
n_seg = 4   # num of segments per side of the square
h = 0.004   # time step size in s

# initialize simulation
[x, e] = square_mesh.generate(side_len, n_seg)  # node positions and edge node indices
v = np.array([[0.0, 0.0]] * len(x))             # velocity
m = [rho * side_len * side_len / ((n_seg + 1) * (n_seg + 1))] * len(x)  # calculate node mass evenly
# rest length squared
l2 = []
for i in range(0, len(e)):
    diff = x[e[i][0]] - x[e[i][1]]
    l2.append(diff.dot(diff))
k = [k] * len(e)    # spring stiffness
# apply initial stretch horizontally
for i in range(0, len(x)):
    x[i][0] *= initial_stretch

# simulation with visualization
resolution = np.array([900, 900])
offset = resolution / 2
scale = 200
def screen_projection(x):
    return [offset[0] + scale * x[0], resolution[1] - (offset[1] + scale * x[1])]

time_step = 0
square_mesh.write_to_file(time_step, x, n_seg)
screen = pygame.display.set_mode(resolution)
running = True
while running:
    # run until the user asks to quit
    for event in pygame.event.get():
        if event.type == pygame.QUIT:
            running = False
    
    print('### Time step', time_step, '###')

    # fill the background and draw the square
    screen.fill((255, 255, 255))
    for eI in e:
        pygame.draw.aaline(screen, (0, 0, 255), screen_projection(x[eI[0]]), screen_projection(x[eI[1]]))
    for xI in x:
        pygame.draw.circle(screen, (0, 0, 255), screen_projection(xI), 0.1 * side_len / n_seg * scale)

    pygame.display.flip()   # flip the display

    # step forward simulation and wait for screen refresh
    [x, v] = time_integrator.step_forward(x, e, v, m, l2, k, h, 1e-2)
    time_step += 1
    pygame.time.wait(int(h * 1000))
    square_mesh.write_to_file(time_step, x, n_seg)

pygame.quit()

For 2D visualization in our simulator, we utilize the Pygame library. The simulation is initiated with a scene featuring a single square, which is initially elongated horizontally. During the simulation, the square begins to revert to its original horizontal dimensions. Subsequently, due to inertia, it will start to stretch vertically, oscillating back and forth until it eventually stabilizes at its rest shape, as illustrated in (Figure 4.5.1).

**Figure 4.5.1.** From left to right: initial, intermediate, and final static frame of the initially stretched square simulation.

In addition to storing node positions x and edges e, our simulation also requires allocating memory for several other key variables:

Node Velocities (v): To track the movement of each node over time.
Masses (m): Node masses are calculated by uniformly distributing the total mass of the square across each node. This is a preliminary approach; more detailed methods for calculating nodal mass in Finite Element Method (FEM) or Material Point Method (MPM) will be explored in future chapters.
Squared Rest Length of Edges (l2): Important for calculating the potential energy in the mass-spring system.
Spring Stiffnesses (k): A crucial parameter influencing the dynamics of the springs.

For visualization purposes beyond our simulator, we enable the export of the mesh data into .obj files. This is achieved by calling the write_to_file() function at the start and at each frame of the simulation. This feature facilitates the use of alternative visualization software to analyze and present the simulation results.

Implementation 4.5.2 (Output Square Mesh, square_mesh.py).

def write_to_file(frameNum, x, n_seg):
    # Check if 'output' directory exists; if not, create it
    if not os.path.exists('output'):
        os.makedirs('output')

    # create obj file
    filename = f"output/{frameNum}.obj"
    with open(filename, 'w') as f:
        # write vertex coordinates
        for row in x:
            f.write(f"v {float(row[0]):.6f} {float(row[1]):.6f} 0.0\n") 
        # write vertex indices for each triangle
        for i in range(0, n_seg):
            for j in range(0, n_seg):
                #NOTE: each cell is exported as 2 triangles for rendering
                f.write(f"f {i * (n_seg+1) + j + 1} {(i+1) * (n_seg+1) + j + 1} {(i+1) * (n_seg+1) + j+1 + 1}\n")
                f.write(f"f {i * (n_seg+1) + j + 1} {(i+1) * (n_seg+1) + j+1 + 1} {i * (n_seg+1) + j+1 + 1}\n")

With all components properly set up, the next phase involves initiating the simulation loop. This loop advances the time integration and visualizes the results at each time step. To execute the simulation program, the following command is used in the terminal:

python3 simulator.py

Remark 4.5.1 (Practical Considerations). With our simulator implementation in place, it provides us with the flexibility to experiment with various configurations of the optimization time integration scheme. Such testing is invaluable for gaining deeper insights into the roles and impacts of each essential component.

Consider an example: if we opt not to project the mass-spring Hessian to a Symmetric Positive Definite (SPD) form, peculiar behaviors may emerge under certain conditions. For instance, running the simulation with a frame-rate time step size of h=0.02 and an initial_stretch of 0.5 could lead to line search failures. This, in turn, results in very small step sizes, hampering the optimization process and preventing significant progress.

While line search might seem superfluous in this simplistic 2D example, its necessity becomes apparent in more complex 3D elastodynamics simulations, especially those involving large deformations. Here, line search is crucial to ensure the convergence of the simulation.

Another point of interest is the stopping criteria applied in traditional solids simulators. Many such simulators forego a dynamic stopping criterion and instead terminate the optimization process after a predetermined number of iterations. This approach, while straightforward, can lead to numerical instabilities or 'explosions' in more challenging scenarios. This underscores the importance of carefully considering the integration scheme and its parameters to ensure stable and accurate simulations.

GPU-Accelerated Simulation

*Author of this section: Zhaofeng Luo, Carnegie Mellon University

We now rewrite the 2D mass-spring simulator to leverage GPU acceleration. Instead of directly writing CUDA, we resort to MUDA, a lightweight library that provides a simple interface for GPU-accelerated computations.

The architecture of the GPU-accelerated simulator is similar to the Python version. All function and variable names are consistent with the Numpy version. However, the implementation details are different due to the GPU architecture and programming model. Before delving into the details, let's first get a feeling of the speedup that GPU could bring us from the following gif (Figure 4.6.1).

**Figure 4.6.1.** An illustration of simulation speed of the Numpy CPU (left) and the MUDA GPU (right) versions.

Key Considerations for GPU Programming

To maximize resource utilization on the GPU, there are two important aspects to consider:

Minimizing Data Transfer. In most modern architectures, CPU and GPU have separate memory spaces. Transferring data between these spaces can be expensive. Therefore, it is essential to minimize data transfers between CPU and GPU.
Exploiting Parallelism. GPUs excel at parallel computations. However, care must be taken to avoid read-write conflicts that can arise when multiple threads attempt to access the same memory locations simultaneously.

Minimizing Data Transfer

To reduce data transfer between the CPU and GPU, we store the main energy values and their derivatives on the GPU. Computations are then performed directly on the GPU, and only the necessary position information is transferred back to the CPU for control and rendering. A more efficient implementation could render directly on the GPU, eliminating even this data transfer, but for simplicity and readability, we have not implemented that here.

To make the code more readable, the variables begin with device_ are stored in the GPU memory, and the variables begin with host_ are stored in the CPU memory.

Implementation 4.6.1 (Data structure, MassSpringEnergy.cu).

template <typename T, int dim>
struct MassSpringEnergy<T, dim>::Impl
{
	DeviceBuffer<T> device_x;
	DeviceBuffer<T> device_l2, device_k;
	DeviceBuffer<int> device_e;
	int N;
	DeviceBuffer<T> device_grad;
	DeviceTripletMatrix<T, 1> device_hess;
};

As shown in the code above, the energy values and their derivatives, as well as all the necessary parameters are stored in a DeviceBuffer object, which is a wrapper of the CUDA device memory implemented by the MUDA library. This allows us to perform computations directly on the GPU without the need for data transfer between the CPU and GPU.

Newton's Method

The iterations of Newton's method is a serial process and cannot be parallelized. Therefore, we implement this part on the CPU:

Implementation 4.6.2 (Newton's method, simulator.cu).

template <typename T, int dim>
void MassSpringSimulator<T, dim>::Impl::step_forward()
{
    update_x_tilde(add_vector<T>(device_x, device_v, 1, h));
    DeviceBuffer<T> device_x_n = device_x; // Copy current positions to device_x_n
    int iter = 0;
    T E_last = IP_val();
    DeviceBuffer<T> device_p = search_direction();
    T residual = max_vector(device_p) / h;
    while (residual > tol)
    {
        std::cout << "Iteration " << iter << " residual " << residual << "E_last" << E_last << "\n";
        // Line search
        T alpha = 1;
        DeviceBuffer<T> device_x0 = device_x;
        update_x(add_vector<T>(device_x0, device_p, 1.0, alpha));
        while (IP_val() > E_last)
        {
            alpha /= 2;
            update_x(add_vector<T>(device_x0, device_p, 1.0, alpha));
        }
        std::cout << "step size = " << alpha << "\n";
        E_last = IP_val();
        device_p = search_direction();
        residual = max_vector(device_p) / h;
        iter += 1;
    }
    update_v(add_vector<T>(device_x, device_x_n, 1 / h, -1 / h));
}

In this function, step_forward, the projected Newton method with line search is implemented, performing necessary computations on the GPU while controlling the process on the CPU. Any variable begin with device_ here is a DeviceBuffer object on the GPU. To print the values in DeviceBuffer for debugging purposes, the common practice is to transfer the data back to the CPU, or call the display_vec function (which calls printf in parallel on the GPU) implemented in uti.cu.

The update_x function updates the positions of the nodes to all Energy classes and transfers the updated positions back to the CPU for rendering:

Implementation 4.6.3 (Update positions, simulator.cu).

template <typename T, int dim>
void MassSpringSimulator<T, dim>::Impl::update_x(const DeviceBuffer<T> &new_x)
{
    inertialenergy.update_x(new_x);
    massspringenergy.update_x(new_x);
    device_x = new_x;
}

As the Energy classes has already updated its positions, the IP_val function no loner needs to pass any parameters, avoiding unnecessary data transfer. In fact, it only calls the val function of all energy classes and then sum the results together:

Implementation 4.6.4 (Computing IP, simulator.cu).

template <typename T, int dim>
T MassSpringSimulator<T, dim>::Impl::IP_val()
{

    return inertialenergy.val() + massspringenergy.val() * h * h;
}

Similarly for the IP_grad and IP_hess functions:

Implementation 4.6.5 (Computing IP gradient and Hessian, simulator.cu).

template <typename T, int dim>
DeviceBuffer<T> MassSpringSimulator<T, dim>::Impl::IP_grad()
{
    return add_vector<T>(inertialenergy.grad(), massspringenergy.grad(), 1.0, h * h);
}

template <typename T, int dim>
DeviceTripletMatrix<T, 1> MassSpringSimulator<T, dim>::Impl::IP_hess()
{
    DeviceTripletMatrix<T, 1> inertial_hess = inertialenergy.hess();
    DeviceTripletMatrix<T, 1> massspring_hess = massspringenergy.hess();
    DeviceTripletMatrix<T, 1> hess = add_triplet<T>(inertial_hess, massspring_hess, 1.0, h * h);
    return hess;
}

Notice that they utilize the parallel operations (add_vector and add_triplet, which are implemented in uti.cu) on the GPU to perform the summation for gradients and Hessians.

Parallel Computations

In our implementation, parallel computation is primarily employed in the computation of energy and its derivatives, as well as vector addition and subtraction. Let's take the MassSpringEnergy computation as an example.

Energy Computation

Implementation 4.6.6 (Computing energy, MassSpringEnergy.cu).

template <typename T, int dim>
T MassSpringEnergy<T, dim>::val()
{
	auto &device_x = pimpl_->device_x;
	auto &device_e = pimpl_->device_e;
	auto &device_l2 = pimpl_->device_l2;
	auto &device_k = pimpl_->device_k;
	int N = device_e.size() / 2;
	DeviceBuffer<T> device_val(N);
	ParallelFor(256).apply(N, [device_val = device_val.viewer(), device_x = device_x.cviewer(), device_e = device_e.cviewer(), device_l2 = device_l2.cviewer(), device_k = device_k.cviewer()] __device__(int i) mutable
						   {
		int idx1= device_e(2 * i); // First node index
		int idx2 = device_e(2 * i + 1); // Second node index
		T diff = 0;
		for (int d = 0; d < dim;d++){
			T diffi = device_x(dim * idx1 + d) - device_x(dim * idx2 + d);
			diff += diffi * diffi;
		}
		device_val(i) = 0.5 * device_l2(i) * device_k(i) * (diff / device_l2(i) - 1) * (diff / device_l2(i) - 1); })
		.wait();

	return devicesum(device_val);
} // Calculate the energy

The ParallelFor function distributes the computation across multiple GPU threads. The captured variables in the lambda function allow access to the necessary data structures within each thread.

Gradient Computation

Implementation 4.6.7 (Computing gradients, MassSpringEnergy.cu).

template <typename T, int dim>
const DeviceBuffer<T> &MassSpringEnergy<T, dim>::grad()
{
	auto &device_x = pimpl_->device_x;
	auto &device_e = pimpl_->device_e;
	auto &device_l2 = pimpl_->device_l2;
	auto &device_k = pimpl_->device_k;
	auto N = pimpl_->device_e.size() / 2;
	auto &device_grad = pimpl_->device_grad;
	device_grad.fill(0);
	ParallelFor(256).apply(N, [device_x = device_x.cviewer(), device_e = device_e.cviewer(), device_l2 = device_l2.cviewer(), device_k = device_k.cviewer(), device_grad = device_grad.viewer()] __device__(int i) mutable
						   {
		int idx1= device_e(2 * i); // First node index
		int idx2 = device_e(2 * i + 1); // Second node index
		T diff = 0;
		T diffi[dim];
		for (int d = 0; d < dim;d++){
			diffi[d] = device_x(dim * idx1 + d) - device_x(dim * idx2 + d);
			diff += diffi[d] * diffi[d];
		}
		T factor = 2 * device_k(i) * (diff / device_l2(i) -1);
		for(int d=0;d<dim;d++){
		   atomicAdd(&device_grad(dim * idx1 + d), factor * diffi[d]);
		   atomicAdd(&device_grad(dim * idx2 + d), -factor * diffi[d]);	  
		} })
		.wait();
	// display_vec(device_grad);
	return device_grad;
}

The atomicAdd function is crucial in the gradient computation to ensure safe concurrent updates to shared data (different edges can update the gradient of the same node), thus preventing race conditions.

Hessian Computation

We utilized the Sparse Matrix data structure to store the Hessian matrix. The computation is parallelized across multiple threads, with each thread updating a specific element of the Hessian matrix. The actual size of the Sparse Matrix is calculated at the start of the simulation, allocating just enough memory for non-zero entries. The main consideration here is to calculate the correct indices for each element during simulation:

Implementation 4.6.8 (Computing Hessians, MassSpringEnergy.cu).

template <typename T, int dim>
const DeviceTripletMatrix<T, 1> &MassSpringEnergy<T, dim>::hess()
{
	auto &device_x = pimpl_->device_x;
	auto &device_e = pimpl_->device_e;
	auto &device_l2 = pimpl_->device_l2;
	auto &device_k = pimpl_->device_k;
	auto N = device_e.size() / 2;
	auto &device_hess = pimpl_->device_hess;
	auto device_hess_row_idx = device_hess.row_indices();
	auto device_hess_col_idx = device_hess.col_indices();
	auto device_hess_val = device_hess.values();
	device_hess_val.fill(0);
	ParallelFor(256).apply(N, [device_x = device_x.cviewer(), device_e = device_e.cviewer(), device_l2 = device_l2.cviewer(), device_k = device_k.cviewer(), device_hess_val = device_hess_val.viewer(), device_hess_row_idx = device_hess_row_idx.viewer(), device_hess_col_idx = device_hess_col_idx.viewer(), N] __device__(int i) mutable
						   {
		int idx[2] = {device_e(2 * i), device_e(2 * i + 1)}; // First node index
		T diff = 0;
		T diffi[dim];
		for (int d = 0; d < dim; d++)
		{
			diffi[d] = device_x(dim * idx[0] + d) - device_x(dim * idx[1] + d);
			diff += diffi[d] * diffi[d];
		}
		Eigen::Matrix<T, dim, 1> diff_vec(diffi);
		Eigen::Matrix<T, dim, dim> diff_outer = diff_vec * diff_vec.transpose();
		T scalar = 2 * device_k(i) / device_l2(i);
		Eigen::Matrix<T, dim, dim> H_diff = scalar * (2 * diff_outer + (diff_vec.dot(diff_vec) - device_l2(i)) * Eigen::Matrix<T, dim, dim>::Identity());
		Eigen::Matrix<T, dim * 2, dim * 2> H_block, H_local;
		H_block << H_diff, -H_diff,
			-H_diff, H_diff;
		make_PSD(H_block, H_local);
		// add to global matrix
		for (int ni = 0; ni < 2; ni++)
			for (int nj = 0; nj < 2; nj++)
			{
				int indStart = i * 4*dim*dim + (ni * 2 + nj) * dim*dim;
				for (int d1 = 0; d1 < dim; d1++)
					for (int d2 = 0; d2 < dim; d2++){
						device_hess_row_idx(indStart + d1 * dim + d2)= idx[ni] * dim + d1;
						device_hess_col_idx(indStart + d1 * dim + d2)= idx[nj] * dim + d2;
						device_hess_val(indStart + d1 * dim + d2) = H_local(ni * dim + d1, nj * dim + d2);
					}
			} })
		.wait();
	return device_hess;
} // Calculate the Hessian of the energy

Summary

We have successfully demonstrated the implementation of a basic 2D mass-spring simulator encompassing several critical components:

Mesh Generation: This involves the creation of nodes and connecting elements. In practical scenarios, simulators often import meshes from pre-existing files.
Incremental Potential Energy Evaluation: Comprises the computation of the potential energy value, its gradient, and the Symmetric Positive Definite (SPD)-projected Hessian.
Optimization Time Integrator: This includes linear solves for determining search directions, line search techniques to ensure global convergence, and rules for updating nodal positions and velocities.
Simulator Structure: Encompasses scene setup, variable initialization, and the execution of the simulation loop. (Note: Visualization aspects can be decoupled from the simulator itself.)

In the forthcoming chapter, we will delve into boundary treatments, including prescribed motion and frictional contact, which are implemented through equality or inequality constraints in the optimization framework. This discussion will be enriched with practical case studies, illustrating the application of each boundary treatment in computational simulations.

Dirichlet Boundary Conditions*

Boundary treatments, including boundary conditions and frictional contacts, play a crucial role in solid simulations. They not only enhance the expressiveness of scene setup but also capture intricate dynamics within the simulation. This lecture introduces Dirichlet boundary conditions, a pivotal concept for prescribing the motion of specific nodes in solid structures. Understanding these conditions is essential for accurately modeling and manipulating the behavior of solids in various simulation scenarios.

Equality Constraint Formulation

Dirichlet boundary conditions (BC), when integrated into the optimization time integrator, are represented as linear equality constraints: $A x = b, (5.1.1)$ In this equation, the matrix $A$ is a $m \times dn$ matrix, where $m \leq dn$. This matrix functions to select the degrees of freedom (DOFs) at the nodes that are subject to the boundary conditions. The vector $b$ is a $m \times 1$ vector, which specifies the precise spatial values that are prescribed by these conditions.

Example 5.1.1 (Sticky Dirichlet Boundary Condition). For a 2D system containing two nodes $(x_{11}, x_{12})$ and $(x_{21}, x_{22})$, to fix the second node at position $(1, 2)$, the boundary condition (Equation (5.1.1)) can be expressed as $[00001001] x_{11} x_{12} x_{21} x_{22} = [12] .$

The two most common types of Dirichlet boundary conditions are sticky and slip:

Sticky Boundary Conditions: These conditions effectively fix the position of certain nodes within a time step. They are characterized by a block-wise constraint Jacobian matrix $A$. In this matrix, each set of $d$ rows includes exactly one $d \times d$ identity matrix. The rest of the matrix consists of zero matrices. This configuration is illustrated in Example 5.1.1. The implementation of sticky boundary conditions ensures that the specified nodes remain stationary, adhering to the prescribed positions during the simulation.

Slip Boundary Conditions: These conditions are designed to constrain each boundary condition (BC) node within a specific linear subspace, such as a plane or a line, which may not necessarily be axis-aligned. As an example, consider planar slip boundary conditions. Here, for each BC node, there is a corresponding row in the matrix $A$ that contains the normal vector of the plane. This vector occupies the columns corresponding to the BC node, as detailed in Example 5.1.2. Such conditions allow the nodes to move, but only within the defined linear subspace, thus adding a layer of complexity and realism to the simulation.

Example 5.1.2 (Slip Dirichlet Boundary Condition). For the same two-node system in Example 5.1.1, to constrain the first node in the line with equation $2x + 3y = 4$, the constraint (Equation (5.1.1)) can be expressed as $[2300] x_{11} x_{12} x_{21} x_{22} = 4.$

At the start of each time step, if we are given that all boundary conditions are satisfied, then the goal during optimization is simply to maintain the positions of the boundary condition nodes. This is represented as: $A Δ x = 0. (5.1.2)$ Here, $\Delta x$ is the search direction in each optimization iteration. Maintaining this condition ensures that any updated nodal position $x + \alpha \Delta x$, with $\alpha$ being the step size from line search, still satisfies the boundary conditions: $A (x + α Δ x) = b .$ This guarantees the adherence to boundary conditions throughout the optimization process.

To enforce the linear equality constraints (Equation (5.1.2)) for sticky DBC in a time step, we address this in each Newton iteration while solving for the search direction $ \Delta x $. This process involves forming the Lagrangian with a quadratic approximation to the Incremental Potential:

$L (Δ x, λ) = \frac{1}{2} Δ x^{T} H Δ x + g^{T} Δ x + λ^{T} A Δ x,$

Here, $ \lambda $ is the $ m\times 1 $ Lagrange multiplier vector. The gradient and Hessian of the Incremental Potential are denoted by $ g $ and $ H $, respectively.

The solution is approached through a max-min optimization problem:

$λ max Δ x min L (Δ x, λ),$

which leads to the formulation of a Karush-Kuhn-Tucker (KKT) system:

$[H A A^{T}] [Δ x λ] = [- g 0] . (5.1.3)$

Solving this KKT system is essential to determine the search direction. Note that this system is not Symmetric Positive Definite (SPD) and its size increases with the number of BC nodes.

DOF Elimination Method

Considering the simplest sticky Dirichlet boundary condition as an example, its constraint Jacobian $ A $ acts as a selection matrix. Consequently, $ AA^T $ forms a $ m \times m $ identity matrix, and $ A^T A $ becomes a $ dn \times dn $ diagonal matrix. In this matrix, the entries corresponding to the BC nodes are one, and all other entries are zero.

When we left-multiply $ A $ to the first block row of Equation (5.1.3), the resulting equation is:

$[A H A A A^{T}] [Δ x λ] = [- A g 0] .$

This manipulation allows us to directly solve for $ \lambda $ as:

$λ = - A H Δ x - A g . (5.2.1)$

By substituting Equation (5.2.1) back into the first block row of Equation (5.1.3), we derive the following equation:

$(I - A^{T} A) H Δ x = (I - A^{T} A) (- g) . (5.2.2)$

Here, left-multiplying by $(I - A^T A)$ effectively zeroes out the rows corresponding to the BC nodes. Hence, Equation (5.2.2) represents an under-constrained system. However, the second block row of Equation (5.1.3) actually provides us with the values of $\Delta x$ at the BC nodes (so they are not really unknowns). By considering this information, we can rewrite Equation (5.2.2) into a Symmetric Positive Definite (SPD) system:

$H_{U B} Δ x_{B} + H_{UU} Δ x_{U} = - g_{U},$

where the matrices and vectors are partitioned as follows:

$H = [H_{BB} H_{U B} H_{B U} H_{UU}], Δ x = [Δ x_{B} Δ x_{U}], g = [g_{B} g_{U}],$

and the subscript $B$ denotes the BC nodes. Knowing that $\Delta x_B = 0$, the system simplifies to:

$H_{UU} Δ x_{U} = - g_{U}, (5.2.3)$

which represents a SPD system that excludes the BC nodes.

A More Practical Approach

The method outlined above serves primarily for mathematical explanation. In practical applications, constructing Equation (5.2.3) is often avoided. This is because it entails reordering degrees of freedom (DOFs) and separating the BC nodes from unconstrained nodes, a process that can be both tedious and inefficient, particularly when the set of Dirichlet nodes varies over time.

To circumvent the need to reorder DOFs, a direct modification of the original linear system can be made to align it with Equation (5.2.3). This adjustment involves setting all entries in the rows corresponding to BC nodes in $ H $ and $ g $ to $ 0 $. Additionally, for the columns associated with BC nodes in $ H $, all off-diagonal entries are set to $ 0 $ while diagonal entries are assigned $ 1 $ or another positive real number to ensure the system remains well-conditioned. After solving this modified system, the resulting values of $ \Delta x_U $ are immediately aquired, and all $ \Delta x_B $ values are guaranteed to be $ 0 $.

Example 5.2.1 (DOF Elimination). For the problem defined in Example 5.1.1 where the second node $(x_{21}, x_{22})$ is fixed at $(1,2)$ in a 2D two-node system, assuming in a certain iteration of a time step $H = 4 - 1 - 1 - 1 - 1 4 - 1 - 1 - 1 - 1 4 - 1 - 1 - 1 - 1 4, and g = 1234,$ we solve the system $4 - 1 00 - 1 400 00100001 Δ x_{11} Δ x_{12} Δ x_{21} Δ x_{22} = - 1 - 2 00 . (5.2.4)$ for search direction $\Delta x$ so that $\Delta x_{21} = \Delta x_{22} = 0$ and after line search we for sure know that $(x_{21}, x_{22}) = (1, 2)$ still holds since $(x_{21} + α Δ x_{21}, x_{22} + α Δ x_{22}) = (x_{21}, x_{22})$ . Here (5.2.4) is essentially $[H_{UU} I] [Δ x_{U} Δ x_{B}] = [- g_{U} 0]$

Remark 5.2.1 (Limitations of DOF Elimination). The DOF elimination method described is effective when sticky BC nodes are established at the beginning of the time step. However, if this is not the case, and the constraint function in Equation (5.1.3) has a non-zero right-hand side (rhs), the DOF elimination method becomes inapplicable. The issue here is not the inability to solve for $ \Delta x $ under constraints with a non-zero rhs. Rather, the concern is that the resulting $ \Delta x $ might not lead to a descent direction in the Incremental Potential. This can result in exceedingly small step sizes after a line search, potentially stalling the optimization process.

Intuitively, if the direction of $ \Delta x_B $ is towards the prescribed BC coordinates, it could inadvertently increase the Incremental Potential, which is not adjusted to consider the BCs. Conversely, if $ \Delta x_B $ is simply $ 0 $ when the BCs are already satisfied, it effectively minimizes the Incremental Potential using a subset of variables, which remains a valid approach.

One might then ask why not adjust the DOFs to meet the BCs before starting the optimization. However, this strategy could lead to infeasible configurations, such as those involving intersections. A viable alternative is to initially apply stiff spring forces to gradually 'drag' the BC nodes to their constrained positions during optimization. After this, switching to the DOF elimination method can enhance convergence. This technique is further discussed in the section Moving Boundary Conditions*.

Case Study: Hanging Square*

We use a simple case study to end this lecture. Based on the mass-spring system developed in a previous section, we implement gravitational energy and sticky Dirichlet boundary conditions to simulate a hanging square. The excutable Python project for this section can be found at https://github.com/phys-sim-book/solid-sim-tutorial under the 2_dirichlet folder. MUDA GPU implementations can be found at https://github.com/phys-sim-book/solid-sim-tutorial-gpu under the simulators/2_dirichlet folder.

Gravitational energy has $P (x) = - x^{T} M g, \nabla P (x) = - M g, and \nabla^{2} P (x) = 0,$ which can be trivially implemented:

Implementation 5.3.1 (GravityEnergy.py).

import numpy as np

gravity = [0.0, -9.81]

def val(x, m):
    sum = 0.0
    for i in range(0, len(x)):
        sum += -m[i] * x[i].dot(gravity)
    return sum

def grad(x, m):
    g = np.array([gravity] * len(x))
    for i in range(0, len(x)):
        g[i] *= -m[i]
    return g

# Hessian is 0

Then we just need to make sure the gravitational energy is added into the Incremental Potential (IP):

Implementation 5.3.2 (Adding gravity to IP, time_integrator.py).

def IP_val(x, e, x_tilde, m, l2, k, h):
    return InertiaEnergy.val(x, x_tilde, m) + h * h * (MassSpringEnergy.val(x, e, l2, k) + GravityEnergy.val(x, m))     # implicit Euler

def IP_grad(x, e, x_tilde, m, l2, k, h):
    return InertiaEnergy.grad(x, x_tilde, m) + h * h * (MassSpringEnergy.grad(x, e, l2, k) + GravityEnergy.grad(x, m))   # implicit Euler

For the sticky Dirichlet boundary condition, we modify the system accordingly when computing search direction:

Implementation 5.3.3 (DOF elimination, time_integrator.py).

def search_dir(x, e, x_tilde, m, l2, k, is_DBC, h):
    projected_hess = IP_hess(x, e, x_tilde, m, l2, k, h)
    reshaped_grad = IP_grad(x, e, x_tilde, m, l2, k, h).reshape(len(x) * 2, 1)
    # eliminate DOF by modifying gradient and Hessian for DBC:
    for i, j in zip(*projected_hess.nonzero()):
        if is_DBC[int(i / 2)] | is_DBC[int(j / 2)]: 
            projected_hess[i, j] = (i == j)
    for i in range(0, len(x)):
        if is_DBC[i]:
            reshaped_grad[i * 2] = reshaped_grad[i * 2 + 1] = 0.0
    return spsolve(projected_hess, -reshaped_grad).reshape(len(x), 2)

Here is_DBC is an array marking whether a node is Dirichlet or not as we store the Dirichlet node indices in DBC:

Implementation 5.3.4 (DBC definition, simulator.py).

DBC = [n_seg, (n_seg + 1) * (n_seg + 1) - 1]  # fix the left and right top nodes

# ...

# identify whether a node is Dirichlet
is_DBC = [False] * len(x)
for i in DBC:
    is_DBC[i] = True

Finally, after making sure is_DBC is passed to the time integrator, we can simulate an energetic hanging square (no initial stretching) with a smaller spring stiffness k=1e3 at framerate time step size h=0.02 (Figure 5.3.1).

**Figure 5.3.1.** From left to right: initial, intermediate, and final static frame of the hanging square simulation.

Summary

In this section, we explored Dirichlet boundary conditions (DBC), integral to optimization time integrators, and presented them as straightforward linear equality constraints. There are two types of DBCs: sticky and slip. Sticky DBCs immobilize certain nodes, fixing their positions, whereas slip DBCs restrict the movement of nodes to within a plane or a line.

We focused on cases where sticky DBCs are already met at the start of a time step. In such scenarios, the DOF elimination method proves efficient. This technique modifies the gradient and Hessian of the Incremental Potential, ensuring that the resulting search direction remains within the feasible space.

In the following lecture, we will delve into the handling of slip DBCs and demonstrate methods for their efficient incorporation into optimization problems.

Slip Dirichlet Boundary Conditions

Although they might be satisfied at the start of a time step, general slip Dirichlet boundary conditions (DBC) present unique challenges. Unlike the sticky DBCs, they cannot be directly addressed using the DOF elimination method, primarily because their constraint Jacobian does not consist of identity matrix blocks. To navigate this complexity, we can adopt a change-of-basis strategy.

Before delving into the more general scenarios, it's insightful to first examine a particular type of slip DBC: those that are axis-aligned. Understanding this specific case will lay the groundwork for tackling the broader range of slip DBCs.

Axis-Aligned Slip DBC

Axis-Aligned slip Dirichlet boundary conditions (DBC) uniquely restrict the movement of certain nodes to linear subspaces that are aligned with the axes. For instance, these constraints could limit motion to lines parallel to the x-axis or planes parallel to the yz-plane. An advantageous aspect of Axis-Aligned slip DBC is that their constraint Jacobians bear resemblance to those of sticky DBCs. Consequently, they can be efficiently managed using the same DOF elimination method.

Example 6.1.1 (Axis-Aligned Slip DBC). Consider the previously mentioned two-node system in a 2D space, as referenced in the slip DBC example (Example 5.1.2). To apply a slip DBC that constrains the first node, represented by coordinates $(x_{11}, x_{12})$, to move only along the line $y = 3$, we express this constraint as a linear equality: $[0100] x_{11} x_{12} x_{21} x_{22} = 3.$ Then similar to sticky DBC, in a time step where this slip DBC is already satisfied, assume we have $H = 4 - 1 - 1 - 1 - 1 4 - 1 - 1 - 1 - 1 4 - 1 - 1 - 1 - 1 4, and g = 1234,$ we can solve the system $40 - 1 - 1 0100 - 1 04 - 1 - 1 0 - 1 4 Δ x_{11} Δ x_{12} Δ x_{21} Δ x_{22} = - 1 0 - 3 - 4$ for search direction so that $\Delta x_{12} = 0$ and the first node will stay on the $y=3$ line for arbitrary step size since its $y$ coordinate will not vary.

Change of Variables

Challenges with General Slip DBCs and the DOF Elimination Method

When dealing with general linear equality constraints, such as slip DBCs that aren't axis-aligned, the direct Degree of Freedom (DOF) elimination method faces certain limitations. This becomes evident particularly when $ AA^T $ is not an $ m \times m $ identity matrix. According to the Karush-Kuhn-Tucker (KKT) system (Equation (5.1.3)), the Lagrange multiplier vector $ \lambda $ can be solved as follows:

$λ = - (A A^{T})^{- 1} A H Δ x - (A A^{T})^{- 1} A g . (6.2.1)$

When we substitute Equation (6.2.1) back into the KKT system, it results in:

$(I - A^{T} (A A^{T})^{- 1} A) H Δ x = (I - A^{T} (A A^{T})^{- 1} A) (- g), (6.2.2)$

This leads to an under-constrained system. The key challenge here is that $ I - A^T (AA^T)^{-1} A $ does not possess a special structure that can be conveniently exploited to derive an equivalent, non-singular system while still satisfying the constraints. This makes the direct application of the DOF elimination method impractical for general slip DBCs.

Simplifying Constraints Using Singular Value Decomposition

Our approach involves transforming the degrees of freedom (DOF) into a new set of variables, making the constraints as straightforward as those in sticky DBC. To achieve this, we employ singular value decomposition (SVD) on the constraint Jacobian matrix $ A $. The SVD of $ A $ is expressed as: $A = U S V^{T} .$ Here, $ U $ is a $ m \times m $ orthogonal matrix, $ V $ is a $ dn \times dn $ orthogonal matrix, and $ S $ is a $ m \times dn $ diagonal matrix.

By defining $ y = V^T \Delta x $, we can reframe the Karush-Kuhn-Tucker (KKT) system (Equation (5.1.3)) into a new format:

$[V^{T} H V S S^{T} 0] [y λ^{'}] = [- V^{T} g 0] . (6.2.3)$

In this transformed system, $ \lambda' = U^T \lambda $. Notably, the presence of the diagonal matrix $ S $ in the off-diagonal blocks allows the direct application of the DOF elimination method. Once we solve for $ y $, the original variable $ \Delta x $ is easily recovered through the matrix-vector product $ \Delta x = V y $.

Remark 6.2.1 (Limitations of Using SVD for DOF Elimination). While we utilized singular value decomposition (SVD) to illustrate the concept, it's important to recognize the limitations of applying SVD in practice, especially on large matrices. There are two primary concerns:

Intractability with Large Matrices: Performing SVD on matrices of substantial size can be computationally challenging and often impractical.

Impact on Computational Efficiency: The Incremental Potential Hessian $ H $ typically exhibits sparsity, making it efficient to factorize in linear solves during simulations. However, if the resulting $ V $ from the SVD is dense, then $ V^T H V $ will also be dense. This not only slows down the computation but also significantly increases the cost of linear solves.

It's crucial to note that the new basis set (the column vectors of $ V $) needs to be linearly independent but does not necessarily have to be orthonormal. This insight opens up the possibility of identifying a sparse basis set. Such a set can maintain computational efficiency when dealing with general linear equality constraints. For a practical example of this approach, see [Chen et al. 2022].

General Slip DBC

Fortunately, for constraints like slip DBCs that are decoupled per node, SVD simply results in block-diagonal $U$ and $V$ which could be constructed procedurally in an efficient way. 3D planar slip DBC at node $i$ can be expressed as $n_{i}^{T} (x_{i} - x_{i}^{'}) = 0,$ where $n_{i}$ is the normal of the plane that node $i$ is slipping, and $x_{i}^{'}$ is an arbitrary point on that plane. As discussed near Equation (5.1.2), if at the beginning of the time step node $i$ is already on the plane, the constraint simplifies to $n_{i}^{T} Δ x_{i} = 0.$ Then performing SVD on the row vector $n_{i}^{T}$ , we obtain $n_{i}^{T} = U_{i} S_{i} V_{i}^{T} = 1 [100] n_{i}^{T} m_{i}^{T} l_{i}^{T}, (6.3.1)$ where unit vectors $n_{i}$ , $m_{i}$ , and $l_{i}$ together form an orthonormal basis in 3D.

Then it becomes clear that globally, $U$ is simply a $m \times m$ identity matrix, $S$ is a $m \times d n$ matrix where every row contains exactly one unit-valued entry in the column corresponding to the first DOF of the slip BC node, and $V$ is a $d n \times d n$ block-diagonal matrix with the $d \times d$ orthonormal blocks only on those corresponding to BC nodes, and $d \times d$ identity matrix elsewhere.

To compute $m_{i}$ and $l_{i}$ from $n_{i}$ , we first note that there are an infinite number of possible solutions. Therefore, we can simply first construct $m_{i} = n_{i} \times [100]^{T}$ , or $m_{i} = n_{i} \times [010]^{T}$ if $n_{i}$ is almost colinear with $[100]^{T}$ , and then construct $l_{i} = n_{i} \times m_{i}$ . To obtain $V^{T} (- g)$ , one only needs to left-multiply each $V_{i}^{T} = n_{i}^{T} m_{i}^{T} l_{i}^{T}$ to $- g_{i}$ . As for $V^{T} H V$ , first left-multiply each $V_{i}^{T}$ to every block on the $i$ -th block row of $H$ to obtain $V^{T} H$ . Then for the $i$ -th block column of $V^{T} H$ , right-multiply $V_{i} = [n_{i} m_{i} l_{i}]$ to every block. Finally, after solving for $y$ by applying the DOF elimination method on the modified system (Equation (6.2.3)), that is, solving $(I - S^{T} S) V^{T} H V y = (I - S^{T} S) (- V^{T} g), (6.3.2)$ $Δ x$ can be obtained by $Δ x = V y$ with similar block(node)-wise operations.

Example 6.3.1 (General Slip DBC). For the same two-node system in 2D as mentioned in the slip DBC example (Example 5.1.2), to constrain the first node $(x_{11}, x_{12})$ inside the $3 x + 4 y = 2$ line, the slip DBC can be expressed as $[3400] x_{11} x_{12} x_{21} x_{22} = 2$ and we can build $U = 1, S = [1000], V^{T} = 0.6 - 0.8 0.8 0.6 11$ for changing the basis. Then in a time step where this slip DBC is already satisfied, assume we have $H = 4 - 1 - 1 - 1 - 1 4 - 1 - 1 - 1 - 1 4 - 1 - 1 - 1 - 1 4, and g = 1234,$ we can compute $V^{T} H V = 3.04 0.28 - 1.4 - 1.4 0.28 4.96 0.2 0.2 - 1.4 0.2 4 - 1 - 1.4 0.2 - 1 4, and V^{T} g = 2.2 0.4 34,$ and solve the system $1000 0 4.96 0.2 0.2 0 0.2 4 - 1 0 0.2 - 1 4 y_{11} y_{12} y_{21} y_{22} = 0 - 0.4 - 3 - 4$ for $y$ . Then the search direction can be obtained by $Δ x = V y$ so that $3Δ x_{11} + 4Δ x_{12} = 0$ and so the first node will stay on the $3 x + 4 y = 2$ line for arbitrary step size.

Summary

This section has demonstrated that, with a change in the basis of variables, general slip Dirichlet boundary conditions (DBC) can be effectively managed using the Degree of Freedom (DOF) elimination method, much like axis-aligned slip DBCs.

While singular value decomposition (SVD) can be used to find the basis for general linear equality constraints, this approach may not be feasible for large or complex constraints. Nonetheless, it's possible to develop procedural routines for computing the basis, specifically tailored to node-wise slip DBC constraints.

Currently, our focus has been on maintaining DBCs that are already satisfied within the simulation framework. Moving forward, the discussion will shift towards exploring frictional contact between points and analytic surfaces. Additionally, we will revisit scenarios where DBCs are not satisfied at the start of a time step, delving into more complex cases.

Distance Barrier for Nonpenetration

Contact modeling is a crucial aspect of ensuring that solids do not intersect with obstacles or themselves. This topic was briefly touched upon in a previous section. In this lecture, we delve deeper into the specifics of non-interpenetration within the framework of the Incremental Potential Contact (IPC) method. Our focus will be on a simplified yet significant scenario: contact between solids and obstacles that have closed boundaries. This specific focus allows us to thoroughly explore the mechanics and principles of the IPC method in a controlled setting.

Signed Distances

The Incremental Potential Contact (IPC) method is designed to ensure non-interpenetration in solids of any codimension by maintaining the unsigned distances between solid boundaries above zero throughout their movement. This approach is robust as it applies universally, irrespective of the solid's specific characteristics.

However, when signed distances are accessible, the application of IPC becomes not only straightforward but also more streamlined. Signed distances extend the concept of unsigned distances to encompass solid geometries with closed boundaries. With IPC enforcing non-interpenetration, the possibility of negative distances inside a solid is eliminated. Therefore, in scenarios where signed distances remain non-negative (including the state of being exactly zero), it's an indication of successful non-interpenetration.

Definition 7.1.1 (Codimension). If $W$ is a linear subspace of a finite-dimensional vector space $V$, then the codimension of $W$ in $V$ is the difference between their dimensions: $codim (W) = dim (V) - dim (W) .$ For example, in 3D, a surface has codimension $1$, and a line has codimension $2$. In computer graphics, when simulating cloth and hair, codimension 1 and 2 geometry representations are often applied respectively for efficiency. However, their signed distances are not well-defined. This also explains why unsigned distances are more general for modeling solid contact.

In a previous section, we explored various methods for representing solid geometries. One notable approach is the analytical representation. For instance, a 3D ball centered at $ \mathbf{c} $ with radius $ r $ can be analytically described by the parameterization:

${x \in R^{3} ∣ ∥ x - c ∥ \leq r, c \in R^{3}, r > 0} .$

This principle of defining solid geometries extends beyond simple spheres. Many other shapes, such as half-spaces, boxes, ellipsoids, and tori, can be similarly parameterized. The key to these parameterizations lies in defining the "interior" of these objects, which can often be achieved through functions like signed distances. These functions provide a versatile tool for describing a wide range of simple and complex shapes in a concise and mathematical manner.

Example 7.1.1 (Ball Signed Distance Function). The signed distance function $d(\mathbf{x})$ and its derivatives of a ball centered at $\mathbf{c}$ with radius $r$ can be defined as $d (x) = ∥ x - c ∥ - r, \nabla d (x) = \frac{x - c}{∥ x - c ∥}, and \nabla^{2} d (x) = \frac{∥ x - c ∥ ^{2} I - ( x - c ) ( x - c ) ^{T}}{∥ x - c ∥ ^{3}} .$

Example 7.1.2 (Half-Space Signed Distance Function). The signed distance function $d(\mathbf{x})$ and its derivatives of a half-space with normal $\mathbf{n}$ and $d(\mathbf{o}) = 0$ can be defined as $d (x) = n^{T} (x - o), \nabla d (x) = n, and \nabla^{2} d (x) = 0 . (7.1.1)$

Representing more intricate geometries, like those commonly encountered in real-life scenarios, can be a challenging task due to their complexity. An effective alternative to intricate parameterizations is the use of a uniform Euclidean grid. This grid serves as a storage mechanism for the signed distances of a solid object, with these distances precomputed at each grid node. When the distance at any arbitrary point within the solid is required, interpolation can be applied to the grid data.

Example 7.1.3 (Grid Signed Distance Field). For a signed distance field stored on a uniform Euclidean grid with spacing $\Delta x$, to query the distance at an arbitrary location $\mathbf{x} = (x,y)$ where $x = x_i + \alpha \Delta x$ and $y = y_i + \beta \Delta x$ ($\mathbf{x}_{i,j} = (x_i, y_j)$ are the location of grid nodes, $0 \leq \alpha,\beta \leq 1$), with bilinear interpolation (Figure 7.1.1 right), $d (x) = (1 - β) ((1 - α) d (x_{i, i}) + α d (x_{i + 1, i})) + β ((1 - α) d (x_{i, i + 1}) + α d (x_{i + 1, i + 1})) .$ From Figure 7.1.1 we also see that to approximate a solid boundary smoothly in this setting, a higher-order interpolation scheme such as quadratic b-spline interpolation is needed.

Figure 7.1.1. The signed distance between the grid nodes and the sphere boundary is precomputed and stored (left). With bilinear interpolation, part of the sphere boundary is approximated as the blue polyline (right).

Distance Barrier

Constrained Optimization

In scenarios like a solid interacting with a planar ground, where the signed distance function $ d(\mathbf{x}) $ is smooth outside the obstacle, we can approach the modeling of contact by incorporating non-interpenetration constraints. These constraints are formulated using $ d(\mathbf{x}) $, while we also aim to minimize the Incremental Potential of the system.

Assuming that the solids are densely sampled with nodes $\mathbf{x}$, we apply these constraints at the level of nodal Degrees of Freedom (DOFs) in relation to the obstacles:

$x min E (x) s.t. d_{ij} \geq 0 \forall node i and obstacle j . (7.2.1)$

In this equation, $ d_{ij} $ represents the signed distance between node $ i $ and obstacle $ j $. By ensuring that $ d_{ij} $ is non-negative, we effectively prevent the solids from intersecting with the obstacles¹.

Logarithm Barrier Potential in Contact Modeling

To address the inequality constraints in our contact modeling, we introduce a barrier potential $ P_b(\mathbf{x}) $. This potential transforms the constrained problem, as described in Equation (7.2.1), into an "unconstrained" optimization problem:

$x min E (x) + h^{2} P_{b} (x) . (7.2.2)$

The barrier potential is defined as follows:

$P_{b} (x) = i, j \sum A_{i} \hat{d} b (d_{ij} (x)) and b (d_{ij} (x)) = {\frac{κ}{2} (\frac{d _{ij}}{d ^} - 1) ln \frac{d _{ij}}{d ^} 0 d_{ij} < \hat{d} d_{ij} \geq \hat{d} . (7.2.3)$

In this formulation, $ b() $ represents the barrier energy density function. As the distance approaches zero, this function tends to infinity, thereby providing a strong repulsion force to prevent interpenetration (refer to Figure 7.2.1). The distance threshold $ \hat{d} $ above which no contact force is exerted, the contact stiffness $ \kappa $ which controls the rate of change of the contact forces with respect to distance, and $ A_i $, the contact area of node $ i $, are key parameters in this setup. By integrating the energy density over the solid boundary, the barrier formulation effectively models a potential energy field that is of thickness $ \hat{d} $.

**Figure 7.2.1.** The barrier energy density function plotted with different $\hat{d}$ . Decreasing $\hat{d}$ asymptotically matches the discontinuous definition of the contact condition.

Remark 7.2.1 (Contact Layer Interpretation). Imagine the barrier potential $ P_b(\mathbf{x}) $ as representing the elasticity of an ultra-thin layer of virtual material that exists just outside the boundaries of the solids. This virtual layer has an effective thickness of $ \hat{d} $, which correlates with the distance threshold in the barrier function.

Consequently, the integration or summation used in computing $ P_b(\mathbf{x}) $ is weighted by the volume element $ w_i = A_i \hat{d} $, where $ A_i $ represents the contact area of each node. As solids approach and begin to compress this virtual elastic layer, contact forces arise. These forces, akin to a unique type of elasticity force, serve to prevent interpenetration by providing a repulsion effect whenever the solids come too close to each other. This model allows us to simulate the physical response of contact without actual penetration of the solids.

Applying chain rules with distance being the intermediate variables, we can derive the gradient and Hessian of $P_b(\mathbf{x})$ as $\nabla P_{b} (x) = i, j \sum w_{i} \frac{\partial b}{\partial d} (d_{ij} (x)) \nabla d_{ij} (x) (7.2.4)$ and $\nabla^{2} P_{b} (x) = i, j \sum w_{i} (\frac{\partial ^{2} b}{\partial d ^{2}} (d_{ij} (x)) \nabla d_{ij} (x) \nabla d_{ij} (x)^{T} + \frac{\partial b}{\partial d} (d_{ij} (x)) \nabla^{2} d_{ij} (x)) . (7.2.5)$

As we are using signed distances here, the inequality constraints can be defined without introducing an $\epsilon$ as in Equation (2.3.1) with unsigned distances.

Solution Accuracy

So why can we solve Equation (7.2.2) to approximate the solution of the original problem in Equation (7.2.1)? Similar to Dirichlet Boundary Conditions, at the solution $x^*$ of Equation (7.2.1), the following KKT conditions all hold: $\nabla E (x^{*}) - i, j \sum γ_{ij} \nabla d_{ij} (x^{*}) = 0, \forall i, j, d_{ij} (x^{*}) \geq 0, γ_{ij} \geq 0, γ_{ij} d_{ij} (x^{*}) = 0. (7.3.1)$ While at the local optimum $x'$ of Equation (7.2.2), we have $\nabla E (x^{'}) + h^{2} \nabla P_{b} (x^{'}) = 0, (7.3.2)$ which is equivalently $\nabla E (x^{'}) + h^{2} i, j \sum A_{i} \hat{d} \nabla b (d_{ij} (x^{'})) = 0$ and $\nabla E (x^{'}) + h^{2} i, j \sum A_{i} \hat{d} \frac{\partial b}{\partial d} (d_{ij} (x^{'})) \nabla d_{ij} (x^{'}) = 0$ if we plug in the expression of $\nabla b(d_{ij})$. Let $\gamma_{ij}' = -h^2 A_i \hat{d} \frac{\partial b}{\partial d}(d_{ij}(x'))$, we can further rewrite Equation (7.3.2) as $\nabla E (x^{'}) - i, j \sum γ_{ij}^{'} \nabla d_{ij} (x^{'}) = 0,$ which is essentially the stationarity condition (first line in Equation (7.3.1)) if we take $\gamma_{ij}'$ as the dual variable. Now since the barrier function provides arbitrarily large repulsion to avoid interpenetration, we know that $\forall i,j$, $d_{ij}(x') \geq 0$. In addition, $\gamma_{ij}' \geq 0$ also holds for all $i,j$ because $\frac{\partial b}{\partial d} \leq 0$ by construction. This means that at $x'$, we have momentum balance, no interpenetrations, and contact forces only push but not pull.

In our simulation, the only Karush-Kuhn-Tucker (KKT) condition not strictly satisfied at $ x' $ is the complementarity slackness condition. This arises due to the way our barrier approximation functions. Specifically, we have a situation where $ \gamma_{ij} > 0 \Longleftrightarrow 0 < d_{ij} < \hat{d} $, representing the activation of contact forces based on the distance between solids and obstacles.

As the threshold $ \hat{d} $ decreases, contact forces become active only when the solids are in closer proximity (as illustrated in Figure 7.2.1). This adjustment leads to a reduction in the complementarity slackness error, which can be controlled to a certain extent. However, it's important to note that this control comes at a cost: computational efficiency may be reduced. This is because sharper objective functions, resulting from smaller $ \hat{d} $ values, tend to require more Newton iterations to resolve. Therefore, there is a trade-off between the accuracy of the simulation (in terms of adhering to the KKT condition) and the computational resources required.

Summary

In simulating contact between solids and obstacles, we primarily focus on enforcing non-negativity on the signed distances between solid degrees of freedom (DOFs) and obstacles, in conjunction with minimizing the Incremental Potential.

Transformation to an Unconstrained Problem: The inherent inequality-constrained minimization issue for each time step is transformed into an unconstrained problem. This is achieved through the introduction of a barrier potential. This potential rises to infinity as distances approach zero, effectively generating large repulsion forces that prevent interpenetration.
Outcomes at Local Minimum: At the local minimum of this barrier-augmented Incremental Potential, we attain a balance of momentum, ensure non-interpenetration, and generate contact forces that only push but do not pull. The only exception in the Karush-Kuhn-Tucker (KKT) conditions is the complementarity slackness, which is not strictly satisfied. The accuracy in satisfying this condition can be controlled by adjusting the distance threshold $\hat{d}$ , albeit at the expense of computational efficiency.
Limitations and Next Steps: While the distance barrier method effectively addresses many contact scenarios, it cannot alone prevent artificial tunneling in dynamic simulations. To overcome this limitation, our next lecture will introduce the filtered line search scheme, an advanced technique designed to provide more guarantees to our simulations.

Remark 7.4.1 (Tunneling). Artificial tunneling in the context of simulations, particularly in computational physics and computer graphics, refers to a phenomenon where fast-moving objects pass through other objects or barriers without physically interacting with them, as if there were a tunnel through the barrier. This typically happens in scenarios involving discrete time steps, such as in computer simulations of physical systems.

In a real-world scenario, when two objects collide, there should be a physical interaction like a bounce, a stop, or a deformation. However, in a simulation with discrete time steps, if an object is moving very fast or the time steps are too large, the object's position might be calculated as being on one side of a barrier in one step and then on the other side in the next, without ever detecting a collision. This "skipping" of the collision step leads to what appears as tunneling through the object.

Filter Line Search*

The Incremental Potential Contact (IPC) method effectively maintains non-interpenetration constraints within solid simulations. This method models a constitutive relationship that directly correlates contact forces with their respective distances, thus converting the constrained problem into an unconstrained one. By using appropriately small time steps, the IPC allows for robust and accurate solid simulations free from obstacle interpenetration within an optimization-based time integration framework.

However, challenges arise when using larger time steps, which can introduce multiple local minima in the Incremental Potential. This condition can lead to tunneling issues, where solids might unexpectedly pass through obstacles due to overly large search directions. To mitigate this risk, we introduce a filter line search strategy supplemented by continuous collision detection (CCD). This approach is designed to prevent tunneling by continuously adjusting the trajectory of solids in response to potential collisions.

To illustrate these concepts, we will examine a case study where an elastic square falls onto the ground. This example will demonstrate the effectiveness of the IPC method along with the filter line search and CCD in managing the dynamics of solid bodies and ensuring accurate, interpenetration-free simulations.

The Tunneling Issue

Example 8.1.1 (Tunneling). Let's consider a simple illustrative example. Without external forces like gravity, for a particle (no elasticity) at $\mathbf{x}_0 = (0, 0)$ with mass $m$ and initial velocity $\mathbf{v}_0 = (1, 0)$ hitting a fixed square obstacle centered at $(0.005, 0) $, the Incremental Potential minimization problem for the first time step is $x min (\frac{m}{2} ∥ x - (x_{0} + h v_{0}) ∥^{2} + h^{2} P_{b} (x)) . (8.1.1)$ Since $\hat{d}$ is usually set small enough such as $10^{-4}m$ in this case, the barrier potential $P_b(\mathbf{x})$ is not yet active at $\mathbf{x}_0$ as the particle is not touching the obstacle. This makes the problem in Equation (8.1.1) quadratic, and our projected Newton (PN) method (Algorithm 3.3.1) will produce a search direction $p = h v_{0}$ at the first iteration, which directly leads to the global minimum of the Incremental Potential at $\mathbf{x}_0 + h\mathbf{v}_0$ after line search. Taking $h=0.01s$ (Figure 8.1.1), the particle will tunnel through the obstacle. However, scenarios where particles pass through obstacles due to large time steps are clearly unrealistic, as the expected physical behavior is for the particle to collide with the obstacle and either stop or bounce back.

From Example 8.1.1, we understand that simply ensuring the signed distances to be non-negative at the final solution is inadequate, especially in scenarios involving large time step sizes, high-speed impacts, or thin obstacles. These conditions can lead to inaccuracies and unrealistic outcomes in simulations.

The Incremental Potential Contact (IPC) method addresses this issue by ensuring that distances remain non-zero across the entire motion trajectory of solids. This approach is crucial for maintaining the physical accuracy and realism of the simulation.

But what exactly do we mean by "motion trajectory" in the context of discrete time integration? We will explain this next.

Penetration-free Trajectory

The most straightforward way of defining the motion trajectory between $x^n$ and $x^{n+1}$ at time $t^n$ and $t^{n+1}$ respectively would be the high-dimensional line segment connecting these two configurations. However, although enforcing non-negative signed distances on this trajectory could avoid the tunneling issue in Example 8.1.1, this strategy could potentially result in unrealistic behaviors as it alters the local optimum of the minimization problem (Equation (7.2.1)) in a nonphysical way (Figure 8.2.1).

**Figure 8.2.1.** For the setup in the tunneling example, enforcing non-negative signed distance along the motion trajectory approximated by the line segment between $x^{n}$ and $x^{n + 1}$ results in a nonphysical simulation result.

A more rigorous definition of the motion trajectory between $x^n$ and $x^{n+1}$ could be ${arg x min (\frac{1}{2} ∥ x - (x^{n} + h v^{n}) ∥_{M}^{2} + h^{2} \sum P (x)) h \in [0, t^{n + 1} - t^{n}]} .$ However, evaluating the configurations on this trajectory requires solving extra optimization problems, which could significantly complicate the time integration.

Instead, IPC takes the optimization path as an approximation to the motion trajectory. Specifically, for the time step solving from $x^n$ to $x^{n+1}$, if the optimization took $l$ iterations, and each iteration we get iterate $x^i$ after line search, the optimization path is simply the high-dimensional polyline ${(1 - β) x^{i} + β x^{i + 1} ∣ β \in [0, 1], i = 0, 1, 2, ..., l} .$ Now the time integration problem in time step $n$ becomes finding such optimization path $x^0, x^1, ..., x^l$ where $x^l$ locally minimizes the Incremental Potential (Equation (7.2.2)) subject to $d_{jk} ((1 - β) x^{i} + β x^{i + 1}) > 0 \forall node j, obstacle k, β \in [0, 1], and i = 0, 1, 2, ..., l .$ This enables enforcing the non-negative distance constraints per optimization iteration on the line segment between $x^i$ and $x^{i+1}$, which will not alter the local optimum of the time integration problem, and can be handled efficiently.

Recall from Algorithm 3.2.1 that the line search scheme updates the iterate as $x^{i+1} \leftarrow x^i + \alpha p$, which means $x^{i+1} - x^{i} = \alpha p$. Therefore, given an interpenetration-free $x^i$, to ensure all the configurations on the line segment between $x^i$ and $x^{i+1}$ are interpenetration-free, we just need to find such $\alpha$ that makes sure $d_{jk} (x^{i} + βp) \geq 0 \forall node j, obstacle k, and β \in [0, α] .$ Based on the intuition that a sufficiently small $\alpha$ could definitely make this happen, we can simply calculate an upper bound of such $\alpha$ in every iteration, and make sure the backtracking line search results in a step size smaller than this upper bound. This upper bound can be conveniently calculated with continuous collision detection (CCD).

Definition 8.2.1 (Continuous Collision Detection (CCD)). For a distance function $d_{jk}(x + \alpha p)$ defined with the initial interpenetration-free configuration of the solids and obstacles $x$, their intended displacement $p$, and the step size $\alpha$, CCD calculates the step size $\alpha^C_{jk}$ given $x$ and $p$ such that $d_{jk} (x + α p) > 0 \forall α \in [0, α_{jk}^{C}) . (8.2.1)$ Note that the problem definition implicitly requires $d_{jk}(x) > 0$. Under this setting, if we denote $d^a_{jk}(\alpha) = d_{jk}(x + \alpha p)$, $\alpha^C_{jk}$ is simply the smallest positive real root of $d^a_{jk}(\alpha)$ (see Figure 8.2.2 for an example), or $\alpha^C_{jk} = \infty$ if $d^a_{jk}(\alpha)$ does not have any positive real roots. There are many methods to obtain the exact or a conservative estimate of $\alpha^C_{jk}$, we will see a specific example in the case study of this lecture. After computing $\alpha^C_{jk}$ for all nodes $j$ and obstacle $k$, a step size upper bound $\alpha^C$ for the line search could then be obtained as $α^{C} = min (1, j, k min α_{jk}^{C})$

Figure 8.2.2. An illustration of CCD with a solid particle at $(0, 0)$ hitting a fixed vertical plane at $x = 0.3$ . With the intended displacement $p = (0.5, 0)$ , we obtain $α^{C} = 0.6$ .

Now, we can introduce our filter line search method (Algorithm 8.2.1), specifically designed to enforce non-interpenetration constraints throughout the entire approximated motion trajectory. This strategic enforcement is key in preventing tunneling issues that commonly occur in simulations with insufficient constraint handling.

This new scheme differs from the traditional backtracking line search method in a critical aspect: it initializes the step size. Instead of starting with a step size of $1$, the filter line search method begins with $\alpha^C$. This modification is subtle yet significant.

Algorithm 8.2.1 (Filter Backtracking Line Search).

Remark 8.2.1 (Algorithm Dependency Issue). Using the optimization path to approximate the motion trajectory is still not perfect as it is algorithm dependent. Other than the projected Newton (PN) method, there could be an algorithm that walks around an obstacle and ended up with a configuration on the other side, still providing a tunneling solution (Figure 8.2.3). Even with projected Newton, although in practice it always generates straightforward and physically plausible trajectories, there is no theoretical guarantee that it will never encounter tunneling issues. An intuition is that the search direction in every PN iteration always significantly decreases the Incremental Potential (IP), and so it is unlikely to walk around any contacts which often results in iterations that do not sufficiently decrease the IP. In fact, this kind of issue also happens in elastodynamics simulation without contact. Elasticity energy itself is also nonconvex, which can result in multiple local optima for the IP. The key to obtaining physical behaviors is to locally minimize IP, in other words, finding the nearby local minimum as the solution.

Figure 8.2.3. For the setup in the tunneling example, even with the filter line search scheme, if an optimization method other than projected Newton is applied, it could still lead to the tunneling issue.

Case Study: Square Drop

To conclude, let's consider a case study where we simulate a square dropped onto a fixed planar ground. Building on our previous mass-spring model for an elastic square, we augment a barrier potential into its Incremental Potential and apply the filter line search scheme to manage the contact between the square's degrees of freedom (DOFs) and the ground.

The excutable Python project for this section can be found at https://github.com/phys-sim-book/solid-sim-tutorial under the 3_contact folder. MUDA GPU implementations can be found at https://github.com/phys-sim-book/solid-sim-tutorial-gpu under the simulators/3_contact folder.

If we further limit the planar ground to be horizontal, e.g. at $y=y_0$, its signed distance function can be made even simpler than Equation (7.1.1): $d (x) = x_{y} - y_{0}, \nabla d (x) = [01], \nabla^{2} d (x) = 0 . (8.3.1)$ Combining it with Equation (7.2.4) and Equation (7.2.5), we can conveniently implement the gradient and Hessian computation for the barrier potential of this horizontal ground:

Implementation 8.3.1 (Barrier energy value, gradient, and Hessian, BarrierEnergy.py).

import math
import numpy as np

dhat = 0.01
kappa = 1e5

def val(x, y_ground, contact_area):
    sum = 0.0
    for i in range(0, len(x)):
        d = x[i][1] - y_ground
        if d < dhat:
            s = d / dhat
            sum += contact_area[i] * dhat * kappa / 2 * (s - 1) * math.log(s)
    return sum

def grad(x, y_ground, contact_area):
    g = np.array([[0.0, 0.0]] * len(x))
    for i in range(0, len(x)):
        d = x[i][1] - y_ground
        if d < dhat:
            s = d / dhat
            g[i][1] = contact_area[i] * dhat * (kappa / 2 * (math.log(s) / dhat + (s - 1) / d))
    return g

def hess(x, y_ground, contact_area):
    IJV = [[0] * len(x), [0] * len(x), np.array([0.0] * len(x))]
    for i in range(0, len(x)):
        IJV[0][i] = i * 2 + 1
        IJV[1][i] = i * 2 + 1
        d = x[i][1] - y_ground
        if d < dhat:
            IJV[2][i] = contact_area[i] * dhat * kappa / (2 * d * d * dhat) * (d + dhat)
        else:
            IJV[2][i] = 0.0
    return IJV

For the filter line search, with the position in the last iteration $\mathbf{x}$ and a search direction $\mathbf{p}$ of a specific node, the signed distance function is simply \[ d(\mathbf{x} + \alpha \mathbf{p}) = \mathbf{x}_y + \alpha \mathbf{p}_y - y_0, \] where $\alpha$ is the step size, and there is only one positive real root $\alpha = (y_0 - \mathbf{x}_y) / \mathbf{p}_y$ when $\mathbf{p}_y < 0$ since $\mathbf{x}_y > y_0$ (no interpenetration up to current iteration). Taking the minimum of the positive real root per node then gives us the step size upper bound $\alpha_C$ defined in Equation (8.2.1):

Implementation 8.3.2 (Ground CCD, BarrierEnergy.py).

def init_step_size(x, y_ground, p):
    alpha = 1
    for i in range(0, len(x)):
        if p[i][1] < 0:
            alpha = min(alpha, 0.9 * (y_ground - x[i][1]) / p[i][1])
    return alpha

Here we scale the upper bound by $0.9\times$ so that exact touching configurations with $d=0$ and $b = \infty$ (floating-point number overflow) can be avoided.

Then once we make sure the step size upper bound is used to initialize the line search

Implementation 8.3.3 (Filter line search, time_integrator.py).

        # filter line search
        alpha = BarrierEnergy.init_step_size(x, y_ground, p)  # avoid interpenetration and tunneling
        while IP_val(x + alpha * p, e, x_tilde, m, l2, k, y_ground, contact_area, h) > E_last:
            alpha /= 2

and that the contact area weights for all nodes are calculated

Implementation 8.3.4 (Contact area, simulator.py).

contact_area = [side_len / n_seg] * len(x)     # perimeter split to each node

and passed to our simulator, we can simulate the square drop with mass-spring stiffness k=2e4 and time step size h=0.01 as shown in Figure 8.3.1.

**Figure 8.3.1.** A mass-spring elastic square is dropped onto the ground with $0$ initial velocity under gravity. Here we show the frames when the square is: just dropped, first touching the ground, compressed to the maximum in this simulation, and becoming static.

Remark 8.3.1 (Contact Layer Integration). Since in practice, contact forces are only exerted on the boundary of the solids, the barrier potential should be integrated only on the boundary as well. This also explains why in our case study the contact area weight per node is simply calculated as the diameter of the square evenly distributed onto each boundary node. However, as mass-spring elasticity cannot guarantee that all interior nodes will stay inside the boundary of the solid, we simply apply the barrier potential to all nodal DOFs of the square.

Summary

To mitigate tunneling issues in solid simulation with large time steps, it is crucial to enforce non-negativity constraints of signed distances between solids and obstacles throughout the entire motion trajectory, not just at the final solution.

While directly using the optimization path to approximate the motion trajectory isn't perfect theoretically, it supports the design of a filter line search scheme. This scheme utilizes continuous collision detection (CCD) and the projected Newton method, effectively preventing tunneling in practical scenarios.

The projected Newton method, a gradient-based approach for minimizing the Incremental Potential, requires that the potential energy has a continuous gradient. Consequently, the distance functions employed in our barrier potential need to be at least $C^{1}$ continuous. For grid-based signed distance fields (Example 7.1.3), mere bilinear interpolation is considered insufficient.

Additionally, handling self-contact on the piece-wise linear boundary of a mesh necessitates further approximations to smooth the distance function. Detailed exploration of self-contact will be addressed in future sections. Before that, we will first transition to discussing solids-obstacle friction in our next lecture.

Frictional Contact

In the macroscopic view, contact forces comprise not only the normal forces that prevent interpenetrations but also tangential friction forces that dampen shearing motions at the interfaces. Most surfaces, when observed microscopically, are not perfectly smooth but are formed of jagged edges. Friction essentially arises from forces preventing non-interpenetration between these jagged edges. In this lecture, we introduce the Coulomb friction model, incorporating approximations that make it compatible with optimization time integrators.

Smooth Dynamic-Static Transition

To model frictional contact, local frictional forces $F_{k}$ can be added for every active contact point pair $k$ . For each such pair $k$ , at the current state ${x, v}$ , a consistently oriented sliding basis $T_{k} (x) \in R^{d m \times d}$ can be constructed, where $m$ is the total number of simulated nodes and $d$ is the dimension of space, such that $v_{k} = T_{k} (x)^{T} v \in R^{d}$ provides the local relative sliding velocity that is orthogonal to the distance gradient in the normal direction $n_{k} (x)$ .

Example 9.1.1 (Particle Sliding on Sphere). For a particle with velocity $v_{p} \in R^{3}$ moving on the surface of a sphere with velocity $v_{s} \in R^{3}$ (no rotation), the relative sliding velocity $v_{k}$ here can be calculated as $v_{k} = (v_{p} - v_{s}) - n_{k} \cdot (v_{p} - v_{s}) n_{k} = (I_{3} - n_{k} n_{k}^{T}) (v_{p} - v_{s}) .$ If we stack the velocity of the particle and the sphere for this system to obtain $v = [v_{p}^{T}, v_{s}^{T}]^{T} \in R^{6}$ , we now know that $T_{k}$ is simply $T_{k} (x) = [I_{3} - n_{k} (x) n_{k} (x)^{T} n_{k} (x) n_{k} (x)^{T} - I_{3}] \in R^{6 \times 3} .$ For more general cases like mesh-mesh contact, the form of $T_{k}$ only varies in how the relative velocity at the contact point pair $k$ is related to the velocity at the simulated nodes.

Maximizing dissipation rate subject to the Coulomb constraint defines friction forces variationally $F_{k} (x) = T_{k} (x) arg β \in R^{d} min β^{T} v_{k} s.t. ∥ β ∥ \leq μ λ_{k} and β \cdot n_{k} = 0, (9.1.1)$ where $λ_{k} = - w_{k} \frac{\partial b}{\partial d _{k}}$ is the contact force magnitude and $μ$ is the local friction coefficient. This is equivalent to $F_{k} (x) = - μ λ_{k} T_{k} (x) f (∥ v_{k} ∥) s (v_{k}), (9.1.2)$ with $s (v_{k}) = \frac{v _{k}}{∥ v _{k} ∥}$ when $∥ v_{k} ∥ > 0$ , while $s (v_{k})$ takes any unit vector orthogonal to $n_{k}$ when $∥ v_{k} ∥ = 0$ . In addition, the friction scaling function, $f$ , is also nonsmooth with respect to $v_{k}$ since $f (∥ v_{k} ∥) = 1$ when $∥ v_{k} ∥ > 0$ , and $f (∥ v_{k} ∥) \in [0, 1]$ when $∥ v_{k} ∥ = 0$ . These non-smoothness would severely slow down and even break convergence of gradient-based optimization.

**Figure 9.1.1.** An illustration of $T_{k}$ , $v_{k}$ , $n_{k}$ , and $F_{k}$ when a point slides on a sphere.

Remark 9.1.1 (Contact Force Magnitude). $λ_{k} = - w_{k} \frac{\partial b}{\partial d _{k}}$ is the contact force magnitude because at node $k$ , the contact force is $- w_{k} \nabla_{x_{k}} b (d_{k} (x)) = - w_{k} \frac{\partial b}{\partial d _{k}} \nabla_{x_{k}} d_{k} (x)$ . Therefore, $λ_{k} = ∥ - w_{k} \frac{\partial b}{\partial d _{k}} \nabla_{x_{k}} d_{k} (x) ∥ = - w_{k} \frac{\partial b}{\partial d _{k}}$ since $\frac{\partial b}{\partial d _{k}} < 0$ and $∥ \nabla_{x_{k}} d_{k} (x) ∥ = 1$ .

To enable efficient and stable optimization, the friction-velocity relation in the transition to static friction can be mollified by replacing $f$ with a smoothly approximated function. Following IPC, we use $f_{1} (y) = {- \frac{y ^{2}}{ϵ _{v}^{2}} + \frac{2 y}{ϵ _{v}}, 1, y \in [0, ϵ_{v}) y \geq ϵ_{v}, (9.1.3)$ where $f_{1}^{'} (ϵ_{v}) = 0$ and a velocity magnitude bound $ϵ_{v}$ (in units of $m / s$ ) below which sliding velocities $v_{k}$ are treated as static is defined for bounded approximation error (Figure 9.1.2).

**Figure 9.1.2.** A 1D illustration of the smoothed relation between friction force and sliding velocity. Decreasing $ϵ_{v}$ asymptotically matches the discontinuous Coulomb friction model.

Semi-Implicit Discretization

However, challenges still remain on incorporating friction into the optimization time integration. A major problem is that friction is not a conservative force and there is no well-defined potential such that taking the opposite of its gradient produces the frictional force. In other words, implicit friction force is not integrable. Without a potential energy, backtracking line search could not be performed, and thus guarantees on the stability and convergence of the optimization will be broken.

In fact, whether a force has well-defined potential energy really depends on the temporal discretization. For example, with explicit time integration, any force $f$ is constant within a time step and it has a potential energy $- f^{T} x$ . Taking this inspiration, we could make friction force integrable with a smarter temporal discretization. Making friction force constant within a time step would certainly restrict the size of the time step to obtain high quality results. Therefore, we discretize part of the friction force explicitly and formulate an integrable semi-implicit friction force.

Following IPC, we fix the normal force magnitude $λ$ (the ones only used in calculating friction) and the tangent operator $T$ during the nonlinear optimization to the value in the last time step $n$ : $λ^{n} = λ (x^{n})$ , and $T^{n} = T (x^{n})$ , which then makes the friction force integrable with a potential energy $P_{f} (x) = k \sum μ λ_{k}^{n} f_{0} (∥ \overset{ˉ}{v}_{k} \hat{h} ∥), (9.2.1)$ where $\overset{ˉ}{v}_{k} = (T_{k}^{n})^{T} v$ , $\hat{h} I = (\partial v / \partial x)^{- 1}$ , and $f_{0} (y) = {- \frac{y ^{3}}{3 ϵ _{v}^{2} h ^ ^{2}} + \frac{y ^{2}}{ϵ _{v} h ^} + \frac{ϵ _{v} h ^}{3}, y, y \in [0, ϵ_{v} \hat{h}) y \geq ϵ_{v} \hat{h}, (9.2.2)$ so that $f_{0}^{'} (y) = f_{1} (y / \hat{h})$ . Here $\hat{h}$ is a constant multiple of the time step size $h$ for most linear (multi-)step time integration methods including implicit Euler and higher-order backward difference formulas, etc. Then, taking the gradient of Equation (9.2.1) w.r.t. $x$ we obtain $- \nabla P_{f} (x) = - k \sum μ λ_{k}^{n} T_{k}^{n} f_{1} (∥ \overset{ˉ}{v}_{k} ∥) s (\overset{ˉ}{v}_{k}), (9.2.3)$ which is a semi-implicit discretization of our mollified friction force with explicit terms $λ_{k}^{n}$ and $T_{k}^{n}$ . The Hessian of $P_{f}$ can be calculated as $= \nabla^{2} P_{f} (x) k \sum μ λ_{k}^{n} T_{k}^{n} (\frac{f _{1}^{'} ( ∥ v ˉ _{k} ∥ ) ∥ v ˉ _{k} ∥ - f _{1} ( ∥ v ˉ _{k} ∥ )}{∥ v ˉ _{k} ∥ ^{3}} \overset{ˉ}{v}_{k} \overset{ˉ}{v}_{k}^{T} + \frac{f _{1} ( ∥ v ˉ _{k} ∥ )}{∥ v ˉ _{k} ∥} I_{3}) T_{k}^{n}^{T} \frac{\partial v}{\partial x} . (9.2.4)$

Remark 9.2.1. In the friction gradient and Hessian expression (Equation (9.2.3) and Equation (9.2.4)), there are $∥ v_{k} ∥$ in the denominators, which could be $0$ when there is no relative sliding motion at a contact point. To avoid division by $0$ during the computation, for friction gradient, we can derive $\frac{f _{1} ( ∥ v ˉ _{k} ∥ )}{∥ v ˉ _{k} ∥} = {- \frac{∥ v ˉ _{k} ∥}{ϵ _{v}^{2}} + \frac{2}{ϵ _{v}}, 1/∥ \overset{ˉ}{v}_{k} ∥, ∥ \overset{ˉ}{v}_{k} ∥ \in [0, ϵ_{v}) ∥ \overset{ˉ}{v}_{k} ∥ \geq ϵ_{v}, (9.2.5)$ which is well-defined everywhere, and so we obtain $- \nabla P_{f, k} (x) = - μ λ_{k}^{n} T_{k}^{n} \frac{f _{1} ( ∥ v ˉ _{k} ∥ )}{∥ v ˉ _{k} ∥} \overset{ˉ}{v}_{k} = 0 when ∥ \overset{ˉ}{v}_{k} ∥ = 0.$ For friction Hessian, we can derive $\frac{f _{1}^{'} ( ∥ v ˉ _{k} ∥ ) ∥ v ˉ _{k} ∥ - f _{1} ( ∥ v ˉ _{k} ∥ )}{∥ v ˉ _{k} ∥ ^{2}} = {- 1/ ϵ_{v}^{2}, - 1/∥ \overset{ˉ}{v}_{k} ∥^{2}, ∥ \overset{ˉ}{v}_{k} ∥ \in [0, ϵ_{v}) ∥ \overset{ˉ}{v}_{k} ∥ \geq ϵ_{v}, (9.2.6)$ which is also well-defined everywhere, and since $\overset{ˉ}{v}_{k} \overset{ˉ}{v}_{k}^{T} /∥ \overset{ˉ}{v}_{k} ∥ = 0$ when $∥ \overset{ˉ}{v}_{k} ∥ = 0$ , we know that $\nabla^{2} P_{f, k} (x) = μ λ_{k}^{n} T_{k}^{n} (\frac{f _{1} ( ∥ v ˉ _{k} ∥ )}{∥ v ˉ _{k} ∥} I_{3}) T_{k}^{n}^{T} \frac{\partial v}{\partial x} when ∥ \overset{ˉ}{v}_{k} ∥ = 0.$

Remark 9.2.2. The friction formulation in this lecture is introduced slightly differently from the original IPC [Li et al. 2020] in 2 places:

We directly use the relative sliding velocity $v_{k}$ rather than the relative sliding displacement $u_{k} = \hat{h} v_{k}$ in IPC as the input to the mollifier $f_{1} ()$ , and so our $f_{1} ()$ differs from that in the IPC on $\hat{h}$ in the denominators. When time integration rules other than implicit Euler is applied (so $x^{n + 1} - x^{n} \neq = \hat{h} v^{n + 1}$ ), calling $u_{k}$ the relative sliding displacement is inappropriate and may cause confusions.

We did not introduce a tangent basis to express relative sliding velocity in the tangent space, because this is not necessary in computing the friction energy, gradient, and Hessian.

Fixed-Point Iteration

To obtain the solution with fully implicit friction, we can iteratively alternate between the nonlinear optimization with fixed $λ$ , and $T$ given as $x min : E (x, {λ, T}) = \frac{1}{2} ∥ x - \tilde{x}^{n} ∥_{M}^{2} + Δ t^{2} (P_{e} (x) + P_{b} (x) + P_{f} (x, {λ, T})) s.t. A x = b, (9.3.1)$ and friction update until convergence (Algorithm 9.3.1).

If we denote \begin{equation} \begin{aligned} & f_m({ \lambda, T }) = \text{arg}\min_x E(x, { \lambda, T}) \ & f_u(x) = \text{FrictionUpdate}(x), \end{aligned} \end{equation} then Algorithm 9.3.1 is essentially a fixed-point iteration that finds the fixed-point of function \begin{equation} (f_m \cdot f_u) (x) \equiv f_m( f_u (x)). \end{equation}

Definition 9.3.1. $x$ is a fixed point of function $f ()$ if and only if \begin{equation} x = f(x). \end{equation} The fixed-point iterations find the fixed-point of a function $f ()$ starting from $x^{0}$ by iteratively updating the estimate \begin{equation} x^{i+1} \leftarrow f(x^i) \end{equation} until convergence.

Since the convergence of fixed-point iterations could only be achieved given an initial guess sufficiently close to the final solution, the convergence of Algorithm 9.3.1 analogously requires small time step sizes. However, note that each minimization with fixed ${λ, T}$ (Algorithm 9.3.1 line 4) is still guaranteed to converge with arbitrarily large time step sizes.

Remark 9.3.1. In practice, semi-implicit friction with frame-rate time step sizes can already produce results with high visual quality. For higher accuracy, running 2 to 3 fixed-point iterations for friction is generally sufficient.

Summary

We introduced the Coulomb friction model, which non-smoothly penalizes shearing motion at contact points through static and dynamic friction forces in the tangent space.

To integrate friction into the optimization time integrator, we first smoothly approximate the dynamic-static transition. This allows friction forces to be uniquely determined using only the nodal velocity degrees of freedom.

We then apply a semi-implicit discretization that fixes the normal force magnitude $λ$ and the tangent operator $T$ at the previous time step, enhancing the integrability of friction.

To achieve a solution with fully-implicit friction, fixed-point iterations are performed. These iterations alternate between semi-implicit time integration and updates for $λ$ and $T$ .

In the next lecture, we will explore a case study involving a square on a slope with varying friction coefficients.

Case Study: Square On Slope*

In this section, based on our learnings from Frictional Contact, we implement frictional contact for a slope within the optimization time integration framework. We start by extending the contact model used for horizontal grounds in the Square Drop case study to accommodate slopes with arbitrary orientations and locations.

Following this extension, we implement friction for the slope, tested by simulating an elastic square dropped onto it. Depending on the friction coefficient $μ$ , the square either stops at various points on the slope or continues to slide.

The excutable Python project for this section can be found at https://github.com/phys-sim-book/solid-sim-tutorial under the 4_friction folder. MUDA GPU implementations can be found at https://github.com/phys-sim-book/solid-sim-tutorial-gpu under the simulators/4_friction folder.

From Ground to Slope

The implementation in the Square Drop case study for horizontal grounds results in a simplified distance and distance gradient (Equation (8.3.1)) compared to that of a general half-space (Equation (7.1.1)): $d (x) = n^{T} (x - o), \nabla d (x) = n, and \nabla^{2} d (x) = 0 . (10.1.1)$ This is all we need for implementing the slope. Defining a normal direction and a point lying on the slope

Implementation 10.1.1 (Slope setup, simulator.py).

ground_n = np.array([0.1, 1.0])     # normal of the slope
ground_n /= np.linalg.norm(ground_n)    # normalize ground normal vector just in case
ground_o = np.array([0.0, -1.0])    # a point on the slope

and passing them to the time integrator and barrier energy, we can modify the barrier energy value, gradient, and Hessian computation for the slope as

Implementation 10.1.2 (Slope contact barrier, BarrierEnergy.py).

import math
import numpy as np

dhat = 0.01
kappa = 1e5

def val(x, n, o, contact_area):
    sum = 0.0
    for i in range(0, len(x)):
        d = n.dot(x[i] - o)
        if d < dhat:
            s = d / dhat
            sum += contact_area[i] * dhat * kappa / 2 * (s - 1) * math.log(s)
    return sum

def grad(x, n, o, contact_area):
    g = np.array([[0.0, 0.0]] * len(x))
    for i in range(0, len(x)):
        d = n.dot(x[i] - o)
        if d < dhat:
            s = d / dhat
            g[i] = contact_area[i] * dhat * (kappa / 2 * (math.log(s) / dhat + (s - 1) / d)) * n
    return g

def hess(x, n, o, contact_area):
    IJV = [[0] * 0, [0] * 0, np.array([0.0] * 0)]
    for i in range(0, len(x)):
        d = n.dot(x[i] - o)
        if d < dhat:
            local_hess = contact_area[i] * dhat * kappa / (2 * d * d * dhat) * (d + dhat) * np.outer(n, n)
            for c in range(0, 2):
                for r in range(0, 2):
                    IJV[0].append(i * 2 + r)
                    IJV[1].append(i * 2 + c)
                    IJV[2] = np.append(IJV[2], local_hess[r, c])
    return IJV

Then for the continuous collision detection, we similarly modify the implementation to compute the large feasible initial step size for line search using $n$ and $o$ :

Implementation 10.1.3 (Slope CCD, BarrierEnergy.py).

def init_step_size(x, n, o, p):
    alpha = 1
    for i in range(0, len(x)):
        p_n = p[i].dot(n)
        if p_n < 0:
            alpha = min(alpha, 0.9 * n.dot(x[i] - o) / -p_n)
    return alpha

Here the search direction of each node is projected onto the normal direction $n$ to divide the current distance when computing the smallest step size that first brings the distance to $0$ .

Finally, drawing the slope as a line from $o - 3 \overset{n}{^}$ to $o + 3 \overset{n}{^}$ where $\overset{n}{^} = [n_{y}, - n_{x}]$ pointing to the inclined direction,

Implementation 10.1.4 (Slope visualization, simulator.py).

    pygame.draw.aaline(screen, (0, 0, 255), screen_projection([ground_o[0] - 3.0 * ground_n[1], ground_o[1] + 3.0 * ground_n[0]]), 
        screen_projection([ground_o[0] + 3.0 * ground_n[1], ground_o[1] - 3.0 * ground_n[0]]))   # slope

we can now simulate an elastic square dropped on a slope without friction (Figure 10.1.1).

**Figure 10.1.1.** An elastic square dropped onto a frictionless slope, bouncing as it slides down.

Slope Friction

Now to implement friction for the slope, we start by implementing the functions that calculate $f_{0} (∥ \overset{v}{ˉ}_{k} ∥ \hat{h})$ , $f_{1} (∥ \overset{v}{ˉ}_{k} ∥) /∥ \overset{v}{ˉ}_{k} ∥$ , and $(f_{1}^{'} (∥ \overset{ˉ}{v}_{k} ∥) ∥ \overset{ˉ}{v}_{k} ∥ - f_{1} (∥ \overset{ˉ}{v}_{k} ∥)) /∥ \overset{ˉ}{v}_{k} ∥^{2}$ according to Equation (9.2.2), Equation (9.2.5), and Equation (9.2.6) respectively.

Implementation 10.2.1 (Friction helper functions, FrictionEnergy.py).

import numpy as np
import utils

epsv = 1e-3

def f0(vbarnorm, epsv, hhat):
    if vbarnorm >= epsv:
        return vbarnorm * hhat
    else:
        vbarnormhhat = vbarnorm * hhat
        epsvhhat = epsv * hhat
        return vbarnormhhat * vbarnormhhat * (-vbarnormhhat / 3.0 + epsvhhat) / (epsvhhat * epsvhhat) + epsvhhat / 3.0

def f1_div_vbarnorm(vbarnorm, epsv):
    if vbarnorm >= epsv:
        return 1.0 / vbarnorm
    else:
        return (-vbarnorm + 2.0 * epsv) / (epsv * epsv)

def f_hess_term(vbarnorm, epsv):
    if vbarnorm >= epsv:
        return -1.0 / (vbarnorm * vbarnorm)
    else:
        return -1.0 / (epsv * epsv)

With these terms available, we can then implement the semi-implicit friction energy value, gradient, and Hessian computations according to Equation (9.2.1), Equation (9.2.3), and Equation (9.2.4) respectively.

Implementation 10.2.2 (Friction value, gradient, and Hessian, FrictionEnergy.py).

def val(v, mu_lambda, hhat, n):
    sum = 0.0
    T = np.identity(2) - np.outer(n, n) # tangent of slope is constant
    for i in range(0, len(v)):
        if mu_lambda[i] > 0:
            vbar = np.transpose(T).dot(v[i])
            sum += mu_lambda[i] * f0(np.linalg.norm(vbar), epsv, hhat)
    return sum

def grad(v, mu_lambda, hhat, n):
    g = np.array([[0.0, 0.0]] * len(v))
    T = np.identity(2) - np.outer(n, n) # tangent of slope is constant
    for i in range(0, len(v)):
        if mu_lambda[i] > 0:
            vbar = np.transpose(T).dot(v[i])
            g[i] = mu_lambda[i] * f1_div_vbarnorm(np.linalg.norm(vbar), epsv) * T.dot(vbar)
    return g

def hess(v, mu_lambda, hhat, n):
    IJV = [[0] * 0, [0] * 0, np.array([0.0] * 0)]
    T = np.identity(2) - np.outer(n, n) # tangent of slope is constant
    for i in range(0, len(v)):
        if mu_lambda[i] > 0:
            vbar = np.transpose(T).dot(v[i])
            vbarnorm = np.linalg.norm(vbar)
            inner_term = f1_div_vbarnorm(vbarnorm, epsv) * np.identity(2)
            if vbarnorm != 0:
                inner_term += f_hess_term(vbarnorm, epsv) / vbarnorm * np.outer(vbar, vbar)
            local_hess = mu_lambda[i] * T.dot(utils.make_PSD(inner_term)).dot(np.transpose(T)) / hhat
            for c in range(0, 2):
                for r in range(0, 2):
                    IJV[0].append(i * 2 + r)
                    IJV[1].append(i * 2 + c)
                    IJV[2] = np.append(IJV[2], local_hess[r, c])
    return IJV

Note that in Numpy, matrix-matrix and matrix-vector products are realized by the dot() function. For implicit Euler, $v = (x - x^{n}) / h$ and so $\hat{h} = h$ . Here mu_lambda stores $μ λ_{k}^{n}$ for each node, where the normal force magnitude $λ_{k}^{n}$ is calculated using $x^{n}$ at the beginning of each time step.

Implementation 10.2.3 (Use mu and lambda, time_integrator.py).

def step_forward(x, e, v, m, l2, k, n, o, contact_area, mu, is_DBC, h, tol):
    x_tilde = x + v * h     # implicit Euler predictive position
    x_n = copy.deepcopy(x)
    mu_lambda = BarrierEnergy.compute_mu_lambda(x, n, o, contact_area, mu)  # compute mu * lambda for each node using x^n

    # Newton loop

Implementation 10.2.4 (Compute mu and lambda, BarrierEnergy.py).

def compute_mu_lambda(x, n, o, contact_area, mu):
    mu_lambda = np.array([0.0] * len(x))
    for i in range(0, len(x)):
        d = n.dot(x[i] - o)
        if d < dhat:
            s = d / dhat
            mu_lambda[i] = mu * -contact_area[i] * dhat * (kappa / 2 * (math.log(s) / dhat + (s - 1) / d))
    return mu_lambda

Since the slope is static, and the normal direction is the same everywhere, $T$ is constant and so can be discretized accurately.

Finally, we set friction coefficient $μ$ and pass it to the time integrator where we add friction energy to model semi-implicit friction on the slope.

mu = 0.11        # friction coefficient of the slope

Now we are ready to test the simulation with different friction coefficients. Since our slope has an inclined angle $θ$ with $tan (θ) = 0.1$ , we test $μ = 0.1$ , $0.11$ , and $0.2$ (Figure 10.2.1). Here we see that when $μ = 0.1$ , the critical value that provides dynamic friction forces in the same magnitude with that of the gravity component on the slope, the square keeps sliding after gaining the initial momentum (Figure 10.2.1 top). When we set $μ = 0.11$ , right above the critical value, the square slides a while and then stopped, showing that static friction is properly resolved (Figure 10.2.1 middle). With $μ = 0.2$ , the square stops even earlier (Figure 10.2.1 bottom).

**Figure 10.2.1.** With friction coefficient $μ = 0.1$ (top), $0.11$ (middle), and $0.2$ (bottom), we simulate an elastic square dropped onto a slope. Except the top one that the square keeps sliding, the lower two with larger $μ$ both end up with a static equilibrium.

Summary

In this case study, we implemented semi-implicit friction between simulated objects and a slope, accommodating arbitrary orientations and positions. Within the optimization time integration framework of IPC, friction is also modeled using potential energy. The key difference is that the normal force magnitude and tangent operator are precomputed at the start of each time step for semi-implicit discretization.

In the next lecture, we will introduce moving boundary conditions. This will involve obstacles or boundary nodes moving in a prescribed manner, actively injecting dynamics into the scene.

Moving Boundary Conditions*

Kinematic Collision Objects (CO) and Moving Dirichlet Boundary Conditions (BC) are crucial in many simulation scenarios. A CO can be considered as a collection of BC nodes.

At the start of a time step, it is ideal if the BC nodes can be moved directly to their prescribed locations without causing any interpenetrations. This allows the simulation to proceed smoothly using the Degree of Freedom (DOF) elimination method, which ensures the constraints remain feasible.

However, with large time steps, high velocities, or significant deformations, directly prescribing BC nodes often leads to interpenetration or "tunneling" artifacts, where objects pass through each other unrealistically.

To address these challenges, the penalty method is applied. This method progressively adjusts the simulation towards a feasible set where both CO and BC constraints are satisfied, and interpenetrations are avoided.

A case study demonstrating these principles will be shown through the simulation of a compressed square.

Penalty Method

At the beginning of each time step towards time $n + 1$ , we evaluate nodal position $\hat{x}_{k}^{n + 1}$ for each BC node $k$ based on their prescribed motions. During each Newton iteration $i$ , for the iterate $x^{i}$ , we define a velocity residual to assess how close each BC node is to meeting its target: $r_{BC, k}^{i} = \frac{1}{h} ∥ x_{k}^{i} - \hat{x}_{k}^{n + 1} ∥.$ When $r_{BC, k}^{i}$ falls below a specific tolerance $ϵ$ for any BC node $k$ , we can fix the node at its current location $x_{k}^{i} \approx \hat{x}_{k}^{n + 1}$ and apply the DOF elimination method in the subsequent iterations. This is particularly straightforward in scenes with only static BCs, where the DOF elimination method is directly applied.

For other BC nodes $k$ that are far from their target locations, we introduce new penalty terms to the Incremental Potential for each of these nodes: $\frac{κ _{M}}{2} m_{k} ∥ x_{k} - \hat{x}_{k}^{n + 1} ∥^{2} . (11.1.1)$ Here, $m_{k}$ represents the nodal mass, allowing for intuitive setting of the penalty stiffness $κ_{M}$ , as the Hessian of the penalty term with respect to BC nodes is simply $κ_{M}$ times that of the inertia term.

Remark 11.1.1. For collision obstacles (CO), precisely calculating node masses is challenging due to unknown factors like density. A practical approach is to assume a density similar to that of the simulated solids in the scene. This assumption makes the diagonal entries on the Hessian of the penalty terms roughly $κ_{M}$ times that of the inertia term.

For codimensional COs such as shells, rods, and particles, the key is to consider a reasonably large thickness when calculating their volumes. This helps in ensuring that their physical properties align more closely with those of the main simulation bodies.

Setting the penalty stiffness $κ_{M}$ appropriately can be challenging. If $κ_{M}$ is set too low, it may not effectively move the BC node towards its target. Conversely, a too high $κ_{M}$ can lead to numerical issues. Thus, we initially set $κ_{M}$ to a reasonably large value and adaptively increase it as necessary.

During the Newton solve, if there are BC nodes $k$ where $r_{BC, k}^{i} \geq ϵ$ at the point of Newton convergence, we double the penalty stiffness $κ_{M}$ to $2 \times$ its current value and continue the Newton solve. This process is repeated until all BCs are satisfactorily met at convergence.

Remark 11.1.2. In practice, with double precision floating-point numbers, initializing $κ_{M}$ below $1 0^{6}$ is typically sufficient, given that the Hessian of the stiff penalty terms is purely diagonal. However, if certain BCs remain unsatisfied even when $κ_{M}$ is increased to above $1 0^{10}$ , the optimization process may stall due to severe numerical errors. This stalling occurs because extremely stiff penalty terms are in conflict with the contact barriers. However, such a scenario would likely only occur under a rare CO/BC setting in a manner far more extreme than what is tested in Figure 2.3.1.

Case Study: Compressing Square

We simulate compressing an elastic square using a ceiling. The excutable Python project for this section can be found at https://github.com/phys-sim-book/solid-sim-tutorial under the 5_mov_dirichlet folder. MUDA GPU implementations can be found at https://github.com/phys-sim-book/solid-sim-tutorial-gpu under the simulators/5_mov_dirichlet folder.

The ceiling in our simulation is modeled as a half-space with a downward normal vector $n = (0, - 1)$ . The distance from the ceiling to other simulated Degrees of Freedom (DOFs) can be calculated using Equation (7.1.1). To effectively apply the penalty method, it's necessary that the ceiling's height also serves as a DOF.

Following the approach used in the Square on Slope project, we choose the origin $o$ on the ceiling as the DOF and incorporate it into the variable $x$ :

Implementation 11.2.1 (Ceiling DOF setup, simulator.py).

[x, e] = square_mesh.generate(side_len, n_seg)      # node positions and edge node indices
x = np.append(x, [[0.0, side_len * 0.6]], axis=0)   # ceil origin (with normal [0.0, -1.0])

The ceiling is initially positioned directly above the elastic square, as shown in the left image of Figure 11.2.1. By doing so, we ensure that the nodal mass of this newly added DOF is consistent with the other simulated nodes on the square, as per our implementation.

With this additional DOF, we can straightforwardly model the contact between the ceiling and the square. This is done by enhancing the existing functions that compute the barrier energy value, gradient, Hessian, and the initial step size:

Implementation 11.2.2 (Barrier energy value, BarrierEnergy.py).

    n = np.array([0.0, -1.0])
    for i in range(0, len(x) - 1):
        d = n.dot(x[i] - x[-1])
        if d < dhat:
            s = d / dhat
            sum += contact_area[i] * dhat * kappa / 2 * (s - 1) * math.log(s)

Implementation 11.2.3 (Barrier energy gradient, BarrierEnergy.py).

    n = np.array([0.0, -1.0])
    for i in range(0, len(x) - 1):
        d = n.dot(x[i] - x[-1])
        if d < dhat:
            s = d / dhat
            local_grad = contact_area[i] * dhat * (kappa / 2 * (math.log(s) / dhat + (s - 1) / d)) * n
            g[i] += local_grad
            g[-1] -= local_grad

Implementation 11.2.4 (Barrier energy Hessian, BarrierEnergy.py).

    n = np.array([0.0, -1.0])
    for i in range(0, len(x) - 1):
        d = n.dot(x[i] - x[-1])
        if d < dhat:
            local_hess = contact_area[i] * dhat * kappa / (2 * d * d * dhat) * (d + dhat) * np.outer(n, n)
            index = [i, len(x) - 1]
            for nI in range(0, 2):
                for nJ in range(0, 2):
                    for c in range(0, 2):
                        for r in range(0, 2):
                            IJV[0].append(index[nI] * 2 + r)
                            IJV[1].append(index[nJ] * 2 + c)
                            IJV[2] = np.append(IJV[2], ((-1) ** (nI != nJ)) * local_hess[r, c])

Implementation 11.2.5 (Initial step size calculation, BarrierEnergy.py).

    n = np.array([0.0, -1.0])
    for i in range(0, len(x) - 1):
        p_n = (p[i] - p[-1]).dot(n)
        if p_n < 0:
            alpha = min(alpha, 0.9 * n.dot(x[i] - x[-1]) / -p_n)

Here for the distance between the ceiling $o$ and a node $x$ , we have the stacked quantities locally: $d (x, o) = n^{T} (x - o), \nabla d (x, o) = [n - n], \nabla^{2} d (x, o) = 0 .$

Now we apply the moving BC on the ceiling to compress the elastic square. We set the ceiling's DOF, identified by the node index (n_seg+1)*(n_seg+1), as the sole Dirichlet Boundary Condition (DBC) in this scene. We assign it a downward velocity of $(0, - 0.5)$ . The movement is stopped when the ceiling reaches a height of $- 0.6$ :

Implementation 11.2.6 (DBC setup, simulator.py).

DBC = [(n_seg + 1) * (n_seg + 1)]       # dirichlet node index
DBC_v = [np.array([0.0, -0.5])]         # dirichlet node velocity
DBC_limit = [np.array([0.0, -0.6])]     # dirichlet node limit position

Then we implement the penalty term according to Equation (11.1.1), which is essentially a quadratic spring energy for controlling the motion of the ceiling:

Implementation 11.2.7 (Spring energy computation, SpringEnergy.py).

import numpy as np

def val(x, m, DBC, DBC_target, k):
    sum = 0.0
    for i in range(0, len(DBC)):
        diff = x[DBC[i]] - DBC_target[i]
        sum += 0.5 * k * m[DBC[i]] * diff.dot(diff)
    return sum

def grad(x, m, DBC, DBC_target, k):
    g = np.array([[0.0, 0.0]] * len(x))
    for i in range(0, len(DBC)):
        g[DBC[i]] = k * m[DBC[i]] * (x[DBC[i]] - DBC_target[i])
    return g

def hess(x, m, DBC, DBC_target, k):
    IJV = [[0] * 0, [0] * 0, np.array([0.0] * 0)]
    for i in range(0, len(DBC)):
        for d in range(0, 2):
            IJV[0].append(DBC[i] * 2 + d)
            IJV[1].append(DBC[i] * 2 + d)
            IJV[2] = np.append(IJV[2], k * m[DBC[i]])
    return IJV

Next, we focus on optimizing with the spring energies while properly handling the convergence check and penalty stiffness adjustments. At the start of each time step, the target position for each DBC node is computed, and the penalty stiffness, $k_{M}$ , is initialized to $10$ . If certain nodes reach their preset limit, we then set the target as their current position:

Implementation 11.2.8 (DBC initialization, time_integrator.py).

    DBC_target = [] # target position of each DBC in the current time step
    for i in range(0, len(DBC)):
        if (DBC_limit[i] - x_n[DBC[i]]).dot(DBC_v[i]) > 0:
            DBC_target.append(x_n[DBC[i]] + h * DBC_v[i])
        else:
            DBC_target.append(x_n[DBC[i]])

Entering the Newton loop, in each iteration, just before computing the search direction, we assess how many DBC nodes are close enough to their target positions. We store these results in the variable DBC_satisfied:

Implementation 11.2.9 (DBC satisfaction check, time_integrator.py).

    # check whether each DBC is satisfied
    DBC_satisfied = [False] * len(x)
    for i in range(0, len(DBC)):
        if LA.norm(x[DBC[i]] - DBC_target[i]) / h < tol:
            DBC_satisfied[DBC[i]] = True

Then we only eliminate the DOFs of those DBC nodes that already satisfy the boundary condition:

Implementation 11.2.10 (DOF elimination, time_integrator.py).

    # eliminate DOF if it's a satisfied DBC by modifying gradient and Hessian for DBC:
    for i, j in zip(*projected_hess.nonzero()):
        if (is_DBC[int(i / 2)] & DBC_satisfied[int(i / 2)]) | (is_DBC[int(j / 2)] & DBC_satisfied[int(j / 2)]): 
            projected_hess[i, j] = (i == j)
    for i in range(0, len(x)):
        if is_DBC[i] & DBC_satisfied[i]:
            reshaped_grad[i * 2] = reshaped_grad[i * 2 + 1] = 0.0
    return [spsolve(projected_hess, -reshaped_grad).reshape(len(x), 2), DBC_satisfied]

The BC satisfaction information stored in DBC_satisfied is also used to check convergence and update $k_{M}$ when needed:

Implementation 11.2.11 (Convergence criteria, time_integrator.py).

    [p, DBC_satisfied] = search_dir(x, e, x_tilde, m, l2, k, n, o, contact_area, (x - x_n) / h, mu_lambda, is_DBC, DBC, DBC_target, DBC_stiff[0], tol, h)
    while (LA.norm(p, inf) / h > tol) | (sum(DBC_satisfied) != len(DBC)):   # also check whether all DBCs are satisfied
        print('Iteration', iter, ':')
        print('residual =', LA.norm(p, inf) / h)

        if (LA.norm(p, inf) / h <= tol) & (sum(DBC_satisfied) != len(DBC)):
            # increase DBC stiffness and recompute energy value record
            DBC_stiff[0] *= 2
            E_last = IP_val(x, e, x_tilde, m, l2, k, n, o, contact_area, (x - x_n) / h, mu_lambda, DBC, DBC_target, DBC_stiff[0], h)

Now, we proceed to run the simulation, which involves severely compressing the dropped elastic square as depicted in (Figure 11.2.1). From the final static frame, we observe that the elastic springs on the edges are inverted due to extreme compression. This artifact is typical in mass-spring models of elasticity. In future chapters, we will explore how applying finite-element discretization to barrier-type elasticity models, such as the Neo-Hookean model, can prevent such issues. That approach is akin to the enforcement of non-interpenetrations in our current simulations.

**Figure 11.2.1.** A square is dropped onto the ground and compressed by a ceiling until inverted.

Summary

We introduced the penalty method for handling moving boundary conditions while preventing interpenetrations. The key strategies involved are:

Augmenting the Incremental Potential with additional spring energies on the DBC nodes;
Adaptively increasing the penalty stiffness as required;
Eliminating DOFs for those BC nodes that are sufficiently close to their targets; and
Ensuring all BCs are satisfied at the point of convergence.

To address the inversion artifact observed in our case study of compressing mass-spring elastic squares, the application of barrier-type elasticity energies is essential. Our penalty method for moving BCs plays a crucial role when these energies are applied, as directly prescribing BC nodes can still lead to inversion. In the next chapter, we will explore hyperelasticity models, which are preferred over mass-spring systems in practical applications.

Kinematics Theory

In previous case studies, we've relied on the mass-spring model to simulate the elastic behaviors of solids. This model approximates 2D and 3D elasticity by connecting multiple springs in various directions, each responding only to stretch and compression. However, this simple approximation often fails to capture the complexities of real-world phenomena. Starting with this lecture, we will delve into the mathematical description of deformation and introduce a more rigorous approach to modeling elasticity for continuum bodies.

When discussing continuum bodies or continuum mechanics, we operate under the continuum assumption. This perspective treats materials—whether solid, liquid, or gas—as continuous entities, avoiding the need to account for microscopic interactions between molecules and atoms. This assumption is not only practical in engineering and graphics applications but is also prevalent in everyday scenarios.

In graphics simulations, the continuum assumption applies to a wide range of materials, including deformable objects (both elastic and plastic), muscle, flesh, cloth, hair, liquids, smoke, gas, and granular materials like sand, snow, mud, and soil. In continuum mechanics, properties such as density, velocity, and force are defined as continuous functions of position. We have explored their discrete counterparts in the Discrete Space and Time section.

Equations of motion, based on Newton's 2nd law, are solved within the spatial domain and evolved over time to simulate the dynamic behaviors of these materials.

Continuum Motion

Kinematics is the study of motion within continuum materials, focusing primarily on the changes in shape or deformation that occur, whether locally or globally, across different coordinate systems. The aim is to describe motion both qualitatively and quantitatively, which is crucial for deriving the governing equations of dynamics and mechanical responses. Notably, kinematics can often be described without the need to introduce concepts like force, stress, or even mass.

In continuum mechanics, deformation is typically represented through three key components:

Material (or undeformed) space $X$ : This represents the initial position of any point in the material.
World (or deformed) space $x$ : This indicates the current position of any point.
Deformation map $ϕ (X, t)$ : This function maps points from the material space to the world space, showing how the position of material points changes over time.

At the initial time $t = 0$ , the material space $X$ and the world space $x$ coincide, meaning every point starts at its undeformed position.

Definition 12.1.1 (Deformation/Flow Map). The motion of material in continuum mechanics is determined by a mapping $ϕ (\cdot, t) : Ω^{0} \to Ω^{t}$ , where $Ω^{0}, Ω^{t} \subset R^{d}$ and $d = 2$ or $3$ represents the dimension of the simulated problem (or domain). This mapping, often referred to as the flow map or the deformation map, is crucial in understanding how material points move over time.

Material Points $X$ : Points in the set $Ω^{0}$ are known as material points and are designated as $X$ .

Current Locations $x$ : Points in $Ω^{t}$ represent the location of material points at time $t$ , and are referred to as $x$ . The deformation map $ϕ$ describes the trajectory of each material point $X \in Ω^{0}$ throughout time, expressed as: $x = x (X, t) = ϕ (X, t) .$

Example 12.1.1. If our object is moving with a constant speed $v$ along direction $n$ , then we have $x = X + t v n . (12.1.1)$ If an object undergoes some rigid motion after time $t$ (compared to time $0$ ), we will have $x = RX + b, (12.1.2)$ where $R$ is a rotation matrix, and $b$ is some translation. $R$ and $b$ will likely be functions of time $t$ and the initial position $X$ , depending on the actual motion.

The mapping $ϕ$ can be used to quantify relevant continuum-based physics. For example, the velocity of a given material point $X$ at time $t$ is $V (X, t) = \frac{\partial ϕ}{\partial t} (X, t), (12.1.3)$ and the acceleration is $A (X, t) = \frac{\partial ^{2} ϕ}{\partial t ^{2}} (X, t) = \frac{\partial V}{\partial t} (X, t) . (12.1.4)$ That is, $V (\cdot, t) : Ω^{0} \to R^{d}$ and $A (\cdot, t) : Ω^{0} \to R^{d}$ .

Remark 12.1.1. In the above, the velocity $V$ and acceleration $A$ are defined from the Lagrangian perspective. This means that both velocity and acceleration are functions of the material configuration $X$ and time $t$ , focusing on specific particles within the material. Physically, this implies that these measurements pertain to particles that have their own mass and have occupied some volume from the beginning of the simulation. The Lagrangian view is particularly valuable for tracking individual particle dynamics over time, offering detailed insights into how particles move, accelerate, and interact within the material under various conditions.

Deformation

We have $X$ and $x$ as material coordinates and world coordinates, respectively, each associated with domains $Ω_{0}$ and $Ω_{t}$ . For any point $X$ within $Ω_{0}$ , the mapping function $ϕ$ transports it to $Ω_{t}$ at a specific time $t$ , represented as $x = ϕ (X, t)$ .

Definition 12.2.1 (Deformation Gradient). The Jacobian of the deformation map $ϕ$ is referred to as the deformation gradient and is crucial in describing the physics of elasticity. It is commonly denoted as $F$ and defined by the relation: $F (X, t) = \frac{\partial ϕ}{\partial X} (X, t) = \frac{\partial x}{\partial X} (X, t) .$ Discretely, this Jacobian often takes the form of a small $2 \times 2$ or $3 \times 3$ matrix. For materials like cloth or thin shells in 3D, $F$ might be a $3 \times 2$ matrix, reflecting the 2D nature of the material space. Thus, $F (\cdot, t) : Ω^{0} \to R^{d \times d}$ maps every material point $X$ to a $R^{d \times d}$ matrix that describes the deformation Jacobian at time $t$ . Using index notation, it can be expressed as: $F_{ij} = \frac{\partial ϕ _{i}}{\partial X _{j}} = \frac{\partial x _{i}}{\partial X _{j}}, i, j = 1, \dots, d .$

We can compute the deformation gradient for the deformation map specified in Equation (12.1.1), where the result is the identity matrix. Similarly, for the deformation map in Equation (12.1.2), the deformation gradient $F$ equals $R$ . In both cases, the object does not undergo real deformation; these are merely examples of rigid transformations. Such deformation gradients should not lead to any internal forces within the material unless artistic effects are intentionally being pursued (such as in a cartoon).

**Figure 12.2.1 (Deformation gradient).**

Example 12.2.1. Intuitively, the deformation gradient $F$ indicates the extent of local deformation within a material. Consider two nearby points, $x_{1}^{0}$ and $x_{2}^{0}$ , embedded in the material at the start of the simulation (as illustrated in Figure 12.2.1). If $x_{1}$ and $x_{2}$ represent these points in the current configuration, the relationship between these points can be expressed as: $(x_{2} - x_{1}) = F (x_{2}^{0} - x_{1}^{0}) .$ This equation shows how the deformation gradient transforms the initial distance between the points into their current separation, thus quantifying the local deformation.

The determinant of the deformation gradient $F$ , commonly denoted by $J$ , is crucial because it characterizes the infinitesimal volume change during deformation. This is expressed as $J = det (F)$ . The value of $J$ represents the ratio of the infinitesimal volume of the material in the deformed configuration $Ω^{t}$ to its original volume in $Ω^{0}$ . For instance, in rigid motions, which include rotations and translations, $F$ is a rotation matrix and therefore $J = 1$ . Notably, the identity matrix, being a rotation matrix, also results in $J = 1$ .

If $J > 1$ , it indicates a volume increase, whereas $J < 1$ indicates a decrease. A situation where $J = 0$ suggests that the volume has effectively become zero, a scenario that is impossible in the real world but can occur numerically. In 3D, this indicates that the material is compressed to such an extent that it might collapse into a plane, line, or even a point without volume. Conversely, $J < 0$ indicates material inversion. For example, in 2D, if $J < 0$ for a triangle, it implies that one vertex has passed through the opposing edge, effectively 'inverting' the triangle and making its area negative. As seen in the Moving Boundary Conditions section, severe compression of an elastic square can lead to inversions. In such cases, $J$ serves as a direct measure of this artifact and is utilized in many elasticity models to ensure simulations are free from inversions.

Summary

Defining the flow map which transforms continuum bodies from the material space (initial configuration) to the world space (current configuration), we introduced a mathematical description of the change in shapes -- the deformation gradient $F \in R^{d \times d}$ ( $d = 2$ or $3$ ), which is the Jacobian of the flow map with respect to $X$ .

When $F$ at a certain point on the continuum body is a rotation matrix, it indicates there is no deformation and, consequently, no local elasticity forces should be present. In the next lecture, we will explore how to define more realistic elastic potential energies using the deformation gradient.

Strain Energy

With the deformation gradient $F$ serving as a rigorous mathematical measure of local deformation, we can define the elastic potential energy based on $F$ to more accurately capture the elastic behaviors of solids. $F$ is measured at every local point within the solid domain. We would measure the elastic potential locally for each point and then integrate these measurements across the entire domain. This approach mirrors the process used in the 2D Mass Spring case study, where the energy of each spring, weighted by an estimated volume, was summed up in a discrete setting. Here, $F$ is also known as strain, and the elastic potential $P_{e}$ , referred to as strain energy, is derived from integrating strain energy density functions $Ψ (F) : R^{d \times d} \to R$ at each material point within the solid domain: $P_{e} = \int_{Ω_{0}} Ψ (F) d X .$ In this lecture, we will explore various design choices of $Ψ (F)$ and examine some of their properties.

Rigid Null Space and Rotation Invariance

As mentioned in the previous lecture, for a solid undergoing only translational and/or rotational motions, no elastic potential energy is stored, and thus no elasticity force is exerted. This implies that any strain energy density functions $Ψ (F)$ have a rigid null space, meaning that $Ψ (F)$ should remain $0$ if the input deformation gradient is any rotation matrix $R$ : $Ψ (F) = 0 \forall F = R .$ A square matrix $F$ is a rotation matrix if and only if: $F^{T} = F^{- 1} and J \equiv det (F) = 1.$ From this definition, a straightforward formulation for $Ψ (F)$ emerges, penalizing any deviation of $F$ from being a rotation matrix with quadratic terms: $Ψ (F) = \frac{μ}{4} ∥ F^{T} F - I ∥_{F}^{2} + \frac{λ}{2} (J - 1)^{2} . (13.1.1)$ Here, $μ$ and $λ$ are the stiffness parameters, with the first term derived from right-multiplying $F$ to both sides of $F^{T} = F^{- 1}$ . This intuitive formulation closely aligns with how many standard strain energy density functions are constructed.

Definition 13.1.1 (Neo-Hookean Elasticity). The Neo-Hookean elasticity model is characterized by the following energy density function: $Ψ_{NH} (F) = \frac{μ}{2} (tr (F^{T} F) - d) - μ ln (J) + \frac{λ}{2} ln^{2} (J) . (13.1.2)$ Taking the derivative of $Ψ_{NH} (F)$ with respect to $F$ , we obtain: $\frac{\partial Ψ}{\partial F} (F) = μ (F - F^{- T}) + λ ln (J) F^{- T} .$ From this gradient, it is evident that the $μ$ -term achieves a local minimum when $F - F^{- T} = 0$ (i.e., $F^{T} = F^{- 1}$ ), and for the $λ$ -term, the local minimum occurs at $J = 1$ .

Definition 13.1.2 (Lame Parameters). In standard strain energy density functions, the stiffness parameters $μ$ and $λ$ are known as Lame parameters. These parameters are directly related to the Young's modulus $E$ , which measures resistance to stretching, and the Poisson's ratio $ν$ , which measures the incompressibility of the solid: $μ = \frac{E}{2 ( 1 + ν )}, λ = \frac{E ν}{( 1 + ν ) ( 1 - 2 ν )} .$

Definition 13.1.3 (Rotation Invariance). The energy density function for any nonlinear elastic model is rotation invariant. Mathematically, this is expressed as: $Ψ (F) = Ψ (RF) \forall F \in R^{d \times d} and d \times d rotation matrix R . (13.1.3)$ Intuitively, this means that any rotations applied after deformation should not alter the value of the strain energy density function.

However, the simplest strain energy density function, linear elasticity, does not include rigid modes in its null space nor does it satisfy Equation (13.1.3). This is because linear elasticity is specifically designed for infinitesimal strains, where no significant rotations are involved.

Definition 13.1.4 (Linear Elasticity). Linear elasticity has the energy density function $Ψ_{lin} (F) = μ ∥ ϵ ∥_{F}^{2} + \frac{λ}{2} tr^{2} (ϵ) . (13.1.4)$ Here $ϵ = \frac{1}{2} (F + F^{T}) - I$ is the small strain tensor, and we see that $Ψ_{lin} (F)$ is a quadratic function of $F$ .

Notably, the linear elasticity model with the corresponding Lame parameters is calibrated to real-world experiments under conditions of small deformations. In such circumstances, all standard strain energy density functions must align with linear elasticity. The consistency between these models and linear elasticity will be concisely demonstrated after we introduce the polar singular value decomposition of $F$ in the next section.

Rotation invariance (Equation (13.1.3)) should not be confused with the isotropic property of certain elastic models.

Definition 13.1.5 (Isotropic Elasticity). The energy density function of isotropic elastic models satisfies $Ψ (F) = Ψ (FR) \forall F \in R^{d \times d} and d \times d rotation matrix R . (13.1.5)$ This implies that the same amount of stretch in any direction results in the same energy change. Consequently, there are no special directions in which the material is harder or easier to deform than others.

Neo-Hookean (Equation (13.1.2)) and our intuitive model (Equation (13.1.1)) are both examples of isotropic models. However, linear elasticity (Equation (13.1.4)) does not meet this condition (Equation (13.1.5)), as it is not designed to handle rotational motions effectively.

For anisotropic elastic models, the resistance to stretch varies depending on the direction. Materials such as cloth, bones, muscles and wood are examples of anisotropic materials, exhibiting different mechanical properties in different directions.

Polar Singular Value Decomposition

When discussing general slip boundary conditions, we introduced the usage of singular value decomposition (SVD). Here, we apply a variant known as Polar SVD (Algorithm 13.2.1) to decompose $F$ : $F = UΣ V^{T},$ where $U$ and $V$ are both $d \times d$ rotation matrices, and $Σ$ is a $d \times d$ diagonal matrix. Unlike standard SVD, which ensures $Σ_{ii}$ remains non-negative possibly at the expense of having $det (U) = - 1$ or $det (V) = - 1$ , Polar SVD maintains $det (U) = 1$ and $det (V) = 1$ , allowing $Σ_{ii}$ to be negative if necessary.

Polar SVD is named for its relation to Polar decomposition, where $F$ is expressed as $RS$ . This decomposition can be reconstructed via $R = U V^{T}$ and $S = VΣ V^{T}$ , with $R$ representing the closest rotation to $F$ and $S$ being symmetric.

Algorithm 13.2.1 (Polar SVD from Standard SVD).

The Polar SVD of $F$ offers a more intuitive way to understand deformation. If we denote $σ_{i} = Σ_{ii}$ , referred to as the principal stretches, we can conceptualize $F$ as comprising a sequence of transformations. Initially, there is a rotation by $V^{T}$ , followed by scaling the dimensions by $σ_{i}$ along each axis, and concluding with another rotation by $U$ . This decomposition is applicable for all possible $F$ .

Polar SVD also allows for the more convenient expression of isotropic strain energy density functions using $σ_{i}$ exclusively. For instance, our intuitive formulation in Equation (13.1.1) can be reframed as:

$Ψ (F) = \hat{Ψ} (Σ) = \frac{μ}{4} i = 1 \sum d (σ_{i}^{2} - 1)^{2} + \frac{λ}{2} (i = 1 \prod d σ_{i} - 1)^{2},$

where $J = \prod_{i = 1}^{d} σ_{i} = σ_{1} \cdot σ_{2} \cdot \dots \cdot σ_{d}$ . Moreover, the Neo-Hookean strain energy density function (Equation (13.1.2)) can be rewritten as:

$Ψ_{NH} (F) = \hat{Ψ}_{NH} (Σ) = \frac{μ}{2} (i = 1 \sum d σ_{i}^{2} - d) - μ ln (J) + \frac{λ}{2} ln^{2} (J) .$

These two models are both consistent with linear elasticity under small deformation.

Definition 13.2.1 (Consistency to Linear Elasticity). To verify the consistency to linear elasticity of a strain energy density function $Ψ (F)$ , we just need to check whether the following relations all hold: $\hat{Ψ} (I) = 0, \frac{\partial Ψ ^}{\partial σ _{i}} (I) = 0, and \frac{\partial ^{2} Ψ ^}{\partial σ _{i} \partial σ _{j}} (I) = 2 μ δ_{ij} + λ .$ Here $1 \leq i, j \leq d$ , and $δ_{ij} = 1$ if $i = j$ , otherwise it is $0$ .

Simplified Models and Invertibility

Definition 13.3.1 (Corotated Linear Elasticity). To make linear elasticity rotation-aware while maintaining its simplicity, we can introduce a base rotation $R^{n}$ and construct an energy density function $Ψ_{LC} (F) = Ψ_{lin} ((R^{n})^{T} F),$ penalizing any deviation between $F$ and this fixed $R^{n}$ . This is called corotated linear elasticity.

$Ψ_{LC} (F)$ remains a quadratic energy with respect to $F$ and is very useful for dynamic simulations. At the beginning of the optimization for each time step $n + 1$ , we compute $R^{n}$ as the closest rotation to $F^{n}$ : $R^{n} = arg R min ∥ F^{n} - R ∥_{F}^{2} s.t. R^{T} = R^{- 1} and det (R) = 1. (13.3.1)$ As mentioned earlier, the solution is given by the Polar decomposition on $F^{n}$ , and with Polar SVD $F^{n} = U^{n} Σ^{n} (V^{n})^{T}$ , we have $R^{n} = U^{n} (V^{n})^{T}$ . However, corotated linear elasticity is still not rotation invariant, as $R^{n}$ does not change with $F$ during the optimization. Thus, it is not suitable for large deformations.

For rotation invariant elastic models, practitioners in computer graphics have been simplifying them for visual computing purposes. For example, only keeping a $μ$ -term while ignoring the $λ$ -term in the energy density function for more efficient computations: $Ψ_{R} (F) = \frac{μ}{4} ∥ F^{T} F - I ∥_{F}^{2}, or Ψ_{ARAP} (F) = μ i \sum d (σ_{i} - 1)^{2}, etc . (13.3.2)$ Here $Ψ_{ARAP} (F)$ is called the As-Rigid-As-Possible (ARAP) energy, which is widely used in shape modeling, cloth simulation, and surface parameterization, etc. $Ψ_{R} (F)$ , while being a higher-order polynomial of $F$ compared to ARAP, can be computed without performing the expensive SVDs on $F$ .

For all the strain energy density functions we have looked at in this lecture, except Neo-Hookean, all others are defined on the whole domain $R^{d \times d}$ . Neo-Hookean energy density function is defined on ${F ∣ F \in R^{d \times d}, det (F) > 0}$ . Just like the barrier energy to prevent interpenetrations in IPC, $Ψ_{NH} (F)$ is also a barrier energy, which goes to infinity as $det (F)$ approaches $0$ , providing arbitrarily large elastic forces to prevent inversion ( $det (F) \leq 0$ ).

Strain energy density functions allowing $det (F) \leq 0$ are also called invertible elasticity models. They are easy to deal with (no need for line search filtering), but do not guarantee non-inversion. Designing an invertible elastic energy that provides reasonably large resistance to inversion has drawn a lot of attention in computer graphics research [Stomakhin et al. 2012] [Smith et al. 2018].

Summary

The elastic potential energy $P_{e}$ is an integration of the strain energy density function $Ψ (F)$ at every local point in the solid domain. From the rigid null space, we derived an intuitive formulation of the strain energy density function, similar in structure to standard models like Neo-Hookean. Nonlinear elastic models are also rotation invariant, meaning any rotations applied after the deformation $F$ do not change $Ψ$ .

Linear elasticity features a quadratic energy density function and is specifically designed for infinitesimal strains $ϵ$ , lacking rigid modes in its null space. Yet, with the corresponding Lame Parameters $μ$ and $λ$ , it can accurately capture behaviors of small deformations observed in the real world. Standard elasticity models are required to be consistent with linear elasticity under small deformations.

This lecture focused on isotropic elasticity, where no special directions exist that make the material harder or easier to deform. Performing Polar SVD on $F = U Σ V$ allows us to rewrite $Ψ (F)$ of isotropic models using only principal stretches $σ_{i} = Σ_{ii}$ .

Using the closest rotation $R^{n} = U^{n} (V^{n})^{T}$ to $F^{n}$ in the last time step, we constructed a corotated linear elasticity to make linear elasticity rotation-aware while maintaining its simplicity. Simplifying further by retaining only the $μ$ -term enhances efficiency for visual computing.

Similar to how non-interpenetrations are enforced in IPC, the energy density function of Neo-Hookean acts as a barrier function, ensuring non-inversion ( $det (F) > 0$ ). All other elasticity models introduced in this lecture are invertible, and they do not guarantee non-inversion.

In the next lecture, we will explore the derivatives of $Ψ (F)$ with respect to $F$ .

Stress and Its Derivatives

Having introduced standard strain energies, we now proceed to their differentiation with respect to the world space coordinates, $x$ , to simulate realistic elastic behaviors. However, it's important to first establish the explicit relationship between these coordinates $x$ and the deformation gradient $F$ . This relationship heavily depends on specific discretization choices.

Before we explore discretization in depth, we should understand how to compute the derivatives of the strain energy function, $Ψ$ , with respect to $F$ . These derivatives are fundamentally linked to the concept of stress, a critical element in understanding material behavior under deformation.

Stress

Stress is a tensor field, akin to the deformation gradient $F$ , and is defined over the entire domain of solid materials. It quantifies the internal pressures and tensions experienced by a material object. The link between stress and strain (or $F$ ) is established through what is known as a constitutive relationship. This relationship outlines how materials respond to various deformations.

A common example of a constitutive relationship is Hooke's law in one dimension, which applies to many conventional materials under elastic conditions. In the context of hyperelastic materials, the relationship is specifically defined by the strain energy function, $Ψ (F)$ .

Definition 14.1.1 (Hyperelastic Materials). Hyperelastic materials are those elastic solids whose first Piola-Kirchhoff stress $P$ can be derived from a strain energy density function $Ψ (F)$ via $P = \frac{\partial Ψ}{\partial F} . (14.1.1)$ With index notation, this means $P_{ij} = \frac{\partial Ψ}{\partial F _{ij}} .$ $P$ is discretely a small matrix with the same dimensions as $F$ .

In the study of material behavior under stress, various definitions are utilized, with Cauchy stress being particularly prevalent in engineering contexts. Cauchy stress, denoted as $σ (\cdot, t) : Ω^{t} \to R^{d \times d}$ , can be mathematically linked to the first Piola-Kirchhoff stress tensor $P$ through the relationship: $σ = \frac{1}{J} P F^{T} = \frac{1}{det ( F )} \frac{\partial Ψ}{\partial F} F^{T} .$

Calculating $P$ from the strain energy function $Ψ (F)$ is relatively straightforward for energy models that do not require singular value decomposition (SVD), such as the Neo-Hookean model. However, general isotropic elasticity models, like ARAP (As-Rigid-As-Possible), often rely on the computation of principal stretches or the closest rotation matrix, necessitating SVD. This computation becomes particularly complex and resource-intensive when determining $\frac{\partial P}{\partial F}$ , which is crucial for implicit time integrations.

We present an efficient method that leverages the sparsity structure, as introduced by [Stomakhin et al. 2012], to compute the first Piola-Kirchhoff stress tensor $P$ and its derivative $\frac{\partial P}{\partial F}$ (whether as a tensor or the differential $δ P$ ) for general isotropic elastic materials. This approach utilizes symbolic software packages, and we will specifically discuss the implementation in Mathematica. Implementations in Maple or other software are similarly straightforward, following the same conceptual framework. For a deeper exploration of derivative computations commonly employed in computer graphics, refer to the work of [Schroeder 2022].

It is important to note that the computational strategy discussed can also be applied to other derivatives in diagonal space, similar to $\frac{\partial P}{\partial F}$ . For instance, in certain models, the Kirchhoff stress $τ$ is preferred over the first Piola-Kirchhoff stress $P$ . The Kirchhoff stress is expressed as: $τ = U \overset{τ}{^} U^{T},$ where $\overset{τ}{^}$ is a diagonal stress measure, with each entry being a function of the singular values $Σ$ . The methodology for computing $\frac{\partial τ}{\partial F}$ mirrors that of $P$ .

Computing $P$

Let's begin with the computation of $P$ . For isotropic materials, the first Piola-Kirchhoff stress tensor can be calculated as follows: $P where F = U \hat{P} V^{T} = U Σ V^{T}, Ψ (F) = \hat{Ψ} (Σ), and \hat{P}_{ij} = \frac{\partial Ψ ^}{\partial σ _{i}} δ_{ij} . (14.2.1)$ This formulation leverages the property that $P$ shares the same SVD space as $F$ , which simplifies the derivation and computation process.

Example 14.2.1. For the Neo-Hookean model (Equation (13.1.2)), we have: $\hat{Ψ}_{NH} (Σ) = \frac{μ}{2} (i \sum d σ_{i}^{2} - d) - μ ln (J) + \frac{λ}{2} ln^{2} (J) .$ Thus, we can first perform SVD on $F = U Σ V$ and derive: $\hat{P}_{ii} = μ (σ_{i} - \frac{1}{σ _{i}}) + λ ln (J) \frac{1}{σ _{i}}$ to compute $\frac{\partial Ψ}{\partial F} = P = U \hat{P} V^{T}$ without symbolically deriving the derivative of $Ψ$ w.r.t. $F$ .

Here we provide the proof that $P$ commutes with rotations in diagonal space (see Equation (14.2.1)). To demonstrate that $P (RF) = RP (F)$ for any rotation matrix $R$ , consider a generic (potentially anisotropic) material model. The key idea is that a rotation applied after deformation does not alter the material's stored energy, thus we have the identity $Ψ (F) = Ψ (RF)$ . Differentiating both sides of this equation with respect to the deformation gradient $F$ yields:

$δ Ψ P (F) : δ F P (F) : δ F P (F) RP (F) = \frac{\partial Ψ}{\partial F} (F) : δ F = \frac{\partial Ψ}{\partial F} (RF) : δ (RF), = P (RF) : (R δ F), = (R^{T} P (RF)) : δ F, = R^{T} P (RF), = P (RF) .$

Furthermore, for an isotropic material where $Ψ (FR) = Ψ (F)$ , a similar argument shows that $P (FR) = P (F) R$ . Combining these relationships for $P$ under rotation, we establish that: $P (F) = P (U Σ V^{T}) = UP (Σ) V^{T} = U \hat{P} V^{T} .$ This formulation confirms the rotational invariance of $P$ in diagonal space.

Additional Proof for $P (Σ) = \hat{P} = \frac{\partial Ψ ^}{\partial Σ}$

In the above, the last equality comes from the fact that $P (F = Σ) = \frac{\partial Ψ ^}{\partial Σ} .$ Here we show why this is true.

(1) First, we claim that $P (Σ)$ is diagonal. This can be seen by realizing that for isotropic elasticity, $P (F) = k \sum \frac{\partial Ψ}{\partial I _{k}} (F) \frac{\partial I _{k}}{\partial F} (F),$ where $I_{k}$ is the isotropic invariants. Following [Sifakis & Barbic 2012] (page 23), we can observe that $\frac{\partial I _{k}}{\partial F} (F)$ when the argument $F$ is diagonal, must be diagonal. Therefore, $P (F)$ is diagonal when $F$ is diagonal.

(2) Next, we claim that $diag (\frac{\partial Σ}{\partial F _{ij}}) = diag (U^{T} \frac{\partial F}{\partial F _{ij}} V) .$ This is proven in [Xu et al. 2015] (Equation 7).

(3) Based on (2), we know that for any $ij$ , after substituting $F = Σ$ , we have $diag (\frac{\partial Σ}{\partial F _{ij}} (Σ)) = diag (I^{T} \frac{\partial F}{\partial F _{ij}} (Σ) I),$ using this we can write out the cases for $ij = 11, ij = 22, ij = 33$ . For example, for $ij = 11$ , we have $\frac{\partial Σ}{\partial F _{11}} (Σ) = 1 * * * 0 * * * 0$

(4) Finally, let's derive $P (Σ)$ . Since we know it is diagonal from (1), we just need to derive its diagonal entry. Let's use $11$ entry as an example: $P_{ab} (Σ) P_{11} (Σ) P_{11} (Σ) = \frac{\partial Ψ ^}{\partial F _{ab}} (Σ) = \frac{\partial Ψ ^}{\partial Σ} (Σ) : \frac{\partial Σ}{\partial F _{ab}} (Σ) = \frac{\partial Ψ ^}{\partial Σ} (Σ) : \frac{\partial Σ}{\partial F _{11}} (Σ) = \frac{\partial Ψ ^}{\partial σ _{1}} \frac{\partial Ψ ^}{\partial σ _{1}} \frac{\partial Ψ ^}{\partial σ _{1}} : 1 * * * 0 * * * 0 = \frac{\partial Ψ ^}{\partial σ _{1}}$ Now we are done with the final proof.

Computing $\partial P / \partial F$ or $δ P$

To compute the derivative of $P$ with respect to $F$ , we leverage the rotational invariance property discussed earlier for $P$ . Consider two arbitrary rotation matrices $R$ and $Q$ . From the rotational properties of $P$ , we have:

$P (F) = P (R R^{T} FQ Q^{T}) = RP (R^{T} FQ) Q^{T} .$

Define $K = R^{T} FQ$ , then:

$P (F) = RP (K) Q^{T} .$

Taking the differential of $P$ , while treating $R$ and $Q$ as constants, gives:

$δ P = R [\frac{\partial P}{\partial F} (K) : δ (K)] Q^{T} = R [\frac{\partial P}{\partial F} (K) : (R^{T} δ FQ)] Q^{T} .$

By setting $R = U$ and $Q = V$ , where $K = Σ$ , the differential expression simplifies to:

$δ P = U [\frac{\partial P}{\partial F} (Σ) : (U^{T} δ FV)] V^{T} .$

The tensorial derivative $\partial P / \partial F$ is then expressed in index notation as:

$(δ P)_{ij} = U_{ik} (\frac{\partial P}{\partial F} (Σ))_{k l mn} U_{r m} δ F_{rs} V_{s n} V_{j l}, and (δ P)_{ij} = (\frac{\partial P}{\partial F} (F))_{ij rs} δ F_{rs} .$

These expressions must hold for any $δ F$ , leading to the relationship:

$(\frac{\partial P}{\partial F} (F))_{ij rs} = (\frac{\partial P}{\partial F} (Σ))_{k l mn} U_{ik} U_{r m} V_{s n} V_{j l} .$

So the remaining task is computing $\frac{\partial P}{\partial F} (Σ)$ . We show how to do it in 3D.

First, let's introduce Rodrigues' rotation formula, which provides a method for expressing any rotation matrix in terms of a unit vector $k$ and a rotation angle $θ$ . The formula is given by: $R = I + sin (θ) K + (1 - cos (θ)) K^{2}, (14.3.1)$ where $K$ is the skew-symmetric cross-product matrix associated with $k$ . This formula shows that any rotation matrix is characterized by just three degrees of freedom, denoted as $r_{1}, r_{2}, r_{3}$ . These components are used to define the rotation vector $r$ , from which $k$ and $θ$ are derived as follows:

$k = \frac{r}{∣ r ∣}, θ = ∣ r ∣.$

Using this parameterization, rotation matrices $U$ and $V$ can each be described by three parameters.

Now we have the following code for defining $F$ in terms of $s 1$ , $s 2$ , $s 3$ , $u 1$ , $u 2$ , $u 3$ , $v 1$ , $v 2$ , $v 3$ , where $U$ and $V$ are defined by $u_{i}$ and $v_{i}$ with Rodrigues' rotation formula, $s_{i}$ are the singular values from $Σ$ .

id=IdentityMatrix[3];
var={s1,s2,s3,u1,u2,u3,v1,v2,v3};
Sigma=DiagonalMatrix[{s1,s2,s3}];
cp[k1_,k2_,k3_]={{0,-k3,k2},{k3,0,-k1},{-k2,k1,0}};
vV={v1,v2,v3};
vU={u1,u2,u3};
nv=Sqrt[Dot[vV,vV]];
nu=Sqrt[Dot[vU,vU]];
UU=cp[u1,u2,u3]/nu;
VV=cp[v1,v2,v3]/nv;
U=id+Sin[nu]*UU+(1-Cos[nu])*UU.UU;
V=id+Sin[nv]*VV+(1-Cos[nv])*VV.VV;
F=U.Sigma.Transpose[V];

where cp is a function for generating the cross-product matrix (corresponding to computing $K$ in Equation (14.3.1)).

From now on, we write the $3 \times 3 \times 3 \times 3$ tensor $\frac{\partial P}{\partial F} (Σ)$ and any other such tensors to $9 \times 9$ matrices. That means each $3 \times 3$ matrix is now a size- $9$ vector. It is easy to see the old $\frac{\partial P _{ij}}{\partial F _{k l}}$ is now $\frac{\partial P _{3 (i - 1) + j}}{\partial F _{3 (k - 1) + l}}$ . We further call vector $S = {s 1, s 2, s 3, u 1, u 2, u 3, v 1, v 2, v 3}$ being the parametrization of $F$ . Then we can apply the chain rule $\frac{\partial P}{\partial F} (Σ) = \frac{\partial P}{\partial S} (Σ) \frac{\partial S}{\partial F} (Σ)$

Here are the Mathematica code for computing them. Note that we achieve $F = Σ$ by taking the limit ${u 1, u 2, u 3, v 1, v 2, v 3} = + ϵ$ , which correspond to nearly zero rotations.

dFdS=D[Flatten[F],{var}];
dFdS0=dFdS/.{u1->e,u2->e,u3->e,v1->e,v2->e,v3->e};
dFdS1=Limit[dFdS0,e->0,Direction->-1];
dSdF0=Inverse[dFdS1];
Phat=DiagonalMatrix[{t1[s1,s2,s3],t2[s1,s2,s3],t3[s1,s2,s3]}];
P=U.Phat.Transpose[V];
dPdS=D[Flatten[P],{var}];
dPdS0=dPdS/.{u1->e,u2->e,u3->e,v1->e,v2->e,v3->e};
dPdS1=Limit[dPdS0,e->0,Direction->-1];
dPdF=Simplify[dPdS1.dSdF0];

Note 'Direction->-1' in Mathematica means taking the limit from large values to the small limit value. The Mathematica computation result will be given in terms of the singular values and $\hat{P}$ . One can then take the formula for implementing them in the code. [Stomakhin et al. 2012] gives the result where $\frac{\partial P}{\partial F} (Σ)$ (size $9 \times 9$ matrix) is permuted to be a block diagonal matrix with diagonal blocks $A^{3 \times 3}, B_{12}^{2 \times 2}, B_{13}^{2 \times 2}, B_{23}^{2 \times 2}$ , where $A = \hat{Ψ}_{, σ_{1} σ_{1}} \hat{Ψ}_{, σ_{2} σ_{1}} \hat{Ψ}_{, σ_{3} σ_{1}} \hat{Ψ}_{, σ_{1} σ_{2}} \hat{Ψ}_{, σ_{2} σ_{2}} \hat{Ψ}_{, σ_{3} σ_{2}} \hat{Ψ}_{, σ_{1} σ_{3}} \hat{Ψ}_{, σ_{2} σ_{3}} \hat{Ψ}_{, σ_{3} σ_{3}}$ and $B_{ij} = \frac{1}{σ _{i}^{2} - σ _{j}^{2}} (σ_{i} \hat{Ψ}_{, σ_{i}} - σ_{j} \hat{Ψ}_{, σ_{j}} σ_{j} \hat{Ψ}_{, σ_{i}} - σ_{i} \hat{Ψ}_{, σ_{j}} σ_{j} \hat{Ψ}_{, σ_{i}} - σ_{i} \hat{Ψ}_{, σ_{j}} σ_{i} \hat{Ψ}_{, σ_{i}} - σ_{j} \hat{Ψ}_{, σ_{j}}) .$ Denominator clamping is needed for terms in $B$ that may introduce division-by-zero (after fully simplifying them). Here we denote $\frac{\partial Ψ ^}{\partial σ _{i}}$ and $\frac{\partial ^{2} Ψ ^}{\partial σ _{i} \partial σ _{j}}$ as $\hat{Ψ}_{, σ_{i}}$ and $\hat{Ψ}_{, σ_{i} σ_{j}}$ respectively. The division by $σ_{i}^{2} - σ_{j}^{2}$ is problematic when two singular values are nearly equal or when two singular values nearly sum to zero. The latter is possible with a convention for permitting negative singular values (as in invertible elasticity [Irving et al. 2004] [Stomakhin et al. 2012]).

Expanding $B_{ij}$ in terms of partial fractions yields the useful decomposition $B_{ij} = \frac{1}{2} \frac{Ψ ^ _{, σ_{i}} - Ψ ^ _{, σ_{j}}}{σ _{i} - σ _{j}} (1111) + \frac{1}{2} \frac{Ψ ^ _{, σ_{i}} + Ψ ^ _{, σ_{j}}}{σ _{i} + σ _{j}} (1 - 1 - 1 1) .$ Note that if $\hat{Ψ}$ is invariant under permutation of the singular values, then $\hat{Ψ}_{, σ_{i}} \to \hat{Ψ}_{, σ_{j}}$ as $σ_{i} \to σ_{j}$ . Thus, the first term can normally be computed robustly for an isotropic model if implemented carefully. The other fraction can be computed robustly if $\hat{Ψ}_{, σ_{i}} + \hat{Ψ}_{, σ_{j}} \to 0$ as $σ_{i} + σ_{j} \to 0$ . But this usually does not hold as it means the constitutive model will have difficulty recovering from degenerate or inverted configurations. Thus, this term will be unbounded under some circumstances. We address this by clamping the magnitude of the denominator to not be smaller than $1 0^{- 6}$ before division to bound the derivatives.

For 2D, a rotation matrix is now simply paremetrized with a single $θ$ where the reconstruction is

$R = (cos θ sin θ - sin θ cos θ) .$ The 2D version of the whole Mathematica code is

id=IdentityMatrix[2];
var={s1,s2,u1,v1};
S=DiagonalMatrix[{s1,s2}];
U={{Cos[u1],-Sin[u1]

},{Sin[u1],Cos[u1]}};
V={{Cos[v1],-Sin[v1]},{Sin[v1],Cos[v1]}};
F=U.S.Transpose[V];
dFdS=D[Flatten[F],{var}];
dFdS0=dFdS/.{u1->e,v1->e};
dFdS1=Limit[dFdS0,e->0,Direction->-1};
dSdF0=Inverse[dFdS1];
Phat=DiagonalMatrix[{t1[s1,s2],t2[s1,s2]}];
P=U.Phat.Transpose[V];
dPdS=D[Flatten[P],{var}];
dPdS0=dPdS/.{u1->e,v1->e};
dPdS1=Limit[dPdS0,e->0,Direction->-1];
dPdF=Simplify[dPdS1.dSdF0];

where $A$ is now also $2 \times 2$ and there is only one $B$ .

Summary

Stress is a tensor field that quantifies the pressure or tension exerted on a material object. In the context of hyperelastic materials, the first Piola-Kirchhoff stress tensor $P$ plays a crucial role. It is defined as the derivative of the strain energy density function $Ψ$ , with respect to the deformation gradient $F$ , establishing a constitutive relationship between stress and strain.

In practical computations, particularly for the implicit integration of solid dynamics, it is essential to compute $P$ and its derivative $\frac{\partial P}{\partial F}$ efficiently. By leveraging the sparsity structure in diagonal space, these computations become more feasible. Here, differentiations are primarily required for $Ψ$ with respect to the principal stretches $σ_{i}$ , which simplifies the calculation process.

In the upcoming lecture, we will apply these principles to an inversion-free elasticity model, which will be demonstrated through the compressing square simulation. This application will use the concepts discussed in this chapter to address complex real-world problems in solid mechanics.

Case Study: Inversion-free Elasticity*

At the end of this chapter, we implement the Neo-Hookean model introduced in the previous lectures to simulate inversion-free elastic solids. The excutable Python project for this section can be found at https://github.com/phys-sim-book/solid-sim-tutorial under the 6_inv_free folder. MUDA GPU implementations can be found at https://github.com/phys-sim-book/solid-sim-tutorial-gpu under the simulators/6_inv_free folder. Instead of discretizing elasticity onto the springs as in the mass-spring model, we discretize the Neo-Hookean model onto triangle elements, apply chain rules to compute elastic forces according to the relation between deformation gradient $F$ and world-space nodal position $x$ , and then develop a root-finding based approach to filter the initial step size of line search for guaranteed non-inversion.

Linear Triangle Elements

In previous discussions, we learned to calculate $Ψ$ and its derivatives with respect to $F$ . For simulation, however, we require $\frac{\partial Ψ}{\partial x}$ and $\frac{\partial ^{2} Ψ}{\partial x ^{2}}$ . This necessitates a clear understanding of $F (x)$ , as it allows us to employ the chain rule to derive these derivatives with respect to $x$ effectively.

In 2D simulations, we often divide the solid domain into non-degenerate triangular elements. Assume the mapping $x = ϕ (X)$ is linear within each triangle, thus keeping the deformation gradient $F$ constant. Referencing Example 12.2.1, for a triangle defined by vertices $X_{1} X_{2} X_{3}$ , we have the equations: $x_{2} - x_{1} = F (X_{2} - X_{1}) and x_{3} - x_{1} = F (X_{3} - X_{1}),$ where $x_{i}$ denotes the world-space coordinates of the triangle vertices. This relationship leads to the expression for $F$ : $F = [x_{2} - x_{1}, x_{3} - x_{1}] [X_{2} - X_{1}, X_{3} - X_{1}]^{- 1} . (15.1.1)$ Equation (15.1.1) shows that $F$ , derived here, maps any segment within the triangle to its world-space counterpart through linear combinations of the triangle edges $X_{2} - X_{1}$ and $X_{3} - X_{1}$ . A more general and rigorous derivation of this formula will be presented in subsequent chapters.

Once $F (x)$ is established, we can calculate its derivative with respect to $x$ for each triangle as follows: $\frac{\partial [ F _{11} , F _{21} , F _{12} , F _{22} ] ^{T}}{\partial [ x _{1}^{T} , x _{2}^{T} , x _{3}^{T} ] ^{T}} = - B_{11} - B_{21} 0 - B_{12} - B_{22} 0 0 - B_{11} - B_{21} 0 - B_{12} - B_{22} B_{11} 0 B_{12} 0 0 B_{11} 0 B_{12} B_{21} 0 B_{22} 0 0 B_{21} 0 B_{22},$ where $B = [X_{2} - X_{1}, X_{3} - X_{1}]^{- 1}$ represents the inverse of the matrix formed by subtracting the first vertex from the second and third vertices. This matrix $B$ can be precomputed at initialization along with other properties such as the volume and Lame parameters of each triangle:

Implementation 15.1.1 (Precomputation of element information, simulator.py).

# rest shape basis, volume, and lame parameters
vol = [0.0] * len(e)
IB = [np.array([[0.0, 0.0]] * 2)] * len(e)
for i in range(0, len(e)):
    TB = [x[e[i][1]] - x[e[i][0]], x[e[i][2]] - x[e[i][0]]]
    vol[i] = np.linalg.det(np.transpose(TB)) / 2
    IB[i] = np.linalg.inv(np.transpose(TB))
mu_lame = [0.5 * E / (1 + nu)] * len(e)
lam = [E * nu / ((1 + nu) * (1 - 2 * nu))] * len(e)

The Young's modulus and Poisson's ratio:

E = 1e5         # Young's modulus
nu = 0.4        # Poisson's ratio

Here, e no longer stores all edge elements as in mass-spring models but represents all triangle elements, which can be generated by modifying the meshing code as follows:

Implementation 15.1.2 (Assembling per-triangle vertex indices, square_mesh.py).

    # connect the nodes with triangle elements
    e = []
    for i in range(0, n_seg):
        for j in range(0, n_seg):
            # triangulate each cell following a symmetric pattern:
            if (i % 2)^(j % 2):
                e.append([i * (n_seg + 1) + j, (i + 1) * (n_seg + 1) + j, i * (n_seg + 1) + j + 1])
                e.append([(i + 1) * (n_seg + 1) + j, (i + 1) * (n_seg + 1) + j + 1, i * (n_seg + 1) + j + 1])
            else:
                e.append([i * (n_seg + 1) + j, (i + 1) * (n_seg + 1) + j, (i + 1) * (n_seg + 1) + j + 1])
                e.append([i * (n_seg + 1) + j, (i + 1) * (n_seg + 1) + j + 1, i * (n_seg + 1) + j + 1])

Triangles are arranged in a symmetric pattern and can be rendered by drawing the three edges:

Implementation 15.1.3 (Draw triangles, simulator.py).

        pygame.draw.aaline(screen, (0, 0, 255), screen_projection(x[eI[0]]), screen_projection(x[eI[1]]))
        pygame.draw.aaline(screen, (0, 0, 255), screen_projection(x[eI[1]]), screen_projection(x[eI[2]]))
        pygame.draw.aaline(screen, (0, 0, 255), screen_projection(x[eI[2]]), screen_projection(x[eI[0]]))

Computing Energy, Gradient, and Hessian

We first follow sections Strain Energy and Stress and Its Derivatives to implement computing $Ψ (F)$ , $P = \frac{\partial Ψ}{\partial F}$ , and SPD-projected $\frac{\partial P}{\partial F}$ :

Implementation 15.2.1 (Energy derivatives w.r.t. $F$ , NeoHookeanEnergy.py).

import utils
import numpy as np
import math

def polar_svd(F):
    [U, s, VT] = np.linalg.svd(F)
    if np.linalg.det(U) < 0:
        U[:, 1] = -U[:, 1]
        s[1] = -s[1]
    if np.linalg.det(VT) < 0:
        VT[1, :] = -VT[1, :]
        s[1] = -s[1]
    return [U, s, VT]

def dPsi_div_dsigma(s, mu, lam):
    ln_sigma_prod = math.log(s[0] * s[1])
    inv0 = 1.0 / s[0]
    dPsi_dsigma_0 = mu * (s[0] - inv0) + lam * inv0 * ln_sigma_prod
    inv1 = 1.0 / s[1]
    dPsi_dsigma_1 = mu * (s[1] - inv1) + lam * inv1 * ln_sigma_prod
    return [dPsi_dsigma_0, dPsi_dsigma_1]

def d2Psi_div_dsigma2(s, mu, lam):
    ln_sigma_prod = math.log(s[0] * s[1])
    inv2_0 = 1 / (s[0] * s[0])
    d2Psi_dsigma2_00 = mu * (1 + inv2_0) - lam * inv2_0 * (ln_sigma_prod - 1)
    inv2_1 = 1 / (s[1] * s[1])
    d2Psi_dsigma2_11 = mu * (1 + inv2_1) - lam * inv2_1 * (ln_sigma_prod - 1)
    d2Psi_dsigma2_01 = lam / (s[0] * s[1])
    return [[d2Psi_dsigma2_00, d2Psi_dsigma2_01], [d2Psi_dsigma2_01, d2Psi_dsigma2_11]]

def B_left_coef(s, mu, lam):
    sigma_prod = s[0] * s[1]
    return (mu + (mu - lam * math.log(sigma_prod)) / sigma_prod) / 2

def Psi(F, mu, lam):
    J = np.linalg.det(F)
    lnJ = math.log(J)
    return mu / 2 * (np.trace(np.transpose(F).dot(F)) - 2) - mu * lnJ + lam / 2 * lnJ * lnJ

def dPsi_div_dF(F, mu, lam):
    FinvT = np.transpose(np.linalg.inv(F))
    return mu * (F - FinvT) + lam * math.log(np.linalg.det(F)) * FinvT

def d2Psi_div_dF2(F, mu, lam):
    [U, sigma, VT] = polar_svd(F)

    Psi_sigma_sigma = utils.make_PSD(d2Psi_div_dsigma2(sigma, mu, lam))

    B_left = B_left_coef(sigma, mu, lam)
    Psi_sigma = dPsi_div_dsigma(sigma, mu, lam)
    B_right = (Psi_sigma[0] + Psi_sigma[1]) / (2 * max(sigma[0] + sigma[1], 1e-6))
    B = utils.make_PSD([[B_left + B_right, B_left - B_right], [B_left - B_right, B_left + B_right]])

    M = np.array([[0, 0, 0, 0]] * 4)
    M[0, 0] = Psi_sigma_sigma[0, 0]
    M[0, 3] = Psi_sigma_sigma[0, 1]
    M[1, 1] = B[0, 0]
    M[1, 2] = B[0, 1]
    M[2, 1] = B[1, 0]
    M[2, 2] = B[1, 1]
    M[3, 0] = Psi_sigma_sigma[1, 0]
    M[3, 3] = Psi_sigma_sigma[1, 1]

    dP_div_dF = np.array([[0, 0, 0, 0]] * 4)
    for j in range(0, 2):
        for i in range(0, 2):
            ij = j * 2 + i
            for s in range(0, 2):
                for r in range(0, 2):
                    rs = s * 2 + r
                    dP_div_dF[ij, rs] = M[0, 0] * U[i, 0] * VT[0, j] * U[r, 0] * VT[0, s] \
                        + M[0, 3] * U[i, 0] * VT[0, j] * U[r, 1] * VT[1, s] \
                        + M[1, 1] * U[i, 1] * VT[0, j] * U[r, 1] * VT[0, s] \
                        + M[1, 2] * U[i, 1] * VT[0, j] * U[r, 0] * VT[1, s] \
                        + M[2, 1] * U[i, 0] * VT[1, j] * U[r, 1] * VT[0, s] \
                        + M[2, 2] * U[i, 0] * VT[1, j] * U[r, 0] * VT[1, s] \
                        + M[3, 0] * U[i, 1] * VT[1, j] * U[r, 0] * VT[0, s] \
                        + M[3, 3] * U[i, 1] * VT[1, j] * U[r, 1] * VT[1, s]
    return dP_div_dF

Next, we implement computing $F (x)$ , and the tensor products with $\frac{\partial F}{\partial x}$ for chain rule based computation of elasticity energy gradient and Hessian:

Implementation 15.2.2 (Energy derivatives w.r.t. $x$ , NeoHookeanEnergy.py).

def deformation_grad(x, elemVInd, IB):
    F = [x[elemVInd[1]] - x[elemVInd[0]], x[elemVInd[2]] - x[elemVInd[0]]]
    return np.transpose(F).dot(IB)

def dPsi_div_dx(P, IB):  # applying chain-rule, dPsi_div_dx = dPsi_div_dF * dF_div_dx
    dPsi_dx_2 = P[0, 0] * IB[0, 0] + P[0, 1] * IB[0, 1]
    dPsi_dx_3 = P[1, 0] * IB[0, 0] + P[1, 1] * IB[0, 1]
    dPsi_dx_4 = P[0, 0] * IB[1, 0] + P[0, 1] * IB[1, 1]
    dPsi_dx_5 = P[1, 0] * IB[1, 0] + P[1, 1] * IB[1, 1]
    return [np.array([-dPsi_dx_2 - dPsi_dx_4, -dPsi_dx_3 - dPsi_dx_5]), np.array([dPsi_dx_2, dPsi_dx_3]), np.array([dPsi_dx_4, dPsi_dx_5])]

def d2Psi_div_dx2(dP_div_dF, IB):  # applying chain-rule, d2Psi_div_dx2 = dF_div_dx^T * d2Psi_div_dF2 * dF_div_dx (note that d2F_div_dx2 = 0)
    intermediate = np.array([[0.0, 0.0, 0.0, 0.0]] * 6)
    for colI in range(0, 4):
        _000 = dP_div_dF[0, colI] * IB[0, 0]
        _010 = dP_div_dF[0, colI] * IB[1, 0]
        _101 = dP_div_dF[2, colI] * IB[0, 1]
        _111 = dP_div_dF[2, colI] * IB[1, 1]
        _200 = dP_div_dF[1, colI] * IB[0, 0]
        _210 = dP_div_dF[1, colI] * IB[1, 0]
        _301 = dP_div_dF[3, colI] * IB[0, 1]
        _311 = dP_div_dF[3, colI] * IB[1, 1]
        intermediate[2, colI] = _000 + _101
        intermediate[3, colI] = _200 + _301
        intermediate[4, colI] = _010 + _111
        intermediate[5, colI] = _210 + _311
        intermediate[0, colI] = -intermediate[2, colI] - intermediate[4, colI]
        intermediate[1, colI] = -intermediate[3, colI] - intermediate[5, colI]
    result = np.array([[0.0, 0.0, 0.0, 0.0, 0.0, 0.0]] * 6)
    for colI in range(0, 6):
        _000 = intermediate[colI, 0] * IB[0, 0]
        _010 = intermediate[colI, 0] * IB[1, 0]
        _101 = intermediate[colI, 2] * IB[0, 1]
        _111 = intermediate[colI, 2] * IB[1, 1]
        _200 = intermediate[colI, 1] * IB[0, 0]
        _210 = intermediate[colI, 1] * IB[1, 0]
        _301 = intermediate[colI, 3] * IB[0, 1]
        _311 = intermediate[colI, 3] * IB[1, 1]
        result[2, colI] = _000 + _101
        result[3, colI] = _200 + _301
        result[4, colI] = _010 + _111
        result[5, colI] = _210 + _311
        result[0, colI] = -_000 - _101 - _010 - _111
        result[1, colI] = -_200 - _301 - _210 - _311
    return result

Finally, Neo-Hookean energy value, gradient, and Hessian on the entire mesh can be computed as follows:

Implementation 15.2.3 (Energy value, Gradient, and Hessian, NeoHookeanEnergy.py).

def val(x, e, vol, IB, mu, lam):
    sum = 0.0
    for i in range(0, len(e)):
        F = deformation_grad(x, e[i], IB[i])
        sum += vol[i] * Psi(F, mu[i], lam[i])
    return sum

def grad(x, e, vol, IB, mu, lam):
    g = np.array([[0.0, 0.0]] * len(x))
    for i in range(0, len(e)):
        F = deformation_grad(x, e[i], IB[i])
        P = vol[i] * dPsi_div_dF(F, mu[i], lam[i])
        g_local = dPsi_div_dx(P, IB[i])
        for j in range(0, 3):
            g[e[i][j]] += g_local[j]
    return g

def hess(x, e, vol, IB, mu, lam):
    IJV = [[0] * (len(e) * 36), [0] * (len(e) * 36), np.array([0.0] * (len(e) * 36))]
    for i in range(0, len(e)):
        F = deformation_grad(x, e[i], IB[i])
        dP_div_dF = vol[i] * d2Psi_div_dF2(F, mu[i], lam[i])
        local_hess = d2Psi_div_dx2(dP_div_dF, IB[i])
        for xI in range(0, 3):
            for xJ in range(0, 3):
                for dI in range(0, 2):
                    for dJ in range(0, 2):
                        ind = i * 36 + (xI * 3 + xJ) * 4 + dI * 2 + dJ
                        IJV[0][ind] = e[i][xI] * 2 + dI
                        IJV[1][ind] = e[i][xJ] * 2 + dJ
                        IJV[2][ind] = local_hess[xI * 2 + dI, xJ * 2 + dJ]
    return IJV

Filter Line Search for Non-Inversion

To guarantee non-inversion just like non-interpenetration (see Filter Line Search) during the simulation, we can similarly filter the line search initial step size with a critical step size $α^{I}$ that first brings the volume of any triangles to $0$ . This can be obtained by solving a 1D equation per triangle: $V (x_{i} + α^{I} p_{i}) = 0, (15.3.1)$ and taking the minimum of the solved step sizes. Here $p_{i}$ is the search direction of node $i$ , and in 2D, Equation (15.3.1) is equivalent to: $det ([x_{21}^{α}, x_{31}^{α}]) \equiv x_{21, 1}^{α} x_{31, 2}^{α} - x_{21, 2}^{α} x_{31, 1}^{α} = 0 (15.3.2)$ with $x_{ij}^{α} = x_{ij} + α^{I} p_{ij}$ and $x_{ij} = x_{i} - x_{j}$ , $p_{ij} = p_{i} - p_{j}$ . Expanding Equation (15.3.2) we obtain: $(x_{21, 1} + α^{I} p_{21, 1}) (x_{31, 2} + α^{I} p_{31, 2}) - (x_{21, 2} + α^{I} p_{21, 2}) (x_{31, 1} + α^{I} p_{31, 1}) = 0,$ which can be reorganized as a quadratic equation of $α^{I}$ : $det ([p_{21}, p_{31}]) (α^{I})^{2} + (det ([x_{21}, p_{31}]) + det ([p_{21}, x_{31}])) α^{I} + det ([x_{21}, x_{31}]) = 0.$ Here, note that $det ([p_{21}, p_{31}])$ can be very tiny when the nodes do not move much or when their movement barely changes to triangle area in the current timestep, thus the equation can be degenerated into a linear one. To robustly detect this degenerate case, we cannot directly check whether $det ([p_{21}, p_{31}])$ is $0$ due to numerical errors. In fact, checking whether $det ([p_{21}, p_{31}])$ is below an epsilon is still tricky, because the scale of $det ([p_{21}, p_{31}])$ heavily depends on the scene dimension and nodal velocity during the simulation. Therefore, we use $det ([x_{21}, x_{31}])$ as a scaling and obtain a scaled but equivalent equation: $\frac{det ([ p _{21} , p _{31} ])}{det ([ x _{21} , x _{31} ])} (α^{I})^{2} + \frac{det ([ x _{21} , p _{31} ]) + det ([ p _{21} , x _{31} ])}{det ([ x _{21} , x _{31} ])} α^{I} + 1 = 0, (15.3.3)$ where magnitude checks can be safely performed on any coefficients with unitless thresholds.

In practice, we also need to allow some slackness so that the step size to be taken will not lead to an exactly $0$ volume. Thus, we solve $α^{I}$ such that it first decreases the volume of any triangles by $90%$ , which can be realized by modifying the constant term coefficient in Equation (15.3.3) from $1$ to $0.9$ :

Implementation 15.3.1 (Filter line search, NeoHookeanEnergy.py).

def init_step_size(x, e, p):
    alpha = 1
    for i in range(0, len(e)):
        x21 = x[e[i][1]] - x[e[i][0]]
        x31 = x[e[i][2]] - x[e[i][0]]
        p21 = p[e[i][1]] - p[e[i][0]]
        p31 = p[e[i][2]] - p[e[i][0]]
        detT = np.linalg.det(np.transpose([x21, x31]))
        a = np.linalg.det(np.transpose([p21, p31])) / detT
        b = (np.linalg.det(np.transpose([x21, p31])) + np.linalg.det(np.transpose([p21, x31]))) / detT
        c = 0.9  # solve for alpha that first brings the new volume to 0.1x the old volume for slackness
        critical_alpha = utils.smallest_positive_real_root_quad(a, b, c)
        if critical_alpha > 0:
            alpha = min(alpha, critical_alpha)
    return alpha

Here, if the equation does not have a positive real root, that means for this specific triangle, the step size can be taken arbitrarily large and it will not trigger inversion.

The quadratic equation can be solved as

Implementation 15.3.2 (Solve quadratic equation, utils.py).

def smallest_positive_real_root_quad(a, b, c, tol = 1e-6):
    # return negative value if no positive real root is found
    t = 0
    if abs(a) <= tol:
        if abs(b) <= tol: # f(x) = c > 0 for all x
            t = -1
        else:
            t = -c / b
    else:
        desc = b * b - 4 * a * c
        if desc > 0:
            t = (-b - math.sqrt(desc)) / (2 * a) # if a > 0, this is either the smaller positive root, or both roots are negative; 
            # if a < 0, there are 1 negative and 1 positive real roots, and we just need the positive one.
            if t < 0:
                t = (-b + math.sqrt(desc)) / (2 * a)
        else: # desv<0 ==> imag, f(x) > 0 for all x > 0
            t = -1
    return t

With scaled coefficients, we simply use a unitless threshold, e.g. 1e-6, to check for degeneracies. If no positive real roots are found, the function simply returns -1.

Now as we filter the initial step size in addition to non-interpenetration:

Implementation 15.3.3 (Apply filter, time_integrator.py).

        alpha = min(BarrierEnergy.init_step_size(x, n, o, p), NeoHookeanEnergy.init_step_size(x, e, p))  # avoid interpenetration, tunneling, and inversion

and make sure all added data structures and modified functions are reflected in the time integrator, we can finally simulate the compressing square example from Moving Boundary Condition with guaranteed non-inversion (see Figure 15.3.1).

**Figure 15.3.1.** A square is dropped onto the ground and compressed severely by a ceiling while maintaining inversion-free throughout the simulation. The ground has friction coefficient $0.11$ so that the bottom of the square slides less than the top, where the ceiling has no friction.

Summary

We have successfully implemented an inversion-free 2D elasticity simulation by discretizing the Neo-Hookean model using linear triangle elements.

By maintaining a linearly varying displacement field within each triangle, we can directly calculate a constant deformation gradient $F$ for each triangle using both the material and world space coordinates of the vertices. This foundational setup facilitates the computation of the Neo-Hookean energy, as well as its gradient and Hessian with respect to $x$ , by applying the chain rule. These calculations are essential for the optimization-based time integration discussed in previous lectures.

To ensure the simulation remains free of both interpenetration and inversion, we adopt a similar strategy as previously described: the initial step size in the line search is determined by solving a quadratic equation for each triangle. This equation calculates a critical step size that reduces the triangle's volume by 90%. The smallest of these critical step sizes across all triangles is then used to initialize the line search, ensuring robustness against both non-interpenetration and non-inversion.

In the upcoming chapter, we will delve into the derivation of the governing equations for hyperelastic solids, providing a detailed explanation of each step to further solidify understanding.

Strong and Weak Forms

The update rules (refer to Equation (1.5.1)) and the corresponding optimization problems (refer to Equation (2.1.1)) utilized in solids simulation are derived by discretizing the conservation laws—our governing equations—from their continuous forms. This chapter will explore the derivation of both the strong and weak forms of these conservation laws. We will then discuss the methods for their temporal and spatial discretizations, which are essential for formulating the discrete problems we aim to solve.

The fundamental governing equations central to our study are the conservation of mass and the conservation of momentum (Newton's Second Law). We will outline these equations below and provide detailed derivations later in this lecture.

Definition 16.1 (Strong Form). Letting $V (X, t) = \frac{\partial ϕ ( X , t )}{\partial t} = \frac{\partial x ( X , t )}{\partial t}$ be the velocity defined over $X$ , the equations are [Gonzalez & Stuart 2008]: $R (X, t) J (X, t) = R (X, 0) R (X, 0) \frac{\partial V}{\partial t} (X, t) = \nabla^{X} \cdot P (X, t) + R (X, 0) g Conservation of mass, Conservation of momentum,$ where $X \in Ω_{0}$ and $t \geq 0$ . Here $R$ is the mass density, $J (X, t) = det F (X, t)$ , $P$ is the first Piola-Kirchoff stress, and $g$ is the constant gravitational acceleration. Note that $J (X, 0) = 1$ , and the mass conservation can also be written as $\frac{\partial}{\partial t} (R (X, t) J (X, t)) = 0.$

These equations are initially presented in their strong form. In this lecture, we will also derive the equivalent weak form of the force balance equation (conservation of momentum). The weak form reformulates the conservation law using integral expressions, which are crucial for the subsequent derivation of the temporal and spatial discretizations of the equations using the Finite Element Method.

Conservation of Mass

We can think of the mass density $R (X, t)$ to be naturally defined over $Ω^{0}$ as $R (X, t) = ϵ \to + 0 lim \frac{mass ( B _{ϵ}^{t} )}{volume ( B _{ϵ}^{t} )} = ϵ \to + 0 lim \frac{mass ( B _{ϵ}^{t} )}{\int _{B_{ϵ}^{t}} d x} (16.1.1)$ where $B_{ϵ}^{t}$ is the world space counterpart of $B_{ϵ}^{0}$ (the ball of radius $ϵ$ surrounding an arbitrary $X \in Ω^{0}$ ). This is arguably a natural definition since $mass (B_{ϵ}^{t})$ should be a measurable quantity. Conservation of mass can be expressed as $mass (B_{ϵ}^{t}) = mass (B_{ϵ}^{0}), \forall B_{ϵ}^{0} \subset Ω^{0} and t \geq 0. (16.1.2)$

Now, with a change of variables, we have $\int_{B_{ϵ}^{t}} d x = \int_{B_{ϵ}^{0}} J (X, t) d X$ , so Equation (16.1.1) becomes $R (X, t) = ϵ \to + 0 lim \frac{mass ( B _{ϵ}^{t} )}{\int _{B_{ϵ}^{0}} J ( X , t ) d X}, (16.1.3)$ and so $R (X, 0) = ϵ \to + 0 lim \frac{mass ( B _{ϵ}^{0} )}{\int _{B_{ϵ}^{0}} d X} (16.1.4)$ since $J (X, 0) = 1$ . Then combining Equations (16.1.2), (16.1.3), and (16.1.4), we can express the conservation of mass as $\int_{B_{ϵ}^{0}} R (X, t) J (X, t) d X = \int_{B_{ϵ}^{0}} R (X, 0) d X, \forall B_{ϵ}^{0} \subset Ω^{0} and t \geq 0.$ This just says that the mass in $B_{ϵ}^{t}$ (as expressed via an integral of the mass density) should not change with time. This set is associated with a subset of the material at time $t$ and as it evolves in the flow, the material will take up more or less space, but there will always be the same amount (mass) of material in the set. Since $B_{ϵ}^{0}$ is arbitrary, it must be true that $R (X, t) J (X, t) = R (X, 0), \forall X \in Ω^{0} and t \geq 0.$

Remark 16.1.1 (Lagrangian and Eulerian Views). In simulation methods that discretize and track materials directly based on $Ω^{0}$ , conservation of mass is inherently satisfied. For instance, in our Finite Element Method (FEM) simulator, $Ω^{0}$ is segmented into triangles, with the mass of each triangle remaining constant regardless of deformation throughout the simulation. This approach is known as the Lagrangian method. In contrast, Eulerian methods discretize and evolve physical quantities based on $Ω^{t}$ and often need to specially deal with mass conservation.

Conservation of Momentum

In the continuous setting, forces are categorized into body forces (also known as external forces, such as gravity) and surface forces (or internal forces, typically stress-based, like those arising from elasticity). We define stress-based forces through a traction field, whose existence is assumed. The traction, or force per unit area, is represented by the field $T (\cdot, N, t) : Ω^{0} \to R^{d}$ and is defined by the equation: $force_{S} (B_{ϵ}^{0}) = \int_{\partial B_{ϵ}^{0}} T (X, N (X)) d s (X),$ where $N$ represents the outward-pointing normal direction in the material space. Here, $force_{S} (B_{ϵ}^{0})$ denotes the net force exerted from the material outside $\partial B_{ϵ}^{0}$ on the material inside $B_{ϵ}^{0}$ through their interface. The function $T (X, N, t)$ quantifies the force per unit area ( $d = 3$ ) or length ( $d = 2$ ) that material on the $N^{+}$ side exerts at point $X$ on material on the $N^{-}$ side.

It can be shown that this implies the existence of a stress field (first Piola-Kirchoff stress) $P (\cdot, t) : Ω^{0} \to R^{d \times d}$ with: $T (X, N, t) = P (X, t) N .$

Then, by applying Newton's second law on $B_{ϵ}^{0}$ , we can express the conservation of momentum as: $= \int_{B_{ϵ}^{0}} R (X, 0) \frac{\partial V}{\partial t} (X, t) d X \int_{\partial B_{ϵ}^{0}} P (X, t) N (X) d s (X) + \int_{B_{ϵ}^{0}} R (X, 0) A^{ext} (X, t) d X, (16.2.1)$ for all $B_{ϵ}^{0} \subset Ω^{0}$ and $t \geq 0$ .

Applying the divergence theorem, we can transform the boundary integral in Equation (16.2.1) into a volume integral and obtain: $= \int_{B_{ϵ}^{0}} R (X, 0) \frac{\partial V}{\partial t} (X, t) d X \int_{B_{ϵ}^{0}} \nabla^{X} \cdot P (X, t) d X + \int_{B_{ϵ}^{0}} R (X, 0) A^{ext} (X, t) d X, (16.2.2)$ for all $B_{ϵ}^{0} \subset Ω^{0}$ and $t \geq 0$ .

Definition 16.2.1 (Divergence Theorem for Vectors). For a vector-valued function $f (x) : Ω \to R^{d}$ defined on a closed domain $Ω$ , let $n (x)$ be the outward-pointing normal on the boundary of this domain, the following equality holds: $\int_{\partial Ω} f \cdot n d s (x) = \int_{Ω} \nabla \cdot f d x .$ This theorem allows us to conveniently transform between boundary and volume integrals.

Here the divergence operator $\nabla \cdot$ acts on every row vector of $P$ independently and results in a column vector: $(\nabla^{X} \cdot P)_{i} = \sum_{j} P_{ij, j}$ . Since Equation (16.2.2) also holds for arbitrary $B_{ϵ}^{0}$ , we arrive at the strong form of the force balance equation by removing the integration: $R (X, 0) \frac{\partial V}{\partial t} (X, t) = \nabla^{X} \cdot P (X, t) + R (X, 0) A^{ext} (X, t), \forall X \in Ω^{0} and t \geq 0. (16.2.3)$

Remark 16.2.1 (Momentum Conservation in Solid Simulation). Conservation of momentum is the primary governing equation we use to simulate solids. As discussed previously, both the acceleration, denoted by $\frac{\partial V}{\partial t} (X, t)$ , and the internal force, expressed as $\nabla^{X} \cdot P (X, t)$ , can be described using world space coordinates $x$ . With all other relevant quantities established, we incrementally solve for $x$ to get dynamic motions step by step.

Weak Form

First, since the external force term $R (X, 0) A^{ext} (X, t)$ resembles a lot to the time derivative of the momentum on the left-hand side, we will ignore it during the derivation for simplicity. Then, for an arbitrary test function $Q (\cdot, t) : Ω^{0} \to R^{d}$ , let's compute the dot product to both sides of Equation (16.2.3) and integrate over $Ω^{0}$ to generate the weak form: $= \int_{Ω^{0}} R (X, 0) Q (X, t) \cdot A (X, t) d X \int_{Ω^{0}} Q (X, t) \cdot (\nabla^{X} \cdot P (X, t)) d X, \forall Q (\cdot, t) : Ω^{0} \to R^{d} and t \geq 0. (16.3.1)$ Here we denote $A (X, t) = \frac{\partial V}{\partial t} (X, t)$ . Without going into details on finite element analysis, we claim that the weak form is sufficiently equivalent to the strong form since Equation (16.3.1) is required to hold for arbitrary $Q (\cdot, t)$ , and solving the weak form provides us a solution that is a "good enough" soution to the original problem.

With index notation where $A_{i}$ means the $i$ -th component of vector-valued function $A : Ω^{0} \to R^{d}$ , and $A_{i, j}$ means $\frac{\partial A _{i}}{\partial X _{j}}$ , we can rewrite Equation (16.3.1) as $\int_{Ω^{0}} R (X, 0) i \sum Q_{i} (X, t) A_{i} (X, t) d X = \int_{Ω^{0}} i \sum Q_{i} (X, t) j \sum P_{ij, j} (X, t) d X . (16.3.2)$ If we further omit the summation symbol and let the repetitive subscripts represent summation (this is known as Einstein notation), we obtain $\int_{Ω^{0}} R (X, 0) Q_{i} (X, t) A_{i} (X, t) d X = \int_{Ω^{0}} Q_{i} (X, t) P_{ij, j} (X, t) d X . (16.3.3)$ Now applying Integration By Parts on the right-hand side, we can rewrite Equation (16.3.3) as $\int_{Ω^{0}} R (X, 0) Q_{i} (X, t) A_{i} (X, t) d X = \int_{Ω^{0}} (\nabla \cdot (Q_{i} (X, t) P_{i} (X, t)) - \nabla Q_{i} (X, t) \cdot P_{i} (X, t)) d X = \int_{Ω^{0}} ((Q_{i} (X, t) P_{ij} (X, t))_{, j} - Q_{i, j} (X, t) P_{ij} (X, t)) d X . (16.3.4)$

Definition 16.3.1 (Integration By Parts). For a scalar-valued function $u (x)$ and a vector-valued function (vector field) $V (x)$ , the product rule for divergence states that: $\nabla \cdot (u (x) V (x)) = u (x) \nabla \cdot V (x) + \nabla u (x) \cdot V (x) .$ Integrating both sides on domain $Ω$ then gives: $\int_{Ω} \nabla \cdot (u (x) V (x)) d x = \int_{Ω} u (x) \nabla \cdot V (x) d x + \int_{Ω} \nabla u (x) \cdot V (x) d x .$

Then if we further apply the divergence theorem on the first part of the right-hand side of Equation (16.3.4), we obtain $\int_{Ω^{0}} R (X, 0) Q_{i} (X, t) A_{i} (X, t) d X = \int_{\partial Ω^{0}} Q_{i} (X, t) P_{ij} (X, t) N_{j} (X) d s (X) - \int_{Ω^{0}} Q_{i, j} (X, t) P_{ij} (X, t) d X . (16.3.5)$ The quantity $P_{ij} N_{j}$ would be specified as a boundary condition. If we let $T (X, t)$ be the boundary force per unit reference area (traction) with $T_{i} = P_{ij} N_{j}$ , then we can say that the conservation of momentum implies that $\forall Q (\cdot, t) : Ω^{0} \to R^{d}$ $\int_{Ω^{0}} R (X, 0) Q_{i} (X, t) A_{i} (X, t) d X = \int_{\partial Ω^{0}} Q_{i} (X, t) T_{i} (X, t) d s (X) - \int_{Ω^{0}} Q_{i, j} (X, t) P_{ij} (X, t) d X . (16.3.6)$ This is momentum conservation's weak form written in $Ω^{0}$ .

Remark 16.3.1 (Why Weak Form). In finite element method (FEM) for solids, conservation of momentum is formulated in the weak form rather than directly discretizing the strong form due to specific advantages. The strong form requires the displacement field and its derivatives to be continuously differentiable across the entire domain, which is difficult to achieve in practical scenarios involving complex geometries or material discontinuities. On the other hand, the weak form only requires the displacement field itself to be continuous, relaxing the need for continuous derivatives. This makes the weak form more adaptable to irregular mesh geometries and better suited for incorporating boundary conditions and handling interface problems. The weak form's integration-based approach reduces the sensitivity to local irregularities, making it more stable and robust for numerical computation in solid mechanics. Thus, while the strong form provides a direct representation of physical laws, its direct discretization is less practical for the computational demands and complexities typical in FEM analyses.

Summary

In this lecture, we derived the strong forms of the governing equations—conservation of mass and conservation of momentum—focusing on an infinitesimal region within the simulation domain. The conservation of momentum equation was transformed from surface to volume integrals using the divergence theorem.

For Lagrangian simulation methods, such as FEM solid simulation, which discretize and monitor physical quantities based on the material space $Ω^{0}$ , the conservation of mass is inherently maintained. We then progressed to deriving the weak form of conservation of momentum. This involved integrating the dot product between the momentum terms and an arbitrary test function. The weak form is effectively equivalent to the strong form because the integral equation must satisfy any arbitrary test function. Techniques such as integration by parts and the application of the divergence theorem were essential in this derivation.

In our next lecture, we will discretize the weak form both temporally and spatially, further refining our approach to solve the discrete problems examined in our case studies.

Discretization of Weak Forms

In this lecture, we will discretize the weak form of the momentum conservation equation (temporarily ignoring body forces) in both space and time to reach the discrete form—a system of equations introduced in the first lecture.

We will begin by focusing on a specific point in time, $t = t^{n}$ . From the weak form of the momentum conservation equation (Equation (16.3.6)), we have: $\int_{Ω^{0}} R^{0} (X) Q_{i}^{n} (X) A_{i}^{n} (X) d X = \int_{\partial Ω^{0}} Q_{i}^{n} (X) T_{i}^{n} (X) d s (X) - \int_{Ω^{0}} Q_{i, j}^{n} (X) P_{ij}^{n} (X) d X, (17.1)$ for arbitrary $Q^{n} (X)$ , where the superscript $n$ denotes quantities measured at $t = t^{n}$ . Here:

$R$ and $T$ are specified by the simulation setup,
$P$ can be calculated from the degrees of freedom $x$ via a constitutive law,
$A = \frac{\partial ^{2} x}{\partial t ^{2}}$ is the second-order time derivative of $x$ , and
$Q$ is an arbitrary vector field.

Discrete Space

To enable numerical evaluation of the integrals in the weak form, the first step is to discretize the smooth vector fields $x$ and $Q$ . This allows them to be represented by a finite set of samples, along with appropriate interpolation functions.

Example 17.1.1 (1D Function Interpolation). In 1D, to approximate a function $f (x)$ using three sample points $x_{1} = 1$ , $x_{2} = 2$ , $x_{3} = 3$ (Figure 17.1.1), we can use interpolation functions $N_{i} (x) = 1 - ∣ x - x_{i} ∣$ and form $f (x) \approx \sum_{i} f (x_{i}) N_{i} (x)$ .

Figure 17.1.1. With interpolation functions $N_{1} (x)$ , $N_{2} (x)$ , $N_{3} (x)$ and sample points $x_{1} = 1$ , $x_{2} = 2$ , $x_{3} = 3$ , a function $f (x)$ can be approximated as $\sum_{i} f (x_{i}) N_{i} (x)$ .

Given a set of sample points indexed by $a$ or $b$ in the simulation domain, we can approximate the test function $Q$ and the DOF $x$ as: $Q_{i} (X, t^{n}) x_{i} (X, t^{n}) \approx a \sum Q_{a ∣ i} (t^{n}) N_{a} (X) = a \sum Q_{a ∣ i}^{n} N_{a} (X), \approx b \sum x_{b ∣ i} (t^{n}) N_{b} (X) = b \sum x_{b ∣ i}^{n} N_{b} (X),$ where $Q_{a ∣ i}^{n} = Q_{a ∣ i} (t^{n})$ refers to the $i$ -th dimension of $Q$ evaluated at sample point $a$ at time $t^{n}$ , and $N_{a} (X) : Ω^{0} \to R$ is the interpolation function at sample point $a$ . In this way, we similarly have: $A_{i} (X, t^{n}) \approx b \sum A_{b ∣ i} (t^{n}) N_{b} (X) = b \sum A_{b ∣ i}^{n} N_{b} (X) . (17.1.1)$ Plugging these discretizations into the weak form (Equation (17.1)) and expressing summations with the index notation, we obtain: $\int_{Ω^{0}} R (X, 0) Q_{a ∣ i}^{n} N_{a} (X) A_{b ∣ i}^{n} N_{b} (X) d X = \int_{\partial Ω^{0}} Q_{a ∣ i}^{n} N_{a} (X) T_{i} (X, t^{n}) d s (X) - \int_{Ω^{0}} Q_{a ∣ i}^{n} N_{a, j} (X) P_{ij} (X, t^{n}) d X .$ On the left-hand side, we see that the sample values $Q_{a ∣ i}^{n}$ and $A_{b ∣ i}^{n}$ are in fact independent of $X$ , so we can move them out of the integral and obtain: $M_{ab} Q_{a ∣ i}^{n} A_{b ∣ i}^{n} = \int_{\partial Ω^{0}} Q_{a ∣ i}^{n} N_{a} (X) T_{i} (X, t^{n}) d s (X) - \int_{Ω^{0}} Q_{a ∣ i}^{n} N_{a, j} (X) P_{ij} (X, t^{n}) d X$ where $M_{ab} = \int_{Ω^{0}} R (X, 0) N_{a} (X) N_{b} (X) d X (17.1.2)$ is the mass matrix.

Remark 17.1.1 (Mass Matrix Properties). The mass matrix $M$ (Equation (17.1.2)) is symmetric and positive semi-definite because it can be expressed as: $\int_{Ω^{0}} B B^{T} d X,$ where $B_{i} = R (X, 0) N_{i} (X)$ . Thus, for any vector $z$ , $z^{T} M z = \int_{Ω^{0}} (z^{T} B)^{2} d X \geq 0.$ In practice, this mass matrix may be singular. To address this, we typically use a "mass lumping" strategy to approximate the mass matrix with a diagonal and positive definite form. This is achieved by summing each row and defining: $M_{ab}^{lump} = δ_{ab} c \sum M_{a c} .$

After spatial discretization, the solution of the weak form (Equation (17.1)) is confined to $d$ $n$ -dimensional function spaces, where $n$ represents the number of sample points, assuming all interpolation functions are mutually orthogonal. This means that there could be continuous solutions to the weak form outside of our solution space. In such cases, we can only provide an approximate solution based on the chosen sample points and interpolation functions.

Definition 17.1.1 (Orthogonal Functions). Similar to the orthogonality of two vectors $a$ and $b$ , defined as $a^{T} b = 0$ , the orthogonality of two functions $f (x)$ and $g (x)$ is defined as: $\int f (x) g (x) d x = 0.$ Just as a basis of vectors can span a finite-dimensional space, orthogonal functions can form an infinite basis for a function space. Conceptually, the integral above is analogous to a vector dot product.

That being said, to generate equations solvable for the unknowns, the arbitrary test function $Q$ does not need to cover all possibilities to produce an infinite number of equations. Instead, we only need to produce a finite set of equations that spans the entire solution space. Therefore, for $\overset{a}{^}$ traversing all sample points, and $\hat{i} = 1, 2, \dots, d$ , we can assign the test function: $Q_{a ∣ i}^{n} = {1, 0, a = \overset{a}{^} and i = \hat{i} otherwise$ to obtain $n d$ equations: $M_{\overset{a}{^} b} A_{b ∣ \hat{i}}^{n} = \int_{\partial Ω^{0}} N_{\overset{a}{^}} (X) T_{\hat{i}} (X, t^{n}) d s (X) - \int_{Ω^{0}} N_{\overset{a}{^}, j} (X) P_{\hat{i} j} (X, t^{n}) d X, (17.1.3)$ resulting in $n d$ unknowns and $n d$ equations, bringing us closer to the discrete form.

The two integrals on the right side of Equation (17.1.3) can be evaluated analytically or using quadrature rules, depending on the specific choice of interpolation functions. We will discuss these in detail in future lectures.

Discrete Time

Discretization in time links $A$ to our degrees of freedom (DOF) $x$ . In the continuous setting, $A (X, t) = \frac{\partial ^{2} x}{\partial t ^{2}} (X, t)$ . Now, let us divide time into small intervals, $t^{0}, t^{2}, \dots, t^{n}, \dots$ , as discussed in the first chapter. Using the finite difference formula, we can conveniently approximate $A$ in terms of $x$ .

For example, with backward Euler: $A^{n} (X) V^{n} (X) = \frac{V ^{n} ( X ) - V ^{n - 1} ( X )}{t ^{n} - t ^{n - 1}}, = \frac{x ^{n} ( X ) - x ^{n - 1} ( X )}{t ^{n} - t ^{n - 1}},$ which gives us: $A^{n} (X) = \frac{x ^{n} ( X ) - ( x ^{n - 1} ( X ) + Δ t V ^{n - 1} ( X ))}{Δ t ^{2}},$ where $Δ t = t^{n} - t^{n - 1}$ . Applying this relation at the sample points into Equation (17.1.3), we obtain: $M_{\overset{a}{^} b} \frac{x _{b ∣ \hat{i}}^{n} - ( x _{b ∣ \hat{i}}^{n - 1} + Δ t V _{b ∣ \hat{i}}^{n - 1} )}{Δ t ^{2}} = \int_{\partial Ω^{0}} N_{\overset{a}{^}} (X) T_{\hat{i}} (X, t^{n}) d s (X) - \int_{Ω^{0}} N_{\overset{a}{^}, j} (X) P_{\hat{i} j} (X, t^{n}) d X . (17.2.1)$

Then, by applying mass lumping and zero traction boundary conditions, i.e., $T (X, t) = 0$ , we finally see that Equation (17.2.1) is the $(\overset{a}{^} d + \hat{i})$ -th row of the discrete form of backward Euler time integration in the first lecture: $M (x^{n + 1} - (x^{n} + Δ t v^{n})) - Δ t^{2} f (x^{n + 1}) = 0,$ where the elasticity force $f (x)$ is obtained by evaluating: $- \int_{Ω^{0}} N_{\overset{a}{^}, j} (X) P_{\hat{i} j} (X, t) d X,$ which will be discussed in the next chapter.

Summary

In this lecture, we discretized the weak form of momentum conservation in both space and time, arriving at the system of equations for backward Euler time integration introduced in the first lecture.

Spatial Discretization:
For spatial discretization, a finite number of points are sampled within the domain, and their displacements are used as the degrees of freedom (DOF) of the simulation. With the interpolation function associated with each DOF, the displacement at any point in the domain can be approximated, limiting the solution of the weak form to $d$ $n$ -dimensional function spaces formed by mutually orthogonal interpolation functions, where $n$ represents the number of sample points. In this way, the test function $Q$ can be conveniently assigned to generate $n d$ equations for solving the $n d$ unknowns.

Temporal Discretization:
The discretization of time connects the acceleration $A$ to the DOF $x$ via specific time integration rules. By applying mass lumping and assuming zero traction boundary conditions, we can ultimately derive the discrete form. The integration of interpolation functions will be covered in the next chapter.

In the next lecture, we will discuss boundary conditions and frictional contact in the continuous setting.

Boundary Conditions and Frictional Contact

Until now, we've omitted the Dirichlet boundary conditions and frictional contact in both the strong and weak forms of the governing equations to keep the derivations concise and straightforward. However, as we learned in the Boundary Treatments chapter, this boundary information is crucial for accurately simulating a wide range of solid dynamics.

Incorporating Boundary Conditions

In the weak form we derived (see Equation (16.3.6)), there is a boundary term $\int_{\partial Ω^{0}} Q_{i} (X, t) T_{i} (X, t) d s (X)$ that describes the force acting on the boundary of the solid from the outside.

If there are no Dirichlet boundary conditions, the entire boundary is handled using Neumann Boundary Conditions, where the boundary force is specified as part of the problem setup. Recall that we discussed the Dirichlet Boundary Condition, where the displacements of the boundary are directly prescribed. In practice, external forces act on the Dirichlet boundaries to ensure their displacements precisely match the prescribed values, and these forces are calculated directly from those displacements.

In a solid simulation problem, boundaries can be either a Dirichlet boundary or a Neumann boundary, which can be described by a more general problem formulation in strong form: $R (X, 0) \frac{\partial V}{\partial t} (X, t) = \nabla^{X} \cdot P (X, t) + R (X, 0) A^{ext} (X, t), \forall X \in Ω^{0} and t \geq 0; x = x_{D} (X, t), \forall X \in Γ_{D} and t \geq 0; P (X, t) N (X) = T_{N} (X, t), \forall X \in Γ_{N} and t \geq 0. (18.1.1)$

Here $Γ_{N}$ and $Γ_{D}$ are the Neumann and Dirichlet boundaries respectively, $Γ_{N} \cup Γ_{D} = \partial Ω_{0}$ , $Γ_{N} \cap Γ_{D} = \emptyset$ , and $x_{D}$ and $T_{N}$ are given. After we derive the weak form of the momentum conservation (see Equation (18.1.1), first line), the boundary term $\int_{\partial Ω^{0}} Q_{i} (X, t) T_{i} (X, t) d s (X)$ can be separately considered for Dirichlet and Neumann boundaries: $\int_{\partial Ω^{0}} Q_{i} (X, t) T_{i} (X, t) d s (X) = \int_{Γ_{D}} Q_{i} (X, t) T_{D ∣ i} (X, t) d s (X) + \int_{Γ_{N}} Q_{i} (X, t) T_{N ∣ i} (X, t) d s (X) .$

For Neumann boundaries, since the traction $T_{N} (X, t)$ is provided, the above integral can be directly evaluated after discretization. However, for Dirichlet boundaries, $T_{D} (X, t)$ remains unknown until we solve the problem. Therefore, a straightforward approach is to introduce the traction at Dirichlet boundaries as unknowns and solve the system that includes both the discretized weak form equations and the Dirichlet boundary conditions.

Remark 18.1.1 (Optimization Form). In the optimization form, the potential energy does not include any Dirichlet boundaries, effectively ignoring the boundary integral in the weak form. This is valid because the Dirichlet boundary conditions will be enforced by the linear equality constraints, and the corresponding discretized weak form equation will be overwritten.

Normal Contact for Non-penetration

To prevent self-interpenetration during simulation, it's essential to enforce a condition ensuring that the deformation map $ϕ (\cdot, t) : Ω^{0} \to Ω^{t}$ is bijective for any $t \geq 0$ . This bijection is maintained by boundary forces acting on pairs of contacting surface regions, referred to as $Γ_{C}$ . We can think of these forces as another set of Neumann boundary conditions that exert extra forces on $Γ_{C}$ only when necessary to prevent interpenetration. Thus, we can extend the boundary integral term in the weak form as follows:

$\int_{\partial Ω^{0}} Q_{i} (X, t) T_{i} (X, t) d s (X) = \int_{Γ_{D}} Q_{i} (X, t) T_{D ∣ i} (X, t) d s (X) + \int_{Γ_{N}} Q_{i} (X, t) T_{N ∣ i} (X, t) d s (X) + \int_{Γ_{C}} Q_{i} (X, t) T_{C ∣ i} (X, t) d s (X), (18.2.1)$

where $T_{N} (X, t)$ is the original Neumann boundary force specified in the problem setup, and $T_{C} (X, t)$ is the normal contact force arising from the bijectivity constraint.

Similar to Dirichlet boundary conditions, $T_{C} (X, t)$ can only be determined once we solve the problem. However, enforcing non-interpenetration is more complex than prescribing displacements. Fortunately, we can use the approximate constitutive model of $T_{C} (X, t)$ in IPC to represent the contact force as a function of $x$ , ensuring non-interpenetration by simply including this additional conservative force.

Remark 18.2.1 (Overlapping Boundaries). Note that here $Γ_{C}$ can overlap with both $Γ_{D}$ and $Γ_{N}$ . For a free (Neumann) boundary contacting a Dirichlet boundary, $T_{C} (X, t)$ on the Dirichlet part will also be ignored when enforcing the Dirichlet boundary conditions. However, if two Dirichlet boundaries interpenetrate each other, the problem will have no solution with the bijectivity constraint.

Barrier Potential

As discussed in Distance Barrier for Nonpenetration, the principle of IPC for solid-to-obstacle contact is to use a barrier function to ensure that the signed distance between any nodal degrees of freedom (DOFs) and obstacles remains positive throughout the simulation. To handle self-contact, potentially for codimensional objects, this idea is extended to ensure that the unsigned distance between any boundary points and the boundary remains nonzero throughout the simulation.

Let's consider two colliding regions, $Γ_{1} \subset \partial Ω^{0}$ and $Γ_{2} \subset \partial Ω^{0}$ , on the boundary. For any point $X_{1} \in Γ_{1}$ , we must ensure that the closest distance between $X_{1}$ and any point on $Γ_{2}$ remains nonzero. This can be achieved by using a barrier function to enforce this minimum distance, where the negative gradient of the function provides the contact force. This can be written as

$T_{C} (X_{1}, t) = - \frac{\partial b ( min _{X_{2} \in Γ_{2}} ∥ x ( X _{1} , t ) - x ( X _{2} , t ) ∥ , d ^ )}{\partial x ( X _{1} , t )}, (18.3.1)$

where $b$ is the barrier function:

$b (d, \hat{d}) = {\frac{κ}{2} \hat{d} (\frac{d}{d ^} - 1) ln \frac{d}{d ^} 0 d < \hat{d} d \geq \hat{d}, (18.3.2)$

serving as the contact potential energy density. Here, the barrier function approaches infinity as the distance approaches zero, providing an arbitrarily large repulsive force to prevent interpenetration. When the distance is above the threshold $\hat{d}$ and no contact is occurring, no contact forces are exerted. By using the barrier function, the non-smooth contact constraints are approximated by a constitutive model in which the force is conservative, enabling consistent resolution through an optimization-based time integrator.

Remark 18.3.1 (Barrier Density). Compared to Equation (7.2.3), the barrier energy density function here is additionally multiplied by $\hat{d}$ to maintain consistent units after surface integration. Recall from Remark 7.2.1 that the barrier potential can be thought of as an extra thin layer of ( $\hat{d}$ -thick) virtual material right outside the boundary of the solids, and $κ$ is analogous to Young's modulus.

Remark 18.3.2 (Min Operator). When multiple points have the same minimal distance to $X_{1}$ , the distance barrier of $X_{1}$ to all these points should be summed up. The min operator is non-smooth, which can still complicate optimization-based time integration. In the next chapter, we will demonstrate how this is approximated as described in Distance Barrier for Nonpenetration.

The case of two colliding regions results in a boundary integral:

$\int_{Γ_{1}} Q_{i} (X_{1}, t) T_{C_{1} ∣ i} (X_{1}, t) d s (X) + \int_{Γ_{2}} Q_{i} (X_{2}, t) T_{C_{2} ∣ i} (X_{2}, t) d s (X),$

where $T_{C_{1}}$ is defined in Equation (18.3.1), and:

$T_{C_{2}} (X_{2}, t) = - \frac{\partial b ( min _{X_{1} \in Γ_{1}} ∥ x ( X _{2} , t ) - x ( X _{1} , t ) ∥ , d ^ )}{\partial x ( X _{2} , t )}, (18.3.3)$

However, we have ignored the self-contact of $Γ_{1}$ and $Γ_{2}$ in this example. Thus, generalizing to arbitrary self-contact for the whole domain, we can keep the single boundary integral term $\int_{Γ_{C}} Q_{i} (X, t) T_{C ∣ i} (X, t) d s (X)$ for contact as in Equation (18.2.1) and define the traction more generally as:

$T_{C} (X, t) = - \frac{\partial b ( min _{X_{2} \in Γ_{C} - N (X)} ∥ x ( X , t ) - x ( X _{2} , t ) ∥ , d ^ )}{\partial x ( X , t )}, (18.3.4)$

where $N (X) = {X_{N} \in R^{d} ∣ ∥ X_{N} - X ∥ < r}$ is an infinitesimal circle around $X$ with the radius $r$ sufficiently small to avoid unnecessary contact forces between a point and its geodesic neighbors.

Remark 18.3.3 (Barrier Force Limits). In Equation (18.3.4), self-contact is ignored for points inside $N (X)$ . This is the trade-off for smoothly approximating contact forces, which are discontinuous in a macroscopic view. Similarly, $\hat{d}$ introduces another source of error. However, when $\hat{d} \to 0$ and $r \to 0$ , our model converges to the discontinuous definition. Note that we also need $\hat{d} / r \to 0$ , or there could still be some distance $d$ between $r$ and $\hat{d}$ that causes the barrier to diverge in the limit.

Finally, we can define the contact potential over the whole boundary $Γ_{C}$ as:

$\int_{Γ_{C}} \frac{1}{2} b (X_{2} \in Γ_{C} - N (X) min ∥ x (X, t) - x (X_{2}, t) ∥, \hat{d}) d s (X) . (18.3.5)$

Here, the coefficient $\frac{1}{2}$ is used because the barrier energy density $b$ of each pair of contacting points will be counted twice in the integral due to symmetry. When computing barrier forces, both occurrences need to be differentiated. Therefore, using the coefficient $\frac{1}{2}$ allows us to match the force definition in Equation (18.3.4). We'll elaborate on this in the next chapter under the discrete weak form.

The contact potential is not required in the weak form but will be useful for optimization-based time integration.

Friction Force

Analogous to Frictional Contact, maximizing the dissipation rate subject to the Coulomb constraint defines friction forces per unit area variationally:

$T_{F} (X, t) = β \in R^{d} arg min β^{T} V_{F} (X, t) s.t. ∥ β ∥ \leq μ ∥ T_{C} (X, t) ∥ and β \cdot N (X, t) = 0. (18.4.1)$

Here, $V_{F} (X, t) = V (X, t) - V (X_{2}, t)$ is the relative sliding velocity between $X$ and the closest point $X_{2} = arg min_{X_{2} \in Γ_{C} - N (X)} ∥ X - X_{2} ∥$ , $μ$ is the coefficient of friction, $T_{C}$ is the normal contact force per unit area, and $N$ is the normal direction.

This is equivalent to:

$T_{F} (X, t) = - μ ∥ T_{C} (X, t) ∥ f (∥ V_{F} (X, t) ∥) s (V_{F} (X, t)), (18.4.2)$

with $s (V_{F}) = \frac{V _{F}}{∥ V _{F} ∥}$ when $∥ V_{F} ∥ > 0$ , while $s (V_{F})$ takes any unit vector orthogonal to $N (X, t)$ when $∥ V_{F} ∥ = 0$ .

In addition, the friction scaling function $f$ is also nonsmooth with respect to $V_{F}$ , since $f (∥ V_{F} ∥) = 1$ when $∥ V_{F} ∥ > 0$ , and $f (∥ V_{F} ∥) \in [0, 1]$ when $∥ V_{F} ∥ = 0$ . These nonsmooth properties can severely hinder or even break the convergence of gradient-based optimization. The mollification of the friction-velocity relationship here follows the same approach as in Frictional Contact.

Summary

We have discussed Neumann and Dirichlet boundary conditions as well as frictional contact in the continuous setting to complete a rigorous problem formulation. Combining everything in strong form, for all $t \geq 0$ :

$R (X, 0) \frac{\partial V}{\partial t} (X, t) = \nabla^{X} \cdot P (X, t) + R (X, 0) A^{ext} (X, t), x = x_{D} (X, t), P (X, t) N (X) = T_{N} (X, t) + T_{C} (X, t) + T_{F} (X, t), ϕ (X, t) : Ω^{0} \to Ω^{t} is bijective, T_{F} (X, t) = β \in R^{d} arg min β^{T} V_{F} (X, t) s.t. ∥ β ∥ \leq μ ∥ T_{C} (X, t) ∥ and β \cdot N (X, t) = 0, \forall X \in Ω^{0}; \forall X \in Γ_{D}; \forall X \in Γ_{N}; \forall X \in Ω^{0}; \forall X \in Γ_{C} . (18.5.1)$

After deriving the weak form of the momentum equation, the boundary integral term can be separated as follows:

Here, only the Neumann force $T_{N} (X, t)$ is given, while all other boundary forces can be determined after solving the coupled system. Fortunately, Dirichlet boundary conditions can be enforced straightforwardly in the optimization framework as linear equality constraints. Frictional contact forces $T_{C} (X, t)$ and $T_{F} (X, t)$ can both be smoothly approximated as conservative forces with controllable error.

In the next chapter, we will discuss discretizing the weak form using the finite element method (FEM), connecting the derivations in this chapter to the discrete simulation methods.

Linear Finite Elements

From the governing equations in the continuous setting, we derived the discretized weak form system ( $n d$ equations) using the backward Euler time integration rule:

$M_{\overset{a}{^} b} \frac{x _{b ∣ \hat{i}}^{n} - ( x _{b ∣ \hat{i}}^{n - 1} + h V _{b ∣ \hat{i}}^{n - 1} )}{Δ t ^{2}} = \int_{\partial Ω^{0}} N_{\overset{a}{^}} (X) T_{\hat{i}} (X, t^{n}) d s (X) - \int_{Ω^{0}} N_{\overset{a}{^}, j} (X) P_{\hat{i} j} (X, t^{n}) d X . (19.1)$ In this chapter, we'll start by discussing the shape function $N_{\overset{a}{^}}$ in the context of linear finite elements. This exploration will help us understand the underlying implementation detailed in Inversion-Free Elasticity.

We'll focus specifically on simplex finite elements. In 2D, the 2-simplex is a triangle, and we've consistently used triangle meshes throughout this book to discretize the solid domain into a disjoint set of triangular elements.

Definition 19.1 (Simplex).
An n-simplex is a geometric object with $n + 1$ vertices that exists in an $n$ -dimensional space. It cannot fit in any space of smaller dimension.

Piecewise Linear Displacement Field

For a triangle element with vertices $X_{1}$ , $X_{2}$ , and $X_{3}$ in the solid domain, we can approximate the world space coordinates of an arbitrary point $X$ in this element using spatial discretization (see Equation (17.1.1)):

$\hat{x} (X) = x (X_{1}) N_{1} (X) + x (X_{2}) N_{2} (X) + x (X_{3}) N_{3} (X), (19.1.1)$

This equation represents a 2D interpolation, extending Experiment Example 17.1.1. Here, we assume that the world space coordinates of any arbitrary point in an element can be interpolated solely from the coordinates of the element's vertices.

Linear finite elements use linear shape functions $N_{i}$ in Equation (19.1.1), resulting in a piecewise linear (per triangle) displacement field $u = \hat{x} (X) - X$ over the entire domain. Before providing the precise expression of $N$ in terms of $X$ , we'll introduce another parameter space to simplify the derivation.

Let $β, γ \in [0, 1]$ and $β + γ \leq 1$ , we can use them to express the material space coordinates of an arbitrary point $X$ in the element $X_{1} X_{2} X_{3}$ as:

$X (β, γ) = X_{1} + β (X_{2} - X_{1}) + γ (X_{3} - X_{1}) = (1 - β - γ) X_{1} + β X_{2} + γ X_{3} .$

Here, $X$ is a linear function of $(β, γ)$ . With linear shape functions, the approximation $\hat{x}$ is a linear function of $X$ .

Recall that for interpolation, we have to satisfy the conditions $\hat{x} (X_{i}) = x (X_{i})$ . Putting these all together, we can obtain a unique solution:

$x (β, γ) \approx \hat{x} (β, γ) = x_{1} + β (x_{2} - x_{1}) + γ (x_{3} - x_{1}) = (1 - β - γ) x_{1} + β x_{2} + γ x_{3},$

where we denote $x (X_{i})$ as $x_{i}$ . This indicates that:

$N_{1} (β, γ) = 1 - β - γ, N_{2} (β, γ) = β, N_{3} (β, γ) = γ .$

Interestingly, with the expression of $X (β, γ)$ , $x (β, γ)$ , and $N (β, γ)$ , we do not necessarily need the precise expression of $\hat{x} (X)$ and $N (X)$ for the following derivations to compute each term in Equation (17.2.1).

Remark 19.1.1 (Partition of Unity). The shape functions of FEM satisfy the partition of unity everywhere within each element: $N_{1} (β, γ) + N_{2} (β, γ) + N_{3} (β, γ) = 1 \forall β, γ \in [0, 1] and β + γ \leq 1.$ One advantage of FEM is that it provides accurate boundary resolution compared to grid or particle-based representations. The boundary nodes of the FEM mesh can be exactly located on the boundary of the continuous domain. The elements are generated inside the domain, connecting the boundary nodes to form the discrete boundary, which converges to the boundary of the continuous domain as resolution increases.
Although particle-based methods can also sample particles on the domain boundary, their spherical shape functions extend beyond the domain, breaking the partition of unity. This creates a "soft" outbound layer of material that makes boundary force computations inaccurate. In contrast, FEM shape functions are nonzero only within each element, where the partition of unity is satisfied everywhere.

Mass Matrix and Lumping

Recall from Discretization of Weak Forms that:

$M_{ab} = \int_{Ω^{0}} R (X, 0) N_{a} (X) N_{b} (X) d X . (19.2.1)$

With the solid domain discretized into triangles $T$ , we have:

$M_{ab} = e \in T \sum \int_{Ω_{e}^{0}} R (X, 0) N_{a} (X) N_{b} (X) d X, (19.2.2)$

where $Ω_{e}^{0}$ represents the material space of triangle $e$ . Note that for linear triangle elements, since $N_{i}$ is nonzero only on the incident triangles of node $i$ , here we only need to consider triangles with both $a$ and $b$ being their vertices.

Let us change the integration variable from $X$ to $(β, γ)$ , which gives:

$= \int_{Ω_{e}^{0}} R (X, 0) N_{a} (X) N_{b} (X) d X \int_{0}^{1} \int_{0}^{1 - β} R (β, γ, 0) N_{a} (β, γ) N_{b} (β, γ) det (\frac{\partial X}{\partial ( β , γ )}) d γ d β . (19.2.3)$

For simplicity, let us denote the vertices of this triangle $e$ as $X_{1}$ , $X_{2}$ , and $X_{3}$ , and then we have:

$det (\frac{\partial X}{\partial ( β , γ )}) = ∣ det ([X_{2} - X_{1}, X_{3} - X_{1}]) ∣ = 2 A_{e},$

where $A_{e}$ is the area of triangle $e$ . Here, $N_{a}$ and $N_{b}$ take $1 - β - γ$ , $β$ , or $γ$ depending on the vertex indices $a$ and $b$ . For example, if $a$ and $b$ correspond to the 2nd and 3rd vertices of triangle $e$ , then $N_{a} = β$ and $N_{b} = γ$ . Assuming uniform density, we have:

$\int_{Ω_{e}^{0}} R (X, 0) N_{a} (X) N_{b} (X) d X = = = = 2 R A_{e} \int_{0}^{1} \int_{0}^{1 - β} β γ d γ d β 2 R A_{e} \int_{0}^{1} \frac{1}{2} β γ^{2}_{γ = 0}^{γ = 1 - β} d β R A_{e} \int_{0}^{1} β (1 - β)^{2} d β R A_{e} (\frac{β ^{2}}{2} - \frac{2 β ^{3}}{3} + \frac{β ^{4}}{4})_{β = 0}^{β = 1} = \frac{1}{12} R A_{e} . (19.2.4)$

With mass lumping, $M_{ab}^{lump} = δ_{ab} \sum_{c} M_{a c}$ , which means:

$M_{aa}^{lump} = e \in T \sum b \in V \sum \int_{Ω_{e}^{0}} R (X, 0) N_{a} (X) N_{b} (X) d X, (19.2.5)$

where $V$ contains all the nodes of the mesh, and all off-diagonal entries of $M^{lump}$ are $0$ . Similarly, due to the locality of $N$ , for each triangle element, $b$ only needs to traverse all three triangle vertices:

$M_{aa}^{lump} = e \in T (a) \sum 2 R A_{e} (\int_{0}^{1} \int_{0}^{1 - β} β (1 - β - γ) d γ d β + \int_{0}^{1} \int_{0}^{1 - β} β^{2} d γ d β + \int_{0}^{1} \int_{0}^{1 - β} β γ d γ d β) = e \in T (a) \sum 2 R A_{e} \int_{0}^{1} β d γ d β = e \in T (a) \sum 2 R A_{e} \int_{0}^{1} β γ ∣_{γ = 0}^{γ = 1 - β} d β = e \in T (a) \sum 2 R A_{e} \int_{0}^{1} β (1 - β) d β = e \in T (a) \sum 2 R A_{e} (\frac{β ^{2}}{2} - \frac{β ^{3}}{3})_{β = 0}^{β = 1} = e \in T (a) \sum \frac{1}{3} R A_{e}, (19.2.6)$

where $T (a)$ denotes the set of triangles incident to node $a$ . This result also explains why in Inversion-Free Elasticity when computing the mass for all the nodes, we traverse all triangles, calculate the mass of the triangle $R A_{e}$ and evenly distribute it to the three vertices. With the mass matrix computed, the momentum change and external body force terms including their energy forms are all easy to deal with.

Elasticity Term

For the elasticity term $\int_{Ω^{0}} N_{\overset{a}{^}, j} (X) P_{\hat{i} j} (X, t^{n}) d X$ in the discrete weak form system in Equation (19.1), we can write it as the summation of integrals on each triangle $e$ in vector form:

$\int_{Ω^{0}} N_{\overset{a}{^}, j} (X) P_{\hat{i} j} (X, t^{n}) d X = \int_{Ω^{0}} (P (X, t^{n}) \nabla^{X} N_{\overset{a}{^}} (X))_{\hat{i}} d X = e \in T \sum \int_{Ω_{e}^{0}} (P (X, t^{n}) \nabla^{X} N_{\overset{a}{^}} (X))_{\hat{i}} d X . (19.3.1)$

Analogously, this summation also only needs to involve the incident triangles of node $\overset{a}{^}$ .

Recall from Strain Energy, to compute the first Piola-Kirchoff stress $P (X, t^{n})$ , we only need the deformation gradient $F (X, t^{n})$ . From Section Kinematics, we know that $F = \frac{\partial x}{\partial X}$ . Applying the chain rule with the parameter space variables $(β, γ)$ as intermediates, we have:

$F = \frac{\partial x}{\partial ( β , γ )} (\frac{\partial X}{\partial ( β , γ )})^{- 1} \approx \frac{\partial x ^}{\partial ( β , γ )} (\frac{\partial X}{\partial ( β , γ )})^{- 1} = [x_{2} - x_{1}, x_{3} - x_{1}] [X_{2} - X_{1}, X_{3} - X_{1}]^{- 1}, (19.3.2)$

which is exactly the same as Equation (15.1.1) from our earlier implementation (Section Inversion-Free Elasticity). Here, we also see that with linear finite elements, the deformation gradient field is piecewise constant in $Ω^{0}$ , so is $P$ .

Then for $\nabla^{X} N_{\overset{a}{^}} (X)$ , depending on the index of $\overset{a}{^}$ in triangle $e$ , we can derive it again using parameter space variables as:

$\nabla^{X} N_{1} (X) \nabla^{X} N_{2} (X) = ([1, 0] [X_{2} - X_{1}, X_{3} - X_{1}]^{- 1})^{T} \nabla^{X} N_{3} (X) = ([0, 1] [X_{2} - X_{1}, X_{3} - X_{1}]^{- 1})^{T} . = \frac{\partial ( 1 - β - γ )}{\partial X} = (\frac{\partial ( 1 - β - γ )}{\partial ( β , γ )} (\frac{\partial X}{\partial ( β , γ )})^{- 1})^{T} = ([- 1, - 1] [X_{2} - X_{1}, X_{3} - X_{1}]^{- 1})^{T} = \frac{\partial β}{\partial X} = (\frac{\partial β}{\partial ( β , γ )} (\frac{\partial X}{\partial ( β , γ )})^{- 1})^{T} = \frac{\partial γ}{\partial X} = (\frac{\partial γ}{\partial ( β , γ )} (\frac{\partial X}{\partial ( β , γ )})^{- 1})^{T}$

This also allows us to see that $P (X, t^{n}) \nabla^{X} N_{\overset{a}{^}} (X)$ is constant within any triangle $e$ and it is equivalent to $\frac{\partial Ψ _{e}}{\partial x _{\overset{a}{^}}}$ since:

$\frac{\partial Ψ _{e}}{\partial x _{\overset{a}{^}}} = \frac{\partial Ψ _{e}}{\partial F} \frac{\partial F}{\partial x _{\overset{a}{^}}} = P \nabla^{X} N_{\overset{a}{^}} .$

Substituting $\frac{\partial Ψ _{e}}{\partial x _{\overset{a}{^}}}$ into Equation (19.3.1) we obtain:

$\int_{Ω^{0}} N_{\overset{a}{^}, j} (X) P_{\hat{i} j} (X, t^{n}) d X = e \in T \sum \int_{Ω_{e}^{0}} (P (X, t^{n}) \nabla^{X} N_{\overset{a}{^}} (X))_{\hat{i}} d X = e \in T \sum \int_{Ω_{e}^{0}} (\frac{\partial Ψ _{e}}{\partial x _{\overset{a}{^}}})_{\hat{i}} d X = e \in T \sum A_{e} (\frac{\partial Ψ _{e}}{\partial x _{\overset{a}{^}}})_{\hat{i}},$

which is exactly how nodal elasticity force is computed in Section Inversion-Free Elasticity. This also indicates that the total elasticity potential can be calculated as $\sum_{e \in T} A_{e} Ψ_{e}$ , which is $\int_{Ω^{0}} Ψ (X) d X$ before spatial discretization.

Remark 19.3.1. [Linear FEM] Linear FEM refers to $x$ being a piecewise linear function of $X$ , but the elasticity model can still be nonlinear, i.e. $P$ can be a nonlinear function of $F$ .

Summary

Based on the temporally and spatially discretized weak form, we've explored methods to compute the mass matrix, deformation gradient, and elasticity force under the linear finite element setting, all of which align with our implementation in Section Inversion-Free Elasticity.

With linear finite elements, the world space coordinates $x$ are approximated as a piecewise linear function of $X$ . This approximation, $\hat{x} (X)$ , is a linear function inside each triangle and is $C^{0}$ -continuous at the edges. By using two parameters, $β$ and $γ$ , to represent points on each triangle, we can identify the linear shape functions that interpolate the displacements at the triangle vertices and derive the deformation gradient $F$ . The mass matrix entries and elasticity terms can then be computed via integration with respect to $β$ and $γ$ .

Piecewise Linear Boundaries

In this lecture, we will continue our discussion on linear finite elements by focusing on boundary conditions and frictional self-contact on piecewise linear boundaries. Specifically, we will examine the computation of the boundary integral term:

$\int_{\partial Ω^{0}} N_{\overset{a}{^}} (X) T_{\hat{i}} (X, t^{n}) d s (X) (20.1)$

We will cover this in the context of Dirichlet and Neumann boundaries, as well as normal and frictional self-contact forces.

Boundary Conditions

Dirichlet

Due to the accurate boundary resolution of the Finite Element Method (FEM), enforcing Dirichlet boundary conditions is straightforward. We only need to constrain the world-space coordinates of the boundary nodes to the prescribed values:

$\hat{x} (X_{i}) = x_{D} (X_{i}) \forall X_{i} \in Γ_{D} .$

Once these constraints are properly enforced, the Dirichlet boundary integral term can be ignored.

This same mechanism can also be used to prescribe the displacement of any interior nodes. Although this does not directly correspond to any physical effects, it can simplify the simulation setup.

Neumann

For Neumann boundary conditions, we can evaluate the boundary integral term using the parameter space variables $β$ and $γ$ . With triangle mesh discretization, we have:

$\int_{Γ_{N}} N_{\overset{a}{^}} (X) T_{\hat{i}} (X, t^{n}) d s (X) = e \in T \sum \int_{\partial Ω_{e}^{0} \cap Γ_{N}} N_{\overset{a}{^}} (X) T_{\hat{i}} (X, t^{n}) d s (X),$

where $\partial Ω_{e}^{0} \cap Γ_{N}$ is the edge of triangle $e$ that is on the Neumann boundary.

For any boundary node $\overset{a}{^}$ in 2D, there will be at most two incident triangles to consider in the integration for linear shape functions. Let's examine the case with two incident triangles. Consider one of the integrals. Without loss of generality, assume $N_{\overset{a}{^}} = β$ (where $X_{\overset{a}{^}}$ corresponds to $X_{2}$ in triangle $e$ ), and that $X_{3}$ is the other node of $e$ on the boundary edge. Then, switching the integration variables to $β$ gives us:

$\int_{\partial Ω_{e}^{0} \cap Γ_{N}} N_{\overset{a}{^}} (X) T_{\hat{i}} (X, t^{n}) d s (X) = \int_{0}^{1} β T_{\hat{i}} (β X_{2} + (1 - β) X_{3}, t^{n}) \frac{\partial s}{\partial β} d β .$

Here, $\frac{\partial s}{\partial β}$ is simply the edge length $∥ X_{2} - X_{3} ∥$ . If $T$ is constant over the boundary at $t^{n}$ , we can compute:

$T_{\hat{i}}^{n} \int_{0}^{1} β \frac{\partial s}{\partial β} d β = \frac{1}{2} ∥ X_{2} - X_{3} ∥ T_{\hat{i}}^{n} .$

Therefore, to add a constant Neumann force to the discrete system, we first calculate the length weight of each boundary node by distributing the length of the boundary edges evenly to their vertices, and then multiply by the traction $T_{\hat{i}}^{n}$ . If $T$ is not constant over the boundary, more complex boundary integral calculations are needed. For a boundary node with only one incident triangle, its length weight comes from its two incident edges within the same triangle.

Remark 20.1.1 (Neumann Boundary Conditions). Here, we observe that the specified traction in standard Neumann boundary conditions is independent of $x$ , which simplifies the derivation of the potential energy, even in the continuous setting for varying Neumann forces over the domain: $\int_{Γ_{N}} x (X) \cdot T (X, t^{n}) d s (X) . (20.1.1)$ To verify this, we can replace $x (X)$ with $\hat{x} (X) = N_{\overset{a}{^}} (X) x_{\overset{a}{^}} + \dots$ for spatial discretization. Taking the derivative with respect to $x_{\overset{a}{^}}$ gives us the force integral term in the discrete weak form: $\frac{\partial \int _{Γ_{N}} x ^ ( X ) \cdot T ( X , t ^{n} ) d s ( X )}{\partial x _{\overset{a}{^}}} = \int_{Γ_{N}} N_{\overset{a}{^}} (X) T (X, t^{n}) d s (X) .$

Solid-Obstacle Contact

Recall that we used a conservative force model to approximate the contact traction $T_{C}$ , allowing it to be directly evaluated given the current configuration of the solids. This results in a contact potential:

$P_{C} = \int_{Γ_{C}} \frac{1}{2} b (X_{2} \in Γ_{C} - N (X) min ∥ x (X, t) - x (X_{2}, t) ∥, \hat{d}) d s (X),$

where $b ()$ is the barrier energy density function, and $N (X)$ is an infinitesimal region around $X$ where contact is ignored for theoretical soundness.

For normal contact between simulated solids and collision obstacles (ignoring self-contact for now), $P_{C}$ can be written in a much simpler form $P_{C} = \int_{Γ_{S}} \frac{1}{2} b (X_{2} \in Γ_{O} min ∥ x (X, t) - x (X_{2}, t) ∥, \hat{d}) d s (X) + \int_{Γ_{O}} \frac{1}{2} b (X_{2} \in Γ_{S} min ∥ x (X, t) - x (X_{2}, t) ∥, \hat{d}) d s (X) = \int_{Γ_{S}} b (X_{2} \in Γ_{O} min ∥ x (X, t) - x (X_{2}, t) ∥, \hat{d}) d s (X) = \int_{Γ_{S}} b (d^{PO} (x (X, t), O), \hat{d}) d s (X) .$ Here $Γ_{S}$ and $Γ_{O}$ are the boundaries of the simulated solids and obstacles respectively, $d^{PO} (x (X, t), O) = min_{X_{2} \in Γ_{O}} ∥ x (X, t) - x (X_{2}, t) ∥$ is the point-obstacle distance, and the simplification from two terms to one single term is due to symmetry in the continuous setting. With triangle discretization, $\int_{Γ_{S}} b (d^{PO} (x (X, t), O), \hat{d}) d s (X) \approx e \in T \sum \int_{\partial Ω_{e}^{0} \cap Γ_{S}} b (d^{PO} (x (X, t), O), \hat{d}) d s (X) . (20.2.1)$ Similar to the derivation for Neumann boundaries, for any boundary node $\overset{a}{^}$ , with 2 incident triangles, let us look at one of the integral. Without loss of generality, we can assume $N_{\overset{a}{^}} = β$ ( $X_{\overset{a}{^}}$ corresponds to $X_{2}$ in triangle $e$ ), and that $X_{3}$ is the other node of $e$ on the boundary edge. Then, switching the integration variables to $β$ gives us $= \int_{\partial Ω_{e}^{0} \cap Γ_{S}} b (d^{PO} (x (X, t), O), \hat{d}) d s (X) \int_{0}^{1} b (d^{PO} (x (β X_{2} + (1 - β) X_{3}, t), O), \hat{d}) \frac{\partial s}{\partial β} d β . (20.2.2)$ Since $b ()$ and $d^{PO} ()$ are both highly nonlinear functions, we could not obtain a closed-form expression for Equation (20.2.2). If we take the two end points $X_{2}$ and $X_{3}$ as quadrature points both with weights $\frac{1}{2}$ , we can approximate the integral as $\approx \int_{0}^{1} b (d^{PO} (x (β X_{2} + (1 - β) X_{3}, t), O), \hat{d}) \frac{\partial s}{\partial β} d β \frac{1}{2} b (d^{PO} (x (X_{2}, t), O), \hat{d}) \frac{\partial s}{\partial β} + \frac{1}{2} b (d^{PO} (x (X_{3}, t), O), \hat{d}) \frac{\partial s}{\partial β} . (20.2.3)$ Then, the whole boundary integral can be approximated as $\int_{Γ_{S}} b (d^{PO} (x (X, t), O), \hat{d}) d s (X) \approx \overset{a}{^} \sum \frac{∥ X _{\overset{a}{^}} - X _{\overset{a}{^} - 1} ∥ + ∥ X _{\overset{a}{^}} - X _{\overset{a}{^} + 1} ∥}{2} b (d^{PO} (x_{\overset{a}{^}}, O), \hat{d}),$ assuming that $X_{\overset{a}{^} - 1}$ and $X_{\overset{a}{^} + 1}$ are the two neighbors of $X_{\overset{a}{^}}$ on the boundary. This is now exactly what has been implemented in Filter Line Search.

Remark 20.2.1 (Quadrature Choice for Line Segment). Selecting the two end points ( $β = 0, 1$ ) as quadrature points for a line segment integral (Equation (20.2.3)) is not a common design choice. Typically, Gaussian quadrature would use $β = \frac{3 \pm 3}{6}$ . The advantage of choosing $β = 0, 1$ is that it results in fewer quadrature points globally, thus reducing computational costs, as neighboring edges share end points.

To see how $P_{C}$ connects to the boundary integral (Equation (20.1)) in the discrete weak form, let us take the derivative of the discretized contact potential (Equation (20.2.1)) with respect to $x_{\overset{a}{^}}$ : $= = - \frac{\partial ( \sum _{e \in T} \int _{\partial Ω_{e}^{0} \cap Γ_{S}} b ( d ^{PO} ( x ( X , t ) , O ) , d ^ ) d s ( X ))}{\partial x _{\overset{a}{^}}} e \in T \sum \int_{\partial Ω_{e}^{0} \cap Γ_{S}} - \frac{\partial b ( d ^{PO} ( x ( X , t ) , O ) , d ^ )}{\partial x} \frac{\partial x}{\partial x _{\overset{a}{^}}} d s (X) e \in T \sum \int_{\partial Ω_{e}^{0} \cap Γ_{S}} - \frac{\partial b ( d ^{PO} ( x ( X , t ) , O ) , d ^ )}{\partial x} N_{\overset{a}{^}} (X) d s (X) .$ Then we also verified that $T_{C} (X, t) = - \frac{\partial b ( d ^{PO} ( x ( X , t ) , O ) , d ^ )}{\partial x}$ here.

Self-Contact

With triangle discretization, the boundary of the domain is approximated as a polyline formed by a set of edges. Let us denote this set of boundary edges as $E$ , and the barrier potential becomes:

$\approx = \int_{Γ_{C}} \frac{1}{2} b (X_{2} \in Γ_{C} - N (X) min ∥ x (X, t) - x (X_{2}, t) ∥, \hat{d}) d s (X) \int_{Γ_{C}} \frac{1}{2} b (e \in E - I (X) min X_{2} \in e min ∥ x (X, t) - x (X_{2}, t) ∥, \hat{d}) d s (X) \int_{Γ_{C}} \frac{1}{2} b (e \in E - I (X) min d^{PE} (x (X, t), e), \hat{d}) d s (X) .$

Here, $I (X)$ is the set of edges that contain $X$ . Completely ignoring these edges is a specific choice of $N (X)$ under the current discretization. The term $min_{X_{2} \in e} ∥ x (X, t) - x (X_{2}, t) ∥$ is simply the point-edge distance $d^{PE} (x (X, t), e)$ , which can be calculated as either a point-point distance or a point-line distance depending on the relative positions of the point and the edge.

As we know, the barrier energy density function $b$ is already a smooth approximation to the discontinuous normal contact forces that prevent interpenetration between two colliding points. However, when considering self-contact between discrete surfaces (piecewise linear here), the non-smooth $min$ operator on point-edge distances is inevitable. This non-smoothness can still pose challenges for optimization time integrators.

To obtain a smooth barrier potential even in the case of piecewise linear boundaries, we first transform the $min$ operator to a $max$ operator, as the energy density function $b$ is a non-ascending function everywhere in the domain. This gives us:

$\int_{Γ_{C}} \frac{1}{2} e \in E - I (X) max b (d^{PE} (x (X, t), e), \hat{d}) d s (X) .$

Next, we need to smoothly approximate the $max$ operator. A straightforward choice is to use the smooth max function, such as the $p$ -norm function:

$max (a_{1}, a_{2}, ..., a_{n}) \approx (a_{1}^{p} + a_{2}^{p} + \dots + a_{n}^{p})^{\frac{1}{p}},$

with $p > 0$ sufficiently large. However, the exponent $\frac{1}{p}$ will couple multiple inputs together, increasing the stencil size and making the Hessian less sparse, which will make the simulation more computationally expensive.

Fortunately, due to the local support of $b$ , where the contact force only exists for distances smaller than $\hat{d}$ , using $p = 1$ is sufficient. With a relatively small $\hat{d}$ , there will only be some redundant contact forces at the interface of boundary elements (Figure 20.3.1).

**Figure 20.3.1.** In this simple two-edge illustration, the yellow and green regions are only counted once by the summation, but the blue region and the yellow-green overlap are counted twice. If we subtract once the blue region, then for the right-top boundary (convex), it becomes perfect, but for the left-bottom boundary (concave), we can still see some overlap that are counted twice.

Since the overlapping supports of $b$ from multiple boundary elements can be clearly identified, it is also possible to subtract the redundant barrier potentials in those regions, as discussed in detail in [Li et al. 2023]. For this book, let us keep it simple by using $p = 1$ with the $p$ -norm formulation, which is just summation:

$\int_{Γ_{C}} \frac{1}{2} e \in E - I (X) max b (d^{PE} (x (X, t), e), \hat{d}) d s (X) \approx \int_{Γ_{C}} \frac{1}{2} e \in E - I (X) \sum b (d^{PE} (x (X, t), e), \hat{d}) d s (X) .$

Approximating the integral under triangle discretization and picking the end points of each boundary edge as the quadrature points, we obtain the fully discrete form:

$\approx \int_{Γ_{C}} \frac{1}{2} e \in E - I (X) \sum b (d^{PE} (x (X, t), e), \hat{d}) d s (X) \overset{a}{^} \sum \frac{∥ X _{\overset{a}{^}} - X _{\overset{a}{^} - 1} ∥ + ∥ X _{\overset{a}{^}} - X _{\overset{a}{^} + 1} ∥}{4} e \in E - I (X_{\overset{a}{^}}) \sum b (d^{PE} (x_{\overset{a}{^}}, e), \hat{d}) . (20.3.1)$

Similar to the solid-obstacle contact cases, $T_{C}$ can be derived by taking the derivative of the whole contact potential with respect to the nodal degrees of freedom (DOFs).

Summary

We have connected the discrete weak form (Equation (19.1)) to the implementations in Filter Line Search for boundary conditions and contact. Additionally, we have derived self-contact between discrete surfaces in 2D, which will be implemented in the next lecture.

The derivations follow a consistent methodology: first, rewrite the global integral as a summation of local element-wise integrals, and then approximate or analytically evaluate the local integrals using certain quadrature rules.

We didn't explicitly discuss friction in this lecture because its force definition in the continuous setting was covered in Boundary Conditions and Frictional Contact. Its integral approximation can be performed similarly to normal contact forces (see Case Study: 2D Frictional Self-Contact for details).

During the derivation, we also observed that the route we have taken from the strong form to the optimization time integration implementation, namely:

$strong form \to weak form \to discrete weak form \to finite element approximation \to optimization time integration$

is not unique. We can directly write the continuous form of the potential energies and then perform spatial discretization and approximation to obtain the nodal forces. Readers interested in this approach can refer to Lagrangian Mechanics or Hamiltonian Mechanics.

Case Study: 2D Self-Contact*

We have finished connecting linear finite elements to the weak form derivation for elastodynamics and frictional contact. Now, it's time to see how these concepts are implemented in code. In this lecture, we will implement 2D frictionless self-contact based on our Python development of the inversion-free elasticity simulation from Case Study: Inversion-free Elasticity.

The executable Python project for this section can be found at https://github.com/phys-sim-book/solid-sim-tutorial under the 7_self_contact folder. MUDA GPU implementations can be found at https://github.com/phys-sim-book/solid-sim-tutorial-gpu under the simulators/7_self_contact folder. We will implement frictional self-contact in the next lecture.

Scene Setup and Boundary Element Collection

To begin with, we set up a new scene with two squares falling onto the ground, compressed by the ceiling so that self-contact will occur between these squares.

Implementation 21.1.1 (Simulation setup, simulator.py).

# simulation setup
side_len = 0.45
rho = 1000      # density of square
E = 1e5         # Young's modulus
nu = 0.4        # Poisson's ratio
n_seg = 2       # num of segments per side of the square
h = 0.01        # time step size in s
DBC = [(n_seg + 1) * (n_seg + 1) * 2]   # dirichlet node index
DBC_v = [np.array([0.0, -0.5])]         # dirichlet node velocity
DBC_limit = [np.array([0.0, -0.7])]     # dirichlet node limit position
ground_n = np.array([0.0, 1.0])         # normal of the slope
ground_n /= np.linalg.norm(ground_n)    # normalize ground normal vector just in case
ground_o = np.array([0.0, -1.0])        # a point on the slope  
mu = 0.4        # friction coefficient of the slope

# initialize simulation
[x, e] = square_mesh.generate(side_len, n_seg)       # node positions and triangle node indices of the top square
e = np.append(e, np.array(e) + [len(x)] * 3, axis=0) # add triangle node indices of the bottom square
x = np.append(x, x + [side_len * 0.1, -side_len * 1.1], axis=0) # add node positions of the bottom square

In line 17, we adapt the DOF index of the ceiling from $(n se g + 1) * (n se g +$ 1) to $(n se g + 1) * (n se g +$ 1)∗2, as we now have two squares. Line 26 generates the first square on the top, while lines 27 and 28 generate the second square on the bottom by creating copies and offsets.

The initial frame, as shown in Figure 21.1.1, is now established. However, without handling self-contact, these two squares cannot interact with each other yet.

**Figure 21.1.1.** The new scene setup with 2 squares to fall.

To handle contact, we first need to collect all boundary elements. In 2D, this involves identifying the nodes and edges on the boundary where contact forces will be applied to all close but non-incident point-edge pairs. The following function finds all boundary nodes and edges given a triangle mesh:

Implementation 21.1.2 (Collect boundary elements, square_mesh.py).

def find_boundary(e):
    # index all half-edges for fast query
    edge_set = set()
    for i in range(0, len(e)):
        for j in range(0, 3):
            edge_set.add((e[i][j], e[i][(j + 1) % 3]))

    # find boundary points and edges
    bp_set = set()
    be = []
    for eI in edge_set:
        if (eI[1], eI[0]) not in edge_set:
            # if the inverse edge of a half-edge does not exist,
            # then it is a boundary edge
            be.append([eI[0], eI[1]])
            bp_set.add(eI[0])
            bp_set.add(eI[1])
    return [list(bp_set), be]

This function is called in simulator.py, and the boundary elements are then passed to the time integrator for energy, gradient, and Hessian evaluations, as well as line search filtering.

Point-Edge Distance

Next, we calculate the point-edge distance and its derivatives. These will be used to solve for the contact forces. For a node $p$ and an edge $e_{0} e_{1}$ , their squared distance is defined as

$d_{sq}^{PE} (p, e_{0}, e_{1}) = λ min ∥ p - ((1 - λ) e_{0} + λ e_{1}) ∥^{2} s.t. λ \in [0, 1],$

which is the closest squared distance between $p$ and any point on $e_{0} e_{1}$ .

Remark 21.2.1 (Distance Calculation Optimization). Here, we use the squared unsigned distances for evaluating the contact energies. This approach avoids taking square roots, which can complicate the expression of the derivatives and increase numerical rounding errors during computation. Additionally, unsigned distances can be directly extended for codimensional pairs, such as point-point pairs, which are useful when simulating particle contacts in 2D. They also do not suffer from locking issues, as signed distances do, when there are large displacements.

Fortunately, $d_{sq}^{PE} (p, e_{0}, e_{1})$ is a piece-wise smooth function w.r.t. the DOFs: $d_{sq}^{PE} (p, e_{0}, e_{1}) = ⎩ ⎨ ⎧ ∥ p - e_{0} ∥^{2} if (e_{1} - e_{0}) \cdot (p - e_{0}) < 0, ∥ p - e_{1} ∥^{2} if (e_{1} - e_{0}) \cdot (p - e_{0}) > ∥ e_{1} - e_{0} ∥^{2}, \frac{1}{∥ e _{1} - e _{0} ∥ ^{2}} (det ([p - e_{0}, e_{1} - e_{0}]))^{2} otherwise, (21.2.1)$ where the smooth expression can be determined by checking whether the node is inside the orthogonal span of the edge. Given these smooth expressions, we can differentiate each of them and obtain the derivatives of the point-edge distance function. The implementations are as follows:

Implementation 21.2.1 (Point-Edge distance calculation (Hessian omitted), PointEdgeDistance.py).

import numpy as np

import distance.PointPointDistance as PP
import distance.PointLineDistance as PL

def val(p, e0, e1):
    e = e1 - e0
    ratio = np.dot(e, p - e0) / np.dot(e, e)
    if ratio < 0:    # point(p)-point(e0) expression
        return PP.val(p, e0)
    elif ratio > 1:  # point(p)-point(e1) expression
        return PP.val(p, e1)
    else:            # point(p)-line(e0e1) expression
        return PL.val(p, e0, e1)

def grad(p, e0, e1):
    e = e1 - e0
    ratio = np.dot(e, p - e0) / np.dot(e, e)
    if ratio < 0:    # point(p)-point(e0) expression
        g_PP = PP.grad(p, e0)
        return np.reshape([g_PP[0:2], g_PP[2:4], np.array([0.0, 0.0])], (1, 6))[0]
    elif ratio > 1:  # point(p)-point(e1) expression
        g_PP = PP.grad(p, e1)
        return np.reshape([g_PP[0:2], np.array([0.0, 0.0]), g_PP[2:4]], (1, 6))[0]
    else:            # point(p)-line(e0e1) expression
        return PL.grad(p, e0, e1)

It can be verified that the point-edge distance function is $C^{1}$ -continuous everywhere, including at the interfaces between different segments. For the point-point case, we have:

Implementation 21.2.2 (Point-Point distance calculation, PointPointDistance.py).

import numpy as np

def val(p0, p1):
    e = p0 - p1
    return np.dot(e, e)

def grad(p0, p1):
    e = p0 - p1
    return np.reshape([2 * e, -2 * e], (1, 4))[0]

def hess(p0, p1):
    H = np.array([[0.0] * 4] * 4)
    H[0, 0] = H[1, 1] = H[2, 2] = H[3, 3] = 2
    H[0, 2] = H[1, 3] = H[2, 0] = H[3, 1] = -2
    return H

For the point-line case, the distance evaluations can be implemented as follows, and the derivatives can be obtained using symbolic differentiation tools.

Implementation 21.2.3 (Point-Line distance calculation (Hessian omitted), PointLineDistance.py).

import numpy as np

def val(p, e0, e1):
    e = e1 - e0
    numerator = e[1] * p[0] - e[0] * p[1] + e1[0] * e0[1] - e1[1] * e0[0]
    return numerator * numerator / np.dot(e, e)

def grad(p, e0, e1):
    g = np.array([0.0] * 6)
    t13 = -e1[0] + e0[0]
    t14 = -e1[1] + e0[1]
    t23 = 1.0 / (t13 * t13 + t14 * t14)
    t25 = ((e0[0] * e1[1] + -(e0[1] * e1[0])) + t14 * p[0]) + -(t13 * p[1])
    t24 = t23 * t23
    t26 = t25 * t25
    t27 = (e0[0] * 2.0 + -(e1[0] * 2.0)) * t24 * t26
    t26 *= (e0[1] * 2.0 + -(e1[1] * 2.0)) * t24
    g[0] = t14 * t23 * t25 * 2.0
    g[1] = t13 * t23 * t25 * -2.0
    t24 = t23 * t25
    g[2] = -t27 - t24 * (-e1[1] + p[1]) * 2.0
    g[3] = -t26 + t24 * (-e1[0] + p[0]) * 2.0
    g[4] = t27 + t24 * (p[1] - e0[1]) * 2.0
    g[5] = t26 - t24 * (p[0] - e0[0]) * 2.0
    return g

Barrier Energy and Its Derivatives

With the point-edge distance functions implemented, we can traverse all point-edge pairs to assemble the total barrier energy and its derivatives. These will be used to solve for the search direction in the time-stepping optimization.

Since squared distances are used, here we rescale the barrier function to $b (d^{2}, \hat{d}^{2}) = {\frac{κ}{8} \hat{d} (\frac{d ^{2}}{d ^ ^{2}} - 1) ln \frac{d ^{2}}{d ^ ^{2}} 0 d < \hat{d} d \geq \hat{d}, (21.3.1)$ so that $\frac{\partial ^{2} b}{\partial ( \frac{d}{d ^} ) ^{2}} (\hat{d}^{2}, \hat{d}^{2}) = κ \hat{d}$ still holds. Analogous to elasticity, $s = d / \hat{d}$ can be viewed as a strain measure, then the 2nd-order derivative of the energy density (per area) function $b$ w.r.t. $s$ at $s = 1$ would correspond to Young's modulus times thickness $\hat{d}$ , which makes $κ$ physically meaningful and convenient to set.

Based on Equation (20.3.1), we can derive the gradient and Hessian of the barrier potential as $\nabla P_{b} (x) \nabla^{2} P_{b} (x) = \frac{1}{2} \overset{a}{^} \sum A_{\overset{a}{^}} e \in E - I (X_{\overset{a}{^}}) \sum \frac{\partial b}{\partial d} (d (x_{\overset{a}{^}}, e), \hat{d}^{2}) \nabla_{x} d (x_{\overset{a}{^}}, e) and = \frac{1}{2} \overset{a}{^} \sum A_{\overset{a}{^}} e \in E - I (X_{\overset{a}{^}}) \sum (\frac{\partial ^{2} b}{\partial d ^{2}} (d (x_{\overset{a}{^}}, e), \hat{d}^{2}) \nabla_{x} d (x_{\overset{a}{^}}, e) \nabla_{x} d (x_{\overset{a}{^}}, e)^{T} + \frac{\partial b}{\partial d} (d (x_{\overset{a}{^}}, e), \hat{d}^{2}) \nabla_{x}^{2} d (x_{\overset{a}{^}}, e)),$ where $A_{\overset{a}{^}} = \frac{∥ X _{\overset{a}{^}} - X _{\overset{a}{^} - 1} ∥ + ∥ X _{\overset{a}{^}} - X _{\overset{a}{^} + 1} ∥}{2}$ and we omitted the superscripts and subscripts for the squared point-edge distance functions ( $d (x_{\overset{a}{^}}, e)$ denotes $d_{sq}^{PE} (x_{\overset{a}{^}}, e)$ here).

The energy, gradient, and Hessian of the barrier contact potential are implemented as follows:

Implementation 21.3.1 (Barrier energy computation, BarrierEnergy.py).

    # self-contact
    dhat_sqr = dhat * dhat
    for xI in bp:
        for eI in be:
            if xI != eI[0] and xI != eI[1]: # do not consider a point and its incident edge
                d_sqr = PE.val(x[xI], x[eI[0]], x[eI[1]])
                if d_sqr < dhat_sqr:
                    s = d_sqr / dhat_sqr
                    # since d_sqr is used, need to divide by 8 not 2 here for consistency to linear elasticity:
                    sum += 0.5 * contact_area[xI] * dhat * kappa / 8 * (s - 1) * math.log(s)

Implementation 21.3.2 (Barrier energy gradient computation, BarrierEnergy.py).

    # self-contact
    dhat_sqr = dhat * dhat
    for xI in bp:
        for eI in be:
            if xI != eI[0] and xI != eI[1]: # do not consider a point and its incident edge
                d_sqr = PE.val(x[xI], x[eI[0]], x[eI[1]])
                if d_sqr < dhat_sqr:
                    s = d_sqr / dhat_sqr
                    # since d_sqr is used, need to divide by 8 not 2 here for consistency to linear elasticity:
                    local_grad = 0.5 * contact_area[xI] * dhat * (kappa / 8 * (math.log(s) / dhat_sqr + (s - 1) / d_sqr)) * PE.grad(x[xI], x[eI[0]], x[eI[1]])
                    g[xI] += local_grad[0:2]
                    g[eI[0]] += local_grad[2:4]
                    g[eI[1]] += local_grad[4:6]

Implementation 21.3.3 (Barrier energy Hessian computation, BarrierEnergy.py).

    # self-contact
    dhat_sqr = dhat * dhat
    for xI in bp:
        for eI in be:
            if xI != eI[0] and xI != eI[1]: # do not consider a point and its incident edge
                d_sqr = PE.val(x[xI], x[eI[0]], x[eI[1]])
                if d_sqr < dhat_sqr:
                    d_sqr_grad = PE.grad(x[xI], x[eI[0]], x[eI[1]])
                    s = d_sqr / dhat_sqr
                    # since d_sqr is used, need to divide by 8 not 2 here for consistency to linear elasticity:
                    local_hess = 0.5 * contact_area[xI] * dhat * utils.make_PSD(kappa / (8 * d_sqr * d_sqr * dhat_sqr) * (d_sqr + dhat_sqr) * np.outer(d_sqr_grad, d_sqr_grad) \
                        + (kappa / 8 * (math.log(s) / dhat_sqr + (s - 1) / d_sqr)) * PE.hess(x[xI], x[eI[0]], x[eI[1]]))
                    index = [xI, eI[0], eI[1]]
                    for nI in range(0, 3):
                        for nJ in range(0, 3):
                            for c in range(0, 2):
                                for r in range(0, 2):
                                    IJV[0].append(index[nI] * 2 + r)
                                    IJV[1].append(index[nJ] * 2 + c)
                                    IJV[2] = np.append(IJV[2], local_hess[nI * 2 + r, nJ * 2 + c])

Continuous Collision Detection

Now, we have all the ingredients to solve for the search direction in a simulation with self-contact. After obtaining the search direction, we perform line search filtering for the point-edge pairs.

Implementation 21.4.1 (Line search filtering, BarrierEnergy.py).

    # self-contact
    for xI in bp:
        for eI in be:
            if xI != eI[0] and xI != eI[1]: # do not consider a point and its incident edge
                if CCD.bbox_overlap(x[xI], x[eI[0]], x[eI[1]], p[xI], p[eI[0]], p[eI[1]], alpha):
                    toc = CCD.narrow_phase_CCD(x[xI], x[eI[0]], x[eI[1]], p[xI], p[eI[0]], p[eI[1]], alpha)
                    if alpha > toc:
                        alpha = toc

Here, we perform an overlap check on the bounding boxes of the spans of the point and edge first to narrow down the number of point-edge pairs for which we need to compute the time of impact:

Implementation 21.4.2 (Bounding box overlap check, CCD.py).

from copy import deepcopy
import numpy as np
import math

import distance.PointEdgeDistance as PE

# check whether the bounding box of the trajectory of the point and the edge overlap
def bbox_overlap(p, e0, e1, dp, de0, de1, toc_upperbound):
    max_p = np.maximum(p, p + toc_upperbound * dp) # point trajectory bbox top-right
    min_p = np.minimum(p, p + toc_upperbound * dp) # point trajectory bbox bottom-left
    max_e = np.maximum(np.maximum(e0, e0 + toc_upperbound * de0), np.maximum(e1, e1 + toc_upperbound * de1)) # edge trajectory bbox top-right
    min_e = np.minimum(np.minimum(e0, e0 + toc_upperbound * de0), np.minimum(e1, e1 + toc_upperbound * de1)) # edge trajectory bbox bottom-left
    if np.any(np.greater(min_p, max_e)) or np.any(np.greater(min_e, max_p)):
        return False
    else:
        return True

To calculate a sufficiently large conservative estimation of the time of impact (TOI), we cannot directly calculate the TOI and take a proportion of it as we did for point-ground contact in Filter Line Search. Directly calculating the TOI for contact primitive pairs requires solving quadratic or cubic root-finding problems in 2D and 3D, which are prone to numerical errors, especially when distances are tiny and configurations are numerically degenerate (e.g., nearly parallel edge-edge pairs in 3D).

Thus, we implement the additive CCD method (ACCD) [Li et al. 2021], which iteratively moves the contact pairs along the search direction until the minimum separation distance is reached, to robustly estimate the TOI.

Taking a point-edge pair as an example, the key insight of ACCD is that, given the current positions $p$ , $e_{0}$ , $e_{1}$ and search directions $d_{p}$ , $d_{e 0}$ , $d_{e 1}$ , its TOI can be calculated as

$α_{TOI} = \frac{∥ p - (( 1 - λ ) e _{0} + λ e _{1} ) ∥}{∥ d _{p} - (( 1 - λ ) d _{e 0} + λ d _{e 1} ) ∥}$

assuming $(1 - λ) e_{0} + λ e_{1}$ is the point on the edge that $p$ will first collide with. The issue is that we do not know $λ$ a priori. However, we can derive a lower bound for $α_{TOI}$ as

$α_{TOI} \geq \frac{min _{λ \in [0, 1]} ∥ p - (( 1 - λ ) e _{0} + λ e _{1} ) ∥}{∥ d _{p} ∥ + ∥ ( 1 - λ ) d _{e 0} + λ d _{e 1} ∥} \geq \frac{d ^{PE} ( p , e _{0} , e _{1} )}{∥ d _{p} ∥ + max ( ∥ d _{e 0} ∥ , ∥ d _{e 1} ∥ )} = α_{l}$

By taking a step with this lower bound $α_{l}$ , we are guaranteed to have no interpenetration for this pair. However, although straightforward to compute, $α_{l}$ can be much smaller than $α_{TOI}$ . Therefore, we iteratively calculate $α_{l}$ and advance a copy of the participating nodes by this amount, accumulating all $α_{l}$ to monotonically improve the estimate of $α_{TOI}$ until the point-edge pair reaches a distance smaller than the minimum separation, e.g., $0.1 \times$ the original distance. The implementation is as follows, where we first remove the shared components of the search directions so that they have smaller magnitudes to achieve earlier termination of the algorithm.

Implementation 21.4.3 (ACCD method implementation, CCD.py).

# compute the first "time" of contact, or toc,
# return the computed toc only if it is smaller than the previously computed toc_upperbound
def narrow_phase_CCD(_p, _e0, _e1, _dp, _de0, _de1, toc_upperbound):
    p = deepcopy(_p)
    e0 = deepcopy(_e0)
    e1 = deepcopy(_e1)
    dp = deepcopy(_dp)
    de0 = deepcopy(_de0)
    de1 = deepcopy(_de1)

    # use relative displacement for faster convergence
    mov = (dp + de0 + de1) / 3 
    de0 -= mov
    de1 -= mov
    dp -= mov
    maxDispMag = np.linalg.norm(dp) + math.sqrt(max(np.dot(de0, de0), np.dot(de1, de1)))
    if maxDispMag == 0:
        return toc_upperbound

    eta = 0.1 # calculate the toc that first brings the distance to 0.1x the current distance
    dist2_cur = PE.val(p, e0, e1)
    dist_cur = math.sqrt(dist2_cur)
    gap = eta * dist_cur
    # iteratively move the point and edge towards each other and
    # grow the toc estimate without numerical errors
    toc = 0
    while True:
        tocLowerBound = (1 - eta) * dist_cur / maxDispMag

        p += tocLowerBound * dp
        e0 += tocLowerBound * de0
        e1 += tocLowerBound * de1
        dist2_cur = PE.val(p, e0, e1)
        dist_cur = math.sqrt(dist2_cur)
        if toc != 0 and dist_cur < gap:
            break

        toc += tocLowerBound
        if toc > toc_upperbound:
            return toc_upperbound

    return toc

The final simulation results are demonstrated in Figure 21.4.1.

**Figure 21.4.1.** Two squares dropped onto the ground and compressed by a ceiling. The ground has friction coefficient $0.4$ but there is no friction between the squares so that the top square slides down to the ground without significantly changing the position of the bottom one.

Summary

We have implemented frictionless self-contact with guaranteed non-intersection for 2D FEM simulations by discretizing barrier energies onto the non-incident point-edge pairs on the boundary.

To compute the barrier energies, we used squared point-edge distances to avoid potential numerical issues. The point-edge distance is a piecewise smooth function with closed-form expressions depending on the relative positions of the point and the edge, and the overall function is $C^{1}$ -continuous everywhere. The derivatives of the function can be conveniently obtained by applying symbolic differentiation to each expression.

For line search filtering, instead of directly computing the time of impact (TOI) which is prone to numerical issues, we implemented the additive CCD method (ACCD) to obtain a sufficiently large and conservative estimate of TOI. ACCD is an iterative method that accumulates lower bounds of TOI while progressively advancing the nodes along the search direction. Before running ACCD, we perform overlap checks on the bounding boxes of the point's and edge's spans to quickly filter out non-colliding pairs.

In later lectures, we will see that for large-scale scenes in 3D, efficient spatial indexing strategies such as spatial hashing and bounding box hierarchies (BVH) will be needed to significantly reduce the expensive spatial search costs.

In the next lecture, we will implement frictional self-contact based on what we have just developed.

2D Frictional Self-Contact*

In this lecture, we implement 2D friction based on our 2D self-contact implementation in Case Study: 2D Self-Contact. The executable Python project for this section can be found at https://github.com/phys-sim-book/solid-sim-tutorial.

For simplicity, we will focus on implementing a semi-implicit version of friction. This means the normal force magnitude $λ$ and the tangent operator $T$ will be discretized to the last time step, and we solve the optimization once per time step without further fixed-point iterations that converge to solutions with fully-implicit friction (Frictional Contact) under the 8_self_friction folder. MUDA GPU implementations can be found at https://github.com/phys-sim-book/solid-sim-tutorial-gpu under the simulators/8_self_friction folder. Combined with the smoothly approximated static-dynamic friction transition in IPC, implementing friction into an optimization time integration framework is as straightforward as adding an extra potential energy.

Discretization and Approximation

From Equation (18.4.2), the friction force per unit area is defined as

$T_{F} (X, t) = - μ ∥ T_{C} (X, t) ∥ f (∥ V_{F} (X, t) ∥) s (V_{F} (X, t)),$

where $μ$ is the friction coefficient, $T_{C}$ is the normal contact force, and $V_{F}$ is the relative sliding velocity. Here $s (V_{F}) = \frac{V _{F}}{∥ V _{F} ∥}$ when $∥ V_{F} ∥ > 0$ , while $s (V_{F})$ takes any unit vector orthogonal to the normal $N (X, t)$ when $∥ V_{F} ∥ = 0$ . Additionally, the friction scaling function $f$ is also nonsmooth with respect to $V_{F}$ , as $f (∥ V_{F} ∥) = 1$ when $∥ V_{F} ∥ > 0$ , and $f (∥ V_{F} ∥) \in [0, 1]$ when $∥ V_{F} ∥ = 0$ .

It is important to note that without temporal discretization, there is no potential energy for friction. However, similar to Frictional Contact, once we discretize the normal force magnitude and the tangent operator to the last time step and smoothly approximate the friction scaling function $f$ , the friction force at the $(n + 1)$ -th time step becomes integrable with respect to $x$ , and we obtain

$T_{F}^{n + 1} (X) \approx - \frac{\partial D ^{n + 1} ( X )}{\partial x ^{n + 1} ( X )} = - \frac{\partial ( μ ∥ T _{C}^{n} ( X ) ∥ f _{0} ( ∥ V ˉ _{F}^{n + 1} ( X ) h ^ ∥ ) )}{\partial x ^{n + 1} ( X )} .$

Here, $\overset{ˉ}{V}_{F}^{n + 1} (X) = (I - N^{n} (X) N^{n} (X)^{T}) (V^{n + 1} (X) - V^{n + 1} (X_{2}))$ is the approximate relative sliding velocity, where $N^{n}$ and $X_{2}$ are the normal direction and the point in contact with $X$ in the last time step, $\hat{h} I = (\partial v / \partial x)^{- 1}$ , and

$f_{0} (y) = {- \frac{y ^{3}}{3 ϵ _{v}^{2} h ^ ^{2}} + \frac{y ^{2}}{ϵ _{v} h ^} + \frac{ϵ _{v} h ^}{3}, y, y \in [0, ϵ_{v} \hat{h}); y \geq ϵ_{v} \hat{h} .$

Therefore, considering self-contact, the approximate friction potential over the entire boundary can be written as

$\int_{Γ_{C}} \frac{1}{2} μ ∥ T_{C}^{n} (X) ∥ f_{0} (∥ \overset{ˉ}{V}_{F}^{n + 1} (X) \hat{h} ∥) d s (X),$

where the $\frac{1}{2}$ scaling comes from double counting the friction between each pair of contact points in the integral (similar to the normal contact forces in Boundary Conditions and Frictional Contact).

After discretizing the boundary curves as polylines and approximating the max operator in the normal contact force component using summations (Piecewise Linear Boundaries), we similarly obtain the spatially discretized friction potential:

$\int_{Γ_{C}} e \in E - I (X) \sum \frac{1}{2} μ (- \frac{\partial b ( d ^{PE} ( x ^{n} ( X ) , e ) , d ^ )}{\partial d}) f_{0} (∥ \overset{ˉ}{V}_{F}^{n + 1} (X, e) \hat{h} ∥) d s (X) .$

Here, $d^{PE} (x^{n} (X), e)$ is the point-edge distance between $x^{n} (X)$ and edge $e$ in the last time step, and $\overset{ˉ}{V}_{F}^{n + 1} (X, e)$ is the approximate relative sliding velocity of the point-edge pair with contact normal and the closest point discretized to the last time step (see next section for details).

If we choose boundary nodes as quadrature points to approximate the integral, we finally obtain our discrete friction potential:

$P_{f} (x) = \overset{a}{^} \sum A_{\overset{a}{^}} e \in E - I (X_{\overset{a}{^}}) \sum \frac{1}{2} μ (- \frac{\partial b ( d ^{PE} ( x _{\overset{a}{^}}^{n} , e ) , d ^ )}{\partial d}) f_{0} (∥ \overset{ˉ}{V}_{F}^{n + 1} (X_{\overset{a}{^}}, e) \hat{h} ∥) = k \in {(\overset{a}{^}, e)} \sum μ λ_{k}^{n} f_{0} (∥ \overset{ˉ}{v}_{k} \hat{h} ∥)$

where $A_{\overset{a}{^}} = \frac{∥ X _{\overset{a}{^}} - X _{\overset{a}{^} - 1} ∥ + ∥ X _{\overset{a}{^}} - X _{\overset{a}{^} + 1} ∥}{2}$ is the integration weight. Denoting $\overset{ˉ}{v}_{k} = \overset{ˉ}{V}_{F}^{n + 1} (X_{\overset{a}{^}}, e)$ and $λ_{k}^{n} = \frac{1}{2} A_{\overset{a}{^}} (- \frac{\partial b ( d ^{PE} ( x _{\overset{a}{^}}^{n} , e ) , d ^ )}{\partial d})$ , the expression of $P_{f}$ agrees with the discrete form of Equation (9.2.1) we directly derived, except that here $k$ traverses all non-incident point-edge pairs on the boundary.

Based on this discrete form of the smoothed semi-implicit friction potential, we now need to determine how to calculate $λ$ and $\overset{ˉ}{v}$ for point-edge pairs, implement the computation of the value, gradient, and Hessian of $P_{f} (x)$ , and then incorporate them into the optimization.

Precomputing Normal and Tangent Information

To make the temporally discretized friction force integrable, we must explicitly discretize certain normal and tangent information. This information only needs to be calculated once at the beginning of each time step, as it will remain constant during each optimization.

First, we need to calculate $λ^{n}$ for each point-edge pair using $x^{n}$ . Recall that we used squared distances as input for the barrier functions, so $λ^{n}$ should be calculated using the chain rule as follows:

$λ_{\overset{a}{^}, e}^{n} = \frac{1}{2} A_{\overset{a}{^}} (- \frac{\partial b ( d _{sq}^{PE} ( x _{\overset{a}{^}}^{n} , e ) , d ^ ^{2} )}{\partial d ^{PE}}) = \frac{1}{2} A_{\overset{a}{^}} (- \frac{\partial b ( d _{sq}^{PE} ( x _{\overset{a}{^}}^{n} , e ) , d ^ ^{2} )}{\partial d _{sq}^{PE}} \frac{\partial d _{sq}^{PE}}{\partial d ^{PE}}) = \frac{1}{2} A_{\overset{a}{^}} (- \frac{\partial b ( d _{sq}^{PE} ( x _{\overset{a}{^}}^{n} , e ) , d ^ ^{2} )}{\partial d _{sq}^{PE}}) 2 d^{PE} .$

According to the scaled barrier function taking squared distance as input (Equation (21.3.1)), we can derive

$\frac{\partial b ( d _{sq} , d ^ ^{2} )}{\partial d _{sq}} = {\frac{κ}{8} \hat{d} (\frac{1}{d ^ ^{2}} ln \frac{d _{sq}}{d ^ ^{2}} + \frac{1}{d _{sq}} (\frac{d _{sq}}{d ^ ^{2}} - 1)) 0 if d < \hat{d}; if d \geq \hat{d} .$

Remark 22.2.1. The set of point-edge pairs for friction in our semi-implicit friction setting is fixed in each time step and is different from the set of normal contact pairs. The set for friction only contains those pairs with $d^{PE} (x_{\overset{a}{^}}^{n}, e) < \hat{d}$ , and this does not change with the optimization variable $x^{n + 1}$ in the current time step.

Now for the tangent information, the key is to keep the normal and the barycentric coordinate of the closest point on the edge constant. For the $k$ -th point-edge pair, if we denote the node indices for the point and edge as $p$ , $e_{0}$ , and $e_{1}$ , then we can write the relative sliding velocity as

$v_{k} = (I - n n^{T}) (v_{p} - ((1 - r) v_{e_{0}} + r v_{e_{1}})),$

where $r = arg min_{c} ∥ x_{p} - ((1 - c) x_{e_{0}} + c x_{e_{1}}) ∥$ is the barycentric coordinate and $n = (x_{p} - ((1 - r) x_{e_{0}} + r x_{e_{1}})) /∥ x_{p} - ((1 - r) x_{e_{0}} + r x_{e_{1}}) ∥$ is the normal of the edge. Here we see that $r$ and $n$ are both dependent on $x$ , so directly integrating $v_{k}$ is nontrivial. By calculating $n$ and $r$ using $x^{n}$ , we obtain the semi-implicit relative sliding velocity

$\overset{ˉ}{v}_{k} = (I - n^{n} (n^{n})^{T}) (v_{p} - ((1 - r^{n}) v_{e_{0}} + r^{n} v_{e_{1}})),$

and now only the velocities are dependent on $x^{n + 1}$ , which makes both integration and differentiation straightforward. If we denote $\hat{v}_{k} = v_{p} - ((1 - r^{n}) v_{e_{0}} + r^{n} v_{e_{1}})$ , we obtain

$\frac{\partial v ˉ _{k}}{\partial v ^ _{k}} = (I - n^{n} (n^{n})^{T}) and \frac{\partial v ^ _{k}}{\partial [ x _{p}^{T} , x _{e_{0}}^{T} , x _{e_{1}}^{T} ] ^{T}} = \frac{1}{h ^} [I (r^{n} - 1) I - r^{n} I] .$

Code

Next, let's look at the code. Implementation 22.2.1 calculates the barycentric coordinate of the closest point and the normal given point-edge nodal positions. The idea is to orthogonally project $x_{p}$ onto the edge.

Implementation 22.2.1 (Calculating contact point and normal, PointEdgeDistance.py).

# compute normal and the parameterization of the closest point on the edge
def tangent(p, e0, e1):
    e = e1 - e0
    ratio = np.dot(e, p - e0) / np.dot(e, e)
    if ratio < 0:    # point(p)-point(e0) expression
        n = p - e0
    elif ratio > 1:  # point(p)-point(e1) expression
        n = p - e1
    else:            # point(p)-line(e0e1) expression
        n = p - ((1 - ratio) * e0 + ratio * e1)
    return [n / np.linalg.norm(n), ratio]

Then, Implementation 22.2.2 traverses all non-incident point-edge pairs with a distance smaller than $\hat{d}$ , calculates $λ$ , and calls the above function to calculate $n$ and $r$ .

As in Frictional Contact, these lines of code are executed at the beginning of each time step in time_integrator.py, and the information for each friction pair is stored and passed to the energy, gradient, and Hessian computation functions that we will discuss next.

Implementation 22.2.2 (Semi-implicit friction precomputation, BarrierEnergy.py).

    # self-contact
    mu_lambda_self = []
    dhat_sqr = dhat * dhat
    for xI in bp:
        for eI in be:
            if xI != eI[0] and xI != eI[1]: # do not consider a point and its incident edge
                d_sqr = PE.val(x[xI], x[eI[0]], x[eI[1]])
                if d_sqr < dhat_sqr:
                    s = d_sqr / dhat_sqr
                    # since d_sqr is used, need to divide by 8 not 2 here for consistency to linear elasticity
                    # also, lambda = -\partial b / \partial d = -(\partial b / \partial d^2) * (\partial d^2 / \partial d)
                    mu_lam = mu * -0.5 * contact_area[xI] * dhat * (kappa / 8 * (math.log(s) / dhat_sqr + (s - 1) / d_sqr)) * 2 * math.sqrt(d_sqr)
                    [n, r] = PE.tangent(x[xI], x[eI[0]], x[eI[1]]) # normal and closest point parameterization on the edge
                    mu_lambda_self.append([xI, eI[0], eI[1], mu_lam, n, r])

Friction Energy and Its Derivatives

With $λ$ , $r$ , and $n$ precomputed for each friction point-edge pair, we can now conveniently compute the energy (Implementation 22.3.1), gradient (Implementation 22.3.2), and Hessian (Implementation 22.3.3) of the friction potential and add them into the optimization.

Implementation 22.3.1 (Friction energy value, FrictionEnergy.py).

    # self-contact:
    for i in range(0, len(mu_lambda_self)):
        [xI, eI0, eI1, mu_lam, n, r] = mu_lambda_self[i]
        T = np.identity(2) - np.outer(n, n)
        rel_v = v[xI] - ((1 - r) * v[eI0] + r * v[eI1])
        vbar = np.transpose(T).dot(rel_v)
        sum += mu_lam * f0(np.linalg.norm(vbar), epsv, hhat)

When computing the gradient and Hessian, we used the relative velocity $\hat{v}_{k}$ as an intermediate variable to make the implementation more organized. This approach is given by: $\nabla P_{f} (x) = k \sum (\frac{\partial v ^ _{k}}{\partial x})^{T} \frac{\partial D _{k} ( x )}{\partial v ^ _{k}}, \nabla^{2} P_{f} (x) = k \sum (\frac{\partial v ^ _{k}}{\partial x})^{T} \frac{\partial ^{2} D _{k} ( x )}{\partial v ^ _{k}^{2}} \frac{\partial v ^ _{k}}{\partial x},$ where the derivatives of $D_{k}$ with respect to $\hat{v}_{k}$ have exactly the same forms as in Frictional Contact.

Implementation 22.3.2 (Friction energy gradient, FrictionEnergy.py).

    # self-contact:
    for i in range(0, len(mu_lambda_self)):
        [xI, eI0, eI1, mu_lam, n, r] = mu_lambda_self[i]
        T = np.identity(2) - np.outer(n, n)
        rel_v = v[xI] - ((1 - r) * v[eI0] + r * v[eI1])
        vbar = np.transpose(T).dot(rel_v)
        g_rel_v = mu_lam * f1_div_vbarnorm(np.linalg.norm(vbar), epsv) * T.dot(vbar)
        g[xI] += g_rel_v
        g[eI0] += g_rel_v * -(1 - r)
        g[eI1] += g_rel_v * -r

Implementation 22.3.3 (Friction energy Hessian, FrictionEnergy.py).

    # self-contact:
    for i in range(0, len(mu_lambda_self)):
        [xI, eI0, eI1, mu_lam, n, r] = mu_lambda_self[i]
        T = np.identity(2) - np.outer(n, n)
        rel_v = v[xI] - ((1 - r) * v[eI0] + r * v[eI1])
        vbar = np.transpose(T).dot(rel_v)
        vbarnorm = np.linalg.norm(vbar)
        inner_term = f1_div_vbarnorm(vbarnorm, epsv) * np.identity(2)
        if vbarnorm != 0:
            inner_term += f_hess_term(vbarnorm, epsv) / vbarnorm * np.outer(vbar, vbar)
        hess_rel_v = mu_lam * T.dot(utils.make_PSD(inner_term)).dot(np.transpose(T)) / hhat
        index = [xI, eI0, eI1]
        d_rel_v_dv = [1, -(1 - r), -r]
        for nI in range(0, 3):
            for nJ in range(0, 3):
                for c in range(0, 2):
                    for r in range(0, 2):
                        IJV[0].append(index[nI] * 2 + r)
                        IJV[1].append(index[nJ] * 2 + c)
                        IJV[2] = np.append(IJV[2], d_rel_v_dv[nI] * d_rel_v_dv[nJ] * hess_rel_v[r, c])

After these implementations, we can finally run our compressing squares example with frictional self-contact (see: Figure 22.3.1). From the figure, we observe that once the two squares touch, the large friction between them and the ground restricts any sliding. This causes the squares to rotate counter-clockwise as they are compressed by the ceiling.

**Figure 22.3.1.** Two squares dropped onto the ground and compressed by a ceiling. The friction coefficient is $0.4$ between any contacting surfaces, which restricts any sliding here in this scene and results in counter-clockwise rotations of the two squares under compression. As their interface becomes nearly vertical, the squares are finally detached.

Summary

We implemented semi-implicit friction in 2D based on squared unsigned distances of point-edge pairs and incorporated it into the time-stepping optimization.

We began by making the friction force integrable in the continuous setting through semi-implicit temporal discretization and a smooth approximation of the dynamic-static friction transition. The spatial discretization of the approximate friction potential follows a similar approach to the barrier potential.

Next, we examined the computation of the normal force magnitude $λ$ , normal direction $n$ , and barycentric coordinate $r$ of the closest point for point-edge pairs. These values are calculated at the beginning of each time step and remain constant during the optimization. It is important to note that the set of point-edge pairs for friction is also constant per optimization and differs from the set used for the barrier.

Finally, we implemented the computation of the discrete friction potential and its derivatives. We used relative velocities $\hat{v}_{k}$ as intermediate variables and applied the chain rule to organize the calculations.

Up to now, we have covered both the theoretical and practical aspects of a 2D solid simulator with inversion-free elastodynamics and interpenetration-free frictional self-contact. Next, we will explore the additional steps needed to extend these concepts to 3D!

3D Elastodynamics

To extend our 2D solid simulator (2D Frictional Self-Contact) to 3D, we can use $3$ -simplex tetrahedral elements to discretize the 3D solid domains. In this approach, the surface of the solid is represented as a triangle mesh, which is a common method in computer graphics for representing 3D geometries. Additionally, we need to sample vertices in the interior of the solid to form the tetrahedral elements required for discretizing the inertia and elasticity energies.

Kinematics

Similar to 2D triangle elements, we use $β, γ, τ \in [0, 1]$ with $β + γ + τ \leq 1$ to express the material space coordinates of an arbitrary point $X$ in the tetrahedral element defined by vertices $X_{1}, X_{2}, X_{3},$ and $X_{4}$ as follows: $X (β, γ, τ) = X_{1} + β (X_{2} - X_{1}) + γ (X_{3} - X_{1}) + τ (X_{4} - X_{1}) = (1 - β - γ - τ) X_{1} + β X_{2} + γ X_{3} + τ X_{4} .$ Here, $X$ is a linear function of $(β, γ, τ)$ . Using linear shape functions, the approximate world-space coordinate $\hat{x}$ is also a linear function of $(β, γ, τ)$ : $x (β, γ, τ) \approx \hat{x} (β, γ, τ) = (1 - β - γ - τ) x_{1} + β x_{2} + γ x_{3} + τ x_{4},$ where $x (X_{i})$ is denoted as $x_{i}$ . This implies that the shape functions are: $N_{1} (β, γ, τ) N_{2} (β, γ, τ) N_{3} (β, γ, τ) N_{4} (β, γ, τ) = 1 - β - γ - τ, = β, = γ, = τ .$

Mass Matrix

Recall that the mass matrix can be calculated as $M_{ab} = e \in T \sum \int_{Ω_{e}^{0}} R (X, 0) N_{a} (X) N_{b} (X) d X,$ where $Ω_{e}^{0}$ represents the material space of tetrahedron $e$ . Changing the integration variable from $X$ to $(β, γ, τ)$ results in $= \int_{Ω_{e}^{0}} R (X, 0) N_{a} (X) N_{b} (X) d X \int_{0}^{1} \int_{0}^{1 - τ} \int_{0}^{1 - β - τ} R (β, γ, τ, 0) N_{a} (β, γ, τ) N_{b} (β, γ, τ) det (\frac{\partial X}{\partial ( β , γ , τ )}) d γ d β d τ .$

For element $e$ with vertices $X_{1}$ , $X_{2}$ , $X_{3}$ , and $X_{4}$ , $det (\frac{\partial X}{\partial ( β , γ , τ )}) = ∣ det ([X_{2} - X_{1}, X_{3} - X_{1}, X_{4} - X_{1}]) ∣ = 6 V_{e},$ where $V_{e}$ is the volume of tetrahedron $e$ .

Here, we will omit the detailed derivations of each entry in the consistent mass matrix. Assuming uniform density $R$ , for the lumped mass matrix, $M_{aa}^{lump} = e \in T (a) \sum \frac{1}{4} R V_{e} and M_{ab}^{lump} = 0 (a \neq = b),$ where $T (a)$ denotes the set of tetrahedra incident to node $a$ . In other words, the mass of each tetrahedron is evenly distributed among its 4 nodes, which is intuitively analogous to the 2D case.

Elasticity

For elasticity, similar to the 2D case, the deformation gradient $F$ is also constant within each tetrahedron, and we can compute it as $F = \approx = \frac{\partial x}{\partial ( β , γ , τ )} (\frac{\partial X}{\partial ( β , γ , τ )})^{- 1} \frac{\partial x ^}{\partial ( β , γ , τ )} (\frac{\partial X}{\partial ( β , γ , τ )})^{- 1} [x_{2} - x_{1}, x_{3} - x_{1}, x_{4} - x_{1}] [X_{2} - X_{1}, X_{3} - X_{1}, X_{4} - X_{1}]^{- 1} .$ For force and Hessian computation, the required $\partial F / \partial x$ can be computed using $\nabla^{X} N_{1} (X) = \frac{\partial ( 1 - β - γ - τ )}{\partial X} = (\frac{\partial ( 1 - β - γ - τ )}{\partial ( β , γ , τ )} (\frac{\partial X}{\partial ( β , γ , τ )})^{- 1})^{T} = ([- 1, - 1, - 1] [X_{2} - X_{1}, X_{3} - X_{1}, X_{4} - X_{1}]^{- 1})^{T}$ and similarly $\nabla^{X} N_{2} (X) \nabla^{X} N_{3} (X) \nabla^{X} N_{3} (X) = \frac{\partial β}{\partial X} = ([1, 0, 0] [X_{2} - X_{1}, X_{3} - X_{1}, X_{4} - X_{1}]^{- 1})^{T}, = \frac{\partial γ}{\partial X} = ([0, 1, 0] [X_{2} - X_{1}, X_{3} - X_{1}, X_{4} - X_{1}]^{- 1})^{T}, = \frac{\partial τ}{\partial X} = ([0, 0, 1] [X_{2} - X_{1}, X_{3} - X_{1}, X_{4} - X_{1}]^{- 1})^{T} .$ With $F$ , the computation of strain energy $Ψ$ , stress $P$ and stress derivative $\partial P / \partial F$ can all be found in Strain Energy and Stress and Its Derivatives, and the computation of forces and Hessian matrices follow the same spirit as in 2D.

To guarantee non-inversion of the tetrahedral elements during the simulation, the critical step size $α^{I}$ that first brings the volume of any tetrahedra to $0$ can be obtained by solving a 1D equation per tetrahedron $V (x_{i} + α^{I} p_{i}) = 0,$ and then take the minimum of the solved step sizes. Here $p_{i}$ is the search direction of node $i$ , and in 3D, this is equivalent to $det ([x_{21}^{α}, x_{31}^{α}, x_{41}^{α}]) \equiv (x_{21}^{α} \times x_{31}^{α}) \cdot x_{41}^{α} = 0 (23.3.1)$ with $x_{ij}^{α} = x_{ij} + α^{I} p_{ij}$ and $x_{ij} = x_{i} - x_{j}$ , $p_{ij} = p_{i} - p_{j}$ . Expanding Equation (23.3.1), we obtain the following cubic equation for $α^{I}$ :

$((p_{21} \times p_{31}) \cdot p_{41}) α^{I}^{3} + ((x_{21} \times p_{31} + p_{21} \times x_{31}) \cdot p_{41} + (p_{21} \times p_{31}) \cdot x_{41}) α^{I}^{2} + ((x_{21} \times p_{31} + p_{21} \times x_{31}) \cdot x_{41} + (x_{21} \times x_{31}) \cdot p_{41}) α^{I} + (x_{21} \times x_{31}) \cdot x_{41} = 0,$

This cubic equation can sometimes degenerate into a quadratic or linear equation, particularly when node movements do not substantially alter the tetrahedron's volume. To address potential numerical instability, we scale the equation terms based on the constant term coefficient:

$\frac{( p _{21} \times p _{31} ) \cdot p _{41}}{( x _{21} \times x _{31} ) \cdot x _{41}} α^{I}^{3} + \frac{( x _{21} \times p _{31} + p _{21} \times x _{31} ) \cdot p _{41} + ( p _{21} \times p _{31} ) \cdot x _{41}}{( x _{21} \times x _{31} ) \cdot x _{41}} α^{I}^{2} + \frac{( x _{21} \times p _{31} + p _{21} \times x _{31} ) \cdot x _{41} + ( x _{21} \times x _{31} ) \cdot p _{41}}{( x _{21} \times x _{31} ) \cdot x _{41}} α^{I} + 1 = 0, (23.3.2)$

ensuring that magnitude checks can be safely performed with standard thresholds (e.g., $1 0^{- 6}$ ).

Practically, we also ensure some safety margin by solving for $α^{I}$ that reduces the volume of any tetrahedron by 80%, modifying the constant term coefficient in Equation (23.3.2) from $1$ to $0.8$ . If no positive real roots are found, the step size can be considered safe, and inversion will not occur. Here is the C++ code snippet for solving this scaled cubic equation:

Implementation 23.3.1 (Cubic Equation Solver).

double getSmallestPositiveRealRoot_cubic(double a, double b, double c, double d,
    double tol)
{
    // return negative value if no positive real root is found
    double t = -1;

    if (abs(a) <= tol)
        t = getSmallestPositiveRealRoot_quad(b, c, d, tol); // covered in the 2D case
    else {
        complex<double> i(0, 1);
        complex<double> delta0(b * b - 3 * a * c, 0);
        complex<double> delta1(2 * b * b * b - 9 * a * b * c + 27 * a * a * d, 0);
        complex<double> C = pow((delta1 + sqrt(delta1 * delta1 - 4.0 * delta0 * delta0 * delta0)) / 2.0, 1.0 / 3.0);
        if (std::abs(C) == 0.0) // a corner case
            C = pow((delta1 - sqrt(delta1 * delta1 - 4.0 * delta0 * delta0 * delta0)) / 2.0, 1.0 / 3.0);

        complex<double> u2 = (-1.0 + sqrt(3.0) * i) / 2.0;
        complex<double> u3 = (-1.0 - sqrt(3.0) * i) / 2.0;

        complex<double> t1 = (b + C + delta0 / C) / (-3.0 * a);
        complex<double> t2 = (b + u2 * C + delta0 / (u2 * C)) / (-3.0 * a);
        complex<double> t3 = (b + u3 * C + delta0 / (u3 * C)) / (-3.0 * a);

        if ((abs(imag(t1)) < tol) && (real(t1) > 0))
            t = real(t1);
        if ((abs(imag(t2)) < tol) && (real(t2) > 0) && ((real(t2) < t) || (t < 0)))
            t = real(t2);
        if ((abs(imag(t3)) < tol) && (real(t3) > 0) && ((real(t3) < t) || (t < 0)))
            t = real(t3);
    }
    return t;
}

Summary

In this section, we delve into the process of extending our solid simulator to accommodate 3D elastodynamic simulation.

This enhancement involves discretizing the solid domain using 3-simplex tetrahedral elements. Consequently, the kinematics, mass matrix, and elasticity energies adopt the same approach as in 2D, but now incorporate an additional dimension for the per-element parameter space, integration, and deformation gradient.

To maintain inversion-free elements, line search filtering operates similarly, though it now entails solving cubic equations for each element.

In the following section, we will explore the extension of the frictional contact component to 3D scenarios.

3D Frictional Self-Contact

In 3D, the contact between the solid domain boundaries represented as triangle meshes can be reduced to point-triangle and edge-edge contacts. Intuitively, the point-edge contact pairs in 2D extend directly to 3D as point-triangle pairs. However, even if we prevent all point-triangle interpenetrations in 3D, the triangle meshes can still penetrate each other. This necessitates accounting for edge-edge pairs, especially when the resolution of the mesh is not very high.

Barrier and Distances

With triangle mesh discretization, the barrier potential in the continuous settings (Equation (18.3.5)) can be approximated as $\approx = = \int_{Γ_{C}} \frac{1}{2} b (X_{2} \in Γ_{C} - N (X) min ∥ x (X, t) - x (X_{2}, t) ∥, \hat{d}) d s (X) \int_{Γ_{C}} \frac{1}{2} b (e \in T - I (X) min X_{2} \in e min ∥ x (X, t) - x (X_{2}, t) ∥, \hat{d}) d s (X) \int_{Γ_{C}} \frac{1}{2} b (e \in T - I (X) min d^{PT} (x (X, t), e), \hat{d}) d s (X) \int_{Γ_{C}} \frac{1}{2} e \in T - I (X) max b (d^{PT} (x (X, t), e), \hat{d}) d s (X), (24.1.1)$ where $T$ is the set of all surface triangles, $I (X)$ is the set of all surface triangles that hold point $X$ , and $d^{PT}$ is the point-triangle distance. Further approximating the max operator with summations and use mesh surface nodes $\overset{a}{^}$ as quadrature points, we have $\approx \approx \int_{Γ_{C}} \frac{1}{2} e \in T - I (X) max b (d^{PT} (x (X, t), e), \hat{d}) d s (X) \int_{Γ_{C}} \frac{1}{2} e \in T - I (X) \sum b (d^{PT} (x (X, t), e), \hat{d}) d s (X) \overset{a}{^} \sum \frac{1}{2} w_{\overset{a}{^}} e \in T - I (X_{\overset{a}{^}}) \sum b (d^{PT} (x_{\overset{a}{^}}, e), \hat{d}), (24.1.2)$ where $w_{\overset{a}{^}} = \hat{d} \frac{1}{3} \sum_{j \in I (X_{\overset{a}{^}})} A_{j}$ is the integration weight and $A_{j}$ is the area of node $\overset{a}{^}$ 's incident surface triangle $j$ .

Now, getting back to the second line of Equation (24.1.1), if we only use points on the edges to approximate the minimum distance, we obtain $\approx = \int_{Γ_{C}} \frac{1}{2} b (X_{2} \in Γ_{C} - N (X) min ∥ x (X, t) - x (X_{2}, t) ∥, \hat{d}) d s (X) \int_{Γ_{C}} \frac{1}{2} b (e \in E - I (X) min X_{2} \in e min ∥ x (X, t) - x (X_{2}, t) ∥, \hat{d}) d s (X) \int_{Γ_{C}} \frac{1}{2} e \in E - I (X) max b (X_{2} \in e min ∥ x (X, t) - x (X_{2}, t) ∥, \hat{d}) d s (X) .$ Then if we choose a special quadrature point $X_{e_{1}}$ per surface edge $e_{1}$ and approximate the max operators with summations, we get $\approx \int_{Γ_{C}} \frac{1}{2} e \in E - I (X) max b (X_{2} \in e min ∥ x (X, t) - x (X_{2}, t) ∥, \hat{d}) d s (X) e_{1} \in E \sum \frac{1}{2} w_{e_{1}} e \in E - I (X_{e_{1}}) \sum b (X_{2} \in e min ∥ x (X_{e_{1}}, t) - x (X_{2}, t) ∥, \hat{d}),$ where $w_{e_{1}} = \hat{d} \frac{1}{3} \sum_{j \in I (e_{1})} A_{j}$ is the integration weight and $A_{j}$ is the area of $e_{1}$ 's incident surface triangle $j$ . Next, if we always select $X_{e_{1}}$ to be the closest point to $X_{2}$ on $e_{1}$ , we will get $\approx = e_{1} \in E \sum \frac{1}{2} w_{e_{1}} e \in E - I (X_{e_{1}}) \sum b (X_{2} \in e min ∥ x (X_{e_{1}}, t) - x (X_{2}, t) ∥, \hat{d}) e_{1} \in E \sum \frac{1}{2} w_{e_{1}} e \in E - N (e_{1}) \sum b (X_{2} \in e, X \in e_{1} min ∥ x (X, t) - x (X_{2}, t) ∥, \hat{d}) e_{1} \in E \sum \frac{1}{2} w_{e_{1}} e \in E - N (e_{1}) \sum b (d^{EE} (e, e_{1}), \hat{d}), (24.1.3)$ where $N (e_{1})$ is the set of all the surface edge neighbors of $e_{1}$ plus itself. For the summation over all surface edges in Equation (24.1.3), if we only account for $(e_{1}, e)$ with $e_{1} < e$ or the other way around, then the coefficient $1/2$ can be omitted.

Now we have two kinds of discretizations for the 3D barrier potential energy. To use them together in practice, we can take advantage of a linear combination of them, and the coefficient could usually be set to $1/2$ .

For point-triangle and edge-edge distances, they are also both small optimization problems with analytical solutions, which can be represented as piecewise smooth functions like the 2D Point-Edge distance in Equation (21.2.1). For example, in the point-triangle case, the expression can be determined by checking which region the projection of the point onto the triangle plane is located.

Definition 24.1.1 (3D Point-Triangle Distance). The distance between point $x_{p}$ and triangle with vertices $x_{t 1}$ , $x_{t 2}$ , and $x_{t 3}$ can be defined as $arg β_{1}, β_{2} min ∥ x_{p} - (x_{t 1} + β_{1} (x_{t 2} - x_{t 1}) + β_{2} (x_{t 3} - x_{t 1})) ∥ s.t. β_{1}, β_{2} \geq 0, β_{1} + β_{2} \leq 1$

Definition 24.1.2 (3D Edge-Edge Distance). The distance between edge with end nodes $x_{e 1 a}$ and $x_{e 1 b}$ and edge with end nodes $x_{e 2 a}$ and $x_{e 2 b}$ can be defined as $arg β_{1}, β_{2} min ∥ x_{e 1 a} + β_{1} (x_{e 1 b} - x_{e 1 a}) - (x_{e 2 a} + β_{2} (x_{e 2 b} - x_{e 2 a})) ∥ s.t. β_{1}, β_{2} \in [0, 1]$

Remark 24.1.1 (Smoothness of 3D Distance Functions). Note that the point-triangle distance is at least $C^{1}$ continuous everywhere. This means that even when the projected point is located on the borders of the piecewise function, the distance gradient still exists and is continuous. However, for edge-edge distance, when the edges are parallel, the distance function is only $C^{0}$ continuous, as the gradient of the expressions in adjacent regions do not agree. To address this issue, IPC [Li et al. 2020] proposed multiplying a mollifier to the edge-edge barrier energy density function to make the potential $C^{1}$ continuous everywhere. This mollifier smoothly decreases to zero when the edges are parallel. This ensures that gradient-based optimization methods can still be applied efficiently to solve the problem.

Collision Detection

Collision detection in 3D can be significantly more computationally intensive than in 2D due to the larger number of surface primitives involved. Thankfully, spatial data structures like spatial hashing and bounding volume hierarchies (BVH) help efficiently reduce the number of candidate primitive pairs, making continuous collision detection (CCD) more manageable.

Spatial Hashing

The core idea of spatial hashing is to partition the space into a uniform grid and assign each grid cell an array to store the indices of primitives whose bounding boxes intersect with that cell. To find the nearby primitives of a given primitive $A$ (e.g., a point), we identify the grid cells intersecting with $A$ 's bounding box and retrieve the primitive indices stored in these cells. This approach ensures that only nearby primitives are checked for collisions using CCD, eliminating the need for a nested loop to examine all primitive pairs.

Bounding Volume Hierarchies (BVH)

BVH is another effective method for broad-phase collision detection. It organizes primitives into a hierarchy of bounding volumes, allowing for efficient pruning of the search space when detecting potential collisions.

ACCD Method

The ACCD (Adaptive Continuous Collision Detection) method, as discussed in Continuous Collision Detection, is applicable in 3D. In this context, the distance calculations need to be adapted for point-triangle and edge-edge pairs.

Broad Phase Collision Detection

For computing the barrier potential energy, gradient, and Hessian, it is faster and essential to first gather a set of nearby candidate primitive pairs. Then, we compute their distances to determine if they are active (i.e., within a distance $\hat{d}$ ). This filtering process is part of the broad-phase collision detection and can be efficiently implemented using spatial hashing or BVH.

By employing these spatial data structures, we significantly reduce the computational load, focusing our detailed collision checks on a manageable subset of nearby primitives.

Friction

3D friction is quite similar to its 2D counterpart, with the primary difference being the types of contact pairs involved. In 3D, these contact pairs are point-triangle and edge-edge pairs. Consequently, the barycentric coordinates of the closest points are now two-dimensional, represented by the optimal values of $β_{1}$ and $β_{2}$ in the definitions provided in Definition 24.1.1 and Definition 24.1.2.

In practice, this means that while the principles of friction remain the same, the specific calculations adjust to account for the geometry of the contact pairs in 3D space.

Summary

In this section, we discussed the main technical details of implementing a 3D contact handling method based on Implicit Contact Prediction (IPC).

In 3D, both distance and friction basis computations become more complex. These computations rely on point-triangle and edge-edge primitive pairs, similar to the point-edge pairs used in 2D.

For edge-edge distances, which are only $C^{0}$ -continuous, an additional mollification function that smoothly decreases to zero is necessary. This function is multiplied with the barrier energy density function to achieve $C^{1}$ -continuity, enabling the use of efficient gradient-based optimization methods.

Due to the significantly larger number of primitive pairs in 3D, spatial data structures like spatial hashing or bounding volume hierarchies (BVH) are often used in the broad phase to filter candidates before computing distances or performing CCD.

Rigid Body Simulation*

*Author of this lecture: Wenxin Du, University of California, Los Angeles

To extend the IPC method to rigid body contact, we can employ subspace methods to effectively reduce the system's degrees of freedom (DOFs). Since a rigid body serves as an idealized model for extremely stiff real-world objects with negligible deformation, it does not require a large number of DOFs. This observation motivates the Affine Body Dynamics (ABD) [Lan et al. 2022] method. Depending on the choice of subspace, this approach can also be adapted into various algorithms suitable for different categories of soft bodies.

Rigid Body Dynamics

Before jumping into ABD, let's review the traditional approach to simulate rigid body dynamics.

Let’s begin by reviewing Newtonian mechanics — specifically, Newton’s second law for a particle at position $x (t)$ with mass $m$ :

$\frac{d x}{d t} m \frac{d v}{d t} = v, = f .$

Now, compute $\frac{d ( m x \times v )}{d t}$ :

$\frac{d ( m x \times v )}{d t} = m \frac{d x}{d t} \times v + m x \times \frac{d v}{d t} = m v \times v + x \times f = 0 + x \times f = x \times f .$

Here, $m x \times v$ is called the angular momentum, and $x \times f$ is the torque.

For a rigid body $B$ , we can integrate over its volume:

$\frac{d}{d t} (\int_{B} ρ x \times v d x) = \int_{B} ρ x \times f (x) d x,$ where $ρ = ρ (x)$ is the mass density of $B$ at position $x$ .

Now let $c$ denote the center of mass (COM) of $B$ . Then:

$\frac{d}{d t} (\int_{B} ρ c \times v d x) = \int_{B} ρ c \times \frac{d v}{d t} d x = \int_{B} ρ c \times f (x) d x .$

Combining both:

$\frac{d}{d t} (\int_{B} ρ (x - c) \times v d x) = \int_{B} ρ (x - c) \times f (x) d x .$

Since $\int_{B} ρ (x - c) \times v_{c} d x = (\int_{B} ρ (x - c) d x) \times v_{c} = 0 \times v_{c} = 0$ , we can further simplify:

$\frac{d}{d t} (\int_{B} ρ (x - c) \times (v - v_{c}) d x) = \int_{B} ρ (x - c) \times f (x) d x,$ where the right-hand side is the torque $τ$ about the center of mass.

Denoting the right-hand side as the torque $τ_{c}$ , compute the left-hand side:

$\frac{d}{d t} (\int_{B} ρ (x - c) \times (v - v_{C}) d x) = \frac{d}{d t} (\int_{B} ρ (x - c) \times (ω \times (x - c)) d x) = \frac{d}{d t} (\int_{B} ρ (⟨ x - c, x - c ⟩ ω - ⟨ x - c, ω ⟩ (x - c)) d x) = I_{B}^{c} ω .$

Here, the inertia tensor $I_{B}^{c}$ is defined as:

$I_{B}^{c} := \int_{B} ρ (∥ x - c ∥_{2}^{2} I - (x - c) (x - c)^{T}) d x .$

Since $x$ depends on $t$ , so does $I_{B}^{c}$ . Letting $R (t)$ be the rotation matrix (the rotation of $B$ from time $0$ to $t$ ). It's easy to verify that:

$I_{B}^{c} (t) = R (t) I_{B}^{c} (0) R (t)^{T} .$

Differentiating yields:

$τ_{c} = ω \times (I^{c} ω) + I^{c} \frac{d ω}{d t} .$

Thus, the rigid body dynamics law becomes:

$\frac{d}{d t} (m v_{c}) ω \times (I^{c} ω) + I^{c} \frac{d ω}{d t} = f, = τ_{c} . (25.1.1)$

These equations resemble Newton's second law. The first describes how forces affect the linear momentum of the rigid body, while the second describes how torques influence its angular momentum.

With a simple explicit integrator, we can simulate rigid body motion via:

$v_{n + 1} x_{n + 1} ω_{n + 1} \hat{q}_{n + 1} q_{n + 1} = v_{n} + Δ t \frac{f _{n}}{m}, = x_{n} + Δ t v_{n}, = ω_{n} + Δ t (I^{c})^{- 1} (τ_{c} - ω \times (I^{c} ω)), = q_{n} + \frac{Δ t}{2} ω q_{n}, = \frac{q ^ _{n + 1}}{∣ q ^ _{n + 1} ∣} . (25.1.2)$

Here, $q$ is the unit quaternion representing the body’s rotation. Its update is based on a first-order approximation:

$q - q = e^{\frac{Δ t}{2} ω} q - q = (cos \frac{∥ ω ∥Δ t}{2} - 1 + \frac{ω}{∥ ω ∥} sin \frac{∥ ω ∥Δ t}{2}) q = (Θ (Δ t^{2}) + \frac{Δ t}{2} ω) q,$ which leads to: $Δ t \to 0 lim \frac{q - q}{Δ t} = \frac{1}{2} w q,$

where $w = (ω_{x}, ω_{y}, ω_{z}, 0)$ ; $q$ is the composition of the rotation represented by the rotation vector $Δ t ω$ and the rotation represented by $q$ .

Remark 25.1.1 (Rotation Vector to Quaternion). A rotation of angle $θ$ around the axis defined by the unit vector $u = (u_{x}, u_{y}, u_{z}) = u_{x} i + u_{y} j + u_{z} k$ can be represented by the quaternion $q = e^{\frac{θ}{2} (u_{x} i + u_{y} j + u_{z} k)} = cos \frac{θ}{2} + (u_{x} i + u_{y} j + u_{z} k) sin \frac{θ}{2} = cos \frac{θ}{2} + u sin \frac{θ}{2}$

Subspace Simulation

In this subsection, we derive a constrained version of the rigid body dynamics equations using a subspace optimization approach. This formulation connects rigid body dynamics to the Incremental Potential Contact (IPC) method, as will be further elaborated in subsequent sections.

Recall that in IPC, the simulation problem is posed as the following optimization: $x^{n + 1} where E (x) = x arg min E (x) = \frac{1}{2} ∥ x - \tilde{x}^{n} ∥ + Δ t^{2} P (x) .$

Here, $E (x)$ is the incremental potential, $x$ denotes the particle position to be solved.

Let $x$ be a particle position on a rigid body, parameterized by the body's rotation matrix $Q$ and translation $q$ . The rigid body kinematics impose the constraint $x = Q \overset{ˉ}{x} + q$ , where $\overset{ˉ}{x}$ is the particle's position in the body-local frame. In the stacked form, this becomes:

$x = \overset{ˉ}{x} Q + \overset{ˉ}{S} q,$

where $\overset{ˉ}{x}$ and $\overset{ˉ}{S}$ are constant matrices independent of $Q$ and $q$ as they only rely on the local particle positions in the body frame.

Substituting into the IPC formulation, we obtain the constrained optimization problem:

$Q^{n + 1}, q^{n + 1} where E (Q, q) = Q, q arg min E (Q, q) s.t. f (Q) = 0, = \frac{1}{2} \overset{ˉ}{x} Q + \overset{ˉ}{S} q - \tilde{x}^{n}^{2} + Δ t^{2} P (\overset{ˉ}{x} Q + \overset{ˉ}{S} q),$

where the constraint $f (Q) = 0$ enforces rigidity of the rotation matrices $Q_{i}$ for each rigid body $i$ i.e. $Q_{i}^{T} Q_{i} - I_{3} = 0$ .

Applying the method of Lagrange multipliers, we introduce the Lagrangian:

$G = E (Q, q) + λ^{⊤} f (Q),$

and derive the stationarity conditions: $\nabla_{q} G \nabla_{Q} G \nabla_{λ} G = 0, = 0, = 0.$

This leads to the following constrained rigid body dynamics equations:

$\overset{ˉ}{x}^{⊤} M (\overset{ˉ}{x} Q + \overset{ˉ}{S} q) + Δ t^{2} \overset{ˉ}{x}^{⊤} \nabla P (\overset{ˉ}{x} Q + \overset{ˉ}{S} q) + (\nabla f (Q))^{⊤} λ \overset{ˉ}{S}^{⊤} M (\overset{ˉ}{x} Q + \overset{ˉ}{S} q) + Δ t^{2} \overset{ˉ}{S}^{⊤} \nabla P (\overset{ˉ}{x} Q + \overset{ˉ}{S} q) f (Q) = 0, = 0, = 0 . (25.2.1)$

These equations define the motion of rigid bodies under subspace-constrained dynamics, while satisfying both energy consistency and rigidity constraints.

Affine Body Dynamics

Previously, we reviewed rigid body dynamics and implemented a simple solver using explicit integration.

When dealing with many rigid bodies — especially with complex geometry and dense contact — it’s natural to attempt IPC for contact resolution. However, representing a rigid body by its position $x$ and rotation $q$ creates a challenge: step size filtering in Newton’s method becomes nonlinear in $α$ , due to the nonlinearity of quaternion update.

Let the state of a rigid body be denoted by $y = (x, q)$ , where $x$ is the center of mass and $q$ is a unit quaternion representing orientation. During a Newton iteration, let the update direction be $p = (p_{x}, p_{q})$ with step size $α$ , so that the updated state is given by $\tilde{y} = y + α p$ . The position of a vertex initially located at $x_{v}^{(0)}$ is then transformed to:

$x_{v} = q (x_{v}^{(0)} - x^{(0)}) q^{- 1} + x$

Expanding this shows $\tilde{x}_{v}$ is nonlinear in $α$ :

$x_{v} = (q + α p_{q}) (x_{v}^{(0)} - x^{(0)}) (q + α p_{q})^{- 1} + x + α p_{x}$ and since $q$ depends nonlinearly on $α$ , so does $\tilde{x}_{v}$ . Hence, linear CCD may fail to guarantee intersection-free motion.

We could model rigid bodies as very stiff soft bodies (e.g., mass-spring or hyperelastic), but this is inefficient and negates the DOF-reduction benefit.

Affine Body Dynamics (ABD) [Lan et al. 2022] addresses this: instead of strict rigidity, we allow tiny affine deformations $\overset{ˉ}{x} \mapsto x = A x + b$ for each vertex $x$ of the body where $\overset{ˉ}{x}$ denotes the rest position. The body state is $y = (b^{T}, a_{1}^{T}, a_{2}^{T}, a_{3}^{T})$ where $A = (a_{1}, a_{2}, a_{3})^{T}$ , so the vertex position $x$ becomes:

$x = J y$

with $J = (I_{3 \times 3}, I_{3 \times 3} \otimes \overset{ˉ}{x}^{T})$ where $\otimes$ denotes the Kronecker product.

This linearity makes $x$ linearly dependent on the optimization step size $α$ , enabling robust linear CCD.

Let $E (x)$ be the augmented IPC energy. Original IPC solves: $\nabla_{x} E (x) = 0$

ABD solves in a reduced subspace:

$\nabla_{y} E (J y) = 0$

Using the chain rule:

$\nabla_{y} E (J y) D_{y}^{2} E (J y) = J^{T} \nabla_{x} E (x) = J^{T} D_{x}^{2} E (x) J (25.3.1)$ , we can solve Newton Optimization direction by solving the linear system $J^{T} D_{x}^{2} E (x) J = - J^{T} \nabla_{x} E (x)$ .

With these, we can implement ABD using implicit Euler integration for a simple 2D case.

Before implementation, consider this geometric view: in 2D, an affine transformation is determined by 3 points. Choosing a triangle per rigid body, we can interpolate all vertices via barycentric coordinates. The triangle drives the simulation; the original mesh is used for collisions. This yields a reduced DOF system where the triangle controls motion—essentially what ABD does.

Case Study: ABD Square Drop*

Let’s now implement Affine Body Dynamics (ABD) in 2D for a neo-Hookean version of the square drop simulation. This requires only minor modifications to the standard IPC simulation code. The full implementation is available in the 9_reduced_DOF folder of our solid simulation tutorial.

We begin by introducing a function to compute the reduced basis. In ABD, we consider only affine deformations. Therefore, we use method=1 (polynomial basis) and order=1 (linear basis) to extract the linear basis:

Implementation 25.4.1 (Compute reduced basis, utils.py).

def compute_reduced_basis(x, e, vol, IB, mu_lame, lam, method, order):
    if method == 0: # full basis, no reduction
        basis = np.zeros((len(x) * 2, len(x) * 2))
        for i in range(len(x) * 2):
            basis[i][i] = 1
        return basis
    elif method == 1: # polynomial basis
        if order == 1: # linear basis, or affine basis
            basis = np.zeros((len(x) * 2, 6)) # 1, x, y for both x- and y-displacements
            for i in range(len(x)):
                for d in range(2):
                    basis[i * 2 + d][d * 3] = 1
                    basis[i * 2 + d][d * 3 + 1] = x[i][0]
                    basis[i * 2 + d][d * 3 + 2] = x[i][1]
        elif order == 2: # quadratic polynomial basis 
            basis = np.zeros((len(x) * 2, 12)) # 1, x, y, x^2, xy, y^2 for both x- and y-displacements
            for i in range(len(x)):
                for d in range(2):
                    basis[i * 2 + d][d * 6] = 1
                    basis[i * 2 + d][d * 6 + 1] = x[i][0]
                    basis[i * 2 + d][d * 6 + 2] = x[i][1]
                    basis[i * 2 + d][d * 6 + 3] = x[i][0] * x[i][0]
                    basis[i * 2 + d][d * 6 + 4] = x[i][0] * x[i][1]
                    basis[i * 2 + d][d * 6 + 5] = x[i][1] * x[i][1]
        elif order == 3: # cubic polynomial basis
            basis = np.zeros((len(x) * 2, 20)) # 1, x, y, x^2, xy, y^2, x^3, x^2y, xy^2, y^3 for both x- and y-displacements
            for i in range(len(x)):
                for d in range(2):
                    basis[i * 2 + d][d * 10] = 1
                    basis[i * 2 + d][d * 10 + 1] = x[i][0]
                    basis[i * 2 + d][d * 10 + 2] = x[i][1]
                    basis[i * 2 + d][d * 10 + 3] = x[i][0] * x[i][0]
                    basis[i * 2 + d][d * 10 + 4] = x[i][0] * x[i][1]
                    basis[i * 2 + d][d * 10 + 5] = x[i][1] * x[i][1]
                    basis[i * 2 + d][d * 10 + 6] = x[i][0] * x[i][0] * x[i][0]
                    basis[i * 2 + d][d * 10 + 7] = x[i][0] * x[i][0] * x[i][1]
                    basis[i * 2 + d][d * 10 + 8] = x[i][0] * x[i][1] * x[i][1]
                    basis[i * 2 + d][d * 10 + 9] = x[i][1] * x[i][1] * x[i][1]
        else:
            print("unsupported order of polynomial basis for reduced DOF")
            exit()
        return basis
    else: # modal-order reduction
        if order <= 0 or order >= len(x) * 2:
            print("invalid number of target basis for modal reduction")
            exit()
        IJV = NeoHookeanEnergy.hess(x, e, vol, IB, mu_lame, lam, project_PSD=False)
        H = sparse.coo_matrix((IJV[2], (IJV[0], IJV[1])), shape=(len(x) * 2, len(x) * 2)).tocsr()
        eigenvalues, eigenvectors = eigsh(H, k=order, which='SM') # get 'order' eigenvectors with smallest eigenvalues 
        return eigenvectors

Here, method=0 refers to full-space simulation and returns immediately without computing a basis.
method=1 computes polynomial bases, including linear, quadratic, and cubic functions. For each basis vector, the displacement components at each node are expressed as polynomial functions of the node’s material-space coordinates. method=2 computes bases via linear modal analysis, which solves the eigensystem of the elasticity Hessian and extracts displacement fields that correspond to the deformation modes that increases the least amount of energy. This will be discussed in more detail in the next lecture.

After computing the basis, we restrict the simulation to the corresponding subspace by projecting the Hessian matrix and the gradient vector. This projection follows the chain rule, as described in Equation (25.3.1). The relevant implementation is:

Implementation 25.4.2 (Compute reduced search direction, time_integrator.py).

def search_dir(x, e, x_tilde, m, vol, IB, mu_lame, lam, y_ground, contact_area, is_DBC, reduced_basis, h):
    projected_hess = IP_hess(x, e, x_tilde, m, vol, IB, mu_lame, lam, y_ground, contact_area, h)
    reshaped_grad = IP_grad(x, e, x_tilde, m, vol, IB, mu_lame, lam, y_ground, contact_area, h).reshape(len(x) * 2, 1)
    # eliminate DOF by modifying gradient and Hessian for DBC:
    for i, j in zip(*projected_hess.nonzero()):
        if is_DBC[int(i / 2)] | is_DBC[int(j / 2)]: 
            projected_hess[i, j] = (i == j)
    for i in range(0, len(x)):
        if is_DBC[i]:
            reshaped_grad[i * 2] = reshaped_grad[i * 2 + 1] = 0.0
    reduced_hess = reduced_basis.T.dot(projected_hess.dot(reduced_basis)) # applying chain rule
    reduced_grad = reduced_basis.T.dot(reshaped_grad) # applying chain rule
    return (reduced_basis.dot(spsolve(reduced_hess, -reduced_grad))).reshape(len(x), 2) # transform to full space after the solve

These changes enable us to run the ABD version of the square-drop simulation:

**Figure 25.4.1.** ABD simulation of a square dropped onto the ground.

In this example, we reduce the stiffness parameter to make the body softer, emphasizing the difference between ABD and standard IPC. The blue mesh is the original mesh (also used for collision), while the red triangle visualizes the reduced degrees of freedom.

Summary

This lecture explores how subspace methods can be leveraged to extend the IPC framework to rigid and near-rigid body simulations effectively. The key idea is to reduce the system's degrees of freedom (DOFs) by projecting the high-dimensional deformable body dynamics into a lower-dimensional subspace that still captures essential behavior.

Rigid body dynamics are introduced from Newtonian mechanics, resulting in the standard equations governing linear and angular momentum evolution. Integration schemes for solving these equations are reviewed, with quaternions used for rotation tracking.

Affine Body Dynamics (ABD) [Lan et al. 2022] is proposed as a compromise between rigid and deformable models, allowing for minimal affine deformation while maintaining computational efficiency. The ABD formulation benefits from linear dependence of vertex positions on the Projected Newton optimization step size, enabling effective use of linear CCD for collision resolution in IPC.

The ABD model allows simulation using significantly fewer DOFs by driving high-resolution collision geometry via low-resolution affine embeddings (e.g., triangle-driven motion in 2D). IPC gradients and Hessians are projected through the Jacobian to solve the reduced system in the subspace, leading to faster and more stable simulations for complex scenes with many rigid-like bodies and dense contacts.

Model Reductions*

Author of this lecture: Žiga Kovačič, Cornell University. The techniques presented in this lecture are mostly based on [Sifakis & Barbic 2012].

Physics-based simulations using the Finite Element Method offer remarkable realism, but this fidelity comes at a high computational cost. A typical high-resolution 3D object can easily have thousands or millions of degrees of freedom (DOFs), leading to large systems of equations that are expensive to solve at each time step. For applications requiring real-time interaction, such as video games, the cost of a full simulation is often prohibitive. This lecture introduces model reduction, a powerful family of techniques designed to drastically reduce this computational burden by approximating the system's behavior in a much lower-dimensional space.

The core idea is to move from a high-dimensional state vector $u \in R^{d n}$ (where $n$ is the number of vertices and $d$ are the spatial dimensions) to a much smaller vector of reduced coordinates $q \in R^{r}$ (where $r ≪ d n$ ). We achieve this by projecting the full system's dynamics onto a carefully chosen low-dimensional subspace that captures the most significant deformation behaviors of the object.

We will begin by exploring model reduction for linear systems through linear modal analysis, which provides an intuitive foundation based on an object's natural vibration modes. We will then extend these concepts to handle large, nonlinear deformations, addressing the challenges that arise. A crucial component of any model reduction scheme is the selection of a high-quality subspace basis; we will investigate two prominent approaches: data-driven methods using Principal Component Analysis (PCA) and physics-based methods using modal derivatives.

We begin our exploration of model reduction with the simplest case: linear elasticity (13.1.4). While limited to small deformations, this context provides a clear and intuitive introduction to the core concepts of subspace simulation.

The Physics of Vibration Modes

The object doesn't deform in a random way; instead, it vibrates in a superposition of specific, characteristic patterns called modes. Each of these fundamental patterns is called a mode shape, representing a specific way the object can deform. The overall motion is then described by how much each mode shape contributes to the total deformation at any given moment; this "amount" of contribution is the modal amplitude.

Each mode has an associated natural frequency. Low-frequency modes correspond to large, low-energy deformations (like the fundamental bending of the tuning fork), while high-frequency modes represent complex, high-energy vibrations that are often difficult to see and are quick to dissipate.

The key insight is that the vast majority of an object's visible dynamic behavior can be described by a small number of these low-frequency modes. By restricting our simulation to only these important modes, we can create a highly efficient approximation. It is crucial to remember that this approach is valid only for small deformations, where the material's response remains approximately linear. This means that the internal forces that resist deformation are directly proportional to the amount of displacement (a relationship often described by Hooke's Law). In our discrete setting, this is captured by a constant stiffness matrix $K$ , which is also the negative of the strain energy's Hessian matrix. For large deformations, the stiffness properties change with the deformation, and linear modal analysis is no longer applicable.

Computing the Modes

Finding these modes requires us to solve a generalized eigenvalue problem. The setup involves first defining the full system and then applying constraints. We will assume our system is 3-dimensional from now on.

We start by forming the full mass matrix $M \in R^{3 n \times 3 n}$ and stiffness matrix $K \in R^{3 n \times 3 n}$ .
To account for Dirichlet boundary conditions (fixed vertices), we form smaller, constrained matrices $\hat{M}$ and $\hat{K}$ by removing the rows and columns from $M$ and $K$ that correspond to the fixed degrees of freedom.

The eigenvalue problem is then solved on these constrained matrices: $\hat{K} \hat{ψ}_{i} = λ_{i} \hat{M} \hat{ψ}_{i} . (26.1.1)$

Example 26.1.1 (Interpreting the Generalized Eigenvalue Problem). The equation (26.1.1) is fundamental to understanding vibrations in mechanical systems. Let's break it down:

$K \in R^{3 n \times 3 n}$ is the full stiffness matrix. It relates displacement to internal elastic forces ( $f_{e l a s t i c} = Ku$ ). A high value in $K$ means the object strongly resists deformation.

$M \in R^{3 n \times 3 n}$ is the full mass matrix. It relates acceleration to inertial forces ( $f_{in er t ia l} = Ma$ ).

An eigenvector $ψ_{i} \in R^{3 n}$ is a mode shape. It's a vector that describes the spatial pattern of a particular vibration mode.

An eigenvalue $λ_{i} \in R$ is the squared natural frequency, for the corresponding mode $ψ_{i}$ .

The equation $K ψ_{i} = λ_{i} M ψ_{i}$ essentially states that a modal deformation $ψ_{i}$ is special: if the object is deformed into this shape, the elastic restoring force ( $K ψ_{i}$ ) is perfectly proportional to the inertial force ( $M ψ_{i}$ ) required to produce a sinusoidal vibration with that shape.

The modes are the shapes where the elastic and inertial forces align perfectly. Low eigenvalues correspond to low-energy modes, which are the most important for visual animation.

The solution from the eigensolver gives us eigenvalues $λ_{i}$ and eigenvectors $\hat{ψ}_{i}$ for the unconstrained DOFs. The eigenvalues are the squares of the natural frequencies of vibration, $λ_{i} = ω_{i}^{2}$ , and the eigenvectors represent the mode shapes. In order to check that the eigensolver was successful, it is common to visualize the individual modes by animating them as $ψ_{i} sin (ω_{i} t)$ , where $t$ is time.

The different modal vectors are typically assembled into a modal basis matrix:

$U = [ψ_{1}, \dots, ψ_{r}] \in R^{3 n \times r},$

where we select the $r$ modes corresponding to the smallest non-zero eigenvalues.

Subspace Simulation of Linear Dynamics

Now, let's see how this modal basis accelerates simulation. The full-space equation of motion for a linear elastic system with damping is:

$M \ddot{u} + D \dot{u} + Ku = f_{e x t}, (26.1.2)$

where $D$ is the damping matrix and $f_{e x t}$ is the vector of external forces. To reduce this system, we introduce the central approximation of subspace methods: the full-space displacement $u$ is approximated by a linear combination of our basis vectors:

$u (t) \approx Uq (t) .$

Here, $q (t) \in R^{r}$ is the vector of reduced coordinates or modal amplitudes, where ideally $r ≪ 3 n$ . Each component $q_{i} (t)$ represents "how much" of mode $ψ_{i}$ is present in the deformation at time $t$ .

To derive the equation of motion for $q$ , we substitute approximation for $u (t)$ into (26.1.2) and project the entire equation onto the subspace by pre-multiplying with $U^{T}$ : $U^{T} (M (U \ddot{q}) + D (U \dot{q}) + K (Uq)) = U^{T} f_{e x t}$ $(U^{T} MU) \ddot{q} + (U^{T} DU) \dot{q} + (U^{T} KU) q = U^{T} f_{e x t} .$ This is where the magic happens. Due to the properties of the generalized eigenvalue problem, the eigenvectors can be scaled to be mass-orthonormal, meaning $ψ_{i}^{T} M ψ_{j} = δ_{ij}$ (1 if $i = j$ , 0 otherwise).

Method 26.1.1 (Mass-Orthonormalization of Eigenvectors). To achieve the simplification $U^{T} MU = I$ , the modal basis must be made mass-orthonormal. This is accomplished in two steps. First, the generalized eigenvalue problem naturally produces eigenvectors that are M-orthogonal, meaning $ψ_{i}^{T} M ψ_{j} = 0$ for any two different modes $i$ and $j$ . This property ensures that all off-diagonal entries of the projected mass matrix are zero. Second, to make the diagonal entries equal to one, we explicitly normalize each raw eigenvector $ψ_{i}$ from the solver using its mass-weighted length: $ψ_{i} = \frac{ψ _{i}}{ψ _{i}^{T} M ψ _{i}} . (26.1.3)$ The resulting vectors $ψ_{i}$ , which satisfy $ψ_{i}^{T} M ψ_{i} = 1$ , are assembled into the basis $U$ . Together, M-orthogonality and M-normalization guarantee the projected mass matrix is the identity, which is what decouples the system of equations.

This simplifies the projected matrices dramatically:

$U^{T} MU = I$ ;
$U^{T} KU = Λ$ , where $Λ$ is a diagonal matrix of the eigenvalues, $diag (λ_{1}, \dots, λ_{r})$ .

If we assume Rayleigh damping, where $D = α M + β K$ , the projected damping matrix also becomes diagonal: $U^{T} DU = α I + β Λ .$

This means that the final reduced system becomes a set of $r$ completely independent, 1D ordinary differential equations: $\overset{q}{¨}_{i} + (α + β λ_{i}) \overset{q}{˙}_{i} + λ_{i} q_{i} = ψ_{i}^{T} f_{e x t} for i = 1, \dots, r . (26.1.4)$

Method 26.1.2 (Linear Modal Simulation). We can think of the overall process as being split into a one-time precomputation and an efficient runtime loop.

Precomputation:

Discretize the object to form the global mass matrix $M$ and stiffness matrix $K$ .

Solve the generalized eigenvalue problem $K ψ_{i} = λ_{i} M ψ_{i}$ for the $r$ smallest non-zero eigenvalues $λ_{i}$ and corresponding eigenvectors $ψ_{i}$ .

Assemble the mass-orthonormal basis matrix $U = [ψ_{1}, \dots, ψ_{r}]$ .

Runtime Loop (per time step):

Compute external forces $f_{e x t}$ in the full-space (e.g., from gravity, user interaction).

Project the forces onto the reduced basis: $f_{re d u ce d} = U^{T} f_{e x t}$ .

For each mode $i = 1, \dots, r$ , solve its simple 1D ODE (26.1.4) to update its amplitude $q_{i}$ . This can be done with a simple and stable implicit integration scheme.

Reconstruct the full-space deformation for rendering: $u = Uq$ .

The computational savings are immense. Instead of solving a large, coupled $3 n \times 3 n$ system, we solve $r$ tiny, independent 1D equations, where $r$ might be 20-50 while $3 n$ could be in the tens of thousands or even millions.

We implemented the linear modal analysis procedure for a 2D cantilever beam in solid-sim-tutorial/9_reduced_DOF/linear.py.

Visualizing the Mode Shapes

Following the precomputation steps outlined in the methodology, we used the Finite Element Method to assemble the global mass ( $M$ ) and stiffness ( $K$ ) matrices. After applying fixed boundary conditions to one end of the beam, we solved the generalized eigenvalue problem ( $Kx = λ Mx$ ) to find the lowest-frequency mode shapes ( $ψ_{i}$ ) and their corresponding frequencies ( $ω_{i} = λ_{i}$ ).

The resulting modes were visualized by animating the beam's deformation according to $ψ_{i} sin (ω_{i} t)$ . The animation below displays the first six modes for the following parameter values:

E = 5e7   # Young's modulus
nu = 0.3  # Poisson's ratio
rho = 500 # Density
# Height : length ratio = 1 : 15

**Figure 26.2.1.** Visualization of the First Six Computed Vibration Modes. The modes, ordered from lowest frequency (left) to highest (right), display increasingly complex deformation patterns, from simple bending to S-curves and compressional shapes.

Artifacts of Large Deformations

The primary limitation of linear modal analysis is that it produces visible artifacts when undergoing large deformations. To demonstrate this, we designed a second experiment focused exclusively on the first bending mode.

In this experiment, we animate the beam deforming from its rest state to a state of large deflection. Alongside the deforming beam, we plot two key metrics in real-time:

Centerline Arc Length: For a real, inextensible object, the length of its centerline should remain constant during pure bending.
Total Area (2D Volume): For a nearly incompressible material, the total area should also be preserved.

As the tip displacement increases, the linear model incorrectly predicts that (1) the beam's centerline stretches significantly, and (2) its total area increases significantly..

**Figure 26.2.2.** Animation Demonstrating Artifacts of the Linear Model. (Left) The beam deforms according to the first linear mode. (Middle) A plot traces the erroneous increase in centerline length as a function of tip displacement. (Right) A plot traces the erroneous decrease in total area.

These artifacts occur because the linear model cannot account for the geometric nonlinearities that arise from large rotations. This failure is the primary motivation for the methods described in the following section, which are designed specifically to overcome these limitations and accurately simulate large, physically plausible motions.

Model Reduction for Nonlinear Systems

Linear modal analysis is remarkably efficient, but its underlying assumption of small deformations is a major limitation. For simulating visually rich phenomena like buckling, large bending, or twisting, a linear model produces severe visual artifacts. As shown in the previous section, a simple linear model fails to capture the natural foreshortening that occurs when an object bends, resulting in an unrealistic extension of length and volume. To correctly capture these essential nonlinear effects, we must apply model reduction to the full nonlinear equations of motion.

Projecting the Nonlinear Equations

The equation of motion for a general hyperelastic object, including damping, is:

$M \ddot{u} + D \dot{u} + f_{in t} (u) = f_{e x t} . (26.3.1)$

Here, the linear elastic force $Ku$ is replaced by a general internal force function $f_{in t} (u)$ , which can be a complex nonlinear function of the displacements $u$ . This nonlinearity can arise from the material model (stress-strain relationship) or from the strain measure itself (geometric nonlinearity).

We follow the same projection-based procedure as in the linear case. We assume the existence of a suitable basis matrix $U \in R^{3 n \times r}$ (where $r ≪ 3 n$ )—more on the choice of basis in the next section-and approximate the full-space displacements $u$ using the reduced coordinates $q \in R^{r}$ :

$u (t) \approx Uq (t) .$

Substituting this into (26.3.1) and pre-multiplying by $U^{T}$ to project the dynamics onto the subspace yields the reduced nonlinear equation of motion:

$U^{T} MU \ddot{q} + U^{T} DU \dot{q} + U^{T} f_{in t} (Uq) = U^{T} f_{e x t} .$

Assuming we have constructed a mass-orthonormal basis such that $U^{T} MU = I$ , this simplifies to:

$\ddot{q} + \hat{D} \dot{q} + \hat{f}_{in t} (q) = \hat{f}_{e x t}, (26.3.2)$

where:

$\hat{D} = U^{T} DU$ is the $r \times r$ reduced damping matrix.
$\hat{f}_{in t} (q) = U^{T} f_{in t} (Uq)$ is the $r \times 1$ reduced internal force vector.
$\hat{f}_{e x t} = U^{T} f_{e x t}$ is the $r \times 1$ reduced external force vector.

Unlike the linear case, the reduced system in (26.3.2) is no longer a set of independent 1D oscillators. The reduced internal force $\hat{f}_{in t} (q)$ is a nonlinear function that couples all the components of $q$ . This introduces two critical questions:

How do we efficiently time-step this coupled, nonlinear system?
How do we choose an effective basis $U$ that can represent nonlinear deformations? (next section)

Timestepping and the Evaluation Bottleneck

To solve (26.3.2) we typically use an implicit time integration scheme, which is necessary to handle the high-frequency stiffness common in elastic systems. An implicit step requires solving a linear system involving the derivative of the forces. For our reduced system, this means we need the reduced tangent stiffness matrix, $\hat{K} (q)$ :

$\hat{K} (q) = D_{q} \hat{f}_{in t} (q) .$

Using the chain rule, we can relate this to the full-space tangent stiffness matrix $K (u) = D_{u} f_{in t} (u)$ :

$\hat{K} (q) = D_{q} (U^{T} f_{in t} (Uq)) = U^{T} D_{q} (f_{in t} (Uq)) = U^{T} (D_{u} (f_{in t} (u))_{u = Uq} D_{q} (Uq)),$

$\hat{K} (q) = U^{T} K (Uq) U .$

At each time step, an implicit integrator will solve a dense $r \times r$ linear system involving $\hat{K}$ . Since $r ≪ 3 n$ , this is a monumental improvement over solving the original sparse $3 n \times 3 n$ system.

However, a major computational challenge emerges: how do we compute $\hat{f}_{in t} (q)$ and $\hat{K} (q)$ efficiently? The naive approach is to:

Take the current reduced coordinates $q$ .
De-project to find the full-space deformation: $u = Uq$ .
Evaluate the full-space forces $f_{in t} (u)$ and stiffness $K (u)$ by looping over all elements in the high-resolution mesh.
Project back to the reduced space: $\hat{f}_{in t} = U^{T} f_{in t}$ and $\hat{K} = U^{T} KU$ .

This process is extremely slow because step 3 still requires a full pass over the high-resolution mesh, completely defeating the purpose of model reduction. Even worse, computing the matrix products $U^{T} KU$ is both computation and memory intensive.

Accelerating Force Evaluation

To make nonlinear model reduction practical, we need a way to evaluate the reduced forces and stiffness matrices without ever forming their full-space counterparts.

Polynomial Fitting (for specific materials)

For certain material models, a highly efficient analytical approach is possible. For instance, in a geometrically nonlinear but materially linear model (e.g., St. Venant-Kirchhoff), each component of the full-space internal force vector $f_{in t} (u)$ is a cubic polynomial in the components of $u$ .

Since $u = Uq$ , it follows that the reduced force $\hat{f}_{in t} (q) = U^{T} f_{in t} (Uq)$ is also a cubic polynomial, but in the reduced coordinates $q$ . We can precompute the coefficients of this $r$ -variate cubic polynomial. At runtime, evaluating $\hat{f}_{in t} (q)$ and its derivative $\hat{K} (q)$ simply involves evaluating these precomputed polynomials. The evaluation cost is $O (r^{4})$ for the forces and $O (r^{3})$ for implicit integration, which is manageable for small $r$ (e.g., $r < 30$ ) and completely independent of the mesh size $n$ . Unfortunately, this approach is limited to materials where forces are low-degree polynomials.

Cubature (for general materials)

A more general and powerful solution is cubature [An et al. 2008]. The core insight is to re-examine how the reduced force is calculated.

The reduced energy $\hat{E}$ is the full energy evaluated within the subspace: $\hat{E} (q) = E (Uq) = \int_{Ω_{0}} Ψ (F (Uq, X)) d V .$

The reduced internal force $\hat{f}_{in t}$ is the gradient of this reduced energy with respect to the reduced coordinates $q$ . By moving the derivative inside the integral, we find:

$\hat{f}_{in t} (q) = \nabla_{q} E = \int_{Ω_{0}} \nabla_{q} Ψ (F (Uq, X)) d V .$ This shows that the reduced force is the integral of the reduced energy density gradient over the object's volume. The projection $U^{T}$ is implicitly included within the derivative $\nabla_{q}$ via the chain rule.

Instead of computing this integral exactly by summing contributions from all finite elements, cubature approximates it with a weighted sum over a very small number, $T$ , of pre-selected sample points (or "quadrature points") $X_{j} \in Ω_{0}$ :

$\hat{f}_{in t} (q) \approx j = 1 \sum T w_{j} \nabla_{q} Ψ (F (Uq, X))_{X = X_{j}} . (26.3.3)$

The number of cubature points $T$ can be surprisingly small (often on the order of $r$ ) while still yielding a highly accurate approximation. The points $X_{j}$ and their non-negative weights $w_{j}$ are optimized in a precomputation step to best match the true reduced forces over a set of representative "training" poses.

At runtime, evaluating the reduced forces and stiffness only requires looping over these $T$ points. The cost becomes independent of the original mesh resolution $n$ , making fast simulation of complex, nonlinear materials feasible.

You can notice that it might be very important how we choose the basis $U$ s.t. we can most effecitvely represent the space of non-linear deformations. This will be discussed in the following section!

Choice of Basis for Nonlinear Deformations

The quality of the reduced model is determined by the choice of the time-invariant basis matrix $U \in R^{3 n \times r}$ , which we assumed existance of in the previous chapters. This basis, which is pre-processed to be mass-orthonormal ( $U^{T} MU = I$ ), must effectively span the space of expected nonlinear deformations.

Basis from Simulation Data (POD)

This data-driven approach, also known as Principal Orthogonal Directions (POD), constructs an optimal basis from pre-existing simulation data.

First, an offline, unreduced simulation is run to generate a set of $N$ deformation state vectors ${u_{1}, u_{2}, ..., u_{N}}$ . Afterwards, the snapshots are assembled into a matrix $A = [u_{1}, u_{2}, ..., u_{N}]$ . A basis is extracted by performing a Singular Value Decomposition (SVD), typically with respect to the mass-weighted inner product $⟨ x, y ⟩_{M} = x^{T} My$ (a technique known as mass-PCA). The columns of $U$ from the decomposition $A = U Σ V^{T}$ that correspond to the largest $r$ singular values form the basis $U$ !

While optimal for the training data, this method's primary drawback is the necessity of a slow, offline pre-simulation.

This approach constructs a basis automatically by analyzing the system's fundamental nonlinear response, without requiring a non-reduced pre-simulation like POD.

The key insight is that linear modes are an incomplete basis. When a nonlinear system is excited in the direction of a linear mode, other deformations naturally co-appear due to nonlinear coupling. Modal derivatives are precisely these coupled deformations.

Mathematically, we analyze the static deformation $u (p)$ resulting from a force applied along the linear modes $U_{l in}$ : $f_{in t} (u) = M U_{l in} Λp$ , where $Λ$ is the diagonal matrix of squared frequencies and $p$ is a parameter vector. The solution $u (p)$ can be expressed as second order Taylor series expansion around $p = 0$ :

$u (p) = i = 1 \sum k ψ_{i} p_{i} + \frac{1}{2} i = 1 \sum k j = 1 \sum k Φ_{ij} p_{i} p_{j} + O (∣∣ p ∣ ∣^{3}) .$

This expansion reveals the system's response structure. The first-order derivatives, $ψ_{i} = \frac{\partial u}{\partial p _{i}} ∣_{p = 0}$ , are precisely the familiar linear vibration modes, representing the initial linear response to the applied force. The crucial nonlinear couplings are captured by the second-order derivatives, $Φ_{ij} = \frac{\partial ^{2} u}{\partial p _{i} \partial p _{j}} ∣_{p = 0},$ which are known as the modal derivatives. Each $Φ_{ij}$ is a deformation vector that can be computed by solving a linear system involving the constant, rest-state stiffness matrix $K$ .

The full set of vectors ${ψ_{i}, Φ_{ij}}$ is typically too large. A compact, $r$ -dimensional basis $U$ is formed by collecting all linear modes and modal derivatives, scaling them to prioritize low-frequency contributions, and then using mass-PCA to extract the most significant combined shapes. This produces a general-purpose basis that captures essential nonlinear behavior without any prior knowledge of runtime forces.

For more information and the derivation please refer to the SIGGRAPH course notes [Sifakis & Barbic 2012].

Summary

In this lecture, we discussed modal reduction techniques that dramatically accelerate physics-based simulations by projecting high-dimensional systems onto carefully chosen low-dimensional subspaces.

Linear modal analysis uses an object's natural vibration modes as basis vectors, creating independent 1D oscillators that are extremely efficient to solve. However, this approach produces severe artifacts (length stretching, volume changes) for large deformations due to its linear assumptions. Nonlinear model reduction extends these concepts to handle large deformations by projecting the full nonlinear equations of motion. The key challenge is efficiently evaluating reduced forces without computing full-space quantities, solved through polynomial fitting for specific materials or cubature methods for general cases.

The quality of any modal reduction depends critically on basis selection. Two main approaches exist: data-driven methods using Principal Component Analysis on simulation snapshots, and physics-based modal derivatives that capture nonlinear coupling between linear modes without requiring expensive pre-simulations.

Spatial and Temporal Discretization

*Author of this lecture: Chang Yu, University of California, Los Angeles

In the first lecture, Discrete Space and Time, we introduced various representations for solid objects. Rather than starting from classical continuum equations, we directly formulated elasticity simulations as discrete algebraic problems, with previous discussions primarily focusing on mesh-based purely Lagrangian schemes.

In this lecture, we introduce the Material Point Method (MPM) - a hybrid Lagrangian-Eulerian scheme particularly suited for simulating large deformation and topology changes. Unlike purely Lagrangian methods such as FEM, which track material deformation through the elements of a carefully constructed mesh structure, MPM discretizes the material simply into a set of material particles. Each particle stores essential physical information such as mass, momentum, deformation gradient, and stress. MPM then utilizes a temporary, regular Cartesian background grid to solve the momentum equation, with information transfered between particles and grid nodes.

Specifically, in each time step, information from material particles is first transferred onto the Eulerian grid, where governing equations are solved efficiently. Updated grid solutions are then transferred back onto the particles, updating particle states such as position and deformation gradient. After this transfer, the grid is reset, leaving particles as the sole carriers of material history. This hybrid strategy naturally accommodates significant deformation and complex contact scenarios, effectively overcoming typical mesh-distortion challenges inherent to traditional mesh-based purely Lagrangian methods.

Material Particles

In the Material Point Method (MPM), we discretize a solid into a finite set of material points (also referred to as particles), which carry the complete state of the material throughout the simulation. These material points serve as Lagrangian carriers of physical quantities, including:

$x_{p}^{n}$ : the position of particle $p$ (at time $t^{n}$ )
$v_{p}^{n}$ : the velocity of particle $p$
$V_{p}^{n}$ : the volume of particle $p$
$m_{p}$ : the mass of particle $p$ (constant over time)
$F_{p}^{n}$ : the deformation gradient at particle $p$

We adopt the convention that all particle-related quantities are subscripted by $p$ .

Sampling Material Particles

To initialize an MPM simulation, the continuous solid domain is sampled into a finite number of particles. The quality of this sampling has a significant impact on numerical stability and accuracy.

Uniform Grid Sampling. This method places particles at regular intervals on a Cartesian grid. While simple to implement and computationally efficient, it introduces grid-aligned artifacts in simulations, which can reduce realism and robustness.
Random Sampling. Naively placing particles at random positions can lead to clustering and large local variations in density, which degrade both simulation accuracy and numerical stability.
Poisson-Disk Sampling. Instead, Poisson-disk Sampling is commonly used because it guarantees a minimum separation between particles, producing a more uniform distribution. This improves interpolation accuracy, minimizes numerical artifacts, and yields more stable contact behavior.

**Figure 27.1.1.** Comparison of random sampling and Poisson-disk sampling in a unit square with the same number of samples: Poisson-disk sampling produces more uniform spacing between points.

Estimating Particle Volume

Since we track the deformation gradient $F_{p}^{n}$ at each particle, we can compute volume change from its determinant: $V_{p}^{n} \approx J_{p}^{n} \cdot V_{p}^{0} = det (F_{p}^{n}) \cdot V_{p}^{0} .$ Here, $V_{p}^{0}$ is the rest volume of particle $p$ , typically computed as: $V_{p}^{0} = \frac{V _{grid_cell}}{N _{particles_per_cell}},$ and we compute $m_{p}$ as: $m_{p} = ρ_{0} \cdot V_{p}^{0},$ where $ρ_{0}$ is the mass density of the solid at rest.

Interpolating Functions

In each time step of the Material Point Method (MPM), particles transfer their mass and momentum to the background grid, and later retrieve updated velocities from the grid to perform advection. Both transfers rely on interpolating functions to determine how each particle interacts with nearby grid nodes.

We define the interpolating function associated with grid node $i$ as $N_{i} (x)$ . Here, $i$ is a multi-index - $(i, j)$ in 2D or $(i, j, k)$ in 3D - and $x$ is an arbitrary spatial position. When evaluating this function at a particle location $x_{p}$ , we use the shorthand:

$w_{i p} = N_{i} (x_{p}) .$

This weight determines how strongly particle $p$ influences grid node $i$ : the closer the particle is to the node, the larger $w_{i p}$ becomes.

Example 27.2.1 (Linear Interpolation). In 1D, the simplest choice of interpolating functions is the linear (tent) function: $N (x) = {1 - ∣ x ∣ 0 if 0 \leq ∣ x ∣ < 1, otherwise .$ To apply this in a grid-based setting, we scale the spatial coordinates relative to the grid spacing $h$ (i.e., the distance between adjacent grid nodes along one axis). The scaled version becomes: $N (\frac{x _{p} - x _{i}}{h}) .$ In higher dimensions (2D, 3D), we use the tensor product of 1D functions: $N_{i} (x_{p}) = N (\frac{x _{p} - x _{i}}{h}) \cdot N (\frac{y _{p} - y _{j}}{h}) \cdot N (\frac{z _{p} - z _{k}}{h}) .$

Linear interpolation is computationally efficient and easy to implement, and is widely used in FLIP [Brackbill et al. 1988] [Bridson 2015] for fluid simulation. However, it suffers from two serious issues when applied in MPM:

Its gradient $\nabla N (x)$ is discontinuous, leading to unstable and noisy force evaluations.
When particles are near cell boundaries, $w_{i p}$ becomes small while $\nabla w_{i p}$ remains large, causing numerical instabilities known as cell-crossing instability.
The function’s support is too narrow, making it prone to numerical fracture when neighboring particles do not share enough grid nodes.

Because of these issues, linear interpolation is typically avoided in MPM, especially in solid simulation.

Example 27.2.2 (Quadratic B-Spline Interpolation). A better choice is the quadratic B-spline interpolating functions, which is $C^{1}$ -continuous (with continuous gradients) and has wider support: $N (x) = ⎩ ⎨ ⎧ \frac{3}{4} - x^{2} \frac{1}{2} (\frac{3}{2} - ∣ x ∣)^{2} 0 if ∣ x ∣ < \frac{1}{2}, if \frac{1}{2} \leq ∣ x ∣ < \frac{3}{2}, otherwise .$

Quadratic B-splines have a support region of width $3 h$ and provide smooth, stable interactions between particles and the grid. They are computationally efficient and require less memory than higher-order B-splines.

Example 27.2.3 (Cubic B-Spline Interpolation). For even better smoothness and broader support, we can use the cubic B-spline interpolating functions: $N (x) = ⎩ ⎨ ⎧ \frac{1}{2} ∣ x ∣^{3} - ∣ x ∣^{2} + \frac{2}{3} \frac{1}{6} (2 - ∣ x ∣)^{3} 0 if 0 \leq ∣ x ∣ < 1, if 1 \leq ∣ x ∣ < 2, otherwise .$

The cubic B-spline has support over $[- 2 h, 2 h]$ , making it more robust to numerical fracture and instabilities. However, the wider support also means that each particle interacts with more grid nodes, which increases the computational overhead during transfers (e.g., mass, momentum, and force). This trade-off between smoothness and efficiency should be considered when choosing an interpolation kernel.

Gradient of the Interpolating Function

Gradients are needed during force computation. For high-dimensional interpolating functions defined using tensor products, we compute their gradient using the chain rule: $\nabla N_{i} (x_{p}) = \frac{1}{h} N^{'} (\frac{x _{p} - x _{i}}{h}) N (\frac{y _{p} - y _{j}}{h}) N (\frac{z _{p} - z _{k}}{h}) N (\frac{x _{p} - x _{i}}{h}) \frac{1}{h} N^{'} (\frac{y _{p} - y _{j}}{h}) N (\frac{z _{p} - z _{k}}{h}) N (\frac{x _{p} - x _{i}}{h}) N (\frac{y _{p} - y _{j}}{h}) \frac{1}{h} N^{'} (\frac{z _{p} - z _{k}}{h}) .$ Here, $N^{'} (x)$ is the derivative of the 1D interpolating function $N (x)$ .

Particle-Grid Transfers

In MPM, the material state is stored on Lagrangian particles, while the Governing Equations (e.g., conservation of momentum) are solved on a temporary Eulerian grid. Therefore, we must transfer information between particles and grid nodes at each time step.

This section discusses how mass, momentum, and internal forces (stress) are transferred from particles to grid (P2G), and how updated velocities are transferred back from grid to particles (G2P).

Particle-to-Grid (P2G) Transfers

Mass Transfer: Let $m_{p}$ be the mass of particle $p$ , and $w_{i p} = N_{i} (x_{p})$ be the interpolation weight between grid node $i$ and particle $p$ . We transfer mass to grid node $i$ by taking a weighted sum:

$m_{i} = p \sum w_{i p} m_{p} .$

Momentum Transfer: Similarly, particle velocity $v_{p}$ is used to compute nodal momentum:

$(m v)_{i} = p \sum w_{i p} m_{p} v_{p} .$

Then, the nodal velocity of each grid node is:

$v_{i} = \frac{( m v ) _{i}}{m _{i}} .$

Internal Force (Stress) Transfer: In MPM, forces are evaluated on grid nodes where Newton’s second law is applied. For hyperelastic materials, the internal elastic force can be derived either from the Weak Form of the momentum equation or from the gradient of the Strain Energy. The latter is often preferred when a well-defined energy density function is available.

Assuming a deformation-gradient-based hyperelastic energy density $Ψ_{p} (F_{p})$ at each particle $p$ , the total potential energy of the system is:

$E = p \sum V_{p}^{0} Ψ_{p} (F_{p}),$

where $V_{p}^{0}$ is the rest volume of particle $p$ . The elastic force acting on grid node $i$ is then computed as the negative gradient of the total energy with respect to the nodal position $x_{i}$ :

$f_{i} (x) = - \frac{\partial E}{\partial x _{i}} = - p \sum V_{p}^{0} \frac{\partial Ψ _{p}}{\partial F} (F_{p} (x)) F_{p}^{T} \nabla w_{i p} .$

In the case of explicit time integration such as Symplectic Euler, the force is evaluated at the current time step $n$ :

$f_{i}^{n} = - p \sum V_{p}^{0} \frac{\partial Ψ _{p}}{\partial F} (F_{p}^{n}) (F_{p}^{n})^{T} \nabla w_{i p}^{n} .$

This formulation expresses the force on grid node $i$ due to the elastic stress contributions from nearby particles. It depends entirely on known particle attributes and interpolation weights at the current time step.

Grid Update

After collecting information from the particles, time integration is then performed on the grid, where 1st-order schemes are usually used:

$\hat{v}_{i}^{n + 1} = v_{i}^{n} + \frac{Δ t}{m _{i}} (f_{i} + f_{i}^{ext}) .$ Here, we use $\hat{v}_{i}^{n + 1}$ to represent velocity after grid update, so that it differs from the grid velocity after P2G transfer in the next time step, which is written as $v_{i}^{n + 1}$ .

Boundary treatments are also enforced during this stage and will be discussed in later lectures.

Grid-to-Particle (G2P) Transfers

Once the grid velocity $\hat{v}_{i}^{n + 1}$ is updated, we need to transfer information back to the particles. The main quantity to update during the G2P transfer is the particle velocity $v_{p}^{n + 1}$ , which will be used in the advection step.

The simplest approach is to interpolate velocity directly from the grid:

$v_{p}^{n + 1} = i \sum w_{i p} \hat{v}_{i}^{n + 1} .$

This straightforward scheme is part of the Particle-In-Cell (PIC) [Harlow 1962] method, but it is highly dissipative. In practice, we often adopt improved transfer schemes to better preserve velocity modes and conserve angular momentum.

Different Transfer Schemes

Transfer schemes define how P2G and G2P collaborate, balancing numerical stability, energy conservation, and physical realism in different ways.

Example 27.3.1 (PIC, Particle-In-Cell). P2G: Particle velocity $v_{p}$ is transferred to the grid using: $(m v)_{i} = p \sum w_{i p} m_{p} v_{p} .$ G2P: Particle velocity is interpolated directly from the grid: $v_{p}^{n + 1} = i \sum w_{i p} \hat{v}_{i}^{n + 1} .$ PIC is very stable and conserves linear momentum, but suffers from high numerical dissipation, loss of fine-scale motion, and does not conserve angular momentum.

Example 27.3.2 (FLIP, Fluid-Implicit-Particle). P2G: Same as PIC. G2P: In FLIP, the particle velocity is updated by blending the newly interpolated velocity and the change in velocity between time steps: $v_{p}^{n + 1} = i \sum w_{i p} \hat{v}_{i}^{n + 1} + α (v_{p}^{n} - i \sum w_{i p} v_{i}^{n}),$ where $α \in [0, 1]$ is the FLIP-PIC blending factor (with $α = 1$ recovering full FLIP and $α = 0.99$ often used). Since FLIP only interpolate the velocity change, numerical dissipation is effectively mitigated. However, note that to ensure the embedding relation between the particle and grid, particle positions still need to be updated using PIC's velocity: $x_{p}^{n + 1} = x_{p}^{n} + Δ t i \sum w_{i p} \hat{v}_{i}^{n + 1} .$ FLIP can preserve system energy and fine-scale motion over long time scales, making it well-suited for simulating fluids with detailed local features. However, due to its lack of inherent smoothing, it can introduce noise, penetration, or even instability in solid simulations.

Example 27.3.3 (APIC, Affine Particle-in-Cell). [Jiang et al. 2015] introduces affine velocity fields to more accurately model local particle motion and preserve angular momentum. Each particle stores an affine velocity matrix $C_{p}$ in addition to its velocity $v_{p}$ . The particle velocity field is written as: $v_{p} (x) = v_{p} + C_{p} (x - x_{p}) .$ P2G: The momentum transfer includes affine motion: $(m v)_{i} = p \sum w_{i p} m_{p} (v_{p} + C_{p} (x_{i} - x_{p})) .$ This enables particles to transfer both linear and angular momentum to the grid. G2P: After the grid update, particle velocity and affine matrix are reconstructed from grid values: $v_{p}^{n + 1} = i \sum w_{i p} \hat{v}_{i}^{n + 1},$ $B_{p}^{n + 1} = D_{p}^{- 1} i \sum w_{i p} \hat{v}_{i}^{n + 1} (x_{i} - x_{p})^{T} .$ Here, $D_{p}$ is a second moment matrix defined as: $D_{p} = i \sum w_{i p} (x_{i} - x_{p}) (x_{i} - x_{p})^{T} .$ For common B-spline interpolation kernels (quadratic and cubic), $D_{p}$ has a constant value: Quadratic B-spline: $D_{p} = \frac{1}{4} Δ x^{2} I$ . Cubic B-spline: $D_{p} = \frac{1}{3} Δ x^{2} I$ . Thus, $D_{p}^{- 1}$ is simply a constant scaling factor and need not be recomputed per particle. APIC significantly reduces dissipation compared to PIC by capturing local rotational and shear effects through affine velocity fields, while also preserving angular momentum. Although it requires storing an additional local matrix $C_{p}$ per particle and computing particle-to-grid moment contributions, the cost is modest in practice.

Deformation Gradient and Particle State Update

In MPM, each particle carries a deformation gradient $F_{p}$ , which tracks how the local material volume attached to the particle is stretched, rotated, or sheared over time. It plays a central role in computing stress based on constitutive models.

At the beginning of each time step, the grid lies on a regular, undeformed lattice. Let $x_{i}^{n}$ be the position of grid node $i$ at time $t^{n}$ . The deformation gradient is updated by observing how the grid locally moves over the time step, assuming a given grid velocity field $\hat{v}_{i}^{n + 1}$ .

Deformation Gradient Update

The most common approach is to compute $F_{p}^{n + 1}$ using a 1st-order approximation:

$F_{p}^{n + 1} = (I + Δ t i \sum v_{i}^{n + 1} \nabla (w_{i p}^{n})^{T}) F_{p}^{n} .$

In MLS-MPM [Hu et al. 2018] and APIC [Jiang et al. 2015], the deformation gradient update can be rewritten more compactly using the affine velocity matrix $C_{p}$ that is already computed during P2G and G2P transfers.

Recall the local affine velocity field:

$v_{p} (x) = v_{p} + C_{p} (x - x_{p}) .$

From this, the grid motion induces a local velocity gradient $C_{p}$ , and the deformation gradient can be updated as:

$F_{p}^{n + 1} = (I + Δ t C_{p}) F_{p}^{n} . (3)$

This bypasses the need to explicitly evaluate $\nabla w_{i p}$ , and instead uses the already-aggregated affine behavior stored in $C_{p}$ .

Plasticity flow is also applied at this stage to project the updated deformation gradient back onto the admissible space defined by the material's yield criterion. This process, known as return mapping, ensures that the material obeys plastic limits, and will be discussed in detail in the next lecture.

Position Update

The particle position at time $t^{n + 1}$ is then advected as:

$x_{p}^{n + 1} = x_{p}^{n} + Δ t v_{p}^{n + 1} .$

Summary

A single time step of the Material Point Method (MPM) simulation proceeds as follows:

P2G (Particle to Grid): Transfer particle mass, momentum, and force (computed using stresses) to grid nodes using interpolation functions.
Grid Update: Update nodal velocities via time integration while enforcing boundary conditions.
G2P (Grid to Particle): Interpolate updated grid velocities back to particles; optionally reconstruct affine velocity fields (APIC/MLS-MPM).
Particle State Update: Update each particle’s deformation gradient based on local velocity gradients and advect particle positions using updated velocities.

This hybrid particle-grid framework enables convenient and effective handling of large deformations and topological changes.

In the following section, we will explore how MPM naturally supports the simulation of plastic deformation by incorporating a return mapping procedure based on the material’s yield criterion.

Beyond Elasticity: Plasticity and Viscosity

*Author of this lecture: Chang Yu, University of California, Los Angeles

In physics-based simulation, the continuum assumption enables us to model a wide range of materials with rich and diverse behaviors. While hyperelasticity provides a convenient and elegant model for materials that return to their rest shape after deformation, such as rubber or soft tissue, it represents only a small portion of the physical phenomena observed in the real world.

Many natural and everyday materials, including sand, snow, mud, metal, foam, and fracturing solids, exhibit irreversible deformation, dissipation, or flow-like behavior that cannot be captured by purely elastic models. These materials require constitutive models that incorporate plasticity (permanent deformation) and viscosity (rate-dependent stress response).

The core difference between elasticity and plasticity lies in how materials store and dissipate energy. Elastic deformation is reversible and energy-conservative; plastic deformation is irreversible, governed by yield criteria, and involves energy dissipation through internal reconfiguration of the material structure.

MPM is particularly well suited for simulating plastic and viscous materials, as it naturally supports large deformations, history-dependent state updates, and localized inelastic flow, without suffering from mesh entanglement, element inversion, or remeshing-related artifacts common in traditional mesh-based methods.

In the remainder of this lecture, we will introduce how plasticity and viscosity are incorporated into MPM simulations. We begin with the discretization of plastic flow and the representation of irreversible deformation. We then describe how to enforce material-specific yield conditions and perform return mapping to constrain deformation within admissible limits.

Discretization of Plastic Flow

Plasticity is introduced into MPM by multiplicatively decomposing the deformation gradient $F$ into elastic and plastic parts:

$F = F_{E} F_{P} .$

Here, $F_{P}$ represents the accumulated irreversible deformation, while $F_{E}$ captures the recoverable elastic deformation from the plastically deformed configuration.

This decomposition separates material behavior into two parts:

The plastic part $F_{P}$ stores permanent changes (e.g., bending a metal rod into a spring),
The elastic part $F_{E}$ stores current deformation relative to that shape (e.g., compressing the spring slightly).

**Figure 28.1.1.** Multiplicative decomposition of the deformation gradient.

Stress is computed solely from $F_{E}$ using the hyperelastic constitutive model. Plastic flow is triggered when stress exceeds a material-specific limit and updates $F_{P}$ to ensure the stress stays within the yield surface.

Definition 28.1.1 (Yield Surface). We define a Yield Condition $y (τ) \leq 0$ on the Kirchhoff stress $τ$ derived from $F_{E}$ . The boundary $y (τ) = 0$ is known as the Yield Surface. When elastic stress exceeds this surface, plastic flow is triggered to restore admissibility.

This framework cleanly separates recoverable and permanent deformation by computing stress from $F_{E}$ and evolving $F_{P}$ under plastic flow.

Yield Condition and Return Mapping

To enforce yield condition and flow rule in discrete time and space, we apply a procedure known as return mapping. This is a projection-based algorithm that modifies the elastic deformation gradient such that the resulting stress satisfies the yield condition.

Return Mapping

Let $F_{tr}$ denote the trial elastic deformation gradient, obtained after the particle state update but before plasticity is enforced. If the stress computed from $F_{tr}$ lies inside the yield surface, no correction is needed as the deformation is purely elastic.

If the stress lies outside the yield surface, we apply return mapping to correct the trial elastic state, projecting the stress induced by $F_{tr}$ back onto the yield surface, following the direction specified by the plastic flow rule. We denote the result of the return mapping as:

$F_{E}^{n + 1} = Z (F_{tr}),$

where $Z (\cdot)$ is the return mapping operator that corrects the elastic predictor based on the yield surface geometry and material model.

In practice, return mapping is implemented either analytically (for simple yield criteria such as von Mises or Drucker-Prager) or iteratively via numerical solvers when the yield surface is complex.

Example 28.2.1 (von Mises Yield Criterion, Log-Strain Formulation).

In log-strain-based plasticity, return mapping is performed in the diagonalized SVD space. Let $F^{n + 1} = U Σ V^{T}$ and let $σ_{p}^{n}$ be the stored plastic log-strain. The trial elastic stretch is: $Σ_{tr} = Σ \cdot (exp (σ_{p}^{n}))^{- 1}, ϵ_{tr} = lo g (Σ_{tr}) .$ Compute the deviatoric log strain: $ϵ_{dev} = ϵ_{tr} - \frac{1}{d} tr (ϵ_{tr}) I .$ If $∥ ϵ_{dev} ∥ > 2/3 ϵ_{Y}$ , perform radial return: $ϵ_{dev}^{n + 1} = (1 - \frac{∥ ϵ _{dev} ∥ - 2/3 ϵ _{Y}}{∥ ϵ _{dev} ∥}) ϵ_{dev} .$ The corrected elastic log strain is: $ϵ_{log}^{n + 1} = ϵ_{dev}^{n + 1} + \frac{1}{d} tr (ϵ_{tr}) I .$ Then update: $Σ_{E}^{n + 1} = exp (ϵ_{log}^{n + 1}), σ_{p}^{n + 1} = lo g (Σ) - lo g (Σ_{E}^{n + 1}),$ and reconstruct the elastic deformation: $F_{E}^{n + 1} = U Σ_{E}^{n + 1} V^{T} .$

Summary

This section introduced how MPM incorporates plastic material behavior through the multiplicative decomposition of the deformation gradient into elastic and plastic parts. We discussed how stress is computed from the elastic component, while plastic flow evolves to enforce material-specific yield conditions. The return mapping algorithm ensures that stresses remain admissible by projecting the trial state back onto the yield surface.

Boundary Treatments

*Author of this lecture: Chang Yu, University of California, Los Angeles

During the grid update step, we apply external forces and solve for nodal accelerations and velocities. To model object interactions with the physcal environment, boundary conditions can be enforced at this stage.

In this lecture, we describe how boundary treatments are implemented in MPM. Specifically:

Boundary Conditions on Grid explains how velocity constraints and external forces are imposed directly on grid nodes to handle Dirichlet and Neumann conditions.
Frictional Contact on Material Particles focuses on more accurate contact and friction treatments, which are handled at the particle, enabling flexible and more accurate interaction with complex boundaries.

These treatments are essential for producing realistic contact behavior—such as sticking, sliding, and separation—that adheres to the Coulomb friction law.

Boundary Conditions

In the Material Point Method, boundary conditions (BCs) are enforced on the background Eulerian grid, because the governing equations—discretized in their weak form—are formulated over grid nodes, which serve as the true degrees of freedom (DOFs) of the system. These nodes must satisfy Newton's Second Law. Thus, all external constraints, including static walls, moving boundaries, or contact forces, must be applied directly on the grid after P2G and before G2P.

As usual, we denote the nodal velocity at a grid node $x_{i}$ as $\hat{v}_{i}^{n + 1}$ . After external force integration, we apply different velocity projections on boundary nodes depending on the type of boundary condition.

Types of Boundary Conditions

Let $n$ be the surface normal at a boundary grid node $x_{i}$ :

Sticky: The velocity is fully suppressed: $\hat{v}_{i}^{n + 1} \leftarrow 0 .$
Slip: Only the normal component is removed; tangential motion is preserved: $\hat{v}_{i}^{n + 1} \leftarrow \hat{v}_{i}^{n + 1} - n (n^{⊤} \hat{v}_{i}^{n + 1}) .$
Separate (One-way wall): Normal inflow is blocked, but outward motion is allowed: $\hat{v}_{i}^{n + 1} \leftarrow \hat{v}_{i}^{n + 1} - n \cdot min (n^{⊤} \hat{v}_{i}^{n + 1}, 0) .$

These operations are local, applied independently to each boundary grid node.

External Forces

External forces such as gravity can be added after P2G:

$\hat{v}_{i}^{n + 1} \leftarrow \hat{v}_{i}^{n + 1} + Δ t \cdot g .$

Moving Colliders

For moving or deforming objects, Signed Distance Functions (SDFs) are used to detect whether a grid node $x_{i}$ lies inside the object. Once detected, the surface normal $n$ and the relative velocity $Δ v$ are computed, and the grid velocity is adjusted accordingly.

Coulomb Friction

To enforce Coulomb friction on embedded colliders (e.g., sphere or capsule), we apply a projected velocity correction in the direction of the surface normal:

Let $n$ be the contact normal and $Δ v$ be the opposing velocity (i.e., $Δ v = v_{SDF} - \hat{v}_{i}^{n + 1}$ ). Define the normal component:

$\overset{v}{˙}_{n} = n^{⊤} Δ v .$

Then the corrected grid velocity is:

$\hat{v}_{i}^{n + 1} \leftarrow \hat{v}_{i}^{n + 1} + Δ v \cdot μ + n \cdot \overset{v}{˙}_{n} \cdot (1 - μ),$

where $μ \in [0, 1]$ is the friction coefficient. This operation smoothly blends between full separation ( $μ = 0$ ) and sticking ( $μ = 1$ ).

Frictional Contact on Material Particles

Previously, we discussed applying boundary conditions directly on the Eulerian grid nodes. While this approach is natural due to the grid nodes being the true degrees of freedom (DOFs) where Newton's second law is enforced, it suffers from several inherent issues:

Normal Estimation Error: For coarse grids, estimating accurate surface normals becomes challenging, potentially resulting in incorrect contact responses.
Contact gap (dx gap): Grid-based collision handling responds even when grid nodes are within approximately one grid cell spacing from collision surfaces, leading to visible gaps and earlier collision responses.

Given that the Material Point Method (MPM) discretizes solids directly into a set of material particles, it is more intuitive—and potentially more accurate—to handle frictional contact directly at the particle level. However, it is not physically consistent to simply compute collision forces and friction at particles and then directly transfer them (via P2G) to the grid, because this naïve approach bypasses the correct enforcement of Newton’s second law on the grid DOFs, potentially resulting in unphysical responses, particularly failing to properly enforce static friction constraints.

Instead, a more robust strategy is to formulate the frictional contact problem at the grid level explicitly as an optimization problem, taking advantage of the particle-based definition of collision energies or penalty functions. This leads us to adopt a two-stage time-splitting numerical scheme.

Two-Stage Time-splitting Optimization Scheme

Stage 1: Free-motion velocity

In the first stage, we ignore contact constraints entirely and compute the intermediate "free-motion" nodal velocity $v_{i}^{*}$ on grid nodes, which would occur in the absence of any contact forces. These velocities can be conveniently obtained via a standard symplectic Euler time integration step.

Stage 2: Frictional Contact via Optimization

In the second stage, frictional contact is enforced by solving a constrained linearized momentum balance around the free-motion velocity $v^{*}$ :

$M (v_{i}^{n + 1} - v_{i}^{*}) = h J^{T} (x_{p}^{n}) γ (v_{i}^{n + 1}) .$

Here:

$M$ is the mass matrix.
$J$ is the contact Jacobian defined at the particle level, it maps grid nodal velocities to particle-level contact velocities. Specifically, each contacting particle $j$ contributes a local Jacobian matrix: $J_{j} = \frac{\partial v _{c, j}}{\partial v _{i}} \in R^{3 \times n_{v}} .$ Stacking all these individual Jacobians produces the global contact Jacobian $J \in R^{3 n_{c} \times n_{v}}$ , which collectively relates all particle contact velocities $v_{c}$ to grid velocities $v$ as: $v_{c} = J v_{i} .$ Intuitively, $J$ can be understood as a special Grid-to-Particle (G2P) transfer operator linearized at the current configuration, directly encoding how infinitesimal changes in grid nodal velocities affect the particle velocities (normal and tangential components) at contact points.
$γ (v_{i}^{n + 1})$ represents contact forces (normal and tangential) subject to friction cone constraints.

This linear system can be reformulated into an unconstrained optimization problem:

$v_{i} min ℓ (v_{i}) = \frac{1}{2} ∥ v_{i} - v_{i}^{*} ∥_{M}^{2} + ℓ_{c} (v_{c} (v_{i})) .$

Possible contact energy formulations for $ℓ_{c}$ include the Logarithm Barrier Potential and Semi-Implicit Friction.

Summary

In this lecture, we discussed various boundary treatment strategies essential for accurate and stable MPM simulations. Initially, we highlighted that boundary conditions (BCs) must be applied directly to the grid nodes due to their role as the true degrees of freedom satisfying Newton's second law. We presented several common grid-level boundary conditions, including sticky, slip, and separate (one-way) conditions, along with practical methods for incorporating frictional contact through Coulomb friction constraints using signed distance functions (SDFs).

We then addressed inherent limitations of purely grid-based collision handling, such as inaccurate normals and discretization gaps, motivating the use of particle-level frictional contact formulations. By formulating frictional contact explicitly as an optimization problem at the grid level, with particle-based contact energies, we provided a physically consistent and robust approach. Specifically, we introduced a two-stage scheme where free-motion grid velocities are computed first, followed by an optimization-based frictional contact correction, effectively enforcing static friction and maintaining physical fidelity in complex contact scenarios.

In the next two sections, we will apply all these theoretical components to implement a practical, full-featured MPM simulation. We begin with a quick-start example: Two Colliding Elastic Blocks in 2D, showcasing the simplest MPM elastic body simulation based on PIC. Then, with minimal additional effort, we incorporate APIC transfer scheme, sand plasticity and an SDF-based collider to create a more advanced, 2D Sand Simulation.

Case Study: 2D Sand with a Sphere Collider*

*Author of this lecture: Chang Yu, University of California, Los Angeles

So far, we have introduced all core components necessary to build a complete and physically plausible MPM simulation system—including material discretization, time integration, and data transfer schemes.

In this case study, we put these components together to simulate two colliding elastic blocks in 2D. We start by setting up the simulation with material properties and data structure definitions. Then, we implement the Particle-In-Cell (PIC) Transfer to handle momentum exchange between particles and the grid.

This example serves as a quick-start demonstration of the MPM pipeline using elastic materials, without plasticity or complex boundary conditions.

For implementation, we use NumPy and Taichi as our programming framework. Taichi provides efficient parallelism on both CPU and GPU, and more importantly, it supports sparse data structures, which are critical for high-performance MPM grid computations.

The executable Python project for this section can be found at https://github.com/phys-sim-book/solid-sim-tutorial under the 10_mpm_elasticity folder.

Simulation Setup

In this section, we define the physical and numerical setup required for implementing a minimal MPM simulation of Two Colliding Elastic Blocks in 2D. We walk through the definition of simulation properties, initialization of particle positions and velocities, and data structures used throughout the simulation.

Physical and Numerical Parameters

We begin by setting up the discretization of the simulation domain and the material parameters of the block:

Implementation 30.1.1 (Physical and Numerical Parameters, simulator.py).

# simulation setup
grid_size = 128 # background Eulerian grid's resolution, in 2D is [128, 128]
dx = 1.0 / grid_size # the domain size is [1m, 1m] in 2D, so dx for each cell is (1/128)m
dt = 2e-4 # time step size in second
ppc = 8 # average particles per cell

density = 1000 # mass density, unit: kg / m^3
E, nu = 1e4, 0.3 # block's Young's modulus and Poisson's ratio
mu, lam = E / (2 * (1 + nu)), E * nu / ((1 + nu) * (1 - 2 * nu)) # Lame parameters

These parameters define a uniform dense background grid, particle resolution, and time integration step size. The entire simulation domain spans from [0, 0] to [1, 1] meters, and we aim for around 8 particles per grid cell on average. The blocks are set to have mass density at $1000 k g / m^{3}$ , Young's modulus at $1 0^{4} P a$ , and Poisson's ratio at $0.3$ .

Initial Particle Sampling and Scene Setup

We sample particles from two rectangular regions using uniform grid sampling. These two boxes are placed symmetrically on the left and right sides of the domain and are initialized with opposite velocities to simulate a head-on collision.

Compared to Poisson disk sampling, uniform sampling is easier to implement for analytic shapes, such as boxes and spheres, due to its structured nature and simple parametrization. However, this regularity can lead to aliasing artifacts, such as visible patterns or striping in the simulation, which may introduce unnatural structured noise into the result.

Here we adopt uniform sampling for simplicity and clarity, keeping the focus on the MPM pipeline itself.

Implementation 30.1.2 (Initial Particle Sampling and Scene Setup, simulator.py).

# uniformly sampling material particles
def uniform_grid(x0, y0, x1, y1, dx):
    xx, yy = np.meshgrid(np.arange(x0, x1 + dx, dx), np.arange(y0, y1 + dx, dx))
    return np.column_stack((xx.ravel(), yy.ravel()))

box1_samples = uniform_grid(0.2, 0.4, 0.4, 0.6, dx / np.sqrt(ppc))
box1_velocities = np.tile(np.array([10.0, 0]), (len(box1_samples), 1))
box2_samples = uniform_grid(0.6, 0.4, 0.8, 0.6, dx / np.sqrt(ppc))
box2_velocities = np.tile(np.array([-10.0, 0]), (len(box1_samples), 1))
all_samples = np.concatenate([box1_samples, box2_samples], axis=0)
all_velocities = np.concatenate([box1_velocities, box2_velocities], axis=0)

Each block consists of uniformly distributed material points representing a homogeneous elastic body. The left block is given an initial velocity of ([+10, 0]) m/s, and the right block ([-10, 0]) m/s, setting up a symmetric, head-on collision scenario with zero net linear momentum. This configuration mimics a controlled impact experiment.

Particle and Grid Data Fields

We define data fields to represent the state of each material point (particle) and background grid node. For particles, this includes position, velocity, volume, mass, and deformation gradient, following Material Particles. For the grid, we define nodal mass and velocity fields using dense arrays, which are sufficient for small-scale simulations. These can be further optimized using sparse grid structures—a direction we leave as future work for interested readers.

Implementation 30.1.3 (Particle and Grid Data Fields, simulator.py).

# material particles data
N_particles = len(all_samples)
x = ti.Vector.field(2, float, N_particles) # the position of particles
x.from_numpy(all_samples)
v = ti.Vector.field(2, float, N_particles) # the velocity of particles
v.from_numpy(all_velocities)
vol = ti.field(float, N_particles)         # the volume of particle
vol.fill(0.2 * 0.4 / N_particles) # get the volume of each particle as V_rest / N_particles
m = ti.field(float, N_particles)           # the mass of particle
m.fill(vol[0] * density)
F = ti.Matrix.field(2, 2, float, N_particles)  # the deformation gradient of particles
F.from_numpy(np.tile(np.eye(2), (N_particles, 1, 1)))

# grid data
grid_m = ti.field(float, (grid_size, grid_size))
grid_v = ti.Vector.field(2, float, (grid_size, grid_size))

Particle-In-Cell Transfer

At the beginning of each simulation step, the grid must be cleared before accumulating new particle-to-grid transfers.

Implementation 30.2.1 (Reset Grid, simulator.py).

def reset_grid():
    # after each transfer, the grid is reset
    grid_m.fill(0)
    grid_v.fill(0)

We adopt the Saint Venant–Kirchhoff (StVK) constitutive model formulated in the logarithmic (Hencky) strain space using the SVD of the deformation gradient.

Implementation 30.2.2 (Stvk Hencky Elasticity, simulator.py).

@ti.func
def StVK_Hencky_PK1_2D(F):
    U, sig, V = ti.svd(F)
    inv_sig = sig.inverse()
    e = ti.Matrix([[ti.log(sig[0, 0]), 0], [0, ti.log(sig[1, 1])]])
    return U @ (2 * mu * inv_sig @ e + lam * e.trace() * inv_sig) @ V.transpose()

During the particle-to-grid (P2G) transfer, we use quadratic B-spline interpolation to distribute each particle’s mass, momentum, and internal force to its neighboring grid nodes. This process follows the PIC (Particle-In-Cell) formulation, where particle velocities are directly transferred to the grid without storing affine velocity fields.

Implementation 30.2.3 (PIC Particle-to-Grid (P2G) Transfers, simulator.py).

@ti.kernel
def particle_to_grid_transfer():
    for p in range(N_particles):
        base = (x[p] / dx - 0.5).cast(int)
        fx = x[p] / dx - base.cast(float)
        # quadratic B-spline interpolating functions (Section 26.2)
        w = [0.5 * (1.5 - fx) ** 2, 0.75 - (fx - 1) ** 2, 0.5 * (fx - 0.5) ** 2]
        # gradient of the interpolating function (Section 26.2)
        dw_dx = [fx - 1.5, 2 * (1.0 - fx), fx - 0.5]

        P = StVK_Hencky_PK1_2D(F[p])
        for i in ti.static(range(3)):
            for j in ti.static(range(3)):
                offset = ti.Vector([i, j])
                weight = w[i][0] * w[j][1]
                grad_weight = ti.Vector([(1. / dx) * dw_dx[i][0] * w[j][1], 
                                          w[i][0] * (1. / dx) * dw_dx[j][1]])

                grid_m[base + offset] += weight * m[p] # mass transfer
                grid_v[base + offset] += weight * m[p] * v[p] # momentum Transfer, PIC formulation
                # internal force (stress) transfer
                fi = -vol[p] * P @ F[p].transpose() @ grad_weight
                grid_v[base + offset] += dt * fi

Right after Particle-to-Grid (P2G) Transfer, we normalize grid momentum to obtain nodal velocities and enforces Dirichlet boundary conditions near the domain edges by zeroing out velocities.

Implementation 30.2.4 (Grid Update, simulator.py).

@ti.kernel
def update_grid():
    for i, j in grid_m:
        if grid_m[i, j] > 0:
            grid_v[i, j] = grid_v[i, j] / grid_m[i, j] # extract updated nodal velocity from transferred nodal momentum

            # Dirichlet BC near the bounding box
            if i <= 3 or i > grid_size - 3 or j <= 2 or j > grid_size - 3:
                grid_v[i, j] = 0

During the grid-to-particle (G2P) transfer, we gather the updated velocity from the background grid and compute the elastic deformation gradient update using the velocity gradient derived from the interpolation function.

Implementation 30.2.5 (PIC Grid-to-Particle (G2P) Transfers, simulator.py).

@ti.kernel
def grid_to_particle_transfer():
    for p in range(N_particles):
        base = (x[p] / dx - 0.5).cast(int)
        fx = x[p] / dx - base.cast(float)
        # quadratic B-spline interpolating functions (Section 26.2)
        w = [0.5 * (1.5 - fx) ** 2, 0.75 - (fx - 1) ** 2, 0.5 * (fx - 0.5) ** 2]
        # gradient of the interpolating function (Section 26.2)
        dw_dx = [fx - 1.5, 2 * (1.0 - fx), fx - 0.5]

        new_v = ti.Vector.zero(float, 2)
        v_grad_wT = ti.Matrix.zero(float, 2, 2)
        for i in ti.static(range(3)):
            for j in ti.static(range(3)):
                offset = ti.Vector([i, j])
                weight = w[i][0] * w[j][1]
                grad_weight = ti.Vector([(1. / dx) * dw_dx[i][0] * w[j][1], 
                                          w[i][0] * (1. / dx) * dw_dx[j][1]])

                new_v += weight * grid_v[base + offset]
                v_grad_wT += grid_v[base + offset].outer_product(grad_weight)

        v[p] = new_v
        F[p] = (ti.Matrix.identity(float, 2) + dt * v_grad_wT) @ F[p]

Finally, particle positions are updated through advection using symplectic Euler time integration.

Implementation 30.2.6 (Particle State Update, simulator.py).

@ti.kernel
def update_particle_state():
    for p in range(N_particles):
        x[p] += dt * v[p] # advection

A full MPM simulation step consists of the following stages:

Implementation 30.2.7 (A full time step of MPM, simulator.py).

def step():
    # a single time step of the Material Point Method (MPM) simulation
    reset_grid()
    particle_to_grid_transfer()
    update_grid()
    grid_to_particle_transfer()
    update_particle_state()

**Figure 30.2.1.** **Time sequence of two colliding elastic blocks in 2D**. The red and blue blocks approach each other with opposite velocities, undergo large elastic deformation upon impact, and rebound with shape recovery. The simulation demonstrates symmetric momentum exchange and elastic energy storage under the MPM framework.

Summary

We have successfully implemented a minimal yet complete 2D Material Point Method (MPM) simulation featuring two colliding elastic blocks. This setup showcases the core pipeline of MPM, including particle sampling, data structure initialization, PIC-based transfer schemes, and elastic deformation based on the StVK constitutive model in Hencky strain space.

Despite its simplicity, this example captures key aspects of MPM. It serves as a clean and extensible foundation for building more sophisticated MPM systems.

In the next lecture, we build upon this framework by incorporating the APIC transfer scheme, Drucker-Prager plasticity, and SDF-based boundary handling to simulate 2D sand interacting with a static sphere collider, enabling realistic modeling of granular materials with frictional contact.

Case Study: 2D Sand with a Sphere Collider*

*Author of this lecture: Chang Yu, University of California, Los Angeles

Building on the previous chapter—Two Colliding Elastic Blocks in 2D, where we implemented a minimal Material Point Method (MPM) simulation using the PIC transfer scheme—this case study demonstrates how, with minimal additional effort, we can extend the system to create a more advanced sand simulation.

Unlike the previous chapter, where particles are sampled on a regular grid, here we use Poisson-disk Sampling to initialize the material points. This helps reduce aliasing artifacts and structured noise, producing more physically realistic behavior in granular simulations.

In this case study, the sand is modeled using the Drucker-Prager Elastoplasticity [Klar et al. 2016] constitutive model, allowing us to capture non-recoverable deformation and internal friction—key features in granular material behavior.

We place a Static Sphere Collider inside the domain, which interacts with falling sand particles through frictional contact. The collider boundary is represented using a signed distance function (SDF) and enforces contact constraints and Coulomb friction.

We extend the original PIC scheme by incorporating the APIC Transfer Scheme to achieve improved accuracy and reduced numerical dissipation.

The executable Python project for this section can be found at https://github.com/phys-sim-book/solid-sim-tutorial under the 11_mpm_sand folder.

Drucker-Prager Elastoplasticity

The Drucker-Prager plasticity model is widely used for simulating granular materials such as sand and soil. It generalizes the von Mises model by incorporating a friction angle, which governs how much shear stress the material can sustain relative to the normal stress. Physically, this corresponds to Coulomb-like friction between particles: the material yields when the shear stress exceeds a friction-dependent bound based on pressure.

In stress space, the Drucker-Prager yield surface takes the shape of a cone, with three distinct cases to handle:

Case I (Elastic): The stress lies strictly inside the cone, and no plasticity occurs.
Case II (Expansion): The stress corresponds to a configuration where the material expands volumetrically (positive trace), and no resistance is applied—this maps to the cone tip.
Case III (Shearing): The stress lies outside the cone but with compressive pressure and must be projected back to the cone surface.

This model is best implemented in the log-strain (Hencky strain) space using the SVD of the deformation gradient, which we have already introduced in the previous section.

Example 31.1.1 (Drucker-Prager Yield Criterion, Log-Strain Formulation). We define the logarithmic strain as:

$ϵ = lo g (Σ) + \frac{1}{d} lo g (J_{vol}) I .$

The deviatoric part is:

$\hat{ϵ} = ϵ - \frac{1}{d} tr (ϵ) I .$

The plastic multiplier is computed as:

$Δ γ = ∥ \hat{ϵ} ∥ + \frac{( d λ + 2 μ )}{2 μ} \cdot tr (ϵ) \cdot α .$

Here, $α = \frac{2}{3} \cdot \frac{2 s i n ϕ}{3 - s i n ϕ}$ is the Drucker-Prager friction coefficient derived from the friction angle $ϕ$ .

Then we apply the return mapping:

If $tr (ϵ) > 0$ (Case II), we project to the cone tip: set $ϵ = 0$ .

If $Δ γ \leq 0$ , we are inside the cone (Case I): no change.

Otherwise (Case III), we project back to the cone surface:

$ϵ^{n + 1} = ϵ - \frac{Δ γ}{∥ ϵ ^ ∥} \hat{ϵ} .$

Finally, we compute the updated singular values:

$Σ_{E}^{n + 1} = exp (ϵ^{n + 1}),$

and reconstruct the elastic deformation:

$F_{E}^{n + 1} = U diag (Σ_{E}^{n + 1}) V^{T} .$

Example 31.1.2 (Drucker-Prager Plasticity with Volume Correction).

In granular materials like sand, volumetric expansion can result in non-physical volume gain if not properly handled. The standard Drucker-Prager projection maps stress to the tip of the cone under expansion (positive trace), which corresponds to a stress-free state. However, this may unrealistically "lock in" expanded configurations as new rest shapes.

This effect can cause persistent volume inflation when a particle experiences elastic expansion followed by plastic projection. Any future compression then incurs artificial elastic penalties, resulting in incorrect material response.

To correct this, we follow the volume correction treatment described by [Tampubolon et al. 2017] by introducing a per-particle scalar accumulator $v_{vol}$ that tracks log-volume changes induced by plastic projection:

$v_{vol}^{n + 1} = v_{vol}^{n} - lo g det F_{E}^{n + 1} + lo g det F_{trial} .$

This correction is naturally integrated into the log-strain formulation by adjusting the strain before return mapping:

$ϵ_{corrected} = ϵ_{trial} + \frac{v _{vol}}{d} I,$

where $d$ is the spatial dimension. This allows future compression to neutralize previous volume gain rather than being resisted elastically. In the code below, diff_log_J provides this volume correction term, computed as the accumulation of log-difference of determinants.

Implementation 31.1.1 (Drucker-Prager Elastoplasticity Return Mapping, simulator.py).

@ti.func
def Drucker_Prager_return_mapping(F, diff_log_J):
    dim = ti.static(F.n)
    sin_phi = ti.sin(friction_angle_in_degrees/ 180.0 * ti.math.pi)
    friction_alpha = ti.sqrt(2.0 / 3.0) * 2.0 * sin_phi / (3.0 - sin_phi)
    U, sig_diag, V = ti.svd(F)
    sig = ti.Vector([ti.max(sig_diag[i,i], 0.05) for i in ti.static(range(dim))])
    epsilon = ti.log(sig)
    epsilon += diff_log_J / dim # volume correction treatment
    trace_epsilon = epsilon.sum()
    shifted_trace = trace_epsilon
    if shifted_trace >= 0:
        for d in ti.static(range(dim)):
            epsilon[d] = 0.
    else:
        epsilon_hat = epsilon - (trace_epsilon / dim)
        epsilon_hat_norm = ti.sqrt(epsilon_hat.dot(epsilon_hat)+1e-8)
        delta_gamma = epsilon_hat_norm + (dim * lam + 2. * mu) / (2. * mu) * (shifted_trace) * friction_alpha
        epsilon -= (ti.max(delta_gamma, 0) / epsilon_hat_norm) * epsilon_hat
    sig_out = ti.exp(epsilon)
    for d in ti.static(range(dim)):
        sig_diag[d,d] = sig_out[d]
    return U @ sig_diag @ V.transpose()

The return mapping is enforced during particle state update:

Implementation 31.1.2 (Particle State Update, simulator.py).

@ti.kernel
def update_particle_state():
    for p in range(N_particles):
        # trial elastic deformation gradient
        F_tr = F[p]
        # apply return mapping to correct the trial elastic state, projecting the stress induced by F_tr
        # back onto the yield surface, following the direction specified by the plastic flow rule.
        new_F = Drucker_Prager_return_mapping(F_tr, diff_log_J[p])
        # track the volume change incurred by return mapping to correct volume, following https://dl.acm.org/doi/10.1145/3072959.3073651 sec 4.3.4
        diff_log_J[p] += -ti.log(new_F.determinant()) + ti.log(F_tr.determinant()) 
        F[p] = new_F
        # advection
        x[p] += dt * v[p]

SDF-Based Sphere Collider

In the Signed Distance section, we introduced analytical representations of solid geometries—where shapes like spheres, boxes, and half-spaces are defined using mathematical expressions on their coordinates. One powerful abstraction introduced there was the Signed Distance Function (SDF). This function evaluates, at any given point in space, the signed distance to the surface of a geometry: negative values indicate points inside the object, positive values are outside, and zero lies exactly on the surface.

This concept translates naturally into collision detection and boundary condition enforcement in simulation frameworks like MPM.

Representing a Collider with Analytic SDF

Consider a 2D sphere (circle) with center $c = (c_{x}, c_{y})$ and radius $r$ . Its SDF is defined as:

$ϕ (x) = ∥ x - c ∥ - r .$

If $ϕ (x) < 0$ , the point is inside the sphere.
If $ϕ (x) = 0$ , the point is on the sphere’s surface.
If $ϕ (x) > 0$ , the point is outside the sphere.

This definition allows us to apply contact boundary conditions uniformly across the simulation domain by evaluating the SDF at each grid node.

Implementation 31.2.1 (Sphere SDF Collider with Frictional Contact, simulator.py).

            # a sphere SDF as boundary condition
            sphere_center = ti.Vector([0.5, 0.5])
            sphere_radius = 0.05 + dx # add a dx-gap to avoid penetration
            if (x_i - sphere_center).norm() < sphere_radius:
                normal = (x_i - sphere_center).normalized()
                diff_vel = -grid_v[i, j]
                dotnv = normal.dot(diff_vel)
                dotnv_frac = dotnv * (1.0 - sdf_friction)
                grid_v[i, j] += diff_vel * sdf_friction + normal * dotnv_frac

Affine Particle-In-Cell Transfer

We defined the material point data following Simulation Setup, extending it with two additional terms: the affine velocity field $C_{p}$ used in the APIC transfer scheme and the scalar log_J_diff used for volume correction as described in Drucker-Prager Elastoplasticity.

During the particle-to-grid (P2G) transfer, we adopt the APIC formulation for momentum exchange. Instead of directly transferring particle velocity, we include the local affine velocity field $C_{p}$ . Specifically, momentum is transferred using the form:

Implementation 31.3.1 (Affine Transfer in APIC Particle-to-Grid (P2G), simulator.py).

                grid_v[base + offset] += weight * m[p] * (v[p] + C[p] @ dpos) # momentum Transfer, APIC formulation

Here, dpos is the offset from the particle to the grid node.

During the grid-to-particle (G2P) transfer, we gather both the updated velocity and the affine velocity matrix $C_{p}$ from the background grid. The affine matrix is computed from the weighted outer product of grid velocities and position offsets. This allows each particle to retain local velocity variation, significantly reducing numerical dissipation compared to PIC.

Implementation 31.3.2 (Affine Transfer in APIC Grid-to-Particle (G2P), simulator.py).

                new_C += weight * grid_v[base + offset].outer_product(dpos) / D

Here, $D$ is a constant factor when using the Quadratic B-spline interpolation function:

Implementation 31.3.3 (Constant D for Quadratic B-spline used for APIC, simulator.py).

D = (1./4.) * dx * dx # constant D for Quadratic B-spline used for APIC

**Figure 31.3.1.** Time sequence of a 2D sand block falling onto a static red sphere collider. The sand undergoes irreversible deformation and splashing upon impact, demonstrating granular flow and frictional boundary response.

Summary

We have extended our Elastic MPM Framework to simulate granular materials by incorporating Poisson-disk sampling, APIC transfers, Drucker-Prager elastoplasticity, and SDF-based frictional contact. The APIC scheme improves accuracy while maintaining stability by capturing local affine motion, and the Drucker-Prager model enables realistic plastic flow and pressure-dependent yielding. A static sphere collider is introduced using a signed distance function (SDF), allowing smooth and robust enforcement of contact and friction constraints. Together, these enhancements enable stable and physically plausible simulation of sand undergoing large deformation and splashing.

Position Based Dynamics Framework

*Author of this lecture: Žiga Kovačič, Cornell University

This section introduces Position Based Dynamics (PBD). In the field of physical simulation, several dominant paradigms exist, each with a different level of abstraction for describing motion.

The most classical approach is force-based, which directly models Newton's second law. In a force-based paradigm, internal and external forces are accumulated to determine particle accelerations, which are then numerically integrated to find new velocities and positions. While physically accurate, these methods can suffer from expensive computational costs due to the solution of large nonlinear systems.

Position-based methods bypass the velocity and acceleration layers altogether, working immediately on the positions of particles. The central idea is to define the system's behavior through a set of geometric constraints. The simulation loop first predicts a new position for each particle based on its current velocity. Then, an iterative solver adjusts these predicted positions directly to ensure that all constraints are satisfied. This process of constraint projection replaces the explicit integration of forces, leading to highly efficient and visually plausible simulations, which makes PBD an effective alternative to classical dynamics frameworks.

This section is inspired by incedible notes on PBD by [Bender et al. 2017] and [Macklin & Muller 2013].

Preliminaries

To fully appreciate the PBD approach, it is essential to first understand the classical formulation of Lagrangian dynamics it gets inspired. The state of a dynamic system composed of $N$ particles is described by their individual mass $m_{i}$ , position $p_{i} \in R^{3}$ , and velocity $v_{i} \in R^{3}$ . The evolution of this system is governed by a pair of first-order ordinary differential equations derived from Newton's second law: $\dot{v}_{i} = \frac{1}{m _{i}} f_{i} (32.1.1)$ $\dot{p}_{i} = v_{i} (32.1.2)$ where $f_{i}$ is the sum of all forces on particle $i$ .

Rigid bodies require additional attributes to describe their rotational state: an orientation quaternion $q_{i} \in H$ , an angular velocity $ω_{i} \in R^{3}$ , and an inertia tensor $I_{i} \in R^{3 \times 3}$ . The rotational motion is then described by the Newton-Euler equations: $\dot{ω}_{i} = I_{i}^{- 1} (τ_{i} - (ω_{i} \times (I_{i} ω_{i}))) (32.1.3)$ $\dot{q}_{i} = \frac{1}{2} \hat{ω}_{i} q_{i} (32.1.4)$ where $τ_{i}$ is the net torque and $\hat{ω}_{i}$ is the quaternion $[0, ω_{i}]$ .

To simulate this evolution, these continuous equations are discretized using a numerical integrator. The Symplectic Euler method updates the velocity first, then uses this new velocity to update the position, improving stability over standard explicit Euler. For a particle, with time step $Δ t$ , the update is: $v_{i} (t_{0} + Δ t) = v_{i} (t_{0}) + Δ t \frac{1}{m _{i}} f_{i} (t_{0}) (32.1.5)$ $p_{i} (t_{0} + Δ t) = x_{i} (t_{0}) + Δ t v_{i} (t_{0} + Δ t) (32.1.6)$ This procedure is applied analogously for rigid body states.

Remark 32.1.1 (Quaternion Normalization). Due to numerical integration error, the quaternion $q_{i}$ may drift from unit length. It is essential to re-normalize the quaternion after each integration step to maintain a valid rotational state.

Finally, interactions and physical limits are modeled using holonomic constraints, which depend only on positions and orientations, but not velocities! Constraints are kinematic restrictions in the form of equations and inequalities that constrain the relative motion of bodies. An equality (bilateral) constraint takes the form $C_{j} (p, q) = 0$ , while an inequality (unilateral) constraint is $C_{j} (p, q) \geq 0$ . In classical dynamics, these are satisfied by computing constraint forces and adding them to $f_{i}$ in Equation (32.1.1). It is this specific mechanism for handling constraints that PBD fundamentally changes.

Core Framework

The PBD framework simulates a physical object by discretizing it into a set of $N$ particles and defining its behavior through $M$ geometric constraints (relationships). The central idea is to treat the system as a set of particles and constraints. At each time step, particle positions are first predicted using an explicit integration scheme, after which an iterative solver adjusts these predicted positions to satisfy all geometric constraints. We can formally define such a system as follows:

Definition 32.2.1 (PBD System). A PBD system is a tuple consisting of a set of $N$ particles and a set of $M$ constraints.

Each particle $i \in {1, \dots, N}$ is characterized by its position $p_{i} \in R^{3}$ , velocity $v_{i} \in R^{3}$ , and mass $m_{i}$ . For convenience, we define the inverse mass $w_{i} = 1/ m_{i}$ , where $w_{i} = 0$ for infinitely massive (i.e., static or kinematically controlled) particles.

The dynamic behavior is governed by a set of constraints. Each constraint $j \in {1, \dots M}$ is defined by a scalar-valued function $C_{j} (p_{1}, \dots, p_{n_{j}})$ which operates on a subset of $n_{j}$ particles. The constraint is satisfied if $C_{j} = 0$ for an equality constraint, or $C_{j} \geq 0$ for an inequality constraint.

The simulation proceeds in discrete time steps of size $Δ t$ . The central loop of the PBD algorithm can be described as follows.

**Figure 32.2.1 (PBD Main Algorithm).** The core of the loop is a multi-phase process: velocity and position prediction, constraint construction, constraint solving, and state update.

Note that since the algorithm simulates a system which is 2nd order in $t$ , we need to specify both positions and velocities before the simulation loop starts. This loop structure ensures that velocities are implicitly updated based on the geometric corrections performed by the solver, producing the correct behavior for a 2nd-order dynamical system.

Constraint Formulation

The main loop, described in the previous chapter, relies on the solver to correct the predicted particle positions $p$ so that they satisfy all defined constraints. We will begin by framing the problem as a large, non-linear system of equations and inequalities. Then, we will derive the core mathematical machinery used to solve this system, which involves linearizing the constraint functions and applying a Gauss-Seidel-type iterative scheme. The derivation will show how to compute the position correction for a single constraint while accounting for particle masses.

Problem Formulation

The central task of the solver is to find a set of position corrections $Δ p_{i}$ that move the predicted positions $p_{i}$ to a new state $p_{i} + Δ p_{i}$ where all $M$ constraints are satisfied. If we concatenate all particle positions into a single state vector $p = [p_{1}^{T}, ..., p_{N}^{T}]^{T} \in R^{3 N}$ , the problem is to solve the following system of mixed equalities and inequalities: $C_{1} (p) C_{2} (p) ⋮ C_{M} (p) ≻ 0 ≻ 0 ≻ 0 (32.3.1)$

where each $C_{j} (p)$ may represent an equality ( $= 0$ ) or an inequality ( $\geq 0$ ). This system presents a significant challenge. The problem comprises of a set of $M$ equations for $3 N$ unknown position components. If $M \geq 3 N$ , then the system is over-determined, else if $M \leq 3 N$ , then the system is under-determined. Additionally, the system is typically non-linear, as constraint functions often involve distances or angles. Solving such a system globally and simultaneously is computationally intractable for real-time applications.

PBD circumvents this complexity by adopting a local and iterative approach. Instead of solving the entire system at once, it processes one constraint at a time in a sequential manner, similar to the Gauss-Seidel method for solving linear systems. Each constraint is solved independently, and the position updates are applied immediately, influencing the solution of subsequent constraints in the same iteration. We repeatedly iterate through all the constraints and project the particles to valid locations with respect to the given constraint alone.

Constraint Solver

The heart of the PBD algorithm is the constraint solver, which iteratively projects the predicted particle positions to satisfy the defined constraints. Since this projection must be done in a physically plausible manner, ideally conserving the system's total linear and angular momentum for internal constraints, we employ a nonlinear Gauss-Seidel Solver.

Constraint Projection: Momentum Conservation

To keep things as physically plausible as possible we make sure that for any internal constraint, the correction step should not introduce fictitious external forces, often referred to as "ghost forces." This is achieved by ensuring that the net change in linear and angular momentum is zero. Let $Δ p_{i}$ be the position correction for particle $i$ . Linear momentum is conserved if the center of mass remains unchanged by the correction:

$i \sum m_{i} Δ p_{i} = 0$

Angular momentum is conserved if the net torque produced by the corrections is zero with respect to a common center of rotation $c$ :

$i \sum (p_{i} - c) \times (m_{i} Δ p_{i}) = 0$

Remark 32.4.1 (Constraint Gradient). A key insight is that if the correction vector for the concatenated particle positions, $Δ p = [Δ p_{1}^{T}, \dots, Δ p_{n}^{T}]^{T}$ , is chosen to be parallel to the constraint gradient $\nabla_{p} C$ , that is: $Δ p^{⊤} \nabla_{p} C = 0$ then for many common internal constraints (which are independent of rigid body transformations), both momenta are automatically conserved when particle masses are equal.

As such the correction for each particle will be assumed to be parallel to the constraint gradient. This choice is supported by the fact that since the gradient points in the direction of the steepest ascent of the constraint function, and thus moving in the opposite direction is the most direct way to reduce the constraint error.

Position Correction (General Constraints)

Consider a single constraint $C (p_{1}, \dots, p_{n}) = 0$ involving $n$ particles. Given a set of predicted positions $p$ that violate this constraint (i.e., $C (p) \neq = 0$ ), we seek a correction $Δ p$ such that $C (p + Δ p) = 0$ . To make this tractable, we linearize the constraint function using a first-order Taylor expansion around the current positions $p$ : $C (p + Δ p) C (p + Δ p) = C (p) + \nabla_{p} C (p)^{⊤} Δ p + O (∥Δ p ∥^{2}) ≻ 0, \approx C (p) + \nabla_{p} C (p)^{⊤} Δ p ≻ 0. (32.4.1)$

As established above, we restrict the correction to be in the direction of the constraint gradient. This allows us to define the correction for a single particle $i$ in terms of a single unknown scalar $λ$ , which acts as a Lagrange multiplier: $Δ p_{i} = - λ w_{i} \nabla_{p_{i}} C (p) (32.4.2)$ where $w_{i} = 1/ m_{i}$ is the inverse mass, ensuring that lighter particles are displaced more. The negative sign is a convention to align $λ$ with the concept of a repulsive force for a positive constraint violation. In vector form for all involved particles, this is $Δ p = - λ M^{- 1} \nabla_{p} C$ , where $M$ is the diagonal mass matrix.

To find the value of $λ$ , we substitute the correction from Equation (32.4.2) back into the linearized constraint, Equation (32.4.1). For an equality constraint, this becomes: $C (p) + \nabla_{p} C (p)^{⊤} (- λ M^{- 1} \nabla_{p} C (p)) = 0$ Solving for the Lagrange multiplier $λ$ yields: $λ = \frac{C ( p )}{\sum _{j = 1}^{n} w _{j} ∥ \nabla _{p_{j}} C ( p ) ∥ ^{2}} (32.4.3)$ With $λ$ determined, the position correction $Δ p_{i}$ for each particle is fully defined. We can also write the full correction in a single expression by substituting Equation (32.4.3) into Equation (32.4.2): $Δ p_{i} = - \frac{C ( p _{1} , \dots , p _{n} )}{\sum _{j = 1}^{n} w _{j} ∥ \nabla _{p_{j}} C ∥ ^{2}} w_{i} \nabla_{p_{i}} C (p) (32.4.4)$ This equation is the cornerstone of the PBD solver. It provides a direct, computationally efficient method to calculate the position corrections required to satisfy a single linearized constraint while respecting particle masses. This projection is solved multiple times for each constraint within a single time step. It is important to note that for constraints that are linear along their gradient, such as a simple distance constraint, this linearization is exact, and the constraint can be satisfied in a single step.

Example 32.4.1 (Distance Constraint). Let us consider one of the most fundamental constraints: the distance constraint, which enforces a fixed separation $d$ between two particles, $p_{1}$ and $p_{2}$ . This is a common building block used to model stretching resistance in springs, the edges of a cloth mesh, or rigid links between objects.

The constraint function is defined as the difference between the current distance and the desired rest distance $d$ : $C (p_{1}, p_{2}) = ∥ p_{1} - p_{2} ∥ - d$ This function directly measures the error that needs to be corrected. While the general projection method from Equation (32.4.4) can be applied, for this specific and common case, the result simplifies to a very intuitive form. The final position corrections for each particle are given directly by: $Δ p_{1} Δ p_{2} = - \frac{w _{1}}{w _{1} + w _{2}} (∥ p_{1} - p_{2} ∥ - d) \frac{p _{1} - p _{2}}{∥ p _{1} - p _{2} ∥} = + \frac{w _{2}}{w _{1} + w _{2}} (∥ p_{1} - p_{2} ∥ - d) \frac{p _{1} - p _{2}}{∥ p _{1} - p _{2} ∥}$ The term $(∥ p_{1} - p_{2} ∥ - d)$ represents the total correction magnitude needed along the vector connecting the particles. This total correction is then distributed between the two particles based on their inverse mass ratio, $w_{i} / (w_{1} + w_{2})$ . The correction is applied along the unit vector $(p_{1} - p_{2}) /∥ p_{1} - p_{2} ∥$ , moving the particles towards each other if they are too far apart ( $C > 0$ ) and away from each other if they are too close ( $C < 0$ ). This formulation directly satisfies the conservation of linear momentum!

Hierarchical Solver

The standard Gauss-Seidel solver in PBD exhibits slow convergence for low-frequency, large-scale deformations because corrections propagate only locally. To accelerate this, Hierarchical Position Based Dynamics (HPBD) [Müller 2008] introduces a multi-resolution approach. The system is represented as a hierarchy of meshes, from coarse to fine. The solver operates first on the coarsest levels, allowing corrections for large-scale errors to propagate rapidly across the entire object. These corrections are then transferred and refined on successively finer levels in a single coarse-to-fine pass via interpolation (prolongation). This multigrid-inspired technique significantly improves convergence speed. For this to work correctly, constraints on the coarse levels must be unilateral (inequality constraints), resisting only stretching, to avoid artificially restricting large-scale bending and folding of the object.

Stiffness and Damping in Constraint Solving

Stiffness

Material stiffness is controlled by a parameter $k \in [0, 1]$ , which controls how strongly a constraint is enforced. The most straightforward way to incorporate stiffness is to scale the calculated correction by $k$ : $Δ p_{i} \leftarrow k \cdot Δ p_{i}$ . However, this makes the effective stiffness dependent on the number of solver iterations, $n_{s}$ . If a constraint is projected $n_{s}$ times, the remaining error after the projections would be proportional to $(1 - k)^{n_{s}}$ . A more robust formulation that decouples stiffness from the iteration count is: $k^{'} = 1 - (1 - k)^{1/ n_{s}} (32.5.1)$ By scaling the correction by $k^{'}$ instead of $k$ , the remaining error after $n_{s}$ iterations becomes $(1 - k^{'})^{n_{s}} = 1 - k$ , which is independent of $n_{s}$ . This allows artists to tune the material stiffness $k$ without worrying about how many solver iterations are being used.

Damping

Although the PBD method is generally stable, simulations can exhibit excessive oscillations or kinetic energy gain, particularly with stiff constraints. The quality of dynamic simulations can generally be improved by the incorporation of an appropriate damping scheme. It can improve the stability by reducing temporal jittering of the point positions of an object. It also allows for larger time steps which increases the performance of a dynamic simulation.

A naive damping of all velocities can undesirably affect the global motion of an object, slowing down its overall translation and rotation. A more sophisticated approach is to damp only the relative motions of particles while preserving the total linear and angular momentum of the system.

Method 32.5.1 (Momentum-Conserving Damping).

Compute Center of Mass and Velocity: $p_{c m} = \frac{\sum _{i} m _{i} p _{i}}{\sum _{i} m _{i}}$ , $v_{c m} = \frac{\sum _{i} m _{i} v _{i}}{\sum _{i} m _{i}}$

Compute Angular Momentum and Inertia Tensor: Let $r_{i} = x_{i} - p_{c m}$ .

$L = \sum_{i} r_{i} \times (m_{i} v_{i})$

$I = \sum_{i} m_{i} ((r_{i} \cdot r_{i}) Id - r_{i} \otimes r_{i})$

Compute Angular Velocity: $ω = I^{- 1} L$

Apply Damping: For each particle $i$ :

a. Calculate the velocity deviation from rigid body motion: $Δ v_{i} = (v_{c m} + ω \times r_{i}) - v_{i}$

b. Apply damping to the deviation: $v_{i} \leftarrow v_{i} + k_{d am p in g} Δ v_{i}$

This method effectively isolates the non-rigid components of motion and damps only them. In the extreme case where $k_{d am p in g} = 1$ , all relative motion is eliminated, and the object behaves as a perfect rigid body.

Extended Position Based Dynamics (XPBD)

While the stiffness formulation in Equation (32.5.1) decouples the effective stiffness from the solver iteration count, a fundamental limitation of standard PBD remains: the resulting material stiffness is still dependent on the simulation time step $Δ t$ . An extension known as Extended Position Based Dynamics (XPBD) addresses this issue, enabling true parameter-independent stiffness.

XPBD is derived from a compliant constraint formulation and introduces the concept of compliance as the inverse of stiffness.

Definition 32.5.1 (Compliance). Compliance, $α \geq 0$ , is the inverse of a material's stiffness $k$ , with $α = 1/ k$ . It describes a material's propensity to deform under load. In XPBD, a compliance of $α = 0$ corresponds to an infinitely stiff, or hard, constraint.

The core idea of XPBD is to treat the Lagrange multiplier $λ$ not as a temporary value recalculated in each iteration, but as a physical quantity representing the accumulated impulse that is incrementally updated. In each solver iteration, we calculate an impulse increment, $Δ λ$ , and add it to the total impulse $λ$ for that constraint. The formula for this increment modifies Equation (32.4.3) as follows: $Δ λ = \frac{- C ( p ) - α ^ λ}{\sum _{j} w _{j} ∣ \nabla _{p_{j}} C ( p ) ∣ ^{2} + α ^} (32.5.2)$ Here, $λ$ is the total accumulated Lagrange multiplier for the constraint from previous iterations within the current time step. The term $\overset{α}{^}$ is the time-step scaled compliance, defined as: $\overset{α}{^} = \frac{α}{( Δ t ) ^{2}}$ This scaling ensures that the compliance parameter $α$ has physically consistent units within the dynamical system. After computing $Δ λ$ , the solver updates both the particle positions and the accumulated Lagrange multiplier for that constraint: $Δ p_{i} = - Δ λ w_{i} \nabla_{p_{i}} C (p) and λ \leftarrow λ + Δ λ$

Remark 32.5.1 (Interpretation of XPBD). The $+ \overset{α}{^}$ term in the denominator of Equation (32.5.2) acts to limit the magnitude of the corrective impulse $Δ λ$ . As compliance $α$ increases (i.e., the material becomes softer), $\overset{α}{^}$ grows, thus reducing the impulse applied per iteration. In the case of zero compliance ( $α = 0$ ), the $\overset{α}{^}$ terms vanish, and the formula for $Δ λ$ reduces to the standard PBD formulation in Equation (32.4.3).

The primary benefit of XPBD is not faster convergence, but convergence to a physically consistent state that correctly reflects the user-defined compliance $α$ , independent of the time step or iteration count. If the solver is terminated early, the system still exhibits the desired softness to some degree rather than an artificial, uncontrolled compliance. This makes material behavior more predictable and robust.

Summary

The Position Based Dynamics framework provides a robust and efficient method for physics-based simulation. Its core is a simple time-stepping loop that predicts new particle positions and then iteratively corrects them using a Gauss-Seidel solver. The mathematical basis for this correction is a projection derived from a linearized constraint function, weighted by inverse particle masses to produce physically plausible motion. This approach neatly sidesteps the stability problems of traditional explicit integrators. By defining different constraint functions, the same general solver can be used to simulate a wide variety of physical phenomena, from deformable solids to cloth and fluids. The framework is further enhanced by practical features like tunable stiffness and momentum-conserving damping, making it exceptionally well-suited for real-time applications.

Constraint Types for Position Based Dynamics Framework

*Author of this lecture: Žiga Kovačič, Cornell University

The general PBD framework is highly versatile and can be adapted to simulate a wide range of physical systems by defining appropriate constraints. In the following section we will cover a few common constraints that can be used to simulate a variety of materials, such as cloth, fluids, etc.

Cloth: Stretching and Bending

In the context of Position-Based Dynamics (PBD), the complex mechanical behaviors of cloth, such as its resistance to stretching and bending, are modeled through a system of geometric constraints. Instead of accumulating forces, the PBD framework directly manipulates the positions of the mesh vertices to satisfy these constraints in an iterative manner. This section will detail the formulation of fundamental constraints for cloth simulation.

Stretching Resistance via Distance Constraints

The primary characteristic of most textiles is their high resistance to stretching. In PBD, this property is enforced by constraining the distance between connected particles to remain close to its initial, or rest, distance. This is one of the simplest yet most crucial constraints in the PBD ecosystem (we have already seen this constraint in Example 32.4.1)

Example 33.1.1 (Stretching Constraint). Consider two particles, $i = 1, 2$ , with positions $p_{1}$ and $p_{2}$ , masses $m_{1}$ and $m_{2}$ , and a rest distance $d$ between them. The stretching constraint function $C$ is defined as the difference between the current distance and the rest distance: $C (p_{1}, p_{2}) = ∥ p_{1} - p_{2} ∥ - d (33.1.1)$ The goal is to find corrections $Δ p_{1}$ and $Δ p_{2}$ such that $C (p_{1} + Δ p_{1}, p_{2} + Δ p_{2}) = 0$ . The gradients of the constraint function with respect to the particle positions are: $\nabla_{p_{1}} C = \frac{p _{1} - p _{2}}{∥ p _{1} - p _{2} ∥} = n and \nabla_{p_{2}} C = - \frac{p _{1} - p _{2}}{∥ p _{1} - p _{2} ∥} = - n$ where $n$ is the unit vector along the axis connecting the two particles. Following the general PBD projection formula (32.4.4), the scalar Lagrange multiplier $λ$ is computed as: $λ = \frac{C ( p _{1} , p _{2} )}{w _{1} ∥ \nabla _{p_{1}} C ∥ ^{2} + w _{2} ∥ \nabla _{p_{2}} C ∥ ^{2}} = \frac{∥ p _{1} - p _{2} ∥ - d}{w _{1} + w _{2}}$ where $w_{i} = 1/ m_{i}$ is the inverse mass of particle $i$ . The position corrections are then found by moving the particles along their respective gradient directions, scaled by their inverse mass and $λ$ : $Δ p_{1} = - \frac{w _{1}}{w _{1} + w _{2}} (∥ p_{1} - p_{2} ∥ - d) n Δ p_{2} = + \frac{w _{2}}{w _{1} + w _{2}} (∥ p_{1} - p_{2} ∥ - d) n$ These corrections, when applied, will move the particles to exactly satisfy the rest length. Note that the total correction is distributed between the particles based on their inverse mass, ensuring that lighter particles move more than heavier ones and that linear momentum is conserved ( $\sum m_{i} Δ p_{i} = 0$ ).

To achieve different levels of elasticity, a stiffness parameter $k \in [0, 1]$ can be introduced by scaling the corrections $Δ p_{i}$ by $k$ . This allows for materials with varying elasticity, from perfectly rigid ( $k = 1$ ) to completely unstiff ( $k = 0$ ).

Dihedral Angle Bending Constraints

While stretching constraints maintain the structural integrity of the cloth mesh, they do not prevent it from folding unnaturally. Bending resistance, which dictates how the cloth wrinkles and drapes, is modeled by constraining the angle between adjacent triangles.

The constraint is defined for a pair of triangles $(p_{1}, p_{3}, p_{2})$ and $(p_{1}, p_{2}, p_{4})$ sharing a common edge $(p_{1}, p_{2})$ . The bending resistance is a function of the dihedral angle $ϕ$ between the two triangles, which is the initial angle between their respective normal vectors $n_{1}$ and $n_{2}$ . The constraint aims to restore this angle to its rest value, $ϕ_{0}$ .

**[Bending Constraints]** The provided illustration shows the dihedral angle for a pair of triangles in $R^{3}$ .

The constraint function is formulated as: $C_{bend} (p_{1}, p_{2}, p_{3}, p_{4}) = arccos (n_{1} \cdot n_{2}) - ϕ_{0} (33.1.2)$ where the normals are computed as: $n_{1} n_{2} = (p_{2, 1} \times p_{3, 1}) /∥ p_{2, 1} \times p_{3, 1} ∥ = (p_{2, 1} \times p_{4, 1}) /∥ p_{2, 1} \times p_{4, 1} ∥.$ The gradients of this function with respect to the four vertex positions $(p_{1}, p_{2}, p_{3}, p_{4})$ are then computed, and the standard PBD projection mechanism is used to derive the position corrections. The stiffness of bending is determined using $k_{b e n d}$ parameter.

A significant advantage of this formulation is its independence from stretching. Because the angle is defined by normalized vectors, the constraint is invariant to the lengths of the triangle edges.

Isometric Bending

For surfaces that are nearly inextensible, the isometric bending model [Bergou et al. 2006] can be used. This model provides a robust formulation based on the local Hessian of the bending energy.

This model considers a stencil for each interior edge $e_{0}$ of the mesh, consisting of the four vertices of the two triangles adjacent to that edge, labeled $p_{0}, p_{1}, p_{2}, p_{3}$ . The local bending energy for this stencil is defined as a quadratic form: $E_{bend} (p_{s}) = \frac{1}{2} i, j \in {0, 1, 2, 3} \sum Q_{ij} (p_{i}^{⊤} p_{j}) (33.1.3)$ where $p_{s} = (p_{0}, p_{1}, p_{2}, p_{3})^{T}$ is the vector of stencil positions and $Q \in R^{4 \times 4}$ is a constant matrix representing the local Hessian of the bending energy. This matrix depends only on the rest geometry of the stencil and can be precomputed. Its entries are derived from the cotangents of the angles within the two triangles.

The bending constraint is defined directly from this energy: $C_{bend} (p_{s}) = E_{bend} (p_{s})$ . Since the energy is quadratic in the positions, its gradient is linear and straightforward to compute: $\frac{\partial C _{bend}}{\partial p _{i}} = j \in {0, 1, 2, 3} \sum Q_{ij} p_{j} (33.1.4)$ This model is particularly effective for garment simulation where fabric is expected to deform isometrically (i.e., without stretching).

Collision Constraints

In any physical simulation, preventing the interpenetration of objects is paramount for achieving plausible results. Position-Based Dynamics provides a unified approach to this challenge. Collisions are not treated as a separate post-processing step involving impulses or penalty forces; instead, they are formulated as unilateral inequality constraints, just like IPC [Li et al. 2020] and integrated directly into the core PBD solver loop.

Triangle Self-Collisions

For thin-shell objects like cloth or other deformable surfaces, a primary challenge is handling self-collision, where the object folds and interacts with itself. The most common scenario is a vertex penetrating a triangle from another part of the mesh. This interaction is modeled with a unilateral constraint involving all four participating particles.

Consider a vertex with position $q$ and a triangle with vertices $p_{1}, p_{2}, p_{3}$ . A collision constraint can be formulated by defining a minimum separation distance, or thickness $h$ , from the triangle plane. The unilateral constraint function is: $C (q, p_{1}, p_{2}, p_{3}) = (q - p_{1}) \cdot n - h \geq 0 (33.2.1)$ where $n = (p_{2, 1} \times p_{3, 1}) /∥ p_{2, 1} \times p_{3, 1} ∥$ is the unit normal of the triangle. In case of $C < 0$ , all four particles ( $q, p_{1}, p_{2}, p_{3}$ ) are involved, and their positions are corrected according to their respective inverse masses to resolve the penetration while conserving momentum. It is also essential to check the barycentric coordinates of the vertex's projection onto the triangle plane to ensure the contact point lies within the triangle's boundaries before applying the correction.

Particle-Environment Collisions

The simplest collision scenario involves a dynamic particle interacting with a static, immovable piece of geometry, such as a floor plane or a convex container. This geometry acts as a boundary for the simulation domain. For a particle at position $p$ interacting with a static plane, the non-penetration condition can be formulated as a constraint on the particle's signed distance from the plane.

Let the plane be defined such that $n^{T} p - d_{rest} = 0$ for any point $p$ on it, where $n$ is the unit normal and $d_{rest}$ is an offset. A non-penetration constraint for a particle $p$ is then written as an inequality: $C (p) = n^{T} p - d_{rest} \geq 0 (33.2.2)$ When a particle violates this constraint (i.e., $C (p) < 0$ ), the PBD solver projects its position back to the surface. Since only one particle is dynamic, its inverse mass is effectively infinite compared to the static geometry, so it receives the full position correction required to satisfy $C (p) = 0$ . The correction simply moves the particle along the plane normal $n$ to resolve the penetration. You can imagine how this can be simply extended to more complex shapes.

Particle-Particle Collisions

For simulating systems composed of discrete elements, such as granular materials, we have to consider direct particle-particle collision. This constraint prevents any two particles from overlapping.

For two particles at positions $p_{i}$ and $p_{j}$ with corresponding radii $r_{i}$ and $r_{j}$ , the non-penetration constraint is a simple inequality based on their center-to-center distance: $C (p_{i}, p_{j}) = ∥ p_{i} - p_{j} ∥ - (r_{i} + r_{j}) \geq 0 (33.2.3)$ Unlike the linear plane constraint, this function is non-linear. This can be solved similarly to the stretching constraints considered in the previous section. The solver calculates a correction that pushes the two particles apart along the vector connecting their centers, distributing the correction based on their inverse masses to conserve linear momentum.

Frictional Effects at the Position Level

Friction is a dissipative contact force that opposes relative tangential motion between surfaces. To be more robust we can incorporate friction directly into the position-level constraint solve. This method is applied after an interpenetration constraint between two particles, $i$ and $j$ , has been resolved.

Let $p_{i}$ and $p_{j}$ be the particle positions at the start of the time step. Let $p *_{i}$ and $p *_{j}$ be the current candidate positions, which have already been corrected to resolve penetration. The core idea is to compute a frictional position correction that opposes the tangential component of the particles' relative displacement during the timestep.

First, we determine the relative displacement vector over the timestep, $Δ p_{rel}$ : $Δ p_{rel} = (p *_{i} - p_{i}) - (p *_{j} - p_{j})$ Next, we find the tangential component of this displacement, $Δ p_{t}$ , by projecting it onto the contact plane defined by the contact normal $n$ . $Δ p_{t} = Δ p_{rel} - (Δ p_{rel} \cdot n) n (33.2.4)$ The friction model determines the magnitude of the correction based on a comparison between the tangential displacement $∥Δ p_{t} ∥$ and the static friction threshold, which is proportional to the penetration depth $d$ and the coefficient of static friction $μ_{s}$ .

A position correction vector, $Δ p_{friction}$ , is calculated to oppose $Δ p_{t}$ . This correction is then distributed between the two particles. The correction for particle $i$ is given by: $Δ p_{i} = - \frac{w _{i}}{w _{i} + w _{j}} \times {Δ p_{t} Δ p_{t} min (\frac{μ _{k} d}{∥Δ p _{t} ∥}, 1) if ∥Δ p_{t} ∥ \leq μ_{s} d otherwise (Static Friction) (Kinetic Friction) (33.2.5)$ where $w_{i} = 1/ m_{i}$ is the inverse mass and $μ_{k}$ is the coefficient of kinetic friction. The negative sign is crucial, as friction must oppose the tangential displacement. In the static case, the correction fully cancels out the relative tangential movement. In the kinetic case, the correction is limited by the kinetic friction force (Coulomb's law).

Volume Conservation Constraints

The simulation of incompressible or nearly incompressible materials is critical for many applications in computer graphics. In a force-based framework, incompressibility typically requires solving a complex Poisson equation for pressure. PBD offers a more direct and often simpler approach by enforcing volume or density conservation through geometric constraints. This section will explore several such constraints, beginning with those defined on discrete mesh elements like tetrahedra and extending to global volume constraints for closed surfaces.

Tetrahedral Volume and Triangle Area Constraints

For volumetric objects discretized into a tetrahedral mesh, incompressibility can be approximated by constraining the volume of each individual tetrahedron to remain at its rest volume.

Definition 33.3.1 (Tetrahedral Volume Constraint). Given a tetrahedron defined by four vertices with positions $p_{1}, p_{2}, p_{3}, p_{4}$ and a rest volume $V_{0}$ , the volume constraint function $C$ is defined as: $C (p_{1}, p_{2}, p_{3}, p_{4}) = \frac{1}{6} ((p_{2} - p_{1}) \times (p_{3} - p_{1})) \cdot (p_{4} - p_{1}) - V_{0} (33.3.1)$ The constraint is satisfied when $C = 0$ . The gradients of this function with respect to the four vertex positions are: $\nabla_{p_{1}} C \nabla_{p_{2}} C \nabla_{p_{3}} C \nabla_{p_{4}} C = \frac{1}{6} ((p_{3} - p_{2}) \times (p_{4} - p_{2})) = \frac{1}{6} ((p_{4} - p_{3}) \times (p_{1} - p_{3})) = \frac{1}{6} ((p_{1} - p_{4}) \times (p_{2} - p_{4})) = \frac{1}{6} ((p_{2} - p_{1}) \times (p_{3} - p_{1}))$ These gradients—vectors proportional to the areas of the opposing faces—are used to calculate position corrections for the four vertices. An analogous constraint can be defined for the area of a triangle in 2D or 3D simulations.

Global Volume for Closed Meshes

For objects represented by a closed, watertight triangle mesh (such as an inflatable character), it is often desirable to control the total enclosed volume. This can be achieved with a single global constraint that acts on all vertices of the mesh. The total volume $V$ of a closed mesh can be calculated using the Divergence Theorem, which discretizes to a sum over all triangles. The volume contributed by each triangle $i$ with vertices $(p_{t_{1}}^{i}, p_{t_{2}}^{i}, p_{t_{3}}^{i})$ is the signed volume of the tetrahedron formed by the triangle and the origin:

$V = i \sum N_triangles \frac{1}{6} (p_{t_{1}}^{i} \times p_{t_{2}}^{i}) \cdot p_{t_{3}}^{i}$

The global volume constraint is then formulated to maintain this volume at a target value, which is typically the rest volume $V_{0}$ scaled by a pressure factor $k_{pressure}$ : $C (p_{1}, ..., p_{N}) = (i \sum \frac{1}{6} (p_{t_{1}^{i}} \times p_{t_{2}^{i}}) \cdot p_{t_{3}^{i}}) - k_{pressure} V_{0} (33.3.2)$ The gradient of this constraint with respect to a vertex position $p_{j}$ is the sum of the gradients from all triangles adjacent to that vertex. This results in a correction that redistributes the volume error across the entire surface.

Position-Based Fluids: Density and Surface Constraints

The Position-Based Dynamics framework can be elegantly extended from solids to simulate incompressible fluids. The resulting method, known as Position-Based Fluids (PBF) [Macklin & Muller 2013], replaces the complex pressure solves of traditional Smoothed Particle Hydrodynamics (SPH) [Koschier et al. 2020] with a set of geometric constraints. This approach inherits the stability and efficiency of PBD, allowing for large time steps suitable for real-time applications, while effectively enforcing the constant-density condition that characterizes incompressible flow.

The Per-Particle Density Constraint

The fundamental principle of PBF is to ensure that the density around each fluid particle remains constant. The density at a given particle's location is estimated using a kernel-based summation over its neighbors.

The density $ρ_{i}$ for a particle $i$ at position $p_{i}$ is estimated as: $ρ_{i} = j \sum m_{j} W (p_{i} - p_{j}, h) (33.4.1)$ where the sum is over all neighboring particles $j$ , $m_{j}$ is the mass of particle $j$ , $h$ is the smoothing kernel radius, and $W$ is a radially symmetric smoothing kernel function. For simplicity, we can assume all particles have equal mass and absorb it into the density calculation.

The goal is to enforce that this estimated density $ρ_{i}$ matches a user-defined rest density $ρ_{0}$ . This is formulated as a per-particle constraint function $C_{i}$ .

Definition 33.4.1 (PBF Density Constraint). For each particle $i$ , the density constraint $C_{i}$ is defined as the scaled deviation from the rest density $ρ_{0}$ : $C_{i} (p_{1}, ..., p_{N}) = \frac{ρ _{i}}{ρ _{0}} - 1 (33.4.2)$ The constraint is satisfied when $C_{i} = 0$ .

To solve this system of constraints, PBF computes a position correction $Δ p_{i}$ for each particle. Following the PBD methodology, we first compute the gradient of $C_{i}$ with respect to the position of a particle $k$ : $\nabla_{p_{k}} C_{i} = \frac{1}{ρ _{0}} j \sum m_{j} \nabla_{p_{k}} W (p_{i} - p_{j}, h) = {\frac{1}{ρ _{0}} \sum_{j} m_{j} \nabla W (p_{i} - p_{j}, h) - \frac{m _{k}}{ρ _{0}} \nabla W (p_{i} - p_{k}, h) if k = i if k \neq = i (33.4.3)$ The standard PBD solver computes a single Lagrange multiplier $λ_{i}$ for each constraint $C_{i}$ using the formula $λ_{i} = - C_{i} / (\sum_{k} ∥ \nabla_{p_{k}} C_{i} ∥^{2})$ . The position correction for a particle $Δ p_{i}$ is then derived from the influence of its own constraint and the constraints of its neighbors. Due to the symmetry of the kernel gradient ( $\nabla_{p_{i}} W (p_{i} - p_{j}, h) = - \nabla_{p_{j}} W (p_{i} - p_{j}, h)$ ), this results in a simple and efficient final update rule for the position correction of particle $i$ : $Δ p_{i} = \frac{1}{ρ _{0}} j \sum (λ_{i} + λ_{j}) \nabla W (p_{i} - p_{j}, h) (33.4.4)$ Here, a different kernel, the Spiky kernel, is typically used for its non-vanishing gradient, which prevents particle clustering.

Remark 33.4.1 (Robustness and Constraint Softening). A practical issue arises when a particle has few neighbors, as the denominator $\sum_{k} ∥ \nabla_{p_{k}} C_{i} ∥^{2}$ can approach zero, leading to large, unstable position corrections. To prevent this, a small relaxation parameter $ε$ is added to the denominator, softening the constraint. This is known as Constraint Force Mixing (CFM) [Smith & others 2005]. Then: $λ_{i} = - \frac{C _{i} ( p _{1} , ... , p _{N} )}{\sum _{k} ∥ \nabla _{p_{k}} C _{i} ∥ ^{2} + ε} (33.4.5)$

Correcting for Tensile Instability

A well-known artifact in SPH-based fluid simulations is the "tensile instability," where particles near a free surface clump together due to an inability to satisfy the rest density. PBF offers a direct solution by incorporating an artificial pressure term.

Specifically, an artificial pressure term, $s_{corr}$ , is introduced to create a short-range repulsive force between nearby particles. This term is a function of $W$ : $s_{corr} = - k (\frac{W ( p _{i} - p _{j} , h )}{W ( Δ q , h )})^{n} (33.4.6)$ where $k$ and $n$ are small positive constants (e.g., $k = 0.1, n = 4$ ), and $Δ q$ is a vector representing a small distance within the kernel radius (e.g., $∥Δ q ∥ = 0.3 h$ ). This term is only applied to push particles apart and is incorporated directly into the position correction calculation from Equation (33.4.4): $Δ p_{i} = \frac{1}{ρ _{0}} j \sum (λ_{i} + λ_{j} + s_{corr}) \nabla W (p_{i} - p_{j}, h) (33.4.7)$ This prevents clumping, and implicitly creates a surface tension-like effect at the fluid's boundary.

Velocity-Level Corrections: Viscosity and Vorticity

While the primary constraints in PBF operate on positions, velocity-level updates are still necessary to model phenomena like viscosity and to reintroduce lost energy. These are applied as a post-process after the main PBD solver loop has updated positions and velocities.

XSPH Viscosity: To impart a more coherent, liquid-like motion and reduce particle intermixing, XSPH viscosity is applied. It adjusts each particle's velocity to be closer to the average velocity of its neighbors: $v_{i}^{new} = v_{i} + c j \sum (v_{j} - v_{i}) W (p_{i} - p_{j}, h) (33.4.8)$ where $c$ is a positive viscosity coefficient, typically a small value like $0.01$ .

Vorticity Confinement: The iterative nature of PBD can introduce numerical damping, causing the fluid to lose rotational energy and appear overly viscous. Vorticity confinement can be used to counteract this by calculating the vorticity (curl of the velocity field) $ω_{i} = \nabla \times v_{i}$ at each particle and applying a corrective force $f_{vorticity} = ε (N \times ω_{i})$ to reintroduce small-scale rotational details. This force is then integrated to update particle velocities.

Continuum Mechanics-Based Constraints

While simple geometric constraints such as distance and volume are effective for many materials, a more physically rigorous and expressive simulation can be achieved by deriving constraints directly from the principles of continuum mechanics. This approach connects Position-Based Dynamics to the rich theoretical foundation of established methods like the Finite Element Method (FEM), allowing for the simulation of complex material behaviors including anisotropy, elastoplasticity, and nonlinear elasticity. The core idea is to define constraints based on the deformation gradient, a tensor that measures the local stretching and shearing of the material.

This section will first review the necessary concepts of discretized deformation and strain. It will then detail two primary methods for formulating continuum-based constraints: a holistic approach based on strain energy and a more granular method that constrains individual components of the strain tensor directly.

The Strain Energy Constraint

The most direct way to incorporate a standard hyperelastic material model into PBD is to use its strain energy potential as a constraint function.

In a discrete setting, such as a tetrahedral mesh, the deformation gradient is assumed to be constant within each element. For a tetrahedron with material-space vertices $(P_{1}, P_{2}, P_{3}, P_{4})$ and world-space vertices $(p_{1}, p_{2}, p_{3}, p_{4})$ , we can define a material-space shape matrix $D_{m}$ and a world-space shape matrix $D_{s}$ from the edge vectors: $D_{m} D_{s} = [P_{1} - P_{4}, P_{2} - P_{4}, P_{3} - P_{4}] = [p_{1} - p_{4}, p_{2} - p_{4}, p_{3} - p_{4}]$ The deformation gradient for the tetrahedron is then given by $F = D_{s} D_{m}^{- 1}$ . Note that $D_{m}^{- 1}$ is constant and can be precomputed.

From $F$ , we can derive various strain measures. One of the most common is the Green-St. Venant strain tensor, $ε$ , which is independent of rigid-body rotations: $ε = \frac{1}{2} (F^{T} F - I) (33.5.1)$ The related right Cauchy-Green deformation tensor, $S = F^{T} F = 2 ε + I$ , will be particularly useful for formulating direct strain constraints. If there is no deformation, $F = I$ , $S = I$ , and $ε = 0$ .

By Hooke’s generalized law, we can relate stress to strain:

$S = C ε$ where $C$ is the elasticity tensor of the material. Recall that the energy of a deformed solid is defined by integrating the scalar strain energy density field over the whole body $Ω$ :

$E_{s} = \int_{Ω} Ψ_{s} d P$ where the strain energy density can be defined in terms of stress and strain:
$Ψ_{s} = \frac{1}{2} ε : S = \frac{1}{2} t r (ε^{⊤} S)$

So the total strain energy for a single tetrahedral element is its rest volume $V_{0}$ multiplied by the energy density. For an undeformed state, this energy is zero. This naturally leads to the following constraint formulation: $C (p_{1}, ..., p_{4}) = E_{s} = V_{0} Ψ_{s} (F (p_{1}, ..., p_{4})) = 0 (33.5.2)$

The PBD solver requires the gradient of the constraint with respect to the positions of the vertices. This gradient has a deep physical meaning, as it is directly related to the first Piola-Kirchhoff stress tensor, $P$ , which relates forces in the current configuration to areas in the reference configuration. The stress tensor is a function of the deformation gradient, $P (F)$ . Using the chain rule, the gradients of the energy for the first three vertices can be computed in a compact matrix form: $[\frac{\partial E _{s}}{\partial p _{1}}, \frac{\partial E _{s}}{\partial p _{2}}, \frac{\partial E _{s}}{\partial p _{3}}] = V_{0} P (F) D_{m}^{- T} (33.5.3)$ The gradient for the fourth vertex is determined by the condition of translational invariance (i.e., the net force on the element must be zero): $\frac{\partial E _{s}}{\partial p _{4}} = - i = 1 \sum 3 \frac{\partial E _{s}}{\partial p _{i}} (33.5.4)$ This formulation extends naturally to surface meshes composed of triangles. For a triangle with rest area $A_{0}$ , the constraint is $C (p) = A_{0} Ψ_{s} (F_{tri})$ . The gradients are computed analogously: $[\frac{\partial E _{s}}{\partial p _{1}}, \frac{\partial E _{s}}{\partial p _{2}}] = A_{0} P (F_{tri}) D_{m}^{- T} and \frac{\partial E _{s}}{\partial p _{3}} = - i = 1 \sum 2 \frac{\partial E _{s}}{\partial p _{i}} (33.5.5)$

Remark 33.5.1 (Handling Element Inversion). Standard constitutive models are often not defined for degenerate or inverted elements (where $det (F) \leq 0$ ), which can cause instabilities in a simulation. This is a critical practical issue. This problem can be robustly addressed by augmenting the constitutive model with techniques designed for inversion handling.

The elegance of this strain energy constraint lies in its generality. It is not limited to a single material type, like the simple St. Venant-Kirchhoff model. More complex and physically accurate models, such as the Neo-Hookean model for rubber-like materials, can be readily supported by simply substituting the appropriate energy density function $Ψ_{s}$ . This allows PBD to simulate complex physical effects like lateral contraction (Poisson's effect), anisotropy, and elastoplasticity with high fidelity and efficiency.

Direct Strain-Based Constraints

For applications requiring more granular or intuitive control over material properties, particularly for simulating anisotropy, it is advantageous to define separate constraints for each mode of deformation. This approach operates on the components of the right Cauchy-Green deformation tensor, $S = F^{T} F$ .

Stretch Constraints

The diagonal entries of $S$ , $S_{ii}$ , represent the squared stretch along the material's reference axes. In an undeformed state, $S_{ii} = 1$ . To ensure the constraint projection is linear and converges in a single step, the stretch constraint is formulated based on the stretch itself, rather than its square: $C_{stretch, i} (x) = S_{ii} - 1 = 0 (33.5.6)$ By defining one such constraint for each principal axis, one can assign independent stiffness parameters, which is the key to modeling anisotropic stretch resistance.

Shear Constraints

The off-diagonal entries of $S$ , $S_{ij} = f_{i} \cdot f_{j}$ (where $f_{k}$ are the column vectors of $F$ ), measure the shear between material axes. A naive constraint $S_{ij} = 0$ problematically couples shearing and stretching. To isolate the shearing deformation, the constraint must be normalized, thus depending only on the angle between the deformed axes: $C_{shear, ij} (x) = \frac{f _{i} \cdot f _{j}}{∣ f _{i} ∣∣ f _{j} ∣} = 0 (33.5.7)$ This formulation ensures that the shear constraints enforce orthogonality between the material axes without interfering with the stretch constraints that control their length. This decoupling is critical for intuitive and independent tuning of material behavior.

Summary

We have explored the versatility of the Position-Based Dynamics Framework through various constraint types that enable simulation of diverse physical systems. Distance constraints provide cloth stretching and bending resistance, while unilateral inequality constraints handle contact and friction at the position level. Volume conservation constraints maintain incompressibility through tetrahedral and global volume preservation, and density constraints enable position-based fluid simulation with SPH-based kernels. Finally, continuum mechanics-based constraints bridge PBD with hyperelastic material models through strain energy formulations and direct strain component control. Together, these constraint types demonstrate PBD's unified approach to simulating cloth, fluids, soft bodies, and contact phenomena within a single, efficient framework.

Linear System

*Author of this lecture: Tianyi Xie, University of California, Los Angeles

In the context of simulation with optimization, solving linear systems is a fundamental step for Newton-type methods. At each iteration (see Algorithm 3.3.1), we need to compute a search direction $p$ as:

$p = - P^{- 1} \nabla E (x),$

where $P$ is typically the Hessian (or a positive-definite proxy) of the incremental potential, and $\nabla E (x)$ is its gradient at the current iterate. In practice, $P$ is often a large, sparse, symmetric positive definite (SPD) matrix. This is equivalent to solving a linear system of the form:

$A x = b,$

where $A = P$ , $x = p$ , and $b = - \nabla E (x)$ .

Efficiently solving these linear systems is crucial for the performance of the optimization time integrator. Depending on the problem size and structure, both direct and iterative solvers are commonly used. The following sections introduce the basic structure of these linear systems and discuss common solution strategies.

Direct Solver

Direct solvers are a class of algorithms designed to compute the exact solution to a linear system in a finite number of steps, up to numerical precision. In the context of optimization and simulation, the coefficient matrix $A$ is often symmetric positive definite (SPD), making direct solvers both robust and efficient for moderate-sized problems.

Forward and Backward Substitution for Triangular Matrices

Solving $A x = b$ is particularly straightforward when $A$ is triangular. If $A$ is lower triangular, i.e. $a_{11} a_{21} ⋮ a_{n 1} 0 a_{22} ⋮ a_{n 2} \dots \dots ⋱ \dots 000 a_{nn} x_{1} x_{2} ⋮ x_{n} = b_{1} b_{2} ⋮ b_{n} . (34.1.1)$ the system can be solved by forward substitution. The $i$ -th variable is solved as: $x_{i} = \frac{1}{a _{ii}} (b_{i} - j = 1 \sum i - 1 a_{ij} x_{j}), i = 1, \dots, n . (34.1.2)$ This process proceeds from $i = 1$ to $n$ , using previously computed $x_{j}$ for $j < i$ .

If $A$ is upper triangular, the system is solved by backward substitution: $x_{i} = \frac{1}{a _{ii}} (b_{i} - j = i + 1 \sum n a_{ij} x_{j}), i = n, n - 1, \dots, 1. (34.1.3)$ This process proceeds from $i = n$ down to $1$ , using already computed $x_{j}$ for $j > i$ .

Cholesky Decomposition

For general SPD matrices, we can reduce the problem to triangular systems using the Cholesky decomposition. Given $A$ SPD, we can factor $A = L L^{T}$ , where $L$ is lower triangular.

Method 34.1.1 (Cholesky Decomposition). Given a symmetric positive definite matrix $A$ , the Cholesky decomposition computes a lower triangular matrix $L$ such that $A = L L^{T}$ . The algorithm is as follows:
Algorithm 34.1.1 (The Cholesky Decomposition Algorithm).

Once the decomposition is computed, solving $A x = b$ reduces to two triangular systems:

Forward substitution: Solve $L y = b$ for $y$ .
Backward substitution: Solve $L^{T} x = y$ for $x$ .

Cholesky decomposition takes advantage of the symmetry and positive definiteness of $A$ , reducing both computational cost and memory usage compared to general-purpose methods like LU decomposition. Direct solvers are widely used when the system size is not too large, or when high accuracy is required for ill-conditioned systems. For very large or highly sparse systems, iterative solvers may be preferred, as discussed in the next section.

Basic Iterative Methods

For large or highly sparse linear systems, direct solvers may become impractical due to memory or computational constraints. In such cases, iterative methods provide an alternative by generating a sequence of approximate solutions that (ideally) converge to the true solution. This section introduces two of the most fundamental iterative methods: Jacobi and Gauss-Seidel.

Jacobi Method

The Jacobi method is a simple, perfectly parallelizable approach for solving $A x = b$ . At each iteration, every component of $x$ is updated independently using only the values from the previous iteration. Let $A = D + (L + U)$ , where $D$ is the diagonal part of $A$ , $L$ is the strictly lower triangular part, and $U$ is the strictly upper triangular part. The update rule is: $x^{(k + 1)} = D^{- 1} (b - (L + U) x^{(k)}), (34.2.1)$ or, component-wise: $x_{i}^{(k + 1)} = \frac{1}{A _{ii}} b_{i} - j \neq = i \sum A_{ij} x_{j}^{(k)}, i = 1, \dots, n .$

Remark 34.2.1 (Parallelism). All components $x_{i}$ can be updated simultaneously in the Jacobi method, making it well-suited for parallel computation.

Gauss-Seidel Method

The Gauss-Seidel method improves upon Jacobi by using the most recently updated values as soon as they are available. This means each new $x_{i}^{(k + 1)}$ is immediately used in subsequent updates within the same iteration. The update rule is: $x^{(k + 1)} = (D + L)^{- 1} (b - U x^{(k)}), (34.2.2)$ or, component-wise: $x_{i}^{(k + 1)} = \frac{1}{A _{ii}} (b_{i} - j = 1 \sum i - 1 A_{ij} x_{j}^{(k + 1)} - j = i + 1 \sum n A_{ij} x_{j}^{(k)}), i = 1, \dots, n .$ A backward Gauss-Seidel iteration can also be defined, where the updates sweep from $i = n$ down to $1$ in each iteration: $x^{(k + 1)} = (D + U)^{- 1} (b - L x^{(k)}), (34.2.3)$ or, component-wise: $x_{i}^{(k + 1)} = \frac{1}{A _{ii}} (b_{i} - j = i + 1 \sum n A_{ij} x_{j}^{(k + 1)} - j = 1 \sum i - 1 A_{ij} x_{j}^{(k)}), i = n, n - 1, \dots, 1.$

Remark 34.2.2 (Sequential Update). The Gauss-Seidel method typically converges faster than Jacobi, as it incorporates the latest information within each iteration. However, the updates are inherently sequential.

Convergence Analysis

All three methods—Jacobi, forward Gauss-Seidel, and backward Gauss-Seidel can be written in the form: $x^{(k + 1)} = G x^{(k)} + c, (34.2.4)$ where $G$ is the iteration matrix. Specifically, $G_{J A} G_{FGS} G_{BGS} = I - D^{- 1} A, = I - (D + L)^{- 1} A, = I - (D + U)^{- 1} A,$ for the Jacobi, forward Gauss-Seidel, and backward Gauss-Seidel iterations, respectively. The convergence of these methods depends on the spectral radius $ρ (G)$ (the largest absolute value of the eigenvalues of $G$ ). The method converges if and only if $ρ (G) < 1$ .

A sufficient condition for convergence of both methods is that $A$ is strictly diagonally dominant (i.e., $∣ A_{ii} ∣ > \sum_{j \neq = i} ∣ A_{ij} ∣$ for all $i$ ). For the Gauss-Seidel method, convergence is also guaranteed if $A$ is symmetric positive definite (SPD).

Termination Criteria

In practice, iterative methods are terminated when the solution is deemed sufficiently accurate rather than running indefinitely. Common termination criteria include:

Residual-based: Stop when $∥ r^{(k)} ∥ = ∥ b - A x^{(k)} ∥ < ϵ$ or $∥ r^{(k)} ∥/∥ b ∥ < ϵ$ for a prescribed tolerance $ϵ$ .
Solution change: Stop when $∥ x^{(k + 1)} - x^{(k)} ∥ < ϵ$ or $∥ x^{(k + 1)} - x^{(k)} ∥/∥ x^{(k)} ∥ < ϵ$ indicating convergence.
Maximum iterations: Stop after a predetermined number of iterations to prevent infinite loops.

The relative residual-based criterion $∥ r^{(k)} ∥/∥ b ∥ < ϵ$ is most commonly used as it directly measures how well the current solution satisfies the original linear system, considering the scale of the problem.

Conjugate Gradient Method

The Conjugate Gradient (CG) method is a powerful iterative algorithm for solving large, sparse linear systems of the form $A x = b$ , where $A$ is symmetric positive definite (SPD). Unlike general iterative methods such as Jacobi and Gauss-Seidel, CG is specifically designed for SPD matrices and has become fundamental in scientific computing and numerical simulation.

Formulation as Quadratic Optimization

The conjugate gradient method can be elegantly understood as an optimization algorithm for minimizing the quadratic function: $f (x) = \frac{1}{2} x^{T} A x - b^{T} x, (34.3.1)$ where $A$ is SPD. The unique global minimizer of $f (x)$ is precisely the solution to $A x = b$ .

The classical steepest descent method updates $x$ along the negative gradient direction $- \nabla f (x) = b - A x$ , but this approach can suffer from slow convergence, particularly when $A$ is ill-conditioned. The CG method overcomes this limitation by searching along a carefully chosen sequence of directions $p^{(k)}$ that are $A$ -conjugate (or $A$ -orthogonal), meaning $p^{(i)}^{T} A p^{(j)} = 0$ for $i \neq = j$ . This conjugacy property ensures that progress made along one direction is never undone by subsequent steps, and the minimization along each direction becomes independent of the others.

Line Search

Given a sequence of conjugate directions $p^{(0)}, p^{(1)}, p^{(2)}, \dots$ , the problem of minimizing the quadratic function reduces to finding optimal step sizes $α^{(i)}$ such that $\sum_{i = 0}^{n - 1} α^{(i)} p^{(i)}$ closely approximates the solution $x^{*}$ .

The most straightforward approach is greedy line search: starting from an initial point $x^{(0)}$ , we select a search direction $p^{(0)}$ and then minimize $f (x^{(0)} + α^{(0)} p^{(0)})$ with respect to $α^{(0)}$ . For quadratic functions, this optimization has a simple closed-form solution that avoids matrix inversion: $α^{(0)} = \frac{p ^{(0)} ^{T} ( b - A x ^{(0)} )}{p ^{(0)} ^{T} A p ^{(0)}} .$ The intuition is clear: we start at $x^{(0)}$ , choose a direction $p^{(0)}$ , and move along this direction until the objective function is minimized. While this may not reach the global minimum in one step, it guarantees progress toward the optimal solution.

This procedure is then repeated iteratively: at the new point $x^{(1)} = x^{(0)} + α^{(0)} p^{(0)}$ , we select the next direction $p^{(1)}$ , compute the corresponding step size $α^{(1)}$ , and continue.

The general iteration process can be summarized as: $α^{(i)} x^{(i + 1)} r^{(i + 1)} = \frac{p ^{(i)} ^{T} r ^{(i)}}{p ^{(i)} ^{T} A p ^{(i)}}, = x^{(i)} + α^{(i)} p^{(i)}, = r^{(i)} - α^{(i)} A p^{(i)} (34.3.2)$ where $p^{(0)}, p^{(1)}, p^{(2)}, \dots$ are the search directions and $r^{(i)} = b - A x^{(i)}$ is the residual at step $i$ . Note that the residual can be updated without recomputing $A x^{(i + 1)}$ from scratch: $r^{(i + 1)} = b - A x^{(i + 1)} = b - A (x^{(i)} + α^{(i)} p^{(i)}) = r^{(i)} - α^{(i)} A p^{(i)}$ . In practice, when $∥ r^{(i + 1)} ∥$ becomes sufficiently small (typically below a prescribed tolerance), the iteration process can be terminated early to achieve better computational efficiency.

Conjugate Directions

The choice of search directions is crucial for the method's performance. If the directions $p^{(0)}, p^{(1)}, p^{(2)}, \dots$ are poorly chosen, convergence will be slow. In particular, gradient descent (which uses the steepest descent directions) exhibits slow convergence for ill-conditioned matrices $A$ .

In contrast, if we choose the directions to be mutually $A$ -conjugate: $p^{(i)}^{T} A p^{(j)} = 0, \forall i \neq = j, (34.3.3)$ the algorithm achieves remarkable efficiency: there is no "zigzagging" behavior, and we obtain the exact solution in at most $n$ steps, where $n$ is the dimension of the system:

Without loss of generality, assume $x^{(0)} = 0$ and set $p^{(0)}$ to be the initial residual. Since $x^{(0)} = 0$ , the gradient at $x^{(0)}$ is $A x^{(0)} - b = - b$ , so we set $p^{(0)} = b$ . The remaining directions will be constructed to be $A$ -conjugate to all previous directions.

Let $r^{(k)}$ denote the residual at the $k$ -th step: $r^{(k)} = b - A x^{(k)} . (34.3.4)$ Note that $r^{(k)}$ equals the negative gradient of $f$ at $x^{(k)}$ , so standard gradient descent would move in the direction $r^{(k)}$ .

To construct conjugate directions, we require that each new search direction $p^{(k)}$ be built from the current residual $r^{(k)}$ while being $A$ -conjugate to all previous search directions. We start with the negative gradient $r^{(k)}$ and orthogonalize it w.r.t. $A$ against all previous search directions $p^{(0)}, p^{(1)}, \dots, p^{(k - 1)}$ using the Gram–Schmidt process: $p^{(k)} = r^{(k)} - i = 0 \sum k - 1 \frac{p ^{(i)} ^{T} A r ^{(k)}}{p ^{(i)} ^{T} A p ^{(i)}} p^{(i)} . (34.3.5)$

Algorithmic Simplification

The expressions above can be significantly simplified, leading to the elegant final form of the conjugate gradient algorithm. The key insight lies in proving two fundamental orthogonality relationships.

Orthogonality Properties. We first establish that the following orthogonality relations hold: $p^{(i)}^{T} r^{(j)} = 0, \forall i < j, (34.3.6)$ $r^{(i)}^{T} r^{(j)} = 0, \forall i \neq = j . (34.3.7)$

Proof Sketch. From the residual update formula (Equation (34.3.4)), we can write recursively: $r_{j} = r^{(j - 1)} - α^{(j - 1)} A p^{(j - 1)} = r^{(i)} - k = i \sum j - 1 α^{(k)} A p^{(k)} .$ Taking the dot product with $p^{(i)}$ on both sides: $p^{(i)}^{T} r^{(j)} = p^{(i)}^{T} r^{(i)} - k = i \sum j - 1 α_{k} p_{i}^{T} A p_{k} .$ By the $A$ -conjugacy property (Equation (34.3.3)) and the step size formula $α^{(i)} = \frac{p ^{(i)} ^{T} r ^{(i)}}{p ^{(i)} ^{T} A p ^{(i)}}$ , we can verify that $p^{(i)}^{T} r^{(j)} = 0$ for $i < j$ .

The second orthogonality relation $r^{(i)}^{T} r^{(j)} = 0$ follows from the fact that each residual $r^{(j)}$ lies in the span of the search directions ${p^{(0)}, p^{(1)}, \dots, p^{(j - 1)}}$ by construction, and these directions are built from previous residuals.

Simplified Formulas. Using these orthogonality properties, the conjugate direction formula (Equation (34.3.5)) simplifies dramatically. Since $p^{(i)}^{T} r^{(k)} = 0$ for $i < k$ , we have: $p^{(k)}^{T} r^{(k)} = r^{(k)}^{T} r^{(k)} .$

This allows us to simplify the step size calculation: $α^{(k)} = \frac{r ^{(k)} ^{T} r ^{(k)}}{p ^{(k)} ^{T} A p ^{(k)}} . (34.3.8)$

For the conjugate direction update, we can show that only the most recent direction matters. Using the residual update formula and the orthogonality properties: $p^{(k + 1)} = r^{(k + 1)} - i = 0 \sum k \frac{p ^{(i)} ^{T} A r ^{(k + 1)}}{p ^{(i)} ^{T} A p ^{(i)}} p^{(i)} = r^{(k + 1)} - i = 0 \sum k \frac{r ^{(k + 1)} ^{T} A p ^{(i)}}{p ^{(i)} ^{T} A p ^{(i)}} p^{(i)} = r^{(k + 1)} - i = 0 \sum k \frac{r ^{(k + 1)} ^{T} ( r ^{(i)} - r ^{(i + 1)} )}{α ^{(i)} p ^{(i)} ^{T} A p ^{(i)}} p^{(i)} .$

Applying the orthogonality relations (Equations (34.3.7) and (34.3.8)), this simplifies to: $p^{(k + 1)} = r^{(k + 1)} + i = 0 \sum k \frac{r ^{(k + 1)} ^{T} r ^{(i + 1)}}{α ^{(i)} p ^{(i)} ^{T} A p ^{(i)}} p^{(i)} = r^{(k + 1)} + i = 0 \sum k \frac{r ^{(k + 1)} ^{T} r ^{(i + 1)}}{r ^{(i)} ^{T} r ^{(i)}} p^{(i)} = r^{(k + 1)} + \frac{r ^{(k + 1)} ^{T} r ^{(k + 1)}}{r ^{(k)} ^{T} r ^{(k)}} p^{(k)} .$

This remarkable simplification shows that we only need to retain the most recent search direction $p^{(k)}$ to compute the next one $p^{(k + 1)}$ .

Final Algorithm. The simplified CG algorithm is summarized in Algorithm 34.3.1:

Algorithm 34.3.1 (The Conjugate Gradient Method).

Preconditioning

The convergence rate of the conjugate gradient method is fundamentally determined by the condition number of matrix $A$ . When $A$ is ill-conditioned (i.e., has a large condition number), CG may converge slowly, requiring many iterations to reach acceptable accuracy.

Preconditioning is a crucial technique for accelerating convergence by transforming the original system into an equivalent one with more favorable spectral properties. The idea is to solve a modified system: $M^{- 1} A x = M^{- 1} b,$ where $M$ is a carefully chosen preconditioner matrix that approximates $A$ but is much easier to invert.

A simple yet effective choice is the diagonal (Jacobi) preconditioner, where $M = diag (A)$ contains only the diagonal entries of $A$ . The preconditioned CG algorithm then solves the transformed system while maintaining the essential structure of the original algorithm.

In practice, preconditioning amounts to scaling the residuals at each iteration by $M^{- 1}$ , which is computationally inexpensive when $M$ is diagonal. The preconditioned CG algorithm follows the same iterative structure as standard CG, but operates on the preconditioned residuals, often achieving significantly faster convergence for ill-conditioned problems.

Summary

In this section, we explored fundamental approaches for solving large, sparse linear systems of the form $A x = b$ that arise in optimization-based simulation, particularly when computing search directions in Newton-type methods.

We established the context: at each iteration of projected Newton methods, we solve $Pp = - \nabla E (x)$ where $P$ is typically a symmetric positive definite (SPD) proxy matrix. This reduces to solving linear systems crucial for optimization time integrators.

Direct solvers provide exact solutions through matrix factorization. For SPD systems, Cholesky decomposition $A = L L^{T}$ reduces the problem to two triangular solves. While robust and accurate, direct methods become impractical for very large systems due to computational constraints.

Iterative methods offer alternatives for large, sparse systems. Basic methods include Jacobi (naturally parallelizable) and Gauss-Seidel (faster convergence but sequential). Both converge when the spectral radius of their iteration matrices is less than 1. The Conjugate Gradient (CG) method is a more sophisticated approach for SPD systems. CG constructs $A$ -conjugate search directions ensuring progress is never undone, achieving remarkable efficiency through orthogonality relationships. In exact arithmetic, CG converges in at most $n$ steps, while in practice it often converges much faster with preconditioning.

The choice between direct and iterative methods depends on problem size, sparsity structure, and accuracy requirements. Direct methods excel for moderate-sized problems requiring high precision, while iterative methods, particularly CG with preconditioning, are essential for large-scale simulations.

N/A