Skip to main content

2 posts tagged with "math"

View All Tags

· 7 min read
Yuzhe Qin

Physical simulations are a crucial tool in many fields, from game development and computer graphics to robotics and prototype modeling. One fundamental aspect of these simulations is the concept of rotation. Be it planets whirling around a star in a space simulation, joints operating in a humanoid robot, or an animated character performing a thrilling parkour backflip, rotations are indeed everywhere. This blog post seeks to unravel the complexities of 3D rotations and acquaint you with the diverse rotation representations used in physical simulations.

Challenges of 3D Rotations

3D rotations are crucial for modeling the orientation of objects in space. They enable us to visualize and manipulate 3D models mathematically. However, handling rotations in 3D space can be quite tricky. Many bugs in simulations can be traced back to mismanaged rotations. The complexities arise from the nature of the 3D rotation itself – it isn't commutative (the sequence of rotations is crucial) and interpolation isn't straightforward ( calculating a rotation halfway between two given rotations is complex). Additionally, 3D rotations form a group structure known as the Special Orthogonal Group, SO(3), which isn't a typical Euclidean space where we can perform standard linear operations.

Rotation Representations

1. Rotation Matrices

Rotation matrices are 3x3 matrices that signify a rotation around the origin in 3D space. They provide an intuitive approach to understanding rotation, with each column (or row, depending on convention) of the matrix representing the new directions of the original axes after the rotation.

However, rotation matrices come with their set of limitations. The degree of freedom for rotation in an n-dimensional space is n(n-1)/2. Thus, the 3D rotation resides in a 3-dimensional space (while 2D rotation resides in a 1-dimensional space). This means that 3D rotation matrices consume more memory (9 floating point numbers) than necessary, and maintaining the orthogonality and normalization of the rotation matrix during numerical operations can be computationally burdensome. In practical applications, the majority of libraries, including the simulators we've discussed, employ quaternions as their core representation for rotations.

2. Quaternions

Quaternions are a type of mathematical object that extend complex numbers. They consist of one real component and three imaginary components, often denoted as w+xi+yj+zk. Quaternions have emerged as an extremely effective method of representing rotations in 3D space for computation.

Different from rotation matrix, they merely require four floating point numbers, can be easily interpolated using techniques like Spherical Linear Interpolation (SLERP), and they bypass the gimbal lock problem. However, they are not as intuitive as the other methods, and comprehending how they work necessitates some mathematical background. Also, quaternions have a double covering problem. This means that each 3D rotation can be represented by two different quaternions: one and its negation. In other words, a quaternion q and its negative -q will represent the same 3D rotation.

3. Euler Angles

Euler angles represent a rotation as three angular rotations around the axes of a coordinate system. The axes can be in any order (XYZ, ZYX, etc.), and this order makes a difference, leading to what is known as the "gimbal lock" problem.

Gimbal lock occurs when the axes of rotation align, causing a loss of one degree of freedom. This can lead to unexpected behavior in simulations. And same 3D rotation can be mapped into multiple Euler angles. Euler angles also have issues with interpolation, as interpolating between two sets of Euler angles will not produce a smooth rotation.

4. Axis-Angle Representation

The Axis-Angle representation is another way to understand 3D rotations. In this representation, a 3D rotation is characterized by a single rotation about a specific axis. The amount of rotation is given by the angle, and the direction of rotation is specified by the unit vector along this axis.

This representation is simple and intuitive, but it's not easy to concatenate multiple rotations. Also, like Euler angles, it has a gimbal lock problem when the rotation angle reaches 180 degrees. However, it's very useful in some scenarios such as generating a random rotation, or rotating an object around a specific axis.

Conversion Between Representations

Now, let's discuss the conversion between a rotation matrix and other common rotation representations: Euler angles, quaternions, and the axis-angle representation.

1. Rotation Matrix to Euler Angles

The process of extracting Euler angles from a rotation matrix depends on the Euler angles convention. For the XYZ convention (roll, pitch, yaw), the extraction is:

roll = atan2(R[2, 1], R[2, 2])
pitch = atan2(-R[2, 0], sqrt(R[0, 0]^2 + R[1, 0]^2))
yaw = atan2(R[1, 0], R[0, 0])

where R[i, j] denotes the element at the i-th row and the j-th column of the rotation matrix R.

2. Rotation Matrix to Quaternions

The conversion from a rotation matrix R to a quaternion q = (w, x, y, z) can be computed as:

w = sqrt(1 + R[0, 0] + R[1, 1] + R[2, 2]) / 2
x = (R[2, 1] - R[1, 2]) / (4 * w)
y = (R[0, 2] - R[2, 0]) / (4 * w)
z = (R[1, 0] - R[0, 1]) / (4 * w)

3. Rotation Matrix to Axis-Angle

For converting a rotation matrix to axis-angle representation, the axis a = (a_x, a_y, a_z) and the angle θ can be calculated as:

θ = acos((trace(R) - 1) / 2)
a_x = (R[2, 1] - R[1, 2]) / (2 * sin(θ))
a_y = (R[0, 2] - R[2, 0]) / (2 * sin(θ))
a_z = (R[1, 0] - R[0, 1]) / (2 * sin(θ))

where trace(R) is the sum of the elements on the main diagonal of R.

Common Issues and Bugs

Different Simulator, Different Rotation Convention

Both Euler Angles and Quaternions adhere to multiple conventions. Various software libraries utilize different conventions, which can potentially lead to errors when these libraries are used in tandem, a situation that occurs quite frequently.

For instance, some libraries represent Quaternion as (w, x, y, z), positioning the real part as the first element, while others represent it as (x, y, z, w). The following table illustrates the convention adopted by some widely used software and simulators.

Quaternion ConventionSimulator/Library
wxyzMuJoCo, SAPIEN, CoppeliaSim, IsaacSim, Gazebo, Blender, Taichi, Transforms3d, Eigen, PyTorch3D, USD
xyzwIsaacGym, ROS 1&2, IsaacSim Dynamic Control Extension, PhysX, SciPy, Unity, PyBullet

Besides the convention of quaternion. It's essential to recognize that several popular game engines, including Unity and Unreal Engine 4, operate within a left-handed coordinate framework. Within this system, the positive x-axis extends to the right, the positive y-axis ascends upwards, and the positive z-axis stretches forward. These game engines are not only pivotal in game development but also serve as simulators in various research domains.

Conversely, the majority of simulation engines adopt a right-handed coordinate system. The distinction between left-handed and right-handed coordinate systems is a critical aspect to consider during development.

When integrating different libraries and tools, this variation in coordinate system conventions can lead to discrepancies in spatial calculations and visual representations. As such, maintaining consistency across these systems is key to ensuring accurate and reliable outcomes in your projects.

Conclusion

Understanding 3D rotation representations and their conversion plays a pivotal role in creating sophisticated and realistic physical simulations. While this tutorial provides a comprehensive overview of the primary rotation representations, it's up to developers to determine which representation best suits their specific use-cases and computational constraints.

· 3 min read
Yang You

In this blog, we delve into the mechanics of differentiable simulators and explore why they are sometimes a more advantageous choice compared to reinforcement learning (RL) methods.

What is a Differentiable Simulator?

Imagine a robot in an environment where, in each state sSs \in \mathcal{S}, the agent can execute an action aAa \in \mathcal{A} leading to a subsequent state sSs' \in \mathcal{S}. We can describe this transition with the function f:S×ASf: \mathcal{S} \times \mathcal{A} \to \mathcal{S}. In conventional non-differentiable simulators, this function ff is often treated as a black box, with observed rewards rRr \in \mathcal{R} serving as the primary signal for RL-based learning.

Contrastingly, in differentiable simulators, the function ff is perceived as an end-to-end differentiable operator. This implies that if we define some loss related to the output state l(s)l(s'), it becomes feasible to compute the gradient in relation to the input state and actions, such as l(s)s\frac{\partial l(s')}{\partial s} and l(s)a\frac{\partial l(s')}{\partial a}. This capability enables the optimization of an entire sequence of actions using the chain rule.

Why are Differentiable Simulators Effective for Policy Learning?

As Yann LeCun insightfully noted at NeuIPS 2016, "If intelligence is a cake, the bulk of the cake is unsupervised learning, the icing on the cake is supervised learning, and the cherry on the cake is reinforcement learning" (source). The key takeaway here is the direct supervisory gradient flow in relation to the actions we aim to optimize, in contrast to the indirect reward-based approach of the REINFORCE algorithm in RL.

For instance, consider a task where the goal is to maneuver an object to a target position using a pusher. In an RL scenario, this would require extensive exploration, potentially involving thousands of trials before significant progress is made. Conversely, in a differentiable simulator, each iteration of gradient descent inherently contributes to progress toward the goal.

Miscellaneous Considerations

While differentiable simulators present certain advantages over RL, they are not without their limitations and challenges, such as:

  • Invalid Gradients in Certain Scenarios: Consider a scenario involving a rigid rolling pin used to flatten dough. If the initial sequence of actions fails to make contact with the dough, the resulting gradient will be zero throughout. To address this, some studies, like PlasticineLab, suggest incorporating a contact loss based on proximity to the target object. Others propose 'softening' rigid tools by increasing their influence radius, allowing objects to be affected without direct contact.

  • Limited Efficiency in Long-Horizon Tasks: As discussed in this paper, the dependency of differentiable physics on local gradients poses significant challenges. The loss landscape in these scenarios is often complex and riddled with potentially misleading local optima, which can diminish the reliability of this method for certain tasks.