Unlocking Robotic Grasping: The Fusion of Perception, Mathematics and Control

Imagine how you reach out to an object within your field of view, extend your arm to it, and grasp it within your fingers. This task, effortless for us, is rather more complicated for robots.

Let's dive into the fascinating world of robotic grasping, where mathematics meets mechanics to mimic human manipulation.

6 DoF pose estimation using NVIDIA's Deep Object Pose Estimation neural net

The Perception Challenge:
At its core, robotic grasping begins with perception. Just as our eyes and brain work together to identify objects and their positions, robots use advanced computer vision algorithms to detect the 3D pose of objects. These algorithms act as the robot's "eyes," providing crucial information about what to grasp and where.

Training robots to grasp objects involves using a perception model or computer vision algorithm to detect the 6 DoF (Degrees of Freedom) 3D pose of the object. This is often achieved using sophisticated neural networks like NVIDIA's Deep Object Pose Estimation. Once the object's pose is determined, inverse kinematics is used to manipulate the end effector (gripper; think of this as the fingers of the robot) to the appropriate position.

The Mathematics of Movement:

Once a robot "sees" an object, it needs to figure out how to move its arm to reach it. This is where the magic of Inverse Kinematics (IK) comes into play. IK is the mathematical wizardry that transforms a desired end-effector position (where the robot's "hand" should be) into precise joint angles for each part of the robotic arm.

IK calculates the joint angles required to position a robotic arm’s end-effector at a desired pose, initially estimated by a 3D object/grasp point detector and transformed relative to the robot’s base. The Jacobian matrix $J$ (matrix of partial derivatives) relates joint velocities $\dot{\theta}$ (small changes in orientation of the joints of the robot arm) to end-effector velocities (small changes in position of the end effector).

$\dot{x} = J\dot{\theta}$

For IK, we estimate small changes in joint angles $\Delta\theta$ to achieve small changes in end-effector pose $\Delta x$.
Subbing the above into the original Jacobian equation, we get $\Delta x \approx J\Delta\theta$.


2-jointed robotic arm with parameters $\theta_{1}$ and $\theta_{2}$ that need to be estimated using IK, given a target location $(x,y)$ of the end-effector. This is lifted to 3D for an arm in the real world. [source

This relationship is inverted for IK as $\Delta\theta \approx J^{-1}\Delta x$, where 
$J^{-1}$ is the inverse or pseudoinverse of $J$. Iterative IK algorithms apply this approximation by recalculating the Jacobian at each step and adjusting the joint angles until the end-effector reaches the target pose, ensuring convergence despite non-linearities.

The Dance of Control:
After calculating the perfect pose, how does the robot ensure its arm moves accurately? Enter the unsung hero of robotics: PID control. This feedback mechanism continuously adjusts the robot's movements, much like how we unconsciously adjust our hands when reaching for an object.

The desired joint angles $\theta_{desired}$ calculated using IK are passed to PID control that ensures accurate tracking by minimizing error $e=\theta_{desired}-\theta_{current}$.

The PID adjusts motor controls using $u(t) = K_p e(t) + K_i \int_0^t e(\tau) \, d\tau + K_d \frac{de(t)}{dt}$ where $u(t)$ is the control output (motor command), $e(t)$ is the error at time t and $K_{p}$, $K_{i}$ & $K_{d}$ are proportional, integral and derivative gains respectively.

Implementation:

In practice, there are as many actuators as there are degrees of freedom in the arm. Microcontrollers (like an Arduino or an ESP-32) interface with each actuator, running control algorithms that maintain the pose of each actuator. Meanwhile, the neural networks used for detecting the object and performing inverse kinematics typically reside on more capable processors like an NVIDIA Jetson. This seamless integration of perception, kinematics, and control allows robots to grasp objects with increasing dexterity and precision, bringing us closer to a world where machines can interact with their environment as effortlessly as we do.

Arduino-powered 6-axis arm from Elephant Robotics





Comments

Popular posts from this blog

Teaching Robots to Do the Dishes: A Data-Driven Approach

Building the Future Together: Rainier Labs Partners with GoSeeko to Empower the Next Generation of Robotics Innovators