Robot Masters Tennis Using Imperfect Data
A new AI system called LATENT enables a humanoid robot to play tennis by learning from imperfect, amateur motion data. This breakthrough bypasses the need for expensive, professional-grade recordings and could speed up robot training for various physical tasks.
Robot Masters Tennis Using Imperfect Data
Researchers have developed a new AI system that allows a humanoid robot to play tennis, a significant step forward in robotics. This breakthrough, achieved by a collaboration of top Chinese institutions including Singua University and Peking University, and companies like Galbot and Shanghai AI Laboratory, tackles a major challenge: teaching robots complex athletic movements in real-time. The system, named LATENT, learns from messy, incomplete human motion data, bypassing the need for perfect, professional-level recordings.
The chosen sport, tennis, presents a nightmarish scenario for robots. The ball can travel at over 60 mph, requiring the robot to track its movement, position its body, swing a racket, and make contact within milliseconds. Humans spend years mastering these skills, and even then, perfection is rare. Teaching a robot these abilities has been a long-standing goal in the field.
The Unistry G1 Robot and Its Challenge
The robot used in this project is the Unistry G1, a 4’2″ humanoid standing about the size of a young child. It boasts 29 independent joints, allowing for a wide range of motion. For this experiment, its right hand was modified with a 3D-printed grip to hold a tennis racket. While the Unistry G1 had previously shown capabilities in table tennis, the demands of full-court tennis are vastly different. It requires sprinting, full court coverage, and handling much faster balls.
LATENT: Learning from the Imperfect
LATENT, which stands for Learns Athletic Humanoid Tennis Skills from Imperfect Human Motion Data, uses a novel approach. Traditionally, teaching robots complex tasks involves feeding them vast amounts of perfect data from experts. However, capturing precise, full-body motion data from professional tennis players during actual matches is extremely difficult. Furthermore, a robot’s physical structure is different from a human’s, making direct copying of movements ineffective.
Instead of relying on professional data, the researchers collected just five hours of motion data from five amateur tennis players. These players performed basic movements like forehands, backhands, and shuffles within a small, 3m x 5m motion capture area. This area is roughly 17 times smaller than a standard tennis court. The collected data consisted of incomplete motion fragments from isolated movements, not full tennis rallies.
How LATENT Works
The LATENT system employs a three-layer architecture to process this imperfect data.
- Motion Tracker: This layer translates the raw human movements into a format the robot can physically perform, accounting for the robot’s different proportions and joint limits. It’s not about copying but about functional translation.
- Latent Action Space: This is the core innovation. Instead of learning every tiny muscle movement, the robot learns a compressed representation of actions. It grasps the essence of a forehand, for example, and then fills in the specific details itself. This allows for adaptation and the generation of movements not explicitly seen in the training data.
- High-Level Policy: This acts as the robot’s brain. It tracks the incoming ball, predicts its trajectory, decides on the type of shot (forehand or backhand), and coordinates the robot’s entire body to execute the shot.
Bridging the Simulation-to-Reality Gap
A critical challenge in robotics is the “sim-to-real” gap: policies that work perfectly in simulation often fail in the real world due to unpredictable factors like uneven surfaces, ball variations, or tiny mechanical imperfections. To combat this, the researchers trained the LATENT system extensively in a simulated environment.
They deliberately introduced randomization into the simulation, making the physics imperfect and adding noise to observations. This included altering friction, ball behavior, and the robot’s mass distribution. By training in a deliberately chaotic virtual world, the robot becomes more adaptable when faced with the slightly imperfect reality of a real tennis court. The idea is that if the robot can handle simulated chaos, real-world messiness will seem manageable.
Impressive Real-World Results
The results in the real world are striking. The Unistry G1 robot achieved a 91% success rate for forehands and a 78% success rate for backhands. More impressively, it can engage in multi-shot rallies with human players, demonstrating sustained play. The 35kg robot, standing 3.5 feet tall, can sprint at over 6 meters per second (faster than an average jog) and track balls traveling at 15-30 meters per second, making contact within a few milliseconds.
In simulation, the numbers were even higher: a 97% forehand success rate and an 82% backhand success rate. The researchers found that other common AI learning methods failed to achieve comparable results, either being unable to sustain rallies or having drastically lower success rates. This performance was achieved with minimal, imperfect amateur data.
Why This Matters
The true significance of LATENT lies in its ability to learn from messy, imperfect data. This addresses a major bottleneck in robotics: the difficulty and cost of collecting high-quality data. The researchers state that tennis was a proof of concept. The core achievement is demonstrating that robots can learn complex physical tasks without needing perfect, expert-level training data.
This approach could dramatically accelerate the development and deployment of humanoid robots in various fields. Companies like Figure and Tesla, working on robots for factories and warehouses, face the challenge of teaching robots to perform tasks without extensive manual programming. LATENT suggests a solution: capture a few hours of humans performing the task, even imperfectly, and let the AI translate it into robot movements. This makes training robots faster, cheaper, and more accessible.
Future Directions
The research team has identified several areas for future improvement. Currently, the robot relies on external motion capture systems to track the ball. The next step is to equip the robot with onboard cameras and active vision, making it fully self-contained and deployable in real-world environments like construction sites or warehouses without external tracking equipment.
They also aim to explore multi-agent scenarios, such as robots playing doubles or against each other, which would require advanced coordination and strategy. Furthermore, the researchers are testing the generalization capabilities of the LATENT architecture to see if it can learn other physical skills like soccer, parkour, dancing, or martial arts, given its design is not specific to tennis.
Source: China’s Tennis Robot Reveals the Next Step for Humanoids (YouTube)





