A medical student in Nigeria named Zeus straps an iPhone to his forehead every evening and records himself folding laundry for $15 an hour. He's one of thousands of contract workers across 50+ countries hired by Palo Alto startup Micro1 to create training data for humanoid robots. Tesla, Figure AI, and Agility Robotics are buying this footage to teach their robots basic human movements—the same way ChatGPT learned language from internet text.
This represents a fundamental shift in robotics training. For decades, engineers programmed robots with explicit instructions. Now they're betting that robots can learn human-like manipulation by watching millions of hours of real humans doing mundane tasks. The approach mirrors how LLMs work, but physical world data is exponentially harder to collect than text. Virtual simulations can teach robots to do backflips but can't accurately model the physics of grasping a coffee cup or folding a fitted sheet.
What the original reporting doesn't fully capture is how this creates a new category of AI labor—physically embodied data generation. Unlike typical remote work, these jobs require workers to perform precise, repetitive movements while maintaining camera angles and lighting. Zeus finds the work boring despite the good pay, highlighting a tension between economic opportunity and job satisfaction that will likely define many AI-adjacent gig roles.
For developers building robotics applications, this signals that training data is becoming the bottleneck, not compute or algorithms. If you're working on embodied AI, start thinking about your data pipeline now—quality human demonstration data will be expensive and time-consuming to acquire at scale.
