Meta is banking on a recent Supreme Court ruling about ISP piracy liability to dodge copyright lawsuits over its AI training data collection. The company filed a statement last week arguing that the SCOTUS decision in Cox Communications will help defeat claims that it committed contributory copyright infringement by torrenting roughly 80 terabytes of pirated content. The lawsuit from Entrepreneur Media alleges Meta knowingly facilitated infringement by seeding torrents—uploads that help speed downloads in BitTorrent networks.

This legal gambit reveals how precarious the foundation of AI training really is. Meta's torrenting strategy wasn't some rogue operation—it was systematic data collection at massive scale. The contributory infringement claim is particularly dangerous because it's much easier to prove than direct infringement. While authors in the separate Kadrey v. Meta class action struggle to show Meta downloaded complete works (hard to prove with fragmented torrent files), contributory infringement only requires proving Meta facilitated the transfers. A judge already ruled this claim can proceed.

Meta's defense hinges on the Supreme Court's finding that companies aren't liable for "merely providing a service" with knowledge of infringement, unless they "affirmatively induced" it. But this feels like a stretch—Meta wasn't just providing infrastructure like an ISP. They were actively seeding torrents to harvest training data. The distinction between facilitating piracy and participating in it may not hold up when you're the one doing the seeding.

For AI developers, this case matters beyond Meta's legal troubles. If courts decide that torrenting copyrighted works for training constitutes contributory infringement, it could reshape how companies source data. The days of "move fast and scrape everything" might be ending, forcing a shift toward licensed datasets or synthetic data generation.