Encyclopedia Britannica Sues OpenAI Over Unlawful AI Training and Copyright Theft
- Major Lawsuit: Encyclopedia Britannica and Merriam-Webster sued OpenAI in a Manhattan federal court for blatant copyright and trademark infringement.
- Massive Data Scraping: The publisher alleges OpenAI illegally copied nearly 100,000 proprietary articles to train ChatGPT.
- Brand Damage: Britannica claims ChatGPT cannibalizes web traffic and damages its reputation by citing the encyclopedia in fabricated AI "hallucinations."
Encyclopedia Britannica, along with its subsidiary Merriam-Webster, has officially filed a lawsuit against OpenAI in a federal court in Manhattan. The suit accuses the Microsoft-backed tech giant of illegally scraping their reference materials for the purpose of training it’s artificial intelligence models, and it looks like a high-stakes legal battle. Which threatens to further expose what looks like aggressive data harvesting practices that are fueling the generative AI boom.
The formal complaint, was filed on Friday. Britannica alleges that OpenAI unlawfully copied close to 100,000 of their proprietary articles, dictionary entri͏es, as well as encyclopedic data. Apparently the tech firm used this vast trove of intellectual property to teach it's flagship chatbot, ChatGPT, how to generate human-like text.
And this data theft actively damages the publisher’s core business, so it is not a small issue.
ChatGPT regularly produces what looks like near-verbatim copies of Britannica’s definitions and entries, this direct replication cannibalizes Britannica’s web traffic and it is easy to see why, users can just read the AI-generated summaries instead of visiting the original, monetized websites.
The lawsuit extends beyond simple copyright infringement and Britannica is formally accusing OpenAI of violating its trademarks; the AI system frequently implies it has explicit permission to reproduce the publisher’s material and, even worse, ChatGPT wrongfully cites Britannica in entirely fabricated outputs, these are commonly known as AI "hallucinations", and this really harms the brand's reputation for factual accuracy.
AI developers, they often make the case that their systems change copyrighted material enough to make it something new, which would be fair use, according to US law, but rights holders are obviously not happy about this and they disagree strongly.
Britannica is now one of many authors and news groups wh͏o are taking legal action against tech companies who are scraping data without paying for it. It is a growing list, for su͏re. The publisher started a similar lawsuit against Perp͏lexity AI, which is an AI search company, last year and that one is still going on. In their most recent filing, Britannica is asking for an amount of money to cover damages, but the ex͏act amount was not specified. The company also wants the court to issue an order right away to stop OpenAI from continuing to use their content, or at least, what they are calling infringement.

0 Comments