OpenAI keeps deleting data that could allegedly prove the AI company violated copyright laws by training ChatGPT on authors' works. Apparently largely

OpenAI blamed NYT for tech problem erasing evidence of copyright abuse

submited by
Style Pass
2024-11-26 09:00:02

OpenAI keeps deleting data that could allegedly prove the AI company violated copyright laws by training ChatGPT on authors' works. Apparently largely unintentional, the sloppy practice is seemingly dragging out early court battles that could determine whether AI training is fair use.

Most recently, The New York Times accused OpenAI of unintentionally erasing programs and search results that the newspaper believed could be used as evidence of copyright abuse.

The NYT apparently spent more than 150 hours extracting training data, while following a model inspection protocol that OpenAI set up precisely to avoid conducting potentially damning searches of its own database. This process began in October, but by mid-November, the NYT discovered that some of the data gathered had been erased due to what OpenAI called a "glitch."

Looking to update the court about potential delays in discovery, the NYT asked OpenAI to collaborate on a joint filing admitting the deletion occurred. But OpenAI declined, instead filing a separate response calling the newspaper's accusation that evidence was deleted "exaggerated" and blaming the NYT for the technical problem that triggered the data deleting.

Leave a Comment