aner said in #3334 2w ago:
In 2023 the Times sued both Microsoft and OpenAI, and claimed both were using millions of NYT articles w/o permission to train LLMs. A little over a week ago a court determined that OpenAI did, in fact, have to preserve + segregate all output log data. OpenAI, of course, is appealing and making public gestures regarding data privacy and security arguing that the Times will set a bad precedent in their pursuit for evidence supporting their initial claim. OpenAI says that all regular users of ChatGPT will be affected (not enterprise or API customers though) with their output data being preserved and identified. The Times is pushing for deleted GPT chats + API content which is normally removed after ~1 month time frame under OpenAI's current data retention policies. OpenAI is storing the output data in a separate, secure system, protected under legal hold. Training policies will not be changed.
The main thing that comes to my mind is all of the documents and information which analysts (IB, PE, etc.) and consultants have been improperly uploading to GPT for quick analysis + information summary. While those users—the ones not on an enterprise-wide subscription—are likely not complying with their company data policy (violations of NDAs abound), OpenAI's data retention policy, if it is to be fully trusted, assuredly protected them.
The main thing that comes to my mind is all of the documents and information which analysts (IB, PE, etc.) and consultants have been improperly uploading to GPT for quick analysis + information summary. While those users—the ones not on an enterprise-wide subscription—are likely not complying with their company data policy (violations of NDAs abound), OpenAI's data retention policy, if it is to be fully trusted, assuredly protected them.
In 2023 the Times su