Artificial intelligence firm Anthropic hits out at copyright lawsuit filed by music publishing corporations, claiming the content ingested into its models falls under ‘fair use’ and that any licensing regime created to manage its use of copyrighted material in training data would be too complex and costly to work in practice
GenAI tools ‘could not exist’ if firms are made to pay copyright::undefined
Reproduction of copyrighted material would be breaking the law. Studying it and using it as reference when creating original content is not.
I’m curious why we think otherwise when it is a student obtaining an unauthorized copy of a textbook to study, or researchers getting papers from sci-hub. Probably because it benefits corporations and they say so?
While I would like to be in a world where knowledge is free, this is apples and oranges.
OpenAI can purchase a textbook and read it. If their AI uses the knowledge gained to explain maths to an individual, without reproducing the original material, then there’s no issue.
The difference is the student in your example didn’t buy their textbook. Someone else bought it and reproduced the original for others to study from.
If OpenAI was pirating textbooks, that would be a wholly separate issue.
The fact that the “AI” can spit out whole passages verbatim when given the right prompts, suggests that there is a big problem here and they haven’t a clue how to fix it.
It’s not “learning” anything other than the probable order of words.
I’m curious why we think otherwise when it is a student obtaining an unauthorized copy of a textbook to study, or researchers getting papers from sci-hub. Probably because it benefits corporations and they say so?
While I would like to be in a world where knowledge is free, this is apples and oranges.
OpenAI can purchase a textbook and read it. If their AI uses the knowledge gained to explain maths to an individual, without reproducing the original material, then there’s no issue.
The difference is the student in your example didn’t buy their textbook. Someone else bought it and reproduced the original for others to study from.
If OpenAI was pirating textbooks, that would be a wholly separate issue.
The fact that the “AI” can spit out whole passages verbatim when given the right prompts, suggests that there is a big problem here and they haven’t a clue how to fix it.
It’s not “learning” anything other than the probable order of words.
What about these:
https://arxiv.org/abs/2310.02207
https://notes.aimodels.fyi/researchers-discover-emergent-linear-strucutres-llm-truth/
https://notes.aimodels.fyi/self-rag-improving-the-factual-accuracy-of-large-language-models-through-self-reflection/