Is the data input into ChatGPT not a large enough source of new data to matter?
People are constantly inputting novel data, telling ChatGPT about mistakes it made and suggesting approaches to try, and so on.
For local tools, like claude code, it feels like there's an even bigger goldmine of data in that you can have a user ask claude code to do something, and when it fails they do it themselves... and then if only anthropic could slurp up the human-produced correct solution, that would be high quality training data.
I know paid claude-code doesn't slurp up local code, and my impression is paid ChatGPT also doesn't use input for training... but perhaps that's the next thing to compromise on in the quest for more data.
NO! CLEARLY THE ENTIRE CORPUS OF HUMAN LITERATURE AND THE INTERNET DOESN'T CONTAIN ENOUGH INFORMATION TO EDUCATE AN EXPERT!!!! I JUST NEED ANOTHER BILLION DOLLARS PLS PLS PLS I PROMISE THE SCALING LAWS ARE ACTUALLY LAWS THIS TIME
People are constantly inputting novel data, telling ChatGPT about mistakes it made and suggesting approaches to try, and so on.
For local tools, like claude code, it feels like there's an even bigger goldmine of data in that you can have a user ask claude code to do something, and when it fails they do it themselves... and then if only anthropic could slurp up the human-produced correct solution, that would be high quality training data.
I know paid claude-code doesn't slurp up local code, and my impression is paid ChatGPT also doesn't use input for training... but perhaps that's the next thing to compromise on in the quest for more data.