Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I am very interested in what LLMs will be able to do when trained on something other than the content on the Internet, which is primarily generated to sell advertising views.


I highly doubt it’s trained on that. I’m the sure it was curated and trained on the good stuff.


Did you arrive at this certainty through reading something other than what OpenAI has published? The document [0] that describes the training data for GPT-2 makes this assertion hilarious to me.

[0]: https://github.com/openai/gpt-2/blob/master/model_card.md#da...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: