Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Exactly. Even this paper shows how model creativity significantly drops and the models experience mode collapse like we saw in GANs, but the companies keep using RLHF...

https://arxiv.org/abs/2406.05587



A nice talk about a researcher's experience/benchmarks with raw GPT-4, before and after RLHF:

https://www.youtube.com/watch?v=qbIk7-JPB2c


Yup, I remember that! Microsoft removed that part of the paper.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: