Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What the author is doing here is pre-training. This is something usually model makers like Google and Meta need to do. Most business are much better off doing fine-tuning or to a lesser extent continued pre-training. The author is doing this for academic reasons.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: