Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Language models are mathematical, statistical beasts. The beast generally doesn't do well with open ended questions (known as "zero-shot"). It shines when you give it something to work off of ("one-shot").

That is the usage that is advertised to the general public, so I think it's fair to critique it by way of this usage.



Yeah, the "you're using it wrong" argument falls flat on its face when the technology is presented as an all-in-one magic answer box. Why give these companies the benefit of the doubt instead of holding them accountable for what they claim this tech to be? https://www.youtube.com/watch?v=9bBfYX8X5aU

I like to ask these chatbots to generate 25 trivia questions and answers from "golden age" Simpsons. It fabricates complete BS for a noticeable number of them. If I can't rely on it for something as low-stakes as TV trivia, it seems absurd to rely on it for anything else.


Whenever I read something like this I do definitely think "you're using it wrong". This question would've certainly tripped up earlier models but new ones have absolutely no issue making this with sources for each question. Example:

https://chatgpt.com/share/69160c9e-b2ac-8001-ad39-966975971a...

(the 7 minutes thinking is because ChatGPT is unusually slow right now for any question)

These days I'd trust it to accurately give 100 questions only about Homer. LLMs really are quite a lot better than they used to be by a large margin if you use them right.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: