Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

(Continuing from my other post)

The first thing I checked was "how did they verify the proofs were correct" and the answer was they got other AI people to check it, and those people said there were serious problems with the paper's methodology and it would not be a gold medal.

https://x.com/j_dekoninck/status/1947587647616004583

This is why we do not take things at face value.





That tweet is aimed at Google. I don't know much about Google's effort at IMO, but OpenAI was the primary newsmaker in that event, and they reportedly did not use hints or external tools. If you have info to the contrary, please share it so I can update that particular belief.

Gemini 2.5 has since been superceded by 3.0, which is less likely to need hints. 2.5 was not as strong as the contemporary GPT model, but 3.0 with Pro Thinking mode enabled is up there with the best.

Finally, saying, "Well, they were given some hints" is like me saying, "LOL, big deal, I could drag a Tour peleton up Col du Galibier if I were on the same drugs Lance was using."

No, in fact I could do no such thing, drugs or no drugs. Similarly, a model that can't legitimately reason will not be able to solve these types of problems, even if given hints.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: