Really impressive performance from the Moondream model, but looking at the resul...

Jackson__ · 2025-09-27T16:35:29 1758990929

Funnily enough, Gemini is also the only one able to read a D20. ChatGPT consistently gets it wrong, and Claude mostly argues it can't read the face of the die that's facing up because it's obstructed (it's not lol).

KronisLV · 2025-09-27T19:59:36 1759003176

I'm not sure why they haven't been acquired yet by any of the big ones, since clearly Moondream is pretty good! Definitely seems like something Anthropic/OpenAI/whoever would want to fold into their platforms and such. Everyone involved in creating it should probably be swimming in money and visual use cases for LLMs should become far less useless with the reach of the big orgs.

ekidd · 2025-09-27T13:55:37 1758981337

Gemini is really fantastic at anything that's OCR-adjacent, and it promptly falls over on most other image-related tasks.