Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> The problem is that once you have the vectors representing the answers you need something like another model that goes back to a word representation of said answers. Something like a diffusion model but for text.

Could it be just a smaller llm that takes as input both the semantic vector and the prompt, and is trained to predict the output tokens based on those? A model with high linguistic abilities and very little reasoning skills.



I think what you suggest would be very similar to a encoder-decoder architecture, which has been abandoned in favor of decoder-only architectures (https://cameronrwolfe.substack.com/p/decoder-only-transforme...). So I am guessing that what you suggest has already been tried and didn't work out, ut not sure why (the problems I mentioned above or something else).

Sorry, that's where the limit of my knowledge is. I work on ML stuff, but mostly on "traditional" deep learning and so I am not up to speed with the genAI field (also, the sheer amount of papers coming out makes it basically impossible stay up to date of you're not in the field).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: