Thats even worse for autonomous cars, there is so such data and noise there is no way to reproduce the issue, it's complete chaos.
Whereas with a LLM if we control the seed we can 100% reproduce the same result
>> if we control the seed we can 100% reproduce the same result
No, that's the problem. You can't. You should be able to, but you can't. If you could, they wouldn't be scary. But we have Temperature Zero, different results. Because no one gave enough of a shit when coding them, and no one gives enough of a shit to try to fix the issue.
This is what in any other industry would be called gross negligence.
A lot of stuff behind the scenes is going on to batch and route queries to GPT-4 models that are in perturbed states already[1]. This isn't gross negligence, this is basic capitalism. If you want sole access to a GPT-4 MoE cluster starting fresh, it's gonna cost you.
Interesting article. I can see how it makes sense for OpenAI or someone with a LLM to take advantage of any entropy that presented itself, as a shortcut to non-repetitive answers. I'm not sure if you're saying that these LLMs take on new characteristics as they get more randomized? Or just that it would be hard to get your hands on a fresh one to test the determinism of?
>Whereas with a LLM if we control the seed we can 100% reproduce the same result
No, you can't. For the latest GPT models and the way they are run, this doesn't work anymore, making the experiment completely illogical. Some of the reasons are explained here pretty well: https://152334h.github.io/blog/non-determinism-in-gpt-4/