It's intended for SQL generation and similar with cheap fine tuning and inference, not answering general knowledge questions. Their blog post is pretty clear about that. If you just want a chatbot this isn't the model for you. If you want to let non-SQL trained people ask questions of your data, it might be really useful.
Sorry, it sounds like you know a lot more than I do about this, and I'd appreciate it if you'd connect the dots. Is your comment a dig at either Snowflake or Llama? Where are you finding the unquantized size of Llama 3 70B? Isn't it extremely rare to do inference with large unquantized models?
For decent performance, you need to keep all the parameters on memory for both. Well, with a raid-0 of two PCIe 5 SSDs (or 4 PCIe 4) you might get 1 t/s loading experts from disk on snowflake-artic... but that is slooow.