> "...It matches the full-precision (i.e., FP16 or BF16) Wait... WHAT?! When did...

  > "...It matches the full-precision (i.e., FP16 or BF16)

Wait... WHAT?!

When did //HALF PRECISION// become //FULL PRECISION//?

FWIW, I cannot find where you're quoting from. I cannot find "matches" on TFA nor the GitHub link. And in the paper I see

  3.2 Inference Accuracy
  
  The bitnet.cpp framework enables lossless inference for ternary BitNet b1.58 LLMs. To evaluate inference accuracy, we randomly selected 1,000 prompts from WildChat [ ZRH+24 ] and compared the outputs generated by bitnet.cpp and llama.cpp to those produced by an FP32 kernel. The evaluation was conducted on a token-by-token basis, with a maximum of 100 tokens per model output, considering an inference sample lossless only if it exactly matched the full-precision output.