Of course if you are happy with spaces instead of newlines as in your C64 code then you can even do it in one big string so you don't have to pay for concatenation.
Precalcing the entire thing and delivering absolutely nothing but a bunch of print statements is certainly an option that's completely in the spirit of the kinds of stuff one does to speed up an 8-bit computer(1) but the toy problem of "how can I speed up this for-next loop and the various logic inside it" was more fun. :)
(also the spaces-instead-of-newlines choice was made so as to avoid scrolling the screen, the c64's slow enough that having the ROM routines copying ~2k bytes of screen/color memory to scroll every second line of output makes changes in text-generation speed pretty much inconsequential. If that code's optimized for anything it's size, not speed. (2) )
(now that I think about it, there's also the fact that while the c64's BASIC interpreter is limited to about 250 tokens per line, the screen editor limits you to 80 characters per line.)
and now I am wondering how fast some form of
for i=1 to 100 step 15
print i;i+1;"fizz";i+3;"buzz fizz";i+6;i+7;"fizz buzz";i+10;"fizz";i+12;i+13;"fizzbuzz";:next
would work, okay I'm gonna fire up VICE: 49 jiffies, a ton faster than the 134 jiffies my previous best took. And if not for the 80-character input limit it could be one line. Hah. Thanks! Finding the right level to unroll a loop at is always your best bet in the 8-bit world. :)
(with the caveat that this generates fizzbuzz for 1-105 instead of 1-100, and logic to stop at 100 and/or logic to stop at any arbitrary limit would probably slow it down)
1: just this morning I was reading a series of blog posts where one of the original authors of Lemmings wrote a port of it to the ZX Spectrum Next and ended up writing some code that would read the data files from the PC version and generate lengthy chunks of hilariously unrolled assembly source that contained one routine to write every image frame to the screen in the fastest way possible, because a more traditional software sprite routine could only handle about ten lemmings at once; that's absolutely insane and absolutely beautiful: https://lemmings.info/porting-lemmings-to-the-zx-spectrum-ne...
Of course if you are happy with spaces instead of newlines as in your C64 code then you can even do it in one big string so you don't have to pay for concatenation.