I think a simple way to take emotion out of this is to ask if a *computer* can b...

I think a simple way to take emotion out of this is to ask if a computer can beat humans at math. The answer to that is pretty much "duh". Symbolic solvers and numerical methods outperform humans by a wide margin and allow us to reach fundamentally new frontiers in mathematics.

But it's a separate question of whether this is a good example of that. I think there is a certain dishonesty in the tagline. "I asked a computer to improve on the state-of-the-art and it did!". With a buried footnote that the benchmark wasn't actually state-of-the-art, and that an improved solution was already known (albeit structured a bit differently).

When you're solving already-solved problems, it's hard to avoid bias, even just in how you ask the question and otherwise nudge the model. I see it a lot in my field: researchers publish revolutionary results that, upon closer inspection, work only for their known-outcome test cases and not much else.

Another piece of info we're not getting: why this particular, seemingly obscure problem? Is there something special about it, or is it data dredging (i.e., we tried 1,000 papers and this is the only one where it worked)?