In my opinion I think it’s possible to infer by what has been said[1], and the lack of a 5.1 “Thinking mini” version, that it has been folded into 5.1 Instant with it now deciding when and how much to “think”. I also suspect 5.1 Thinking will be expected to dynamically adapt to fill in the role somewhat given the changes there.
[1] “GPT‑5.1 Instant can use adaptive reasoning to decide when to *think before responding*”
> That said, Claude is still quite behind GPT-5 in its ability to review code, and so I'm not sure how much to expect from Sonnet 4.5 in this new domain. OpenAI could probably do better.
It’s always interesting to see others opinions as it’s still so variable and “vibe” based. Personally, for my use, the idea that any GPT-5 model is superior to Claude just doesn’t resonate - and I use both regularly for similar tasks.
I also find the subjective nature of these models interesting, but in this case the difference in my experiences between Sonnet 4.5 and GPT-5 Codex, and especially GPT-5 Pro, for code review is pretty stark. GPT-5 is consistently much better at hard logic problems, which code review often involves.
I have had GPT-5 point out dozens of complex bugs to me. Often in these cases I will try to see if other models can spot the same problems, and Gemini has occasionally but the Claude models never have (using Opus 4, 4.1, and Sonnet 4.5). These are bugs like complex race conditions or deadlocks that involve complex interactions between different parts of the codebase. GPT-5 and Gemini can spot these types of bugs with a decent accuracy, while I’ve never had Claude point out a bug like this.
If you haven’t tried it, I would try the codex /review feature and compare its results to asking Sonnet to do a review. For me, the difference is very clear for code review. For actual coding tasks, both models are much more varied, but for code review I’ve never had an instance where Claude pointed out a serious bug that GPT-5 missed. And I use these tools for code review all the time.
I've noticed something similar. I've been working on some concurrency libraries for elixir and Claude constantly gets things wrong, but GPT5 can recognize the techniques I'm using and the tradeoffs.
Try the TypeScript codex CLI with the gpt-5-codex model with reasoning always set to high, or GPT-5 Pro with max reasoning. Both are currently undeniably better than Claude Opus 4.1 or Sonnet 4.5 (max reasoning or otherwise) for all code-related tasks. Much slower but more reliable and more intelligent.
I've been a Claude Code fanboy for many months but OpenAI simply won this leg of the race, for now.
Same. I switched from sonnet 4 when it was out to codex. Went back to try sonnet 4.5 and it really hates to work for longer than like 5 minutes at a time
Codex meanwhile seems to be smarter and plugs away at a massive todo list for like 2 hours
I think saying they are stuck with standard email protocols is a bit of a stretch. JMAP is not widely implemented outside of Fastmail and certainly isn’t used by Apple Mail, which actually uses a proprietary IMAP extension (XAPPLEPUSHSERVICE).
There isn’t a skill issue at play here - it’s that Apple have closed the previously unofficial route others have been using (including MXRoute), which has brought the issue to a head. There’s some discussion on the Apple developer forums - https://developer.apple.com/forums/thread/778671.
That linked topic says Apple is working with the developer to handle it. I don't see any evidence of favoritism.
Fastmail clearly reached out to Apple eons ago, which is evidence of skill on fastmail's part and not favortism on Apple's part. Admittedly, fastmail very likely has lots of contacts in Apple (Jeremy Howard isn't nobody), but that's also a skill issue.
Having friends inside of Apple that provide you with something no one else can obtain is a fairly decent definition of favoritism.
You cannot send Push notifications to the stock iOS Mail app no matter how hard you try. They can. There are functions inside of iOS that are made better because of this (auto copied 2FA codes, for example).
There's no evidence "nobody else can obtain" it. The parent's link to the Apple developer forum shows somebody else obtaining it. There's no evidence that MXroute couldn't do the same.
There is no defined process for obtaining it. If you'd like to tear down that statement into little pieces and dissect it, I recommend getting a new hobby because it's not that interesting.
MXroute doesn't go around threatening US companies with EU law from Texas. As for your requirement for evidence, this situation does not require your approval unless you work for Apple and can offer some help in the matter. The tweet you are critiquing is me (owner of MXroute) attempting to gain Apple's attention to get what Fastmail and that EU user have obtained. I'll continue doing what I'm doing, if that's alright with you. I'm well aware of the situation and what others have done. What I need at this point is eyes on the prize. I'll get what I'm after, but a public statement that I currently cannot get what I'm after is entirely appropriate for the avenue I've chosen to do so.
It's a mistake to assume that I'm merely flailing my arms chaotically and generically playing the role of Karen.
This is a pretty unfortunate context for this, but I just wanted to say that I've been using your service for years, and me and my handful of customers are 100% satisfied.
It sucks that Apple won't just embrace interop with host mail services in some formal way, I'm not sure what the benefit of this specific flavor of gatekeeping achieves. It also sucks that HN contrarians are reflexively dumping on this, instead of supporting entrepreneurs like one might hope in such a community. Anyway, keep up the great work, I'm with you long term.
Why would you expect a defined process? It's proprietary software on a closed platform.
I'll just take your word for it that there's no ways for Texans to get the same treatment as these other companies. Oh well. Take it up with your legislature, I guess.
Of the many email hosts out there, only a handful receiving special treatment with no route for others to even request the same is special treatment. In the case of Fastmail, they’ve had this access since 2015.
It’s also very interesting that Apple reached out to the user in the developer forum thread after they raised it as an EU DMA issue.
No, I went straight to trying to shame them on Twitter. The part where I said "We’ve tried to talk. @Apple just stops responding once they realize what we’re asking" was just a joke, you got me.
Fastmail have been singled out as they appear to have been given special treatment in the form of an APNS topic ID. Other hosts have been using a reverse engineered endpoint to generate certificates, which has recently been closed.
There’s some discussion on the Apple developer forums - https://developer.apple.com/forums/thread/778671. The solution for the OP there seems to have been they also will get special treatment, but there remains no route for others to use to get the same.
Apple uses a proprietary IMAP extension that, until recently, any developer could use by generating a APNS certificate using a reverse engineered endpoint from macOS Server. They’ve since closed this.
I’m not aware of any UK-based registrars that rival US registrars such as Porkbun on both price and service, but there’s also no reason not to use those providers regardless of where you’re located.
There will certainly be a level where this is viable, if not sensible. Rather than having “casual” users cover the cost of “extreme” users, letting them specify an API key will likely be beneficial (albeit for a small number of users).
In my experience, there’s a vast difference between “education” aimed at the individual and what is delivered in accredited academic courses. The commercial aspect / tailoring to get people to buy and stick with it / no doubt is a factor.
[1] “GPT‑5.1 Instant can use adaptive reasoning to decide when to *think before responding*”