Hacker Newsnew | past | comments | ask | show | jobs | submit | socketcluster's commentslogin

For me, Claude Code was the most impressive innovation this year. Cursor was a good proof of concept but Claude Code is the tool that actually got me to use LLMs for coding.

The kind of code that Claude produces looks almost exactly like the code I would write myself. It's like it's reading my mind. This is a game changer because I can maintain the code that Claude produces.

With Claude Code, there are no surprises. I can pretty much guess what its code will look like 90% to 95% of the time but it writes it a lot faster than I could. This is an amazing innovation.

Gemini is quite impressive as well. Nano banana in particular is very useful for graphic design.

I haven't tried Gemini with coding yet but TBH, Claude Code does such a great job; if I could code any faster, I would get decision fatigue. I don't like rushing into architecture or UX decisions. I like to sit on certain decisions for a day or two before starting implementation. Once you start in a particular direction, it's hard to undo and you may try to double down on the mistake due to sunk cost fallacy. I try hard to avoid that.


I don't even see much reason to use Cursor. I am used to IntelliJ IDEA, so I just downloaded the Claude Code plugin and basically now I use the IDE only for navigating in the code, finding references and reviewing the code. I can't even remember the last time I wrote more than 2 lines of code. Claude Code has catapulted my performance at least 5x if not more. And now that the cost of writing test is so minimal I am also able to achieve much better (and meaningful!) test coverage too. The AI agents is where the most productivity is. I just create a plan with Claude, iterate over, ask questions, then let it implement the plan, review, ask to do some adjustments. No manual writing of code at all. Zero.

Maybe I'm holding it wrong, but the finer aspects of a codebase it still messes up. If I ask it to implement some weird thing off the beaten path it gets lost. But I completely agree at the test part. I actually test much more now since it's so easy!

IntelliJ has its own Claude integration too, but it does not use your Claude subscription: https://blog.jetbrains.com/ai/2025/09/introducing-claude-age...

Do you guys all work 100% on open source? Or are you uploading bits of your copyrighted code for future training to Anthropic? I hate patents so copyright is the only IP protection I have.

We use AWS Bedrock, so everything stays within our AWS account. It's not like we aren't already uploading our code to GitHub for version control, AWS for deployment, Jetbrains for development, all of ours logs to Datadog, Sentry, Snowflake, and more.

Yeah, my source code is on my computers, in self-hosted version control and self-hosted CI runners

I first got into agentic properly with GLM coding plan (it's like $2/month), but I found myself very consistently asking Claude to make the code more elegant and readable. At which point I realized I was being silly and just switched to Claude code.

(GLM etc. get surprisingly close with good prompting but... $0.60/day to not worry about that is a no brainer.)


Nano Banana Pro is legitimately an insane tool if you know how to use it. I still can’t believe they released it in the wild

What is there to using it more than asking it to generate an image of something?

For one: modifying existing images in interesting ways ... adding characters, removing elements, altering or enhancing certain features, creating layers, and so on. Things that would take a while on Photoshop, done almost instantly. Really unlocks the imagination.

A friend just used it to generate an image of an office (complete with company name and address on the wall) that was to be used as “address proof” picture for a credit card application (maybe fraud)

I couldn’t tell it apart from the real thing and I have a great AI image eye


For me: I've only tried using it seriously a few times but my experience is that you have to juggle carefully when to start a fresh session. It can get really anchored on earlier versions of images. It was interesting balancing iteration and from-scratch prompt refinement.

I gave it an image of my crappy art and asked what steps I could take to make it look better. It gave me specific advice like varying the line widths and how to use this on specific parts of the character. It also pointed out that the shading in my piece was inconsistent and did not reflect the 3d form I was representing and again gave me specific fixes I could implement. I asked for it to give me an updated version of the piece with all of its advice implemented and it did so. I was pretty shocked at all of this.

It's decent for things that would take a long time in Photoshop. Like most AI, sometimes it works great and sometimes it goes off the rails completely. Most recently, I used it to process some drone photos that were taken during late fall for the purpose of marketing a commercial property. All of the trees/grass/plants were brown, so I told it to make it look like the photos were taken during the summer but not to change anything else. It did a very good job, not just changing the color, but actually adding leaves to the plants and trees in a way that looked very realistic. It did in seconds what would have taken one of my team members hours, leaving them to work on other more pressing projects.

I don’t have much time to evaluate tools every months and I have settled on Cursor. I’m curious on what I’m missing when using the same models?

You're not missing much. You can generally use Cursor like Claude Code for normal day to day use. I prefer Cursor because I like reviewing changes in an IDE, and I like being able to switch to the current SOTA model.

Though for more automated work, one thing you miss with Cursor is sub agents. And then to a lesser extent skills (these are pretty easy to emulate in other tools). I'm sure it's only a matter of time though.


Claude Code's VS Code integration is very easy to set up and pretty helpful if you want to see/review changes in an IDE.

The big limitation is that you have to approve/disapprove at every step. With Cursor you can iterate on changes and it updates the diffs until you approve the whole batch.

There is an auto accept diffs mode

You are missing an entire agentic experience. And I wouldn't call it vibe coding for an engineer; you're more or less empowered to truly orchestrate the development of your system.

Cursor has agent, but that's like whoever else tried to copy the Model T while Ford was developing it.


This hasn’t been my experience at all. I’m finding Cursor with Opus 4.5 and plan mode to be just as capable as CC. And I prefer the UI/UX.

I have only compared Claude Code with Crush and a tool of my own design. In my experience, Claude code is optimized for giant codebases and long tasks. It loves launching dozens of agents in parallel. So it's a bit heavy for smaller, surgical stuff, though it works decent for that too.

If you mostly have small codebases that fit in context, or make many small changes interactively, it's not really great for that (though it can handle it too). It'll just be spending most of its time poking around the codebase, when the whole thing should have just been loaded... (Too bad there's no small repo mode. I made startup hook that just dumps cat dir into context, but yeah, should be a toggle.)


If you switch to Codex you will get a lot of tokens for $200, enough to more consistently use high reasoning as well. Cursor is simply far more expensive so you end up using less or using dumber models.

Claude Code is overrated as it uses many of its features and modalities to compensate for model shortcomings that are not as necessary for steering state of the art models like GPT 5.2


I think this is a total misunderstanding of Anthropic’s place in the AI race. Opus 4.5 is absolutely a state of the art model. I won’t knock anyone for preferring Codex, but I think you’re ignoring official and unofficial benchmarks.

See: https://artificialanalysis.ai


> Opus 4.5 is absolutely a state of the art model.

> See: https://artificialanalysis.ai

The field moves fast. Per artificialanalysis, Opus 4.5 is currently behind GPT-5.2 (x-high) and Gemini 3 Pro. Even Google's cheaper Gemini 3 Flash model seems to be slightly ahead of Opus 4.5.


Totally, however OP's point was that Claude had to compensate for deficiencies versus a state of the art model like ChatGPT 5.2. I don't think that's correct. Whether or not Opus 4.5 is actually #1 on these benchmarks, it is clearly very competitive with the other top-tier models. I didn't take "state of the art" to here narrowly mean #1 on a given benchmark, but rather to mean near or at the frontier of current capabilities.

One thing to remember when comparing ML models of any kind is that single value metrics obscure a lot of nuance and you really have to go through the model results one by one to see how it performs. This is true for vision, NLP, and other modalities.

https://lmarena.ai/leaderboard/webdev

LM Arena shows Claude Opus 4.5 on top


I wonder how model competence and/or user preference on web development (that leaderboard) carries over to more complex and larger projects, or more generally anything other than web development ?

In addition to whatever they are exposed to as part of pre-training, it'd be interesting to know what kind of coding tasks these models are being RL-trained for? Are things like web development and maybe Python/ML coding overemphasized, or are they also being trained on things like Linux/Windows/embedded development etc in different languages?


https://x.com/giansegato/status/2002203155262812529/photo/1

https://x.com/METR_Evals/status/2002203627377574113

> Even Google's cheaper Gemini 3 Flash model seems to be slightly ahead of Opus 4.5.

What an insane take for anybody uses these models daily.


Yes, I personally feel that the "official" benchmarks are increasingly diverging from the everyday reality of using these models. My theory is that we are reaching a point where all the models are intelligent enough for day-to-day queries, so points like style/personality and proper use of web queries and other capabilities are better differentiators than intelligence alone.

is x-high fast enough to use as a coding agent?

Yes, if you parallelize your work, which you must learn to do if you want the best quality

What am I missing? As suspicious as benchmarks are, your link shows GPT 5.2 to be superior.

It is also out of date as it does not include 5.2 Codex.

Per my point about steerability compensated for by modalities and other harness features: Opus 4.5 scores 58% while GPT 5.2 scores 75% for the instruction following benchmark in your link! Thanks for the hard evidence - GPT 5.2 is 30% ahead of Opus 4.5 there. No wonder Claude Code needs those harness features for the user to manually reign in control over its instruction following capability.


I disagree, the claude models seem the best at tool calling, opus 4.5 seems the smartest, and claude code (+ claude model) seems to make good use of subagents and planning in a way that codex doesn't

Opus 4.5 is so bad at instruction following (30% worse per benchmark shared above) that it requires a manual toggle for plan mode.

GPT 5.2 simply obeys instruction to assemble a plan and avoids the need to compensate for poor steerability that would require the user to manually manage modalities.

Opus has improved though so the plan mode is less necessary than it was before, but it is still far behind state of art steerability.


I've used all of these tools and for me Cursor works just as well but has tabs, easy ways to abort or edit prompts, great visual diff, etc...

Someone sell me on how Claude Code, I just don't get it.


I’m with you, I’ve used CC but I strongly prefer Cursor.

Fundamentally, I don’t like having my agent and my IDE be split. Yes, I know there are CC plugins for IDEs, but you don’t get the same level of tight integration.


Yep, httpOnly cookies just give the hacker a bit of extra work in some situations. TBH I don't even think httpOnly is worth the hassle it creates for platform developers given how little security it adds.

Not a problem in itself. Also, there's not much point of encrypting tokens. The attacker could use the encrypted token to authenticate themselves without having to decrypt. They could just make a request from the victim's own browser. They could do this with cookies too even with httpOnly cookies.

XSS is a big problem. If a hacker can inject a script into your front end and make it execute, it's game over. Once they get to that point, there's an infinite number of things they can do. They basically own the user's account.


Does anyone actually encrypt the contents of JWTs? I'd have thought that anyone who has concerns about the contents of the token being easily visible would be likely to avoid JWTs anyway and just use completely opaque tokens?

JWT supports some encryption algorithms as an alternative to signatures but my experience is that most people like to keep it simple.

JWT is intended for authentication. Most of the time you're basically just signing a token containing an account ID and nothing else... Sometimes a list of groups but that only scales to a small number of groups.


Encrypted tokens are opaque but they are also offline-verifiable. A simple opaque token has to be verified online (typically, against a database) whenever it's used.

Auth0, for example, supports JWE for its access tokens: https://auth0.com/docs/secure/tokens/access-tokens/json-web-...


Great article. I like the simple point about the hypothetical IQ test sent one week in advance. It makes a strong case about time being the true bottleness. I think this same idea could be applied to most tests.

Implicit in the design of most tests is the idea that a person's ability to quickly solve moderately difficult problems implies a proportional ability to solve very difficult problems if given more time. This is clearly jumping to a conclusion. I doubt there is any credible evidence to support this. My experience tends to suggest the opposite; that more intelligent people need more time to think because their brains have to synthesize more different facts and sources of information. They're doing more work.

We can see it with AI agents as well; they perform better when you give them more time and when they consider the problem from more angles.

It's interesting that we have such bias in our education system because most people would agree that being able to solve new difficult problems is a much more economically valuable skill than being able to quickly solve moderate problems that have already been solved. There is much less economic and social value in solving problems that have already been solved... Yet this is what most tests select for.

It reminds me of the "factory model of schooling." Also there is a George Carlin quote which comes to mind:

"Governments don't want a population capable of critical thinking, they want obedient workers, people just smart enough to run the machines and just dumb enough to passively accept their situation."

I suspect there may be some correlation between High IQ, fast thinking, fast learning and suggestibility (meaning insufficient scrutiny of learned information). What if fast learning comes at the expense of scrutiny? What if fast thinking is tested for as a proxy for fast learning?

What if the tests which our society and economy depend on ultimately select for suggestibility, not intelligence?


>most people would agree that being able to solve new difficult problems is a much more economically valuable skill than being able to quickly solve moderate problems that have already been solved

Do most people agree with that? I agree with that completely, and I have spent a lot of time wishing that most people agreed with that. But my experience is that almost no one agrees with that...ever...in any circumstance.

I don't even think society as a whole agrees with this statement. If you just rank careers according to the ones that have the highest likelihood of making the most money, the most economically valuable tend to be the ones solving medium difficulty problems quickly.


Really depends on the problem. Cancer, yes. Banking, no.

"Implicit in the design of most tests is the idea that a person's ability to quickly solve moderately difficult problems implies a proportional ability to solve very difficult problems if given more time."

I used to share that doubt, especially during my first semesters at university.

However, my experience over the decades has been, that people who solved moderately difficult problems quickly were also the ones that excelled at solving hard and novel problems. So in my (little) experience, there is a justification for that and I'd be definitely interested (and not surprised) to see credible evidence for it.


I could understand that, as we are inclined to hold this for true: "slow -> not smart".

What we do know is "not smart -> slow", because if you are dumb, you will be (infinitely) slow to answer correctly. But note this might indicate that the first proposition is a common logical fallacy where the modus tollens is applied incorrectly.

The slow but brilliant thinker wouldn't perhaps show up for solving a hard and novel problem, as they might have learned they are stupid, and they might still be slogging trough other problem sets. Other excuses are found in https://almossawi.substack.com/p/slow-and-fast-learners-3-qu...

If you want to test pure ability for deep thought, it will be very difficult to control all variables that affect slow people.


> I like the simple point about the hypothetical IQ test sent one week in advance.

It’s a simple point but an incorrect one.

If you can work on it for a week, it’s no longer an IQ test. Nobody is saying that the questions on an IQ test are impossible. It’s the fact that there are constraints (time) and that everybody takes the test the same way that makes it an IQ test. Otherwise it’s just a little sheet of kinda tricky puzzles.

Would you be a better basketball player if everyone else had to heave from 3/4 court but you could shoot layups? No, you’d be playing by different rules in an essentially different game. You might have more impressive stats but you wouldn’t be better.


> Would you be a better basketball player if everyone else had to heave from 3/4 court but you could shoot layups? No, you’d be playing by different rules in an essentially different game. You might have more impressive stats but you wouldn’t be better.

I think the correct analogy here is that if everyone had to shoot from 3/4 court, you would likely end up with a different set of superstars than the set of superstars you get when dunking is allowed.

In other words, if the IQ test were much much harder, but you had a month to do it, you might find that the set of people who do well is different than who does well on the 1 hour test. Those people may be better suited to pursuing really hard open ended long term problems.


No, I don’t think that is the correct analogy. The analogy in the blog post is that you (one person) gets a month headstart on the test. You would look like a genius because you’d outscore everyone else who had the time constraint.

Yes, if you play a different game you’ll find different high performers. That is obvious. But it is not what the blog post is saying. It is saying if you let one person play the same game but by different rules, they will look better.


This is the passage you're citing:

> Consider this: if you get access to an IQ test weeks in advance, you could slowly work through all the problems and memorize the solutions. The test would then score you as a genius. This reveals what IQ tests actually measure. It’s not whether you can solve problems, but how fast you solve them.

You retort that "if you can work on it for a week, then it's no longer an IQ test", but that retort is one that the author would agree with. The author is simply making the argument that, what IQ measures is not necessarily the same kind of intelligence as what is necessary for success in the real world. He's not actually arguing that people should be allowed to take as long as they want on the test, he's simply using that hypothetical to illustrate "what IQ tests actually measure".


> This reveals what IQ tests actually measure. It’s not whether you can solve problems, but how fast you solve them.

Who is out here arguing that IQ tests only measure whether you can solve puzzles or not?

> You retort that "if you can work on it for a week, then it's no longer an IQ test", but that retort is one that the author would agree with.

Well it would be unreasonable to disagree with because it is less a retort than a simple fact.


Yeah, I should clarify - I also don't think the article made the correct analogy. But I more meant that I think the different-game-gets-different-winners-analogy should have been how the article tried to make the point the author ultimately intended.

Counterpoint to consider: In real life, you can just play a different game. Most people will choose to shoot from 3/4 court instead of running all the way to the other end, because they’re not interested in basketball.

Most people aren’t interested enough to work 100+ hours per week. But we wouldn’t say Elon isn’t better at work ”because he doesn’t even work a 40-hour work week”

It has a lot to do with interest. Michael Jordan isn’t a world class mathematician. Elon isn’t a world class father.


> Most people will choose to shoot from 3/4 court instead of running all the way to the other end, because they’re not interested in basketball.

I have never once in my life seen anyone do anything close to this. Have you?


> Implicit in the design of most tests is the idea that a person's ability to quickly solve moderately difficult problems implies a proportional ability to solve very difficult problems if given more time. This is clearly jumping to a conclusion. I doubt there is any credible evidence to support this.

I think this approach is effectively testing if a student studied the material. It makes the correlation between memorization and understanding. Recall a piece of information is fast if avaliable.

Its a commonly expressed experience among university students that learning memorization techniques and focusing on solving previous exams is a disproportionately effectively way to pass courses.

It's technically more impressive to pass the exam by never doing a single similar problem before and deriving a solving method or forumla that wasn't memorised.

I took deliberate effort to avoid looking at previous exam question for a course until the week before, since it cased good grades at little value to me long term.


> I suspect there may be some correlation between High IQ, fast thinking, fast learning and suggestibility (meaning insufficient scrutiny of learned information). What if fast learning comes at the expense of scrutiny? What if fast thinking is tested for as a proxy for fast learning?

Precisely. Speaking from experience, in school, every claim that I was supposed to accept and reproduce on an exam or in homework was met with a gut response: "Is this really true? Is so, why? How do you know?". I wanted to verify the information and know the justification for believing it, the reason something was true. What's more, I had trouble with the coherence of the claims being made. The physics we are taught in school, for example, raises very serious metaphysical questions. This produced in me a spirit of rebellion. I felt a certain vague disgust for the way things were taught that frustrated my motivation. In some sense, it didn't feel like truth was being treated seriously. The ceremony of education, with all its trappings, is all that was treated seriously. "Getting the grade", not understanding something was what it was all about. It felt like an acrobatics contest and a game of one-upmanship.

Now, sometimes, the justification for a claim was obvious, at least given certain premises (these are often left tacitly assumed, often implied: the danger), but that's not always or perhaps even usually the case. Even in math, a science that can be done from the armchair, we are given formulas and methods that are supposed to be taken on faith and simply used. Through repetition, we are supposed to become better at identifying situations in which we can apply them. But where do these formulas and methods come from? What do they tell us, and how do we know?

And I emphasize "faith": there is no way the valedictorian has verified everything he or she was taught or knows the justification for them. A "good student" keeps up, and since scrutiny and analysis take time and skill - time no student is given especially as the workload piles up, and skill no student possesses - a faithful student, a student who obediently accepts what he or she is being told. You can imagine that blind faith would produce the "perfect student". (Curiously, we are simultaneously commanded to "question everything" - except questioning everything, of course - but then required not to actually practice that advice.)

Now, you could argue that students are too young to understand the justifications for the claims being made, and in practice, we are always relying on faith in some authority. Few people realize how much faith we rely on in our lives. Society entails a certain epistemic deference, even if merely practical or perfunctory. In practice, it is unrealistic not to rely on faith. Faith has its proper place.

Someone might also say that students could be bracketing the information they are receiving. They may simply be entertaining it as a possibility in good faith and playing with it, until verification becomes possible or necessary. Maybe. But given the intellectual immaturity of students, and the obedience at the top, I suspect there is at least a superficial assent given to what they are taught. Otherwise, school is a game to be played, one that, we are told, is an instrument for climbing the ladder of social status. The content doesn't matter. What matters is that you play by the rules of the game and that you play by them well. When you do that, the kingmakers and status granters will throw you a few golden chips and elevate you in the eyes of society. You will be in.

Sounds cynical. After all, wouldn't an institution that wants to select for wisdom also create barriers? Of course, regardless of how effective they are. But the differences cannot be ignored. The intent and purpose are different, for one. The means of selection are another. Education is bureaucratized. We think that bureaucracy will create a "level playing field", eliminating the biases and favoritism that "personal judgement" is bound to entail. But who designs the bureaucracy? What does it actually select for? And does it not often commit the fallacy of confusing features of the method for features of the real?

We're obsessed with rank, and bureaucratic methods make us obsessively so. We imagine there is a sharper slope and a smaller peak than there really is. There is a slope, to be sure. I am no egalitarian. But come on.

Anyway, for all that rambling, what are some of the morals here?

I suppose my first point is that education ought to be focused on first principles first. It ought to be focused on understanding and truth and learning the competence of being able to get there, as that is the whole point. The trivium and quadrivium of old did this. People think of the Middle Ages as some kind period hostile to education. They think it was like the Prussian-style of education (from which modern education gets a lot of its ideas), oriented toward mindless obedience and unquestioned submission to the state. Nothing could be further from the truth. Universities were renowned for open discussion and debate, perhaps most famously in the form of the disputation. The Scholastics were famous for intellectual rigor, a rigor that puts to shame the pompous pretensions of the so-called Enlightenment that never missed the opportunity to erect straw men of the Medievals to ridicule.

Second, rewards and penalties are selection mechanisms. We get the behavior we reward and we get less of the kind we punish. Habits are like this, too: indulging a habit of overeating reinforces and magnifies the habit, while restraint has the opposite effect. What does our education system feed? What does it starve? We should ask this question ceaselessly.


My life experience and resulting sentiment is extremely similar to yours.

I feel that the education system is deeply flawed and rewards all the wrong things because we refuse to select based on real factors because of political ideology. I think those that are successful become so despite of it, instead of because of it. When you looks at the biographies or people who truly pushed the enveloppe and changed the world, it becomes evident.

We need to ask if the cost of the education system are really worth the rewards. Considering how large that cost has become nowadays, my premilinary anwser would be no. And I feel that the shift to rent seeking economy as well as reduced innovation and iteration speed is deeply linked to that. Most of the recent growth came from IT, a field that was notorious for be full of dropouts. That should tell you something. Now that the field has been innunduated by college graduate, it has shifted to fully extractive behavior.

Any push back against the system is met with suspition because most people feel they should have a shot at making it big, because they are worth it. In practice, it seems that the inequalities never disappear anyway, and people just have to pay more upfront in order to try to prove themselves. In the long run, it mostly end up exactly as it started and society just pay a dear cost for what is basically unproductive behavior.

You behavior remark is quite on the nose, because from my point of view this is exactly how tyrannies are created. If you get rewarded too much for simply being obedient to the autority in place, overtime any other strategy gets pennalised dispropotionally and you end up with a bunch of sycophant you will never push back against the order, no matter how bad the decisions/rules get.


Yep, I fully agree with this view and I find that it's seniors who ask the 'dumb' questions. Everyone is worried about losing face in this precarious economy... But seniors are able to ask really smart questions as well so even their dumb questions sound smart... They can usually spin dumb questions into a smart questions by going one level deeper and bringing nuance into the discussion. This may be difficult to do for a junior.

My experience as a team lead working with a lot of juniors is that they are terrified of losing face and tend to talk a big game. As a team lead, I try to use language which expresses any doubts or knowledge gaps I have so that others in my team feel comfortable doing it as well. But a key aspect is that you have to really know your stuff in certain areas because you need to inspire others to mirror you... They won't try to mirror you if they don't respect you, based on your technical ability.

You need to demonstrate deep knowledge in some areas and need to demonstrate excellent reasoning abilities before you can safely ask dumb questions IMO. I try to find the specific strengths and weaknesses of my team members. I give constructive criticism for weaknesses but always try to identify and acknowledge each person's unique superpower; what makes them really stand out within the team. If people feel secure in their 'superpower', then they can be vulnerable in other areas and still feel confident. It's important to correctly identify the 'superpower' though because you don't want a Junior honing a skill that they don't naturally possess or you don't want them to be calling shots when they should be asking for help.


    My experience as a team lead working with a lot of juniors is that they are terrified of losing face
So much this! Both from my experience as Junior very many years ago and also with Juniors (and not so Juniors) today.

    tend to talk a big game
Very big game. Claude does too. The kind of BS it spews in very confident language is amazing.

    As a team lead, I try to use language which expresses any doubts or knowledge gaps I have so that others in my team feel comfortable doing it as well
Agree. I also often literally say "Dumb idea: X" to try and suss out areas that may have been left by the wayside and under-explored or where assumptions have been made without verifying them. It's amazing how often even "Seniors"+ will spew assumptions as fact without verification. It's very annoying actually.

    superpower
How do you actually do this tho? I would love to do this but it seems hard to find an actual "superpower". Like where does "super" power start vs. "yeah they're better at this than others but definitely not as good as me or "person X that definitely does have that superpower". Like when can you start encouraging so to speak,

The fact that you mentioned Claude (LLMs) is interesting! I definitely feel like there is a parallel; maybe because AI sometimes hallucinates, people have built up a tolerance for this kind of speculative use of language from people as well.

About finding the superpower of each team member; after working with someone for a few months, I start to notice certain patterns in how they think. Sometimes there might be something they say once or a question they ask which makes me view them more favorably than before. Maybe they're fast, good at assembling stuff or slow but good at building stable core components. Maybe they're similar to me or maybe they have a skill I don't have or a certain interest/focus or way to approach problems that is different from me but is beneficial to the project and/or team.

It's a bit like playing a weird game of chess where you can't see the pieces properly at the beginning so everyone looks like a pawn initially; But then over time you discover that one person is a knight, another is a bishop, another is a queen... And you adapt your strategy as your visibility improves.


Other possibility; a disgruntled investor who poured millions into dead-end fusion research and now wishes they had invested in AI research instead? Blames the professor for persuading them to invest in fusion.

It's a tough one to find a motive for...


Can you quote 1 other example of a disgruntled investor that has killed an American academic over the last 50 years ?

They normally just have their friend the DA lock them up for "fraud"

Yes. I feel like people who are trying to push software verification have never worked on typical real-world software projects where the spec is like 100 pages long and still doesn't fully cover all the requirements and you still have to read between the lines and then requirements keep changing mid-way through the project... Implementing software to meet the spec takes a very long time and then you have to invest a lot of effort and deep thought to ensure that what you've produced fits within the spec so that the stakeholder will be satisfied. You need to be a mind-reader.

It's hard even for a human who understands the full business, social and political context to disambiguate the meaning and intent of the spec; to try to express it mathematically would be an absolute nightmare... and extremely unwise. You would literally need some kind of super intelligence... And the amount of stream-of-thought tokens which would have to be generated to arrive at a correct, consistent, unambiguous formal spec is probably going to cost more than just hiring top software engineers to build the thing with 100% test coverage of all main cases and edge cases.

Worst part is; after you do all the expensive work of formal verification; you end up proving the 'correctness' of a solution that the client doesn't want.

The refactoring required will invalidate the entire proof from the beginning. We haven't even figured out the optimal way to formally architect software that is resilient to requirement changes; in fact, the industry is REALLY BAD at this. Almost nobody is even thinking about it. I am, but I sometimes feel like I may be the only person in the world who cares about designing optimal architectures to minimize line count and refactoring diff size. We'd have to solve this problem first before we even think about formal verification of 'most software'.

Without a hypothetical super-intelligence which understands everything about the world; the risk of misinterpreting any given 'typical' requirement is almost 100%... And once we have such super-intelligence, we won't need formal verification because the super-intelligence will be able to code perfectly on the first attempt; no need to verify.

And then there's the fact that most software can tolerate bugs... If operationally important big tech software which literally has millions of concurrent users can tolerate bugs, then most software can tolerate bugs.


Software verification has gotten some use for smart contracts. The code is fairly simple, it's certain to be attacked by sophisticated hackers who know the source, and the consequence of failure is theft of funds, possibly in large amounts. 100% test coverage is no guarantee that an attack can't be found.

People spend gobs of money on human security auditors who don't necessarily catch everything either, so verification easily fits in the budget. And once deployed, the code can't be changed.

Verification has also been used in embedded safety-critical code.


If the requirements you have to satisfy arise out of a fixed, deterministic contract (as opposed to a human being), I can see how that's possible in this case.

I think the root problem may be that most software has to adapt to a constantly changing reality. There aren't many businesses which can stay afloat without ever changing anything.


The whole perspective of this argument is hard for me to grasp. I don't think anyone is suggesting that formal specs are an alternative to code, they are just an alternative to informal specs. And actually with AI the new spin is that they aren't even a mutually exclusive alternative.

A bidirectional bridge that spans multiple representations from informal spec to semiformal spec to code seems ideal. You change the most relevant layer that you're interested in and then see updates propagating semi-automatically to other layers. I'd say the jury is out on whether this uses extra tokens or saves them, but a few things we do know. Chain of code works better than chain of thought, and chain-of-spec seems like a simple generalization. Markdown-based planning and task-tracking agent workflows work better than just YOLOing one-shot changes everywhere, and so intermediate representations are useful.

It seems to me that you can't actually get rid of specs, right? So to shoot down the idea of productive cooperation between formal methods and LLM-style AI, one really must successfully argue that informal specs are inherently better than formal ones. Or even stronger: having only informal specs is better than having informal+formal.


> A bidirectional bridge that spans multiple representations from informal spec

Amusingly, what I'm hearing is literally "I have a bridge to sell you."


There's always a bridge, dude. The only question is whether you want to buy one that's described as "a pretty good one, not too old, sold as is" or if you'd maybe prefer "spans X, holds Y, money back guarantee".

I get it. Sometimes complexity is justified. I just don't feel this particular bridge is justified for 'mainstream software' which is what the article is about.

I agree that trying to produce this sort of spec for the entire project is probably a fool's errand, but I still see the value for critical components of the system. Formally verifying the correctness of balance calculation from a ledger, or that database writes are always persisted to the write ahead log, for example.

I used to work adjacent to a team who worked from closely-defined specs for web sites, and it used to infuriate the living hell out of me. The specs had all sorts of horrible UI choices and bugs and stuff that just plain wouldn't work when coded. I tried my best to get them to implement the intent of the spec, not the actual spec, but they had been trained in one method only and would not deviate at any cost.

Yeah, IMO, the spec almost always needs refinement. I've worked for some companies where they tried to write specs with precision down to every word; but what happened is; if the spec was too detailed, it usually had to be adjusted later once it started to conflict with reality (efficiency, costs, security/access restrictions, resource limits, AI limitations)... If it wasn't detailed enough, then we had to read between the lines and fill in a lot of gaps... And usually had to iterate with the stakeholder to get it right.

At most other companies, it's like the stakeholder doesn't even know what they want until they start seeing things on a screen... Trying to write a formal spec when literally nobody in the universe even knows what is required; that's physically impossible.

In my view, 'Correct code' means code that does what the client needs it to do. This is downstream from it doing what the client thinks they want; which is itself downstream from it doing what the client asked for. Reminds me of this meme: https://www.reddit.com/r/funny/comments/105v2h/what_the_cust...

Software engineers don't get nearly enough credit for how difficult their job is.


How do you or the client know that the software is doing what they want?

What formal verification system did they use? Did they even execute it?

I've been tech lead at different companies. Every time I switched companies, I started out as senior dev and got promoted into the team lead role again; each time with full support of my team.

I don't look or act like a leader and this has been a hurdle for me. But what typically happens anyway is; within a few months, my code ends up being a core part of the project; my modules become heavily depended upon and somehow I end up maintaining all the config files and guiding architecture decisions. One of my team members joked that I "conquered everyone's code." I probably write fewer lines of code than everyone else but somehow those lines end up heavily used. So then I basically just ask the big boss for a team lead position.


While I am not a software developer, it sounds like our career paths have had the same trajectory, and I'm wondering what the common factor is across industries.

I work in automation (mostly) as a lead tech and professional troubleshooter because I am familiar with a wide and varied amount of automation technologies. I've met plenty of people over the years who have much more advanced skills than myself, but never go beyond doing more than parts swapping on a workbench, which leaves me scratching my head.

Over the last few years, I have listened carefully to what people around me say about my work, and while it is good gas for the ego, I have notice that's not the likely reason I get promoted so quickly. While I can walk into a problem and know how to apply different processes to figure out what to do almost reflexively at this point, the real focus seems to be that I take ownership of the process.

Bit of a buzzphrase, "ownership of the process," but the short explanation is that a little planning, accountability, resourcefulness and communication seems to get you a lot further than just knowing what to do in any given situation. Employers like that because they now have department manager they can rely on, and team members like that because someone else is taking responsibility so they don't have to.

You're good at code, obviously, but if you zoom out on your work a bit, are you also bringing a bit of accountable authority to the table? That may be the real reason why you move up so quickly, or at least something that greases the gears so that can happen faster for you than, say, an equally skilled colleague.


I wonder how much empathy plays into it? I’ve trended towards teach lead roles and not that I feel my code is necessarily special, I think it’s clean enough and concise at times - occasionally some ASCII art slips in of course…

But what I have noticed across multiple companies is that I offer feedback for reviews that is thorough when it counts - I may pick apart a huge PR with lots of notes and suggestions, and that’s because the change has the potential to impact large systems. I also explain why, and I ask questions of the engineer to make sure I know why they did a thing because maybe I missed something.

I also talk to managers, product, and design, and do a lot of listening. Often times people are working towards a goal, or working against some barrier to getting their goals met, and being able to listen and understand them lends to a lot of credibility. And when you do a lot of that listening, you inherently gain a good amount of understanding of the systems - both technical and human.

When the time comes for someone to say “we should have a new lead on X”, the people that listen and engage tend to rise to that position naturally.

I think accountability and empathy go hand in hand to some degree - by owning something, you’re also saying “I understand how my work might impact other things, other people, and I want to reassure you that I have your back”.


I don't understand why that is a logical progression. Writing good code and leading a team needs vastly different skill-sets in my eyes.

On the other hand, if you want to lead a bunch of engineers, you should know their work very well, otherwise you will have unrealistic ideas about what can get done and how they should do it.

Is this really true that you need managers to understand the domain deeply to manage? You could argue that a manager should have domain experts they can trust in their team to get estimates about time and complexity.

Indeed. Often mutually exclusive.

To really build great software, you need time and space to git your head around the problem. This is obviously not possible if you're spending most of your week in meetings and tracking the work of 7 or 8 team members.

In my experience, you can be a great IC *or* a great tech lead, but you cannot be both at the same time.


A person might very well be suited for both IC or team lead, there is no conflict in the skills themselves. But as you say, actually trying to apply them both simultaneously in practice could lead to conflicts.

Might work for some projects. For really complex projects (which is what I typically work on), strong engineering helps and so a lead-by-example approach tends to work out and it helps to motivate the team.

Also, I can give my team members detailed feedback. I think they take my constructive criticisms more seriously if I also give them praise which is detailed and accurate. You need a hands-on technical person to really understand people's strengths and weaknesses in a concrete way. Being hands-on also makes you more approachable; people will tell you things that they wouldn't tell a non-technical manager.


Being a good leader is partially out of your control. The people under you need to respect you as a leader. Working with them and showing your technical skills can gain their respect

I was thinking the same thing. Sounds more like staff engineer not team lead/mgmt?

Team lead != Tech lead

It is a good way to burn out high performers haha.

Honestly I’ve seen plenty of folks get promoted to “team lead” because they aren’t as productive with the actual coding. Someone needs to focus on the non-technical project tasks, so the boss picks the least productive team member to move to that role. Calling it a “team lead” makes it more appealing than calling it “worst coder”.

I would argue that the two skills are necessary but not sufficient. If you’re lacking in the core skill, what exactly are you leading? If you’re a great coder and socially inept, good luck leading.

As the other commenters stated, it is not clear what is the point. Writing good code and being a tech leader are very different positions with very different technical skills. I was a tech lead in a few cases (different companies or different departments of a very large company) and I was not the top developer there, my job was not to be one. I worked with developers that were much better than me that were not a good fit for a tech leader.

I took it as good, central contributions are one path to respect and trust.

Interesting. I've witnessed this happening in the past when someone took over all the boring configuration part of the software. It's so essential to the whole thing that you just end up with responsibilities and trust really quickly.

I don't understand the point of your comment. Why did you switch companies?

I am assuming that the point is, when you start in your team's shoes and then get promoted to team lead, your team knows you are capable and that you have well reasoned opinions. Hopefully.

Because that's how Silicon Valley works?

It sounds like he is implying something, as in everyone switches jobs, but he specifically points out he changes jobs after being promoted, which I assume he did for a reason. But I cant tell if its because he was unhappy and for what reason. Otherwise why would he specifically point that out?

If you’ve been in the game a few decades, you’ll have changed jobs for all sorts of reasons. Mostly not relevant.

I cant believe noone else found that statement confusing

>> I don't look or act like a leader

In what way?


This is a very compelling SaaS license.

My advice is: Avoid Blanket Statements About Any Technology.

I'm tired of midwit arguments like "Tech X is N% faster than tech Y at performing operation Z. Since your system (sometimes) performs operation Z, it implies that Tech X is the only logical choice in all situations!"

It's an infuriatingly silly argument because operation Z may only represent about 10% of the total CPU usage of the whole system (averaged out)... So what is promoted as a 50% gain may in fact be a 5% gain when you consider it in the grand scheme of things... Negligible. If everyone was looking at this performance 'advantage' rationally; nobody would think it's worth sacrificing important security or operational properties.

I don't know what happened to our industry; we're supposed to be intelligent people but I see developers falling for these obvious logical fallacies over and over.

I remember back in my day, one of the senior engineers was discussing upgrading a python system and stated openly that the new version of the engine was something like 40% slower than the old version but he didn't even have to explain himself why upgrading was still a good decision; everybody in the company knew he was only talking about the code execution speed and everybody knew that this was a small fraction of the total.

Not saying UUIDv7 was a bad choice for Postgres. I'm sure it's fine for a lot of situations but you don't have to start a cult preaching the gospel of The One True UUID to justify your favorite project's decisions.

I do find it kind of sly though how the community decided to make this UUIDv7 instead of creating a new standard for it.

The whole point of UUID was to leverage the properties of randomness to generate unique IDs without requiring coordination. UUIDv7 seems to take things in a philosophically different path. People chose UUID for scalability and simplicity (both of which you get as a result of doing away with the coordination overhead), not for raw performance...

That's the other thing which drives me nuts; people who don't understand the difference between performance and scalability. People foolishly equate scalability with parallelism or concurrency; whereas that's just one aspect of it; scalability is a much broader topic. It's the difference between a theoretical system which is fast given a certain artificially small input size and one which actually performs better as the input size grows.

Lastly; no mention is made about the complex logic which has to take place behind the scenes to generate UUIDv7 IDs... People take it for granted that all computers have a clock which can produce accurate timestamps where all computers in the world are magically in-sync... UUIDv7 is not simple; it's very complicated. It has a lot of additional complexity and dependencies compared to UUIDv4. Just because that complexity is very well hidden from most developers, doesn't mean it's not there and that it's not a dependency... This may become especially obvious as we move to a world of robotics and embedded systems where cheap microchips may not have enough Flash memory to hold the code for the kinds of programs required to compute such elaborate IDs.


Yep. We have tables that use UUIDv4 that have 60M+ rows and don't have any performance problems with them. Would some queries be faster using something else? Probably, but again, for us it's not close to being a bottleneck. If it becomes a problem at 600M or 6B rows, we'll deal with it then. We'll probably switch to UUIDv7 at some point, but it's not a priority and we'll do some tests on our data first. Does my experience mean you should use UUIDv4? No. Understand your own system and evaluate how the tradeoffs apply to you.

I have tables that have billions of rows that use UUIDv4 primary keys and I haven't encountered any issues either. I do use UUIDv7 for write-heavy tables, but even then, I got a way bigger performance boost from batching inserts than switching from UUIDv4 to UUIDv7. Issue is way overblown.

Nice feedback. Out of curiosity, have you made any fine-tuning to psql that greatly improved performance?

Nope. Out of the box GCP Cloud SQL instance.

Wasn't choosing uuids as ids falling for the deceptive argument in the first place?

Not really, no. They’re very convenient for certain problems and work really well in general. I’ve never had a performance issue where the problem boiled down to my use of UUID.

You never having seen the problem doesn't mean it never happens; I have dealt with a serious performance problem in the past that was due to excessive page fragmentation due to a GUID PK.

To your original point, these are heuristics; there isn't always time to dig into every little architectural decision, so having a set of rules of thumb on hand helps to preempt problems at minimal cognitive cost. "Avoid using a GUID as a primary key if you can" is one of mine.


What are these certain problems, if I may ask?

A major one for me is preventing duplicate records.

If the client POSTs a new object to insert it into the database; if there is a connection failure and the client does not receive a success response from the server, the client cannot know whether the record was inserted or not without making an expensive and cumbersome additional read call to check... The client cannot simply assume that the insertion did not happen purely on the basis that they did not receive a success response. It could very well be that the insertion succeeded but the connection failed shortly after so response was not received. If the IDs are auto-incremented on the server and the client posts the same object again without any ID on it, the server will create a duplicate record in the database table (same object with a different ID).

On the other hand, if the client generates a UUID for the object it wants to create on the front-end, then it can safely resend that exact object any number of times and there is no risk of double-insertion; the object will be rejected the second time and you can show the user a meaningful error "Record was already created" instead of creating two of the same resource; leading to potential bugs and confusion.


Ehm.. so you're saying that INSERT ... RETURNING id is not atomic from the client's pov because something terrible could happen just when client is receiving the answer inside its SQL driver?

I'm actually more thinking about the client sitting on the front-end like a single page app. Network instability could cause the response to not reach the front-end after a successful insert. This wouldn't be extremely common but would definitely be a problem for you as the database admin if you have above a certain number of users. I've seen this issue on live production systems and the root cause of duplicate records can be baffling because of how infrequently it may happen. Tends to cause issues that are hard to debug.

OK got it. I was thinking about SQL client, not about client of a REST service. With that distinction in mind, the reasoning makes sense; thank you.

Preferably, you would design you APIs and services to be idempotent (ie. use PUT not POST etc.)

Using idempotency identifier is the last resort in my book.


Still, UUID is probably the simplest and most reliable way to generate such idempotency identifiers.

How is URI not an idempotency identifier for PUT?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: