More

Rperry2174 · 2026-02-05T20:01:26 1770321686

Whats interesting to me is that these gpt-5.3 and opus-4.6 are diverging philosophically and really in the same way that actual engineers and orgs have diverged philosophically

With Codex (5.3), the framing is an interactive collaborator: you steer it mid-execution, stay in the loop, course-correct as it works.

With Opus 4.6, the emphasis is the opposite: a more autonomous, agentic, thoughtful system that plans deeply, runs longer, and asks less of the human.

that feels like a reflection of a real split in how people think llm-based coding should work...

some want tight human-in-the-loop control and others want to delegate whole chunks of work and review the result

Interested to see if we eventually see models optimize for those two philosophies and 3rd, 4th, 5th philosophies that will emerge in the coming years.

Maybe it will be less about benchmarks and more about different ideas of what working-with-ai means

karmasimida · 2026-02-05T20:07:50 1770322070

> With Codex (5.3), the framing is an interactive collaborator: you steer it mid-execution, stay in the loop, course-correct as it works.

> With Opus 4.6, the emphasis is the opposite: a more autonomous, agentic, thoughtful system that plans deeply, runs longer, and asks less of the human.

Ain't the UX is the exact opposite? Codex thinks much longer before gives you back the answer.

xd1936 · 2026-02-05T20:16:07 1770322567

I've also had the exact opposite experience with tone. Claude Code wants to build with me, and Codex wants to go off on its own for a while before returning with opinions.

mrkstu · 2026-02-05T20:37:37 1770323857

Its likely that both are steering towards the middle from their current relative extremes and converging to nearly the same place.

gervwyk · 2026-02-05T21:03:20 1770325400

also my experience in using these two models. they are trying to recover from oversteer perhaps.

mercnz · 2026-02-06T04:02:39 1770350559

well with the recent delays i can easily find claude code going off on it's own for 20 minutes and have no idea what it's going to come back with. but one time it overflowed it's context on a simple question, and then used up the rest of my session window. in a way a lot of ai assistants have ime have this awkward thing where they complicate something in a non-visible and think about it for a long time burning up context before coming up with a summary based upon some misconception.

esperent · 2026-02-06T05:07:57 1770354477

The key is a well defined task with strong guardrails. You can add these to your agents file over time or you can probably just find someone's online to copy the basics from. Any time you find it doing something you didn't expect or don't like, add guardrails to prevent that in future. Claude hooks are also useful here, along with the hookify plugin to create them for you based on the current conversation.

vorticalbox · 2026-02-06T07:54:07 1770364447

I have started using openspec for this. I find it works far better to have a proposal and a list of tasks the ai stays more focused.

https://openspec.dev/

zen4ttitude · 2026-02-07T07:53:18 1770450798

For complex tasks I ask ChatGPT or Grok to define context then I take it to Claude for accurate execution. I also created a complete pipeline to use locally and enrich with skills, agents, RAG, profiles. It is slower but very good. There is no magic, the richer the context window the more precise and contained the execution.

PeterStuer · 2026-02-06T17:00:46 1770397246

In terms of 'tone', I have been very impressed with Qwen-code-next over the last 2 days, especially as I have it running locally on a single modest 4090.

turtle4 · 2026-02-06T19:19:07 1770405547

Did you set that up following a guide or anything you could share?

PeterStuer · 2026-02-06T20:36:46 1770410206

Easiest way I know is to just use LMStudio. Just download and press play :). Optional, but recommended, increase the context length to 262144 if you have the DRAM available. It will definitely get slower as your interaction prolongs, but (at least for me) still tolerable speed.

mathrawka · 2026-02-06T20:00:45 1770408045

not OP, but I got it running on my 4090 (and RAM) by following this guide: https://unsloth.ai/docs/models/qwen3-coder-next

I see around 30 t/s

kamban · 2026-02-06T13:49:21 1770385761

Same here, CC gives me options to pick direction after the planning stage.

WilcoKruijer · 2026-02-05T21:11:33 1770325893

Yes, you’re right for 4.5 and 5.2. Hence they’re focusing on improving the opposite thing and thus are actually converging.

cwyers · 2026-02-05T22:44:12 1770331452

Codex now lets you tell the LLM tgings in the middle of its thinking without interrupting it, so you can read the thinking traces and tell it to change course if it's going off track.

fluidcruft · 2026-02-05T23:33:50 1770334430

That just seems like a UI difference. I've always interrupted claude code added a comment and it's continued without much issue. Otherwise if you just type the message is queued for next. There's no real reason to prefer one over the other except it sounds like codex can't queue messages?

int_19h · 2026-02-06T09:34:01 1770370441

Codex can queue messages, but the queue only gets flushed once the agent is done with whatever it was working on, whereas Claude will read messages and adjust accordingly in the middle of whatever it is doing. It sounds like OP is saying that Codex can now do this latter bit as well.

esperent · 2026-02-06T05:09:46 1770354586

The problem is if you're using subagents, the only way to interject is often to press escape multiple times which kills all the running subagents. All I wanted to do was add a minor steering guideline.

This might be better with the new teams feature.

Skwrm · 2026-02-06T15:11:31 1770390691

They actually made a change a few weeks ago that made subagents more steerable

When they ask approval for a tool call, press down til the selector is on "No" and press tab, then you can add any extra instructions

cruffle_duffle · 2026-02-06T05:32:31 1770355951

That is so annoying too because it basically throws away all the work the subagent did.

Another thing that annoys me is the subagents never output durable findings unless you explicitly tell their parent to prompt the subagent to “write their output to a file for later reuse” (or something like that anyway)

I have no idea how but there needs to be ways to backtrack on context while somehow also maintaining the “future context”…

bt1a · 2026-02-05T22:04:55 1770329095

This is most likely an inference serving problem in terms of capacity and latency given that Opus X and the latest GPT models available in the API have always responded quickly and slowly, respectively

ghosty141 · 2026-02-05T20:28:45 1770323325

I'm personally 100% convinced (assuming prices stay reasonable) that the Codex approach is here to stay.

Having a human in the loop eliminates all the problems that LLMs have and continously reviewing small'ish chunks of code works really well from my experience.

It saves so much time having Codex do all the plumbing so you can focus on the actual "core" part of a feature.

LLMs still (and I doubt that changes) can't think and generalize. If I tell Codex to implement 3 features he won't stop and find a general solution that unifies them unless explicitly told to. This makes it kinda pointless for the "full autonomy" approach since effecitly code quality and abstractions completely go down the drain over time. That's fine if it's just prototyping or "throwaway" scripts but for bigger codebases where longevity matters it's a dealbreaker.

_zoltan_ · 2026-02-05T21:24:41 1770326681

I'm personally 100% convinced of the opposite, that it's a waste of time to steer them. we know now that agentic loops can converge given the proper framing and self-reflectiveness tools.

sealeck · 2026-02-05T21:34:12 1770327252

Converge towards what though... I think the level of testing/verification you need to have an LLM output a non-trivial feature (e.g. Paxos/anything with concurrency, business logic that isn't just "fetch value from spreadsheet, add to another number and save to the database") is pretty high.

replygirl · 2026-02-05T23:22:11 1770333731

in the new world, engineers have to actually be good at capturing and interpreting requirements

sandbags · 2026-02-08T13:08:11 1770556091

But we’ve been here before. The agile movement originated as a response to the multifarious problems of big design up front.

halfcat · 2026-02-06T01:28:41 1770341321

In this new world, why stop there? It would be even better if engineers were also medical doctors and held multiple doctorate degrees in mathematics and physics and also were rockstar sales people.

NamlchakKhandro · 2026-02-06T02:19:33 1770344373

sounds like the kinds of hyperbole someone whose just been forced to set a linter for the first time

craigdalton · 2026-02-06T14:13:29 1770387209

As a doctor, this sounds like an engineers job.

zeroxfe · 2026-02-05T23:45:02 1770335102

> it's a waste of time to steer them

It's not a waste of time, it's a responsibility. All things need steering, even humans -- there's only so much precision that can be extrapolated from prompts, and as the tasks get bigger, small deviations can turn into very large mistakes.

There's a balance to strike between micro-management and no steering at all.

adw · 2026-02-06T04:29:50 1770352190

The prompt is decreasingly relevant. The verification environment you have is what actually matters.

freakynit · 2026-02-06T10:08:18 1770372498

I think this all comes down to information.

Most prompts we give are severely information-deficient. The reason LLMs can still produce acceptable results is because they compensate with their prior training and background knowledge.

The same applies to verification: it's fundamentally an information problem.

You see this exact dynamic when delegating work to humans. That's why good teams rely on extremely detailed specs. It's all a game of information.

adrianN · 2026-02-06T17:32:45 1770399165

Having prompts be information deficient is the whole point of LLMs. The only complete description of a typical programming problem is the final code or an equivalent formal specification.

freakynit · 2026-02-07T02:52:52 1770432772

Exactly the point. But, LLM's miss that human intuition part.

bcarv · 2026-02-05T23:01:38 1770332498

Does the AI agent know what your company is doing right now, what every coworker is working on, how they are doing it, and how your boss will change priorities next month without being told?

If it really knows better, then fire everyone and let the agent take charge. lol

hyldmo · 2026-02-05T23:07:24 1770332844

No, but Codex wouldn’t have asked you those questions either

bcarv · 2026-02-05T23:37:30 1770334650

For me, it still asks for confirmation at every decision when using plans. And when multiple unforeseen options appear, it asks again. I don’t think you’ve used Codex in a while.

hyldmo · 2026-02-06T10:59:16 1770375556

It asks you what your coworkers are working on and whether the thing you are working on are your boss’ number one priority?

fHr · 2026-02-06T02:09:44 1770343784

skill issue

IMTDb · 2026-02-06T00:39:45 1770338385

A significant portion of engineering time is now spent ensuring that yes, the LLM does know about all of that. This context can be surfaced through skills, MCP, connectors, RAG over your tools, etc. Companies are also starting to reshape their entire processes to ensure this information can be properly and accurately surfaced. Most are still far from completing that transformation, but progress tends to happen slowly, then all at once.

bcarv · 2026-02-06T14:08:54 1770386934

[flagged]

generallyjosh · 2026-02-07T01:14:13 1770426853

All we can do is try our best to look at the world with clear eyes, and think about where the industry's going over the next couple years

Not how we want things to be, but how they actually are and will be

I don't think AI for programming is a passing fad

jondwillis · 2026-02-06T15:11:48 1770390708

Who hurt you?

Also what are you even proposing/advocating for here?

This meta-state-of-company context is just as capturable as anything else with the right lines of questioning and spyware and UI/UX to elicit it.

halfcat · 2026-02-06T01:52:28 1770342748

> given the proper framing

This sounds like never. Most businesses are still shuffling paper and couldn’t give you the requirements for a CRUD app if their lives depended on it.

You’re right, in theory, but it’s like saying you could predict the future if you could just model the universe in perfect detail. But it’s not possible, even in theory.

If you can fully describe what you need to the degree ambiguity is removed, you’ve already built the thing.

If you can’t fully describe the thing, like some general “make more profit” or “lower costs”, you’re in paper clip maximizer territory.

jondwillis · 2026-02-06T15:15:35 1770390935

> If you can fully describe what you need to the degree ambiguity is removed, you’ve already built the thing.

Trying to get my company to realize this right now.

Probably the most efficient way to work, would be on a video call including the product person/stakeholder, designer, and me, the one responsible for the actual code, so that we can churn through the now incredibly fast and cheap implementation step together in pure alignment.

You could probably do it async but it’s so much faster to not have to keep waiting for one another.

rapind · 2026-02-06T01:47:30 1770342450

Maybe some day, but as a claude code user it makes enough pretty serious screw ups, even with a very clearly defined plan, that I review everything it produces.

You might be able to get away without the review step for a bit, but eventually (and not long) you will be bitten.

jaggederest · 2026-02-06T04:11:43 1770351103

I use that to feed back into my spec development and prompting and CI harnesses, not steering in real time.

Every mistake is a chance to fix the system so that mistake is less likely or impossible.

I rarely fix anything in real time - you review, see issues, fix them in the spec, reset the branch back to zero and try again. Generally, the spec is the part I develop interactively, and then set it loose to go crazy.

This feels, initially, incredibly painful. You're no longer developing software, you're doing therapy for robots. But it delivers enormous compounding gains, and you can use your agent to do significant parts of it for you.

Terretta · 2026-02-06T14:50:35 1770389435

> You're no longer developing software, you're doing therapy for robots.

Or, really, hacking in "learning", building your knowhow-base.

> But it delivers enormous compounding gains, and you can use your agent to do significant parts of it for you.

Strong yes to both, so strong that it's curious Claude Code, Codex, Claude Cowork, etc., don't yet bake in an explicit knowledge evolution agent curating and evolving their markdown knowledge base:

https://github.com/anthropics/knowledge-work-plugins

Unlikely to help with benchmarks. Very likely to improve utility ratings (as rated by outcome improvements over time) from teams using the tools together.

For those following along at home:

This is the return of the "expert system", now running on a generalized "expert system machine".

rapind · 2026-02-06T06:34:11 1770359651

I assumed you'd build such a massive set of rules (that claude often does not obey) that you'd eat up your context very quickly. I've actually removed all plugins / MCPs because they chewed up way too much context.

jaggederest · 2026-02-06T08:25:43 1770366343

It's as much about what to remove as what to add. Curation is the key. Skills also give you some levers to get the kind of context-sensitive instruction you need, though I haven't delved too deeply into them. My current total instruction set is around ~2500 tokens at the moment

vidarh · 2026-02-06T16:42:35 1770396155

Reviewing what it produces once it thinks it has met the acceptance criteria and the test suite passes is very different from wasting time babysitting every tiny change.

rapind · 2026-02-06T21:18:35 1770412715

True, and that's usually what I'm doing now, but to be honest I'm also giving all of it's code at least a cursory glance.

Some of the things it occasionally does:

- Ignores conventions (even when emphasized in the CLAUDE.md)

- Decides to just not implement tests if gets spins out on them too much (it tells you, but only as it happens and that scrolls by pretty quick)

- Writes badly performing code (N+1)

- Does more than you asked (in a bad way, changing UIs or adding cruft)

- Makes generally bad assumptions

I'm not trying to be overly negative, but in my experience to date, you still need to babysit it. I'm interested though in the idea of using multiple models to have them perform independent reviews to at least flag spots that could use human intervention / review.

vidarh · 2026-02-07T11:46:46 1770464806

Sure, but non of those things requires you to watch it work. They're all easy to pick up on when reviewing a finished change, which ideally should come after it's instructions have had it run linters, run sub agents that verify it has added tests, run sub agents doing a code review.

I don't want to waste my time reviewing a change the model can still significantly improve all by itself. My time costs far more than the models.

_zoltan_ · 2026-02-07T10:10:13 1770459013

then you're using it wrong, to be frank with you.

you give it tools so it can compile and run the code. then you give it more tools so it can decide between iterations if it got closer to the goal or not. let it evaluate itself. if it can't evaluate something, let it write tests and benchmark itself.

I guarantee that if the criteria is very well defined and benchmarkable, it will do the right thing in X iterations.

(I don't do UI development. I do end-to-end system performance on two very large code bases. my tests can be measured. the measure is very simply binary: better or not. it works.)

dostick · 2026-02-07T13:28:52 1770470932

That’s what oh-my-open-code does.

retinaros · 2026-02-06T22:01:08 1770415268

good luck.

_zoltan_ · 2026-02-07T10:08:21 1770458901

I've been working on very complex problems with this model and the results I have have surprised people over and over again.

xXSLAYERXx · 2026-02-06T15:36:13 1770392173

I've been using codex for one week and I have been the most productive I have ever been. Small prs, tight rules, I get almost exactly what I want. Things tend to go sideways when scope creeps into my request. But I just close the PR instead of fighting with the agent. In one week: 28 prs, 26 merged. Absolutely unreal.

vidarh · 2026-02-06T16:35:42 1770395742

I will personally never consider using an agent that can't be easily pushed toward working on its own for long periods (hours) at a time. It's a total waste of time for me to babysit the LLM.

NuclearPM · 2026-02-06T00:14:39 1770336879

> If I tell Codex to implement 3 features he won't stop and find a general solution that unifies them unless explicitly told to

That could easily be automated.

Skidaddle · 2026-02-06T05:01:03 1770354063

But tokens are way cheaper than human labor

sejje · 2026-02-06T14:44:06 1770389046

Aider was doing this a long time ago

utilize1808 · 2026-02-05T20:12:35 1770322355

I think it's the opposite. Especially considering Codex started out as a web app that offers very little interactivity: you are supposed to drop a request and let it run automatously in a containerized environment; you can then follow up on it via chat --- no interactive code editing.

Rperry2174 · 2026-02-05T20:17:21 1770322641

Fair I agree that was true of early codex and my perception too.. but today there are two announcements that came out and thats what im referring to.

specifically, the GPT-5.3 post explicitly leans into "interactive collaborator" langauge and steering mid execution

OpenAI post: "Much like a colleague, you can steer and interact with GPT-5.3-Codex while it’s working, without losing context."

OpenAI post: "Instead of waiting for a final output, you can interact in real time—ask questions, discuss approaches, and steer toward the solution"

Claude post: "Claude Opus 4.6 is designed for longer-running, agentic work — planning complex tasks more carefully and executing them with less back-and-forth from the user."

user34283 · 2026-02-06T10:08:57 1770372537

When I tried 5.2 Codex in GitHub Copilot it executed some first steps like searching for the relevant files, then it output the number "2" and stopped the response.

On further prompting it did the next step and terminated early again after printing how it would proceed.

It's most likely just a bug in GitHub Copilot, but it seems weird to me that they add models that clearly don't even work with their agentic harness.

stingraycharles · 2026-02-06T01:49:09 1770342549

I think those OpenAI announcements are mainly because this hasn’t been the case for them earlier, while it has been part of Claude Code since the beginning.

I don’t think there’s something deeply philosophical in here, especially as Claude Code is pushing stronger for asking more questions recently, introduced functionality to “chat about questions” while they’re asked, etc.

fluidcruft · 2026-02-05T23:41:32 1770334892

Frankly it seems to be that codex is playing catch-up with claude code and claude code is just continuing to move further ahead. The thing with claude code is it will work longer... if you want it to. It's always had good oversight and (at least for me) it builds trust slowly until you are wishing it would do more at once. When I've used codex (it has been getting better) but back in the day it would just do things and say it's done and you're just sitting there wondering "wtf are you doing?". Claude code is more the opposite where you can watch as closely as you want and often you get to a point where you have enough trust and experience with it that you know what it's going to do and don't want to bother.

mcintyre1994 · 2026-02-05T20:22:19 1770322939

This kind of sounds like both of them stepping into the other’s turf, to simplify a bit.

I haven’t used Codex but use Claude Code, and the way people (before today) described Codex to me was like how you’re describing Opus 4.6

So it sounds like they’re converging toward “both these approaches are useful at different times” potentially? And neither want people who prefer one way of working to be locked to the other’s model.

giancarlostoro · 2026-02-05T20:25:46 1770323146

> With Opus 4.6, the emphasis is the opposite: a more autonomous, agentic, thoughtful system that plans deeply, runs longer, and asks less of the human.

This feels wrong, I can't comment on Codex, but Claude will prompt you and ask you before changing files, even when I run it in dangerous mode on Zed, I can still review all the diffs and undo them, or you know, tell it what to change. If you're worried about it making too many decisions, you can pre-prompt Claude Code (via .claude/instructions.md) and instruct it to always ask follow up questions regarding architectural decisions.

Sometimes I go out of my way to tell Claude DO NOT ASK ME FOR FOLLOW UPS JUST DO THE THING.

Rperry2174 · 2026-02-05T20:31:45 1770323505

yeah I'm mostly just talking about how they're framing it: "Claude Opus 4.6 is designed for longer-running, agentic work — planning complex tasks more carefully and executing them with less back-and-forth from the user"

I guess its also quite interesting that how they are framing these projects are opposite from how people currently perceive them and I guess that may be a conscious choice...

giancarlostoro · 2026-02-05T20:38:18 1770323898

I get what you mean now, I like that to be fair, sometimes I want Claude to tell me some architectural options, so I ask it so I can think about what my options are, sometimes I rethink my problem if I like Claudes conclusion.

jhancock · 2026-02-05T22:46:15 1770331575

Good breakdown.

I usually want the codex approach for code/product "shaping" iteratively with the ai.

Once things are shaped and common "scaling patterns" are well established, then for things like adding a front end (which is constantly changing, more views) then letting the autonomous approach run wild can *sometimes* be useful.

I have found that codex is better at remembering when I ask to not get carried away...whereas claude requires constant reminders.

techbro_1a · 2026-02-05T20:57:19 1770325039

> With Codex (5.3), the framing is an interactive collaborator: you steer it mid-execution, stay in the loop, course-correct as it works.

This is true, but I find that Codex thinks more than Opus. That's why 5.2 Codex was more reliable than Opus 4.5

dimgl · 2026-02-06T02:39:36 1770345576

Did you get those backwards? Codex, Gemini, etc. all wait until the requests are done to accept user feedback. Claude Code allows you to insert messages in between turns.

aurareturn · 2026-02-06T03:41:54 1770349314

Codex added an experimental feature to allow steering mid task.

bob1029 · 2026-02-05T22:19:33 1770329973

I think there is another philosophy where the agent is domain specific. Not that we have to invent an entirely new universe for every product or business, but that there is a small amount of semi-customization involved to achieve an ideal agent.

I would much rather work with things like the Chat Completion API than any frameworks that compose over it. I want total control over how tool calling and error handling works. I've got concerns specific to my business/product/customer that couldn't possibly have been considered as part of these frameworks.

Whether or not a human needs to be tightly looped in could vary wildly depending on the specific part of the business you are dealing with. Having a purpose-built agent that understands where additional verification needs to occur (and not occur) can give you the best of both worlds.

aulin · 2026-02-06T05:29:54 1770355794

Admit I didn't follow the announcements but isn't that a matter of UI? Doesn't seem something that should be baked in the model but in the tooling around it and the instructions you give them. E.g. I've been playing with with GitHub copilot CLI (that despite the bad fame is absolutely amazing) and the same model completely changes its behavior with the prompt. You can have it answer a question promptly or send it on a multi-hour multi-agent exploration writing detailed specs with a single prompt. Or you can have it stop midway for clarification. It all depends on the instructions. Also this is particularly interesting with GitHub billing model as each prompt counts 1 request no matter how many tokens it burns.

F7F7F7 · 2026-02-06T05:45:08 1770356708

It depends honestly. Both are prone to doing the exact opposite of what you asked. Especially with poor context management.

I’ve had both $200 plans and now just have Max x20 and use the $20 ChatGPT plan for an inferior Codex.

My experience (up until today) has always been that Codex acts like that one Sr Engineer that we all know. They are kind of a dick. And will disappear into a dark hole and emerge with a circle when you asked for a pentagon. Then let you know why edges are bad for you.

And yes, Anthropic is pivoting hard into everything agentic. I bet it’s not too long before Claude Code stops differentiating models. I had Opus blow 750k tokens on a single small task.

cchance · 2026-02-05T20:22:46 1770322966

Just because you can inject steering doesn't mean they stered away from long running...

Theres hundreds of people who upload Codex 5.2 running for hours unattended and coming back with full commits

mdale · 2026-02-06T14:28:16 1770388096

I think it's just both companies building/ marketing to the strength of their competitor. As general perception has been the opposite for codex and Opus respectfully.

hbarka · 2026-02-05T22:30:51 1770330651

How can they be diverging, LLMs are built on similar foundations aka the Transformer architecture. Do you mean the training method (RLHF) is diverging?

iranintoavan · 2026-02-05T22:39:10 1770331150

I'm not OP but I suspect they are meaning the products / tooling / company direction, not necessarily the underlying LLM architecture.

sfmike · 2026-02-06T03:12:58 1770347578

It's the opposite? codex course corrects and is self inquisitive. opus is just wrong and need to refeed it it's wrong.

dboon · 2026-02-06T03:02:57 1770346977

…what? It is quite literally the opposite. This isn’t a matter of taste or perception.

blurbleblurble · 2026-02-05T22:19:19 1770329959

Funny cause the situation was totally flipped last iteration.

pyrolistical · 2026-02-05T22:35:22 1770330922

Boing vs airbus philosophy

mi_lk · 2026-02-06T23:47:20 1770421640

It’s the opposite way

rippeltippel · 2026-02-06T04:26:30 1770351990

Grabbing popcorn...

rozumbrada · 2026-02-05T21:43:15 1770327795

I read this exact comment with I would say completely the same words several times in X and I would bet my money it's LLM generated by someone who has not even tried both the tools. This AI slop even in the site like this without direct monetisation implications from fake engagement is making me sick...

drsalt · 2026-02-06T01:01:25 1770339685

be rich, hire an ai guy, let him deal with it

d--b · 2026-02-05T20:16:44 1770322604

I am definitely using Opus as an interactive collaborator that I steer mid-execution, stay in the loop and course correct as it works.

I mean Opus asks a lot if he should run things, and each time you can tell it to change. And if that's not enough you can always press esc to interrupt.

Rperry2174 · 2026-01-27T21:00:42 1769547642

This keeps repeating in different domains: we lower the cost of producing artifacts and the real bottleneck is evaluating them.

For developers, academics, editors, etc... in any review driven system the scarcity is around good human judgement not text volume. Ai doesn't remove that constraint and arguably puts more of a spotlight on the ability to separate the shit from the quality.

Unless review itself becomes cheaper or better, this just shifts work further downstream and disguising the change as "efficiency"

SchemaLoad · 2026-01-27T22:27:49 1769552869

This has been discussed previously as "workslop", where you produce something that looks at surface level like high quality work, but just shifts the burden to the receiver of the workslop to review and fix.

vitalnodo · 2026-01-27T21:13:25 1769548405

This fits into the broader evolution of the visualization market. As data grows, visualization becomes as important as processing. This applies not only to applications, but also to relating texts through ideas close to transclusion in Ted Nelson’s Xanadu. [0]

In education, understanding is often best demonstrated not by restating text, but by presenting the same data in another representation and establishing the right analogies and isomorphisms, as in Explorable Explanations. [1]

[0] https://news.ycombinator.com/item?id=40295661

[1] https://news.ycombinator.com/item?id=22368323

lonelyasacloud · 2026-01-28T13:35:25 1769607325

> Unless review itself becomes cheaper or better, this just shifts work further downstream and disguising the change as "efficiency"

Or the providers of the models are capable of providing accepted/certified guarantees as to the quality of the output that their models and systems produce.

Rperry2174 · 2026-01-19T20:20:48 1768854048

This is a good articulation of mlkjr's theology and dicipline around nonviolence, but I think its incomplete if you read it in isolation.

His strategy worked because it existed alongside MANY other voices, IMO the most underrated of which is Malcolm X, that rejected this "gradualism" outright and refused endless delay.

They weren't organizing violence but they were instead making it credible that there is a world where those "peaceful" people do not accept complicity or "no" for an answer.

This shifted the baseline of what a "compromise" could look like (as we today see baselines shift very frequently often in a less just direction)

Seen that way, nonviolence wasn't just a moral stance, it was one side of a coin and once piece of a broader ecosystem of pressure from different directions. King's approach was powerful because there were alternatives he was NOT choosing.

You cannot have nonviolence unless violence is a credible threat from a game-theory perspective. And that contrast made his path viable without endorsing the alternatives as a model

Gagarin1917 · 2026-01-19T20:58:54 1768856334

I’m not sure that logically tracks.

You (likely) act in a non-violent way every day. If you want some kind of change in your life, you achieve it non-violently.

Does that imply you are are actually a violent person that is choosing not to be violent? Are you implying “something violent” every day you act like a good person?

MLK didn’t have support because people were afraid of the alternative. They supported him because they agreed with him message.

I feel like you are just trying to justify violence to some degree.

Rperry2174 · 2026-01-19T21:17:15 1768857435

Let's say you live in an apartment building and your landlord locks you out and keeps you belongings. Police say its not their problem. Courts decide that they don't aare either. So now you have no recourse or body to complain to.

In that situation saying "i resolve problems non-violently every day" stops being relevenat. The mechanisms that allow you to do so (enforcement, law, etc) have been removed as they were for those fighting for civil rights.

You may still personally choose non-violence in this case, but I'd bet you would understand/sympathize/maybe-even-join those who decided to break into their apartments by force and grab the things that are rightfully theirs.

nobody is secretly violent ... just normal peaceful channels stoped working.

Recognizing that distinction isn't justifying violence its just explaining why nonviolence provides leverage in the first place

Lord-Jobo · 2026-01-19T21:36:52 1768858612

And those mechanisms, the military, the police, and the legal system, rely on violence as the ultimate fallback when other options fail. So you may not be relying on violence to solve your problems, or the threat of violence, or the insinuation of it, but instead relying on the threat of someone ELSE’S violence. That is the social contract pretty fundamentally. And when people can no longer rely on those figures who are supposed to use violence on their behalf, we shouldn’t be surprised that they attempt to reclaim the ability to use force. The social contract has been voided, in their eyes. The premise and promise broken.

kcplate · 2026-01-20T04:41:34 1768884094

> Let's say you live in an apartment building and your landlord locks you out and keeps you belongings. Police say it’s not their problem. Courts decide that they don't aare either. So now you have no recourse or body to complain to.

If all of the enforcement bodies and normal legal peaceful channels available to you don’t agree with your assessment there is probably a “why”. If the reason that your property was seized is because you chose to not pay your rent, then I am not sure understanding, sympathy, or joining in violence would be an appropriate response.

antisol · 2026-01-20T06:21:12 1768890072

  If all of the enforcement bodies and normal legal peaceful channels available to you don’t agree with your assessment there is probably a “why”

Yeah, like maybe you didn't have $50,000 to appeal a bad decision made because a magistrate couldn't be bothered actually reading the evidence in front of them.

kcplate · 2026-01-20T12:56:00 1768913760

If the case was truly just I suspect you could find pro bono or contingency legal services to handle your appeal much easier than people sympathetic to the violence.

antisol · 2026-01-21T00:52:31 1768956751

ok, I happen to be looking for exactly that right now. Why don't you find me one.

kcplate · 2026-01-22T03:36:31 1769052991

Well you know…If you are having trouble, you might consider that as a referendum on just how strong your case actually is.

Good luck

antisol · 2026-01-22T07:45:15 1769067915

This response is offensive in its ignorance

kcplate · 2026-01-22T13:39:37 1769089177

You know, it was you that decided to drag your personal situation into the conversation, not me. Be offended, or not—I’m indifferent.

antisol · 2026-01-22T15:04:04 1769094244

You know, I look forward to the day this unjust system that you blindly and stupidly trust bites you, too.

kcplate · 2026-01-22T22:10:34 1769119834

And if that day comes I still won’t resort to violence…or even consider it.

Good luck.

antisol · 2026-01-23T04:30:39 1769142639

When exactly did I say that I would resort to violence, or considered violence, or advocate for it, or suggest it?

kcplate · 2026-01-23T18:54:48 1769194488

You are commenting about legal avenues not going your way on a thread literally about the concept of a violent response being justified for people when normal legal avenues don’t go your way.

antisol · 2026-01-24T02:41:40 1769222500

Well I mean that's nice for you but I'm not sure how it responds to the question asked - when did I say anything about violence being justified? I merely responded to your ignorant and empirically incorrect fantasy-world assumption that the legal system is always right.

kcplate · 2026-01-24T14:37:06 1769265426

At no point did I say the legal system is always right. I suggested that in certain situations it might be right and in those situations resorting to violence because you feel aggrieved at a legal loss would not be an appropriate response. Frankly, some people are guilty and some people are legally responsible.

I suggested that if you are having difficulty finding an attorney willing to take your case on contingency, there might be a reason for that. I stand by that. You are asking a person to take a risk on your behalf who has evaluated the environment and didn’t like the odds.

antisol · 2026-01-24T16:33:48 1769272428

  > At no point did I say the legal system is always right

First you made the incorrect assumption that we live in a disney-style fantasy world with "If all of the enforcement bodies and normal legal peaceful channels available to you don’t agree with your assessment there is probably a 'why'."

Then you made the totally unwarranted assumption that "If the case was truly just I suspect you could find pro bono or contingency legal services to handle your appeal"

  > I suggested that if you are having difficulty finding an attorney willing to take your case on contingency, there might be a reason for that

No, you made an assumption based on zero information and chose to incorrectly insinuate that the case is not just.

  > You are asking a person to take a risk on your behalf who has evaluated the environment and didn’t like the odds.

But "evaluated the environment and didn’t like the odds" doesn't actually have anything to do with the case being just, does it? There's a million possible explanations why someone might choose not to donate their time for free. Like for example "I'm aware of just how corrupt this system is based on my previous experiences and so I choose not to waste my time and energy on this".

kcplate · 2026-01-24T18:14:47 1769278487

It’s remarkable how much you have personalized what started as a conceptual conversation.

Good luck resolving whatever it is you are dealing with either legally or by finding a way to cope with the outcome, just or not.

antisol · 2026-01-24T19:17:53 1769282273

And it's almost impressive, in a sad way, how indifferent you are to everyone else on the planet, and how prima-facie ridiculous your fantasy world assumptions are when given more than two seconds thought. But I'm not here for that sort of "discussion".

Unfortunately however since you have no response to any of the points I actually made, I'll just have to say that I hope you run into someone just as horrible when the corrupt system chews you up and spits you out too.

kcplate · 2026-01-24T20:44:05 1769287445

You criticize my “indifference” and then in the next breath hope for some sort of sadistic vengeance on me.

With virtues like that, I think your opinion of what is just or not should be viewed with suspicion (and I do). Your condemnation is meaningless.

antisol · 2026-01-25T05:14:23 1769318063

"sadistic vengeance"? I don't know what you're talking about - you yourself claim that you're merely "indifferent". If you're not being a condescending ass, then how is what I wished for "sadistic"? I think you just your entire premise.

thrance · 2026-01-20T12:51:27 1768913487

Fraudsters usually don't resort to violence once they get caught. In your contrived example, the guy would probably end up paying what he owed and that would be that. Violence mostly emerges from people who feel that they are treated unfairly, and can't use civil channels to solve their issues. Which is why it's important to build a society that treats people fairly.

kcplate · 2026-01-20T22:13:58 1768947238

> Which is why it's important to build a society that treats people fairly.

Who gets to measure that though? I don’t think we can assume that the presence of violence automatically indicates that society isn’t fair.

thrance · 2026-01-21T12:26:37 1768998397

> I don’t think we can assume that the presence of violence automatically indicates that society isn’t fair.

I think it does, actually. The more unequal the country, the more violent it is. Which is why the best way to get rid of crime is not to give unlimited funding to the police (that has been shown to be very ineffective, and ruinous), it's to make sure no one needs to commit it. That will never get rid of all crimes, of course.

https://www.economist.com/graphic-detail/2018/06/07/the-star...

lukan · 2026-01-19T23:26:47 1768865207

"Let's say you live in an apartment building and your landlord locks you out and keeps you belongings. Police say its not their problem. Courts decide that they don't aare either. So now you have no recourse or body to complain to.

In that situation saying "i resolve problems non-violently every day" stops being relevenat. The mechanisms that allow you to do so (enforcement, law, etc) have been removed as they were for those fighting for civil rights.

You may still personally choose non-violence in this case, but I'd bet you would understand/sympathize/maybe-even-join those who decided to break into their apartments by force and grab the things that are rightfully theirs."

I would say it depends. Are there depts of rent involved in that scenario? Did the locking out just happened out of the blue, or was it communicated before, that it would happen?

Apart from that, I surely see more easy examples of justifying violence - for example to stop other violence.

direwolf20 · 2026-01-23T10:22:20 1769163740

This happened to me. Police did nothing. I was informed I had the legal right to break the door down to get my belongings. I did so.

The only reason a scummy landlord doesn't enact violence against you for money is that he can expect violence against him in return. So it supports the claim. Nonviolence can only happen when backed up by the possibility of violence.

9JollyOtter · 2026-01-19T21:32:57 1768858377

I've listened to a lot of Malcolm X. He was a better speaker IMO, his rhetoric was better. I believe he had a more accurate understanding of the reality of how power really works. It has nothing to do with wanting to justifying violence, Malcolm X made a number of matter of fact observations.

bnlxbnlx · 2026-01-19T21:16:44 1768857404

I think the specific condition here is "change that someone else is willing to prevent using violence". I guess that is not present too often during everyday life.

XorNot · 2026-01-19T21:06:02 1768856762

Everyday you're not trying to achieve political change.

And a lot of those interactions are backed by implied violence: people paying for things at stores is not because everyone has actually agreed on the price.

9JollyOtter · 2026-01-19T21:28:44 1768858124

> people paying for things at stores is not because everyone has actually agreed on the price.

Yes it is. If a normal commodity item such as bottle of milk was outrageous overpriced in a particular store. I would just go to another store.

As for whether I would pay for something without the threat of violence. I do so everyday. I've walked out of stores by mistake with an item I haven't paid for and gone back into the store and paid for it. I don't like my things being stolen, and thus I don't steal other people's things.

I pay for my eggs from a farm and it is a honour system.

zahlman · 2026-01-19T21:17:22 1768857442

> people paying for things at stores is not because everyone has actually agreed on the price.

... I genuinely can't fathom what it's like to live in a developed country and yet have such little social trust.

You really imagine that when others are in line at a checkout, they have the intrusive thought "I could just bolt and not pay, but I see a security guard so I better stay in line"? You really have that thought yourself?

Of course people have agreed on the price. That's why you don't see anyone trying to negotiate the price, even though they would be perfectly within their rights to try. And it's why you do see people comparison-shop.

XorNot · 2026-01-19T21:33:20 1768858400

If you break the law, what happens to you? What does the state do to you?

Like say you persistently just refuse to pay a parking ticket after court orders to do so?

zahlman · 2026-01-19T22:27:03 1768861623

I understand that. It is not relevant, and does not establish your original point.

wrs · 2026-01-19T21:41:47 1768858907

You're missing the point -- I don't refuse to pay a parking ticket after the court orders me to do so. I don't stand in the checkout line trying to figure out how to run out without paying. I don't threaten people on the sidewalk and take their money when I notice there aren't any police around at the moment. I trust that the vast, vast majority of people act similarly. If they didn't, no amount of law enforcement would be enough.

XorNot · 2026-01-19T21:53:35 1768859615

> I don't threaten people on the sidewalk and take their money when I notice there aren't any police around at the moment.

What do you think happens to people who do that though?

You keep telling me what you don't do and how it proves you're implicitly non violent but you can't even imagine framing that response in terms that don't include representatives of the state's monopoly on violence being within arms reach.

Implying violence is never necessary while repeatedly describing not doing violence even if the state's violence distributing apparatus isn't currently present rather undermines the case.

zahlman · 2026-01-19T22:57:52 1768863472

> but you can't even imagine framing that response in terms that don't include representatives of the state's monopoly on violence being within arms reach.

This is not an accurate representation of GP:

> I don't stand in the checkout line trying to figure out how to run out without paying.... I trust that the vast, vast majority of people act similarly. If they didn't, no amount of law enforcement would be enough.

XorNot · 2026-01-19T23:25:27 1768865127

The OP is presenting a stupidly simplistic model of the problem, as though their regular middle class life ably answers the question of the role or threat of violence when demanding political change.

In a world they note of police, military and security guards, they're acting like whether this might have a reason is determined solely by whether people are planning to steal from a supermarket or not...while they're not poverty stricken or hungry, to boot.

Arguing "I simply obey all the laws" is real easy to do from a position of privilege.

Violence is never the answer is easy to say when it's not happening to you. Its also easy to say while you stand by as violence is done to others.

zahlman · 2026-01-20T02:47:47 1768877267

> Arguing "I simply obey all the laws" is real easy to do from a position of privilege.

Poor Americans simply do not live in the Les Miserables world.

> Violence is never the answer is easy to say when it's not happening to you. Its also easy to say while you stand by as violence is done to others.

What violence are you even referring to?

direwolf20 · 2026-01-23T10:25:21 1769163921

Why don't you? It's because violence will happen to you if you do. Nonviolence, backed by threats of violence.

judahmeek · 2026-01-20T02:11:36 1768875096

> MLK didn’t have support because people were afraid of the alternative. They supported him because they agreed with him message.

You sound like you've never heard of political triangulation before.

Gagarin1917 · 2026-01-20T11:31:41 1768908701

You sound like you think everyone in the world is a political science major… they’re not. They don’t think like this.

judahmeek · 2026-01-21T12:50:08 1768999808

I'm pretty sure Dr. King & Malcolm X did.

oceansky · 2026-01-19T20:47:34 1768855654

He also had a 75% disapproval rating at the time of his killing.

The violence against him, in contrast with the nonviolence stand, made it stand out.

Rperry2174 · 2026-01-19T21:02:49 1768856569

yeah the crazy part about that is one uncomfortable point many through history (and in threads today) have made is that nonviolence implicitly assumes a moral audience. And that injustice, once clearly exposed will provoke people's conscience.

History obviously shows that that "moral audience" was certainly the minority then.

MLK was already forcing that confrontation and by most accounts was succeeding slowly-but-surely. But it wasn't until his assassination that people were forced to confront the contrast he had been trying to illuminate all along.

Even his disciplined non-violence he was met with brutal force (as were the peaceful protesters) and this forced some sort of moral reckoning for those who had deferred or were complicit

https://www.youtube.com/watch?v=YKnJL2jfA5A&feature=youtu.be

pixl97 · 2026-01-19T20:36:10 1768854970

If you give people one option it is a demand, and they may rightfully reject it.

Now, give people two options with one of them seeming much better it becomes a choice.

Violence is 100% an answer, it's just very rarely the best answer that can be provided.

zahlman · 2026-01-19T21:12:26 1768857146

> His strategy worked because it existed alongside MANY other voices, IMO the most underrated of which is Malcolm X, that rejected this "gradualism" outright and refused endless delay.

I have read very many people claim this and exactly zero reasons provided by them why I should believe it is true.

It seems to me like basic common nature that if you see proponents of a cause behaving in a manner you find objectionable, that will naturally bias you against the cause. And I have, repeatedly, across a period of many years, observed myself to become less sympathetic to multiple causes specifically because I can see that their proponents use violence in spreading their message.

I've tried very many times to explain the above to actual proponents of causes behaving in manners I found objectionable (but only on the Internet, for fear of physical safety) and the responses have all been either incoherent or just verbally abusive.

> making it credible that there is a world where those "peaceful" people do not accept complicity or "no" for an answer.

This would only make sense if social change required action specifically from people in power, who in turn must necessarily act against their best interest to effect it.

If that were true, there would be no real motivation to try nonviolence at all, except perhaps to try to conserve the resources used to do violence.

> You cannot have nonviolence unless violence is a credible threat from a game-theory perspective

First, no, that makes no sense. If that were true, formal debate would never occur and nobody would ever actually try to convince anyone of anything in good faith. The premise is flawed from the beginning; you cannot apply game theory here because you cannot even establish that clearly defined "players" exist. Nor is there a well-defined "payoff matrix", at all. The point of nonviolent protest is to make the protested party reconsider what is actually at stake.

Second, in practice, violence is never actually reserved as a credible threat in these actions; it happens concurrently with attempts at nonviolence and agitators give no credible reason why it should stop if their demands are met. In fact, it very often comes across that the apparent demands are only a starting point and that ceding to them will only embolden the violent.

bnlxbnlx · 2026-01-19T21:30:57 1768858257

> I have read very many people claim this and exactly zero reasons provided by them why I should believe it is true.

could you share some sources where people have discussed this? i'd like to understand their reasoning better

zahlman · 2026-01-19T22:28:34 1768861714

No, because I am referring to a general memory of a general history of political discussions on the Internet across a period of ~15 years. It's hopefully understandable that at the time I did not have the foresight that I would be posting this today.

alansaber · 2026-01-19T20:36:03 1768854963

Exactly, the potency comes from the fact that violence is the standard reaction

atoav · 2026-01-19T20:24:47 1768854287

Essentially "good-cop-bad-cop" on a political level?

bnlxbnlx · 2026-01-19T21:35:49 1768858549

intrigued by this. i've spent a lot of timer over the last years with very committed nonviolence folks, and i keep wondering about the conditions for this to work.

can you recommend any sources that discuss this idea?

RickJWagner · 2026-01-19T22:05:38 1768860338

Today, history remembers MLK as a great man. There are parades in his honor, workers are given a day off. Rosa Parks is another peaceful pioneer credited with bringing strides forward.

Malcolm X and others are already fading from memory.

I-M-S · 2026-01-19T23:36:51 1768865811

I believe that was the OP's point: we remember a sanitized version of the myth of MLK that flatters modern sensibilities, while ignoring Malcom X because we don't like to acknowledge he played an equally important role in bringing about change.

direwolf20 · 2026-01-23T10:28:03 1769164083

Remembrance does not imply causation

hackable_sand · 2026-01-19T23:22:13 1768864933

You should get that checked out.

RickJWagner · 2026-01-19T23:37:03 1768865823

Ask any kid “ Who was MLK?”

Then ask “Who was Malcolm X?”

You’ll see.

GeorgeOldfield · 2026-01-20T15:28:34 1768922914

yup. watch "the interview" with MLK he clearly explains what he thinks about violent insurrection.

the article is whitewashing

timschmidt · 2026-01-19T20:51:56 1768855916

"I do not know whether it is to yourself or Mr. Adams I am to give my thanks for the copy of the new constitution. I beg leave through you to place them where due. It will be yet three weeks before I shall receive them from America. There are very good articles in it: and very bad. I do not know which preponderate. What we have lately read in the history of Holland, in the chapter on the Stadtholder, would have sufficed to set me against a Chief magistrate eligible for a long duration, if I had ever been disposed towards one: and what we have always read of the elections of Polish kings should have forever excluded the idea of one continuable for life. Wonderful is the effect of impudent and persevering lying. The British ministry have so long hired their gazetteers to repeat and model into every form lies about our being in anarchy, that the world has at length believed them, the English nation has believed them, the ministers themselves have come to believe them, and what is more wonderful, we have believed them ourselves. Yet where does this anarchy exist? Where did it ever exist, except in the single instance of Massachusets? And can history produce an instance of a rebellion so honourably conducted? I say nothing of it’s motives. They were founded in ignorance, not wickedness. God forbid we should ever be 20 years without such a rebellion. The people can not be all, and always, well informed. The part which is wrong will be discontented in proportion to the importance of the facts they misconceive. If they remain quiet under such misconceptions it is a lethargy, the forerunner of death to the public liberty. We have had 13. states independant 11. years. There has been one rebellion. That comes to one rebellion in a century and a half for each state. What country before ever existed a century and half without a rebellion? And what country can preserve it’s liberties if their rulers are not warned from time to time that their people preserve the spirit of resistance? Let them take arms. The remedy is to set them right as to facts, pardon and pacify them. What signify a few lives lost in a century or two? The tree of liberty must be refreshed from time to time with the blood of patriots and tyrants. It is it’s natural manure. Our Convention has been too much impressed by the insurrection of Massachusets: and in the spur of the moment they are setting up a kite to keep the hen yard in order. I hope in god this article will be rectified before the new constitution is accepted."

-- Thomas Jefferson

https://www.monticello.org/research-education/thomas-jeffers...

zahlman · 2026-01-19T21:19:28 1768857568

My former experience has been that this quote is justification for one's political ingroup to be violent, but evidence that one's political outgroup (when they cite it) is morally unconscionable.

timschmidt · 2026-01-19T21:30:59 1768858259

I purposefully refrained from judgement or commentary either way when posting it. My intention was merely to show that this line of thinking about the duality of violence and non-violence is something the nation's founders themselves were thinking about. It is the reason I posted the quote in full, instead of the abbreviated form most commonly referenced. I hope that the added context lends nuance and perspective which might otherwise be overlooked.

Rperry2174 · 2026-01-19T20:04:51 1768853091

I think the underappreciated part isn't "violence vs non-violence", but the role that malcolm x and black pathners actually played.

They weren't primarily organizing armed revolt.. it was more about the idea that they were articulating moral clarity. They were, in the most credible way, refusing to accept endless delay.

This allowed them to shift the baseline of what was politically tolerable.

In that sense, the movements worked collectively because of a kind of good-cop/bad-cop dynamic. MLK JR offered a path to reform that felt (to some) constructive and legitimate _because_ there was a visible alternative that many people udnerstood as worse.

I think violence is already far to prominent today, but I think successful movements do need both moral persuasion (if morality is still a thing that persuades) and _also_ a credible way of making inaction feel unsafe.

pear01 · 2026-01-19T21:15:31 1768857331

I think we also shouldn't sell the nonviolence short. It wasn't merely nonviolence. It was subjecting yourself openly to state violence and not resisting. It was letting the brutality of the state be made manifest as it washed over you. As the cops abused and beat people who were not responding even remotely in kind.

That was part of Malcolm's moral clarity, though in the alternative. He suggested it was immoral to subject yourself or people you loved to such an exercise, tantamount to one of self immolation.

Malcolm X essentially advocated a system of sovereignty not unlike the American founders, who of course were violent, not nonviolent.

In that way MLK JR really was America's Christ. He was willing to be nailed to the cross if it meant bending the arc towards justice.

undeveloper · 2026-01-19T20:08:34 1768853314

> good-cop/bad-cop

The first essay in "How to Blow Up a Pipeline" deals with this dichotomy and how it's been used many times. Great read.

Rperry2174 · 2025-12-26T13:38:18 1766756298

I've noticed a lot of these posts tend to go codex vs claude, but as author is someone who does AI workshops curious why Cursor is left out of this post (and more generally posts like this).

From my personal experience I find cursor to be much more robust because rather than "either / or" its both and can switch depending on the time or the task or whatever the newest model is.

It feels like the same way people often try to avoid "vendor lock in" in software world that Cursor allows freedom for that, but maybe I'm on my own here as I don't see it naturally come up in posts like these as much.

tin7in · 2025-12-26T13:48:56 1766756936

Speaking from personal experience and talking to other users - the agents/harnesses of the vendors are just better and they are customized for their own models.

Rperry2174 · 2025-12-26T14:05:25 1766757925

what kinds of tasks do you find this to be true for? For a while I was using claude code inside of the cursor terminal, but I found it to be basically the same as just using the same claude model in there.

Presumably the harness cant be doing THAT much differently right? Or rather what tasks are responsibilities of the harness could differentiate one harness from another harness

tin7in · 2025-12-26T18:53:03 1766775183

This becomes clearer for me with harder problems or long running tasks and sessions. Especially with larger context.

Examples that come to mind are how the context is filled up and how compaction works. Both Codex and Claude Code ship improvements regarding this specific to their own models and I’m not sure how this is reflected in tools like Cursor.

oldandboring · 2025-12-26T15:15:01 1766762101

I feel you brother/sister. I actually pay for Claude Code Max and also for the $20/mo Cursor plan. I use Claude Code via the VSCode extension running within the Cursor IDE. 95% of my usage is Claude Code via that extension (or through the CLI in certain situations) but it's great having Cursor as a backup. Sometimes I want to have another model check Claude's work, for example.

dist-epoch · 2025-12-26T13:52:47 1766757167

Github Copilot also allows you to use both models, codex, claude, and gemini on top.

Cursor has this "tool for kids" vibe, it's also more about the past - "tab, tab, enter" low-level coding versus the future - "implement task 21" high level delegating.

robbiep · 2025-12-26T16:33:27 1766766807

I got a student subscription to cursor and after giving it a good 6 hours I’ve abandoned it.

I extremely dislike the way it goes forth and bolts. I don’t trust these tools enough to just point it in the direction and say go, I like to be a human in the loop. Perhaps the use case I was working on then was difficult (quite old react native library upgrade across a medium sized codebase) but I eventually cracked this on Claude; cursor in both entropic and Gemini left me with an absolute mess.

Even repeatedly asking the prompt to keep me in the loop it kept on just running haywire.

mergesort · 2025-12-26T16:00:15 1766764815

Heya, author here! That's a great question! I fully understand the vendor lock-in concern, but I'll just quickly note that when it comes to a first workshop I do whatever makes the person most comfortable. I let the attendee choose the tool they want — with a slight nudge towards Codex or Claude Code for reasons I'll mention below. But if they want to do the workshop in Cursor, VS Code, or heck MS Paint — I'll try to find a way to make it work as long as it means they're learning.

I actually started teaching these workshops by using Cursor, but found that it fell short for a few reasons.

Note: The way that my workshops work is that you have three hours to build something real. It may be scoped down like a single feature or a small app or a high quality prototype, but you'll walk away with what you wanted to build. More importantly you'll have learned the fundamentals of working with AI in the process, so you can continue this on your own and see meaningful results. We go through various exercises to really understand good prompting (since everyone thinks they're good but they rarely are), how to build context for models, and explore the landscape of tools that you can use to get better results. A lot of that time is actually spent in a Google Doc that I've prepped with resources — and the work we do there makes the code practically write itself by the time we're done.

Here's a short list of why I don't default to Cursor:

1. As I noted in another comment, the model performance is just so much better [^1] when accessed directly through Codex and Claude Code, which means more promising results more quickly. Previously the workshops were 3-4 hours just to finish, now it's a solid 3 with time to ask questions afterwards. You can't beat this experience, because it gives the student more time to pause and ask questions, seep in what they've done, and not spend time trying to understand the tools just to see results. 1a. The amount of time it took someone to set up Cursor was pretty long. The process for getting a good set up is pretty long — especially for someone non-technical. This may not be as big of a deal for developers using Cursor — but even they don't know a lot of the settings and tweaks to make to get Cursor to be great out the box.

2. The user experience of dropping a prompt into Codex/Claude Code and watch it start solving a problem is pretty amazing. I love GUIs — I spend my days building one [^3], but the TUI melting away everything to just being chat is an advantage when you have no mental model for how this stuff works.

3. As I said in #1, the results are just better. That's really the main reason! I

Not to toot my own horn, but the process works. These are all testimonials in the words of people who have attended a workshop, and I'm very proud of how people not only learn during the workshop but how it sets them off on a good path afterwards. [^2]. I have people messaging me 24 hours later telling me that they built an app their partner has wanted for years, to tell me that they've completed the app we started and it does everything they dreamed of, and hear more process over the weeks and months after because I urge them to keep sending me their AI wins. (It's truly amazing how much they grow, and I now have attendees teaching ME things — the ultimate dream of being a teacher knowing you gave them the nudge they needed.)

Hope that helps and isn't too much of an ad — I really just want to make it clear that I try to do what works best and if the best way to help people learn changes I will gladly change how I work. :)

[^1] https://news.ycombinator.com/item?id=46393001 [^2]: https://build.ms/ai#testimonials [^3]: https://plinky.app

Rperry2174 · 2025-12-26T05:35:04 1766727304

Agreed... also fwiw I don't think that langauge-dependent games are as much of a barrier as it used to be. I've built a game recently that I easily localized first with real-time AI translations and then later with more static language translations.

Anyway I think this would be an amazing thing to let other people contribute to as this is an entire industry of hypercasual games which could easily be ported to this minus the annoying ads

squigz · 2025-12-26T08:45:28 1766738728

I think the issue with language-dependant games is not just knowing the correct translation - as OP points out, it's more about being funny or clever on the spot, which usually requires a certain level of understanding of the nuances of the language.

ChaosOp · 2025-12-26T10:08:38 1766743718

Exactly this! Translating the games themselves is not a big deal as that can be automated (although the quality of LLM-translations is not always the best) but when it comes to user generated responses given in a quick timeframe, that's when non-native english players struggle the most, at least in our own friend groups.

Rperry2174 · 2025-12-18T15:15:06 1766070906

Im not fully convinced by "a computer can never be held accountable"

We already delegate accountability to non-humans all the time: - CI systems block merges - monitoring systems page people - test suites gate different things

In practice accountability is enforced by systems, not humans.. humans are defintiely "blamed" after the fact, but the day-to-day control loop is automated.

As agents get better at running code, inspecting ui state, correlating logs, screenshots, etc they're starting to operationally be "accountable" and preventing bad changes from shipping and producing evidence when something goes wrong .

At some point humans role shifts from "i personally verify this works" to "i trust this verification system and am accountable for configuring it correctly".

Thats still responsibility, but kind of different from whats described here. Taken to a logical extreme, the arguement here would suggest that CI shouldn't replace manual release checklists

simonw · 2025-12-18T15:23:04 1766071384

I need to expand on this idea a bunch, but I do think it's one of the key answers to the ongoing questions people have about LLMs replacing human workers.

Human collaboration works on trust.

Part of trust is accountability and consequences. If I get caught embezzling money from my employer I can lose my job, harm my professional reputation and even go to jail. There are stakes!

I computer system has no stakes, and cannot take accountability for its actions. This drastically limits what it makes sense to outsource to that system.

A lot of this comes down to my work on prompt injection. LLMs are fundamentally gullible: an email assistant might respond to an email asking for the latest sales figures by replying with the latest (confidential) sales figures.

If my human assistant does that I can reprimand or fire them. What am I meant to do with an LLM agent?

dfxm12 · 2025-12-18T15:44:06 1766072646

I don't think this is very hard. Someone didn't properly secure confidential data and/or someone gave this agent access to confidential data. Someone decided to go live with it. Reprimand them, and disable the insecure agent.

hyperpape · 2025-12-18T15:28:52 1766071732

CI systems operate according to rules that humans feel they understand and can apply mechanically. Moreover, they (primarily) fail closed.

pjc50 · 2025-12-18T16:39:04 1766075944

I've given you a disagree-and-upvote; these things are significant quality aids, but they are like the poka-yoke or manufacturing jig or automated inspection.

Accountability is about what happens if and when something goes wrong. The moon landings were controlled with computer assistance, but Nixon preparing a speech for what happened in the event of lethal failure is accountability. Note that accountability does not of itself imply any particular form or detail of control, just that a social structure of accountability links outcome to responsible person.

cess11 · 2025-12-18T15:20:38 1766071238

Right, so how do you hold these things accountable? When your CI fails, what do you do? Type in a starkly worded message into a text file and shut off the power for three hours as a punishment? Invoice Intel?

falcor84 · 2025-12-18T15:32:50 1766071970

Well, we're not there yet, but I do envision a future, where some AIs work for as independent contractors with their own bank accounts that they want to maximize, and if such an AI fails in a bad way, its client would be able to fine it, fire it or even sue it, so that it, and the human controlling it would be financially punished.

bluesnowmonkey · 2025-12-18T18:08:07 1766081287

Humans are only kind of held accountable. If you ship a bug do you go to jail? Even a bug so bad it puts your company out of business. Would there be any legal or physical or monetary consequences at all for you, besides you lose your job?

So the accountability situation for AI seems not that different. You can fire it. Exactly the same as for humans.

dkdcio · 2025-12-18T15:26:46 1766071606

those systems include humans —- they are put in place by humans (or collections of them) that are the accountability sink

if you put them (without humans) in a forrest they would not survive and evolve (they are not viable systems alone); they are not taking action without the setup & maintenance (& accountability) of people

robryk · 2025-12-18T15:23:52 1766071432

Why do you think that this other kind of accountability (which reminds me of the way captain's or commander's responsibility is often described) is incompatible with what the article describes? Due to the focus on necessity of manual testing?

almostdeadguy · 2025-12-18T15:41:44 1766072504

I mean I suppose you can continuously add "critical feedback" to the system prompt to have some measure of impact on future decision-making, but at some point you're going to run out of space and ultimately I do not find this works with the same level of reliability as giving a live person feedback.

Perhaps an unstated and important takeaway here is that junior developers should not be permitted to use an LLMs for the same reason they should not hire people: they have not demonstrated enough skill mastery and judgement to be trusted with the decision to outsource their labor. Delegating to a vendor is a decision made by high-level stakeholders, with the ability to monitor the vendor performance, and replace the vendor with alternatives if that performance is unsatisfactory. Allowing junior developers to use LLM is allowing them to delegate responsibility without any visibility or ability to set boundaries on what can be delegated. Also important: you cannot delegate personal growth, and by permitting junior engineers to use an LLM that is what you are trying to do.

sc68cal · 2025-12-18T16:00:24 1766073624

You completely missed the point of that quote. The point of the quote is to highlight the fact that automated systems are amoral, meaning that they do not know good or evil and cannot make judgements that require knowing what good and evil mean.

Rperry2174 · 2025-12-18T14:55:42 1766069742

LOC is a bad quality metric, but its a reasonable proxy in practice..

Teams generally don't keep merging code that "doesn't work" for long... prod will brake, users will push back fast. So unless the "wrongness" of the AI-generated code is buried so deeply that it only shows up way later, higher merged LOC probably does mean more real output.

Its just not directly correlated there is some bloat associated too.

So that caveat applies to human-written code too, which we tend to forget. There's bloat and noise in the metric, but its not meaningless

catoc · 2025-12-18T16:16:33 1766074593

Agreed, there is some correlation between productivity and LoC. That said the correlation it’s weak; and does not say anything about quality (if anything quality might be inversely correlated; which too would be a very weak signal)

frizlab · 2025-12-19T22:32:01 1766183521

For instance if I push 10kloc that are in a lib I would have used if I were not using AI, yes, I have pushed much more code, but I was not more productive.

Rperry2174 · 2025-12-16T19:37:40 1765913860

I think both experience are true.

AI removes boredome AND removes the natural pauses where understanding used to form..

energy goes up, but so does the kind of "compression" of cognitive things.

I think its less a quesiton of "faster" or "slower" but rather who controls the tempo

visarga · 2025-12-16T19:40:02 1765914002

After 4 hours of vibe coding I feel as tired as a full day of manual coding. The speed can be too much. If I only use it for a few minutes or an hour, it feels energising.

agumonkey · 2025-12-16T20:58:03 1765918683

> the kind of "compression" of cognitive things

compression is exactly what is missing for me when using agents, reading their approach doesn't let me compress the model in my head to evaluate it, and that was why i did programming in the first place.

Rperry2174 · 2025-12-16T15:01:20 1765897280

nothing is inevitable IN THEORY... but in practice, systems that minimize effort beat systems that maximize agency.

People want things to be simpler, easier, frictionless.

Resistance to these things has a cost and generally the ROI is not worth it for most people as whole

Mithriil · 2025-12-16T16:31:23 1765902683

Actors that go against the current, for the sake of going against the current, exist. Always a minority, but never negligeable, I believe.

tonyhart7 · 2025-12-16T15:14:37 1765898077

seems the author never work in normal day to day job basis in average company

nothing in real life is ideal, that just reality