More

_zoltan_ · 2026-02-13T08:00:31 1770969631

you might not but a lot of very big enterprises use openshift on azure.

_zoltan_ · 2026-02-05T21:26:12 1770326772

zen-mcp (now called pal-mcp I think) and then claude code can actually just pass things to gemini (or any other model)

_zoltan_ · 2026-02-05T21:25:45 1770326745

I have Opus 4.5 do everything then review it with Gemini 3.

_zoltan_ · 2026-02-05T21:24:41 1770326681

I'm personally 100% convinced of the opposite, that it's a waste of time to steer them. we know now that agentic loops can converge given the proper framing and self-reflectiveness tools.

sealeck · 2026-02-05T21:34:12 1770327252

Converge towards what though... I think the level of testing/verification you need to have an LLM output a non-trivial feature (e.g. Paxos/anything with concurrency, business logic that isn't just "fetch value from spreadsheet, add to another number and save to the database") is pretty high.

replygirl · 2026-02-05T23:22:11 1770333731

in the new world, engineers have to actually be good at capturing and interpreting requirements

sandbags · 2026-02-08T13:08:11 1770556091

But we’ve been here before. The agile movement originated as a response to the multifarious problems of big design up front.

halfcat · 2026-02-06T01:28:41 1770341321

In this new world, why stop there? It would be even better if engineers were also medical doctors and held multiple doctorate degrees in mathematics and physics and also were rockstar sales people.

NamlchakKhandro · 2026-02-06T02:19:33 1770344373

sounds like the kinds of hyperbole someone whose just been forced to set a linter for the first time

craigdalton · 2026-02-06T14:13:29 1770387209

As a doctor, this sounds like an engineers job.

zeroxfe · 2026-02-05T23:45:02 1770335102

> it's a waste of time to steer them

It's not a waste of time, it's a responsibility. All things need steering, even humans -- there's only so much precision that can be extrapolated from prompts, and as the tasks get bigger, small deviations can turn into very large mistakes.

There's a balance to strike between micro-management and no steering at all.

adw · 2026-02-06T04:29:50 1770352190

The prompt is decreasingly relevant. The verification environment you have is what actually matters.

freakynit · 2026-02-06T10:08:18 1770372498

I think this all comes down to information.

Most prompts we give are severely information-deficient. The reason LLMs can still produce acceptable results is because they compensate with their prior training and background knowledge.

The same applies to verification: it's fundamentally an information problem.

You see this exact dynamic when delegating work to humans. That's why good teams rely on extremely detailed specs. It's all a game of information.

adrianN · 2026-02-06T17:32:45 1770399165

Having prompts be information deficient is the whole point of LLMs. The only complete description of a typical programming problem is the final code or an equivalent formal specification.

freakynit · 2026-02-07T02:52:52 1770432772

Exactly the point. But, LLM's miss that human intuition part.

bcarv · 2026-02-05T23:01:38 1770332498

Does the AI agent know what your company is doing right now, what every coworker is working on, how they are doing it, and how your boss will change priorities next month without being told?

If it really knows better, then fire everyone and let the agent take charge. lol

hyldmo · 2026-02-05T23:07:24 1770332844

No, but Codex wouldn’t have asked you those questions either

bcarv · 2026-02-05T23:37:30 1770334650

For me, it still asks for confirmation at every decision when using plans. And when multiple unforeseen options appear, it asks again. I don’t think you’ve used Codex in a while.

hyldmo · 2026-02-06T10:59:16 1770375556

It asks you what your coworkers are working on and whether the thing you are working on are your boss’ number one priority?

fHr · 2026-02-06T02:09:44 1770343784

skill issue

IMTDb · 2026-02-06T00:39:45 1770338385

A significant portion of engineering time is now spent ensuring that yes, the LLM does know about all of that. This context can be surfaced through skills, MCP, connectors, RAG over your tools, etc. Companies are also starting to reshape their entire processes to ensure this information can be properly and accurately surfaced. Most are still far from completing that transformation, but progress tends to happen slowly, then all at once.

bcarv · 2026-02-06T14:08:54 1770386934

[flagged]

generallyjosh · 2026-02-07T01:14:13 1770426853

All we can do is try our best to look at the world with clear eyes, and think about where the industry's going over the next couple years

Not how we want things to be, but how they actually are and will be

I don't think AI for programming is a passing fad

jondwillis · 2026-02-06T15:11:48 1770390708

Who hurt you?

Also what are you even proposing/advocating for here?

This meta-state-of-company context is just as capturable as anything else with the right lines of questioning and spyware and UI/UX to elicit it.

halfcat · 2026-02-06T01:52:28 1770342748

> given the proper framing

This sounds like never. Most businesses are still shuffling paper and couldn’t give you the requirements for a CRUD app if their lives depended on it.

You’re right, in theory, but it’s like saying you could predict the future if you could just model the universe in perfect detail. But it’s not possible, even in theory.

If you can fully describe what you need to the degree ambiguity is removed, you’ve already built the thing.

If you can’t fully describe the thing, like some general “make more profit” or “lower costs”, you’re in paper clip maximizer territory.

jondwillis · 2026-02-06T15:15:35 1770390935

> If you can fully describe what you need to the degree ambiguity is removed, you’ve already built the thing.

Trying to get my company to realize this right now.

Probably the most efficient way to work, would be on a video call including the product person/stakeholder, designer, and me, the one responsible for the actual code, so that we can churn through the now incredibly fast and cheap implementation step together in pure alignment.

You could probably do it async but it’s so much faster to not have to keep waiting for one another.

rapind · 2026-02-06T01:47:30 1770342450

Maybe some day, but as a claude code user it makes enough pretty serious screw ups, even with a very clearly defined plan, that I review everything it produces.

You might be able to get away without the review step for a bit, but eventually (and not long) you will be bitten.

jaggederest · 2026-02-06T04:11:43 1770351103

I use that to feed back into my spec development and prompting and CI harnesses, not steering in real time.

Every mistake is a chance to fix the system so that mistake is less likely or impossible.

I rarely fix anything in real time - you review, see issues, fix them in the spec, reset the branch back to zero and try again. Generally, the spec is the part I develop interactively, and then set it loose to go crazy.

This feels, initially, incredibly painful. You're no longer developing software, you're doing therapy for robots. But it delivers enormous compounding gains, and you can use your agent to do significant parts of it for you.

Terretta · 2026-02-06T14:50:35 1770389435

> You're no longer developing software, you're doing therapy for robots.

Or, really, hacking in "learning", building your knowhow-base.

> But it delivers enormous compounding gains, and you can use your agent to do significant parts of it for you.

Strong yes to both, so strong that it's curious Claude Code, Codex, Claude Cowork, etc., don't yet bake in an explicit knowledge evolution agent curating and evolving their markdown knowledge base:

https://github.com/anthropics/knowledge-work-plugins

Unlikely to help with benchmarks. Very likely to improve utility ratings (as rated by outcome improvements over time) from teams using the tools together.

For those following along at home:

This is the return of the "expert system", now running on a generalized "expert system machine".

rapind · 2026-02-06T06:34:11 1770359651

I assumed you'd build such a massive set of rules (that claude often does not obey) that you'd eat up your context very quickly. I've actually removed all plugins / MCPs because they chewed up way too much context.

jaggederest · 2026-02-06T08:25:43 1770366343

It's as much about what to remove as what to add. Curation is the key. Skills also give you some levers to get the kind of context-sensitive instruction you need, though I haven't delved too deeply into them. My current total instruction set is around ~2500 tokens at the moment

vidarh · 2026-02-06T16:42:35 1770396155

Reviewing what it produces once it thinks it has met the acceptance criteria and the test suite passes is very different from wasting time babysitting every tiny change.

rapind · 2026-02-06T21:18:35 1770412715

True, and that's usually what I'm doing now, but to be honest I'm also giving all of it's code at least a cursory glance.

Some of the things it occasionally does:

- Ignores conventions (even when emphasized in the CLAUDE.md)

- Decides to just not implement tests if gets spins out on them too much (it tells you, but only as it happens and that scrolls by pretty quick)

- Writes badly performing code (N+1)

- Does more than you asked (in a bad way, changing UIs or adding cruft)

- Makes generally bad assumptions

I'm not trying to be overly negative, but in my experience to date, you still need to babysit it. I'm interested though in the idea of using multiple models to have them perform independent reviews to at least flag spots that could use human intervention / review.

vidarh · 2026-02-07T11:46:46 1770464806

Sure, but non of those things requires you to watch it work. They're all easy to pick up on when reviewing a finished change, which ideally should come after it's instructions have had it run linters, run sub agents that verify it has added tests, run sub agents doing a code review.

I don't want to waste my time reviewing a change the model can still significantly improve all by itself. My time costs far more than the models.

_zoltan_ · 2026-02-07T10:10:13 1770459013

then you're using it wrong, to be frank with you.

you give it tools so it can compile and run the code. then you give it more tools so it can decide between iterations if it got closer to the goal or not. let it evaluate itself. if it can't evaluate something, let it write tests and benchmark itself.

I guarantee that if the criteria is very well defined and benchmarkable, it will do the right thing in X iterations.

(I don't do UI development. I do end-to-end system performance on two very large code bases. my tests can be measured. the measure is very simply binary: better or not. it works.)

dostick · 2026-02-07T13:28:52 1770470932

That’s what oh-my-open-code does.

retinaros · 2026-02-06T22:01:08 1770415268

good luck.

_zoltan_ · 2026-02-07T10:08:21 1770458901

I've been working on very complex problems with this model and the results I have have surprised people over and over again.

_zoltan_ · 2026-02-02T07:37:09 1770017829

There should never have been an "artisan era". We use computers to solve problems. You should have always getting stuff done instead of bikeshedding over nitty-gritty details, like when in the office people have been spending weeks on optimizing code... just to have the exact same output, exact same time, but now "nicer".

You get paid to get stuff done, period.

mchaver · 2026-02-02T09:19:30 1770023970

> There should never have been an "artisan era".

Firm no. There should be and there will continue to be. Maybe for you all code is business/money-making code, but that is not true for everyone.

> We use computers to solve problems.

We can use computers for lots of things like having fun, making art, and even creating problems for other people.

> You get paid to get stuff done, period.

That is a strange assumption. Plenty of people are writing code without being paid for it.

satvikpendem · 2026-02-02T14:50:05 1770043805

> Plenty of people are writing code without being paid for it.

This is rhetorically a non sequitur. As in, if you get paid (X) then you get stuff done (Y). But if you're not paid (~X), then, ?

Not being paid doesn't mean one does or doesn't get stuff done, it has no bearing on it. So the parent wasn't saying anything about people who don't get paid, they can do whatever they want, but yes, at a job if you're paid, then you better get stuff done over bikeshedding.

techpression · 2026-02-02T11:40:39 1770032439

And to add to this, good artisanal code usually means it runs a lot faster, which means saving money and energy, and those are good things.

satvikpendem · 2026-02-02T14:52:18 1770043938

It depends how much money and energy in the form of manhours were spent to write it in an artisan way in the first place. I've been in a lot of PR reviews where it was clear that the amount of back and forth we had was simply not worth it for the code we wrote.

I'm reminded of this: https://xkcd.com/1205/

lxgr · 2026-02-02T11:59:59 1770033599

I think you're both right. There's a time and place for beautifully crafted code, but there's also a place for a hot mess that barely passes its own non-existing tests, and for anything in between.

Just don't bring an artisan to a slop fight.

ModernMech · 2026-02-02T18:24:14 1770056654

> there's also a place for a hot mess that barely passes its own non-existing tests

For a long time that place has been "the commercial software marketplace". Let's all stop pretending that the code coming out of shops until now has been something you'd find at a guild craft expo. It's always been a ball of spit and duct tape, which is why AI code is often spit and duct tape.

frizlab · 2026-02-02T08:32:33 1770021153

Yeah. Exactly the same as there should never be an “artisan era” for chairs, tables, buildings, etc.

Hell even art! Why should art even be a thing? We are machine driven by neurons, feelings do not exist.

Might be your life, it ain’t mine. I’m an artisan of code, and I’m proud to be one. I might finally use AI one of these days at work because I’ll have to, but I’ll never stop cherishing doing hand-crafted code.

enraged_camel · 2026-02-02T14:34:54 1770042894

>> Yeah. Exactly the same as there should never be an “artisan era” for chairs, tables, buildings, etc.

That's funny you bring up those examples, because they have all moved on to the mass manufacturing era. You can still get artisan quality stuff but it typically costs a lot more and there's a lot less of it. Which is why mass-manufacturing won. Same is going to happen with software. LLMs are just the beginning.

fragmede · 2026-02-02T15:37:48 1770046668

Did you get the Eames version of Windows, or a knockoff?

enraged_camel · 2026-02-02T17:47:58 1770054478

Windows was probably the worst example you could use in this context!

frizlab · 2026-02-02T15:34:51 1770046491

Oh no, but I know! And it is indeed terrible.

I live in a city where there are new houses being built. They are ugly. Meanwhile, the ones that exist since a long time ago have charm and feel homely.

I don’t know, I‘m probably just a regular old man yelling at clouds, but I still think we’re going in the wrong direction. For pretty much everything. And for what? Money. Yay!

Hugh.

scubbo · 2026-02-02T18:00:48 1770055248

You're continuing to make good arguments for why mass-production should exist _alongside_ artisanal craftsmanship. Broad availability of housing which is functional, albeit of questionable aesthetic appeal, is a good thing to improve housing availability[0]; and also it is a good thing for (fewer) well-built, charming, individual homes to be available for those who want to spend more and to get more.

[0] I'm extremely aware that there are other contributing factors to housing shortages. Tax Billionaires, etc. My metaphor still works despite not being total.

falcor84 · 2026-02-02T18:51:37 1770058297

The difference is that end users don't interact with the code that the artisan created, and don't care what it "feels like". One type of code that I do agree should be artisanal is the interface end of libraries.

mikkupikku · 2026-02-02T19:09:28 1770059368

Yes, it's like artisanal plumbing or electrical wiring... all hidden behind walls. A plumber might take pride in the quality of his soldered joints, but artisanal? Who wants to pay for that?

EagnaIonat · 2026-02-02T09:10:21 1770023421

> just to have the exact same output, exact same time, but now "nicer".

The majority of code work is maintaining someone else's code. That's the reason it is "nicer".

There is also the matter of performance and reducing redundancy.

Two recent pulls I saw where it was AI generated did neither. Both attempted to recreate from scratch rather than using industry tested modules. One was using csv instead of polars for the intensive work.

So while they worked, they became an unmaintainable mess.

ModernMech · 2026-02-02T18:20:12 1770056412

You use computers to solve problems. I use computers to communicate and create art. For me, the code I write is first and foremost a form of self expression. No one paid me to write 99% of the code I've written in my life.

For a long time computers were so expensive they could only be used to do things that generate enough money to justify their purchase. But those days are long gone so computers are for much much more than just solving problems and getting stuff done. Code can be beautiful in its own right.

yunohn · 2026-02-02T15:54:00 1770047640

The exact mindset is what has led to the transition from quality products to commercialized crapware, not just with software, but across all industries.

JKCalhoun · 2026-02-02T14:28:59 1770042539

"You get paid to get stuff done, period."

It sounds like you hate your job? To be sure, I've done plenty of grinding over my career as a software engineer but in fact I coded as a hobby before it turned into a career, I then continued to code on the side, now I am retired and code still.

Perhaps the artist in me that keeps at it.

_zoltan_ · 2026-02-03T00:01:34 1770076894

I love my job FWIW. I work at performance engineering and we work with the most complex systems in the world (GB200/B300/...). Couldn't be happier.

But I just don't care if I have 5 layers of abstraction and SOLID principles and clean code and.... bah. I get it. I have an MSc in it and I've been doing this as a hobby and then professionally for decades now. It just doesn't matter. At the end of the day, we get paid to ship something that solves a problem.

It might be a novel problem. And it might be at the frontier of what we can do today. But it's still a problem that needs solving and the path we take is irrelevant from a user's perspective as long as it solves the problem.

satvikpendem · 2026-02-02T14:55:48 1770044148

I don't think they hate their job, just seem to be frustrated at slow bureaucratic processes and long code reviews which I've experienced too. After a while it can get aggravating as to why some people want to nitpick minute details of the code which slows down development overall. I am talking about cases where the initially submitted PR is perfectly fine, not grossly incorrect.

JKCalhoun · 2026-02-02T15:27:13 1770046033

Oh wow, if we're talking about code reviews that's a different topic. I've never, FWIW, encountered "artisans" in code reviews. More like "that's not how I would have coded itsans" and "let me show you some new tricksans".

Yeah, to hell with code reviews. The best years of my career were when I was given carte blanche control over an entire framework, etc. When code reviews came along coding at work sucked.

If anything, the code reviews killed the artisanship.

mikkupikku · 2026-02-02T19:11:59 1770059519

90% of the CRs I've ever gotten have been "artisanal" just because nitpicking superficial nonsense is easier than meaningful critique, and even when the code is perfectly fine it looks more productive from a managers perspective if you're nitpicking a function name than if you just respond with lgtm.

satvikpendem · 2026-02-02T15:31:09 1770046269

Yeah that's what I understood them to mean from "like when in the office people have been spending weeks on optimizing code... just to have the exact same output, exact same time, but now "nicer"." There does come such a time either way when the juice isn't worth the squeeze so to speak in terms of optimization of code.

_zoltan_ · 2026-01-24T03:40:54 1769226054

Of course it works. I haven't looked at code for my internal development in months.

I don't know why people keep repeating this but it's wrong. It works.

_zoltan_ · 2026-01-21T10:55:22 1768992922

It's not that simple. That's how I started as well but now I have hooked up Gemini and GPT 5.2 to review code and plans and then to do consensus on design questions.

And then there's Ralph with cross LLM consensus in a loop. It's great.

_zoltan_ · 2026-01-18T19:54:35 1768766075

what would you calculate in the data?

I could be tempted to do some fun on an NVL72 ;-)

_zoltan_ · 2026-01-16T19:28:53 1768591733

another meh display from dell.

if you truly want a great display for productivity, I can't recommend the Samsung 57 enough. 240hz, 2x4k in one panel. it's great.

_zoltan_ · 2026-01-15T08:43:29 1768466609

8xGPUs per box. this has been the data center standard for the last 8ish years.

furthermore usually NVLink connected within the box (SXM instead of PCIe cards, although the physical data link is still PCIe.)

this is important because the daughter board provides PCIe switches which usually connect NVMe drives, NICs and GPUs together such that within that subcomplex there isn't any PCIe oversubscription.

since last year for a lot of providers the standard is the GB200 I'd argue.

minimaltom · 2026-01-16T00:08:18 1768522098

Fascinating! So each GPU is partnered with disk and NICs such that theres no oversubscription for bandwidth within its 'slice'? (idk what the word is) And each of these 8 slices wire up to NVLink back to the host?

Feels like theres some amount of (software) orchestration for making data sit on the right drives or traverse the right NICs, guess I never really thought about the complexity of this kind of scale.

I googled GB200, its cool that Nvidia sells you a unit rather than expecting you to DIY PC yourself.

_zoltan_ · 2026-01-19T12:50:07 1768827007

usually it's 2-2-2 (2 GPUs, 2 NICs and 2 NVMe drivers on a PCIe complex). no NVLink here, this is just PCIe - under this PCIe switch chip there is full bandwidth, above it's usually limited BW. so for example going GPU-to-GPU over PCIe will walk

GPU -> PCIe switch -> PCIe switch (most likely the CPU, with limited bw) -> PCIe switch -> GPU

NVLink comes into the picture as a separate, 2nd link between the GPUs: if you need to do GPU-to-GPU, you can use NVLink.

you never needed to DIY your stuff, at least not for the last 10 years: most hardware vendors (Supermicro, Dell, ...) will sell you a complete system with 8 GPUs.

what's nice on GH200/GBx00/VR systems, is that you can use chip-to-chip NVLink between the CPU and GPU, so the CPU can access GPU memory coherently and vica versa.