More

joshuaissac · 2026-02-27T18:27:48 1772216868

> Oppression would be quite impossible throughout history if people weren't willing to oppress their own kind to the benefit of others.

Isn't the opposite far more common? When oppression happens, it is typically people oppressing the out-group for the benefit of the in-group.

carlosjobim · 2026-02-27T19:11:43 1772219503

My impression is that the foreign/out group delegate the actual oppressing to local representatives, who are more than eager to do it towards their own kind.

joshuaissac · 2026-02-27T02:51:51 1772160711

If all of your scripts run in the same venv (for a given user), can you inject that into the PATH and rely on env just finding the right interpreter?

I suppose it would also need env to be able to handle paths that have spaces in them.

joshuaissac · 2026-02-15T21:50:12 1771192212

My reading of their comment is that a proof space is a concept where a human guesses that a proof of some form q exists, and the AI searches a space S(q) where most points may be not valid proofs, but if there is a valid proof, it will hopefully be found.

So it is not a space of proofs in the sense that everything in a vector space is a vector. More like a space of sequences of statements, which have some particular pattern, and one of which might be a proof.

measurablefunc · 2026-02-15T21:59:14 1771192754

So it's not a proof space then. It's some computable graph where the edges are defined by standard autoregressive LLM single step execution & some of the vertices can be interpreted by theorem provers like Lean, Agda, Isabelle/HOL, Rocq, etc. That's still not any kind of space of proofs. Actually specifying the real logic of what is going on is much less confusing & does not lead readers astray w/ vague terms like proof spaces.

joshuaissac · 2026-02-13T22:58:13 1771023493

English version: https://github.com/datascale-ai/data_engineering_book/blob/m...

dang · 2026-02-13T23:13:23 1771024403

Oh thanks! I've switched the top URL to that now. Submitted URL was https://github.com/datascale-ai/data_engineering_book.

I hope xx123122 won't mind my mentioning that they emailed us about this post, which originally got caught in a spam filter. I invited them to post a comment giving the background to the project but they probably haven't seen my reply yet. Hopefully soon, given that the post struck a chord!

Edit: they did, and I've moved that post to the toptext.

xx123122 · 2026-02-14T09:40:00 1771062000

Huge thanks, dang! I really appreciate you rescuing the post from the filter and switching the URL to the English version.And thanks for pinning the context comment; it helps a lot since the project is quite extensive. We're thrilled it struck a chord.

xx123122 · 2026-02-14T09:41:23 1771062083

Thanks for sharing the direct link! Much appreciated.

joshuaissac · 2026-02-13T15:45:37 1770997537

Something strange I noticed with the Malayalam course (maybe also other languages with a non-Latin script): when a word is shown, two Latin transliterations are shown underneath. The second one looks like an IAST or ISO-15919 transliteration. The first one is often wrong and sometimes even nonsensical. Why not have only the second transliteration?

joshuaissac · 2026-02-12T17:29:43 1770917383

AI-generated code is meant for the machine, or for the author/prompter. AI-generated text is typically meant for other people. I think that makes a meaningful difference.

ripe · 2026-02-12T17:47:02 1770918422

Code can be viewed as design [1]. By this view, generating code using LLMs is a low-effort, low-value activity.

[1] Code as design, essays by Jack Reeves: https://www.developerdotstar.com/mag/articles/reeves_design_...

acedTrex · 2026-02-12T17:43:04 1770918184

Compiled code is meant for the machine, Written code is for other humans.

gordonhart · 2026-02-12T17:51:22 1770918682

For better or worse, a lot of people seem to disagree with this, and believe that humans reading code is only necessary at the margins, similarly to debugging compiler outputs. Personally I don't believe we're there yet (and may not get there for some time) but this is where comments like GP's come from: human legibility is a secondary or tertiary concern and it's fine to give it up if the code meets its requirements and can be maintained effectively by LLMs.

threetonesun · 2026-02-12T18:25:21 1770920721

I rarely see LLMs generate code that is less readable than the rest of the codebase it's been created for. I've seen humans who are short on time or economic incentive produce some truly unreadable code.

Of more concern to me is that when it's unleashed on the ephemera of coding (Jira tickets, bug reports, update logs) it generates so much noise you need another AI to summarize it for you.

gordonhart · 2026-02-12T19:25:47 1770924347

The main coding agent failure modes I've seen:

- Proliferation of utils/helpers when there are already ones defined in the codebase. Particularly a problem for larger codebases

- Tests with bad mocks and bail-outs due to missing things in the agent's runtime environment ("I see that X isn't available, let me just stub around that...")

- Overly defensive off-happy-path handling, returning null or the semantic "empty" response when the correct behavior is to throw an exception that will be properly handled somewhere up the call chain

- Locally optimal design choices with very little "thought" given to ownership or separation of concerns

All of these can pretty quickly turn into a maintainability problem if you aren't keeping a close eye on things. But broadly I agree that line-per-line frontier LLM code is generally better than what humans write and miles better than what a stressed-out human developer with a short deadline usually produces.

adwf · 2026-02-13T01:02:45 1770944565

Oh god, the bad mocks are the worst. Try adding instructions not to make mocks and it creates "placeholders", ask it to not create mocks or placeholders and it creates "stubs". Drives me mad...

To add to this list:

- Duplicate functions when you've asked for a slight change of functionality (eg. write_to_database and write_to_database_with_cache), never actually updating all the calls to the old function so you have a split codebase.

- On a similar vein, the backup code path of "else: do a stupid static default" instead of erroring, which would be much more helpful for debugging.

- Strong desires to follow architecture choices it was trained on, regardless of instruction. It might have been trained on some presumably high quality, large and enterprise-y codebases, but I'm just trying to write a short little throwaway program which doesn't need the complexity. KISS seems anathema to coding agents.

ziml77 · 2026-02-13T04:16:09 1770956169

I'm sort of happy to see all these things I run into listed out as issues people have so I know it's not just me experiencing and being bothered by these behaviors.

ziml77 · 2026-02-13T03:55:58 1770954958

All of these bother me, but the null/default-value returns drive me insane. It makes the code more verbose and difficult to follow, and in many cases makes the code force its way through problems that should be making it stop. Please, LLM, please just throw an exception!

hinkley · 2026-02-12T17:52:51 1770918771

And Sturgeon tells us 90% of people are wrong, so what can you do.

philipp-gayret · 2026-02-12T17:48:50 1770918530

Compiled natural language is meant for the machine, Written natural language is for other humans.

CivBase · 2026-02-12T18:21:45 1770920505

If AI is the key to compiling natural language into machine code like so many claim, then the AI should output machine code directly.

But of course it doesn't do that becaude we can't trust it the way we do a traditional compiler. Someone has to validate its output, meaning it most certainly IS meant for humans. Maybe that will change someday, but we're not there yet.

jvanderbot · 2026-02-12T17:33:27 1770917607

This is precisely correct IMHO.

Communication is for humans. It's our super power. Delegating it loses all the context, all the trust-building potential from effort signals, and all the back-and-forth discussion in which ideas and bonds are formed.

ginsider_oaks · 2026-02-12T19:33:14 1770924794

> Programs must be written for people to read, and only incidentally for machines to execute.

from the preface of SICP.

bananaflag · 2026-02-13T00:57:35 1770944255

Well SICP was already considered here to be obsolete with the rise of the library-abstraction culture.

copperx · 2026-02-13T06:22:52 1770963772

And AI is great at performing basic science on libraries.

so-cal-schemer · 2026-02-16T22:35:38 1771281338

Is this sarcasm?

SICP is fundamentally about abstraction..

everforward · 2026-02-12T18:19:43 1770920383

A lot of writing (maybe most) is almost the same. Code is a means of translating a process into semantics a computer understands. Most non-fiction writing is a means of translating information or an idea into semantics that allow other people to understand that information or idea.

I don’t think either is inherently bad because it’s AI, but it can definitely be bad if the AI is less good at encoding those ideas into their respective formats.

askvictor · 2026-02-12T21:57:38 1770933458

At the same time, AI-generated code has to be correct and precise, whereas AI-generated text doesn't. There's often no 'correct solution' in AI-generated text.

joshuaissac · 2026-02-12T16:48:54 1770914934

Following Postel's law results in the normalisation and proliferation of defective implementations. The actual standard becomes irrelevant, and new implementations have to be coded against the defective ones.

My opinion is that Postel's law should be approached in the same way that Linus Torvalds did CVS when designing Git. If in doubt about an implementation decision, consider what Postel's law would recommend, and then do the exact opposite.

joshuaissac · 2026-02-11T12:53:31 1770814411

> FIPS-compliant bindings (OpenSSL)

Using FIPS mode can be insecure because the latest FIPS-compliant version can be years older than the latest non-FIPS one with all the updates.

The only time it makes sense to use the FIPS version is where there is a legal or contractual requirement that trumps security considerations.

fuzzzerd · 2026-02-11T14:43:15 1770820995

While I think this is good advice, the fact that it's true feels backward to me. "We have a legal or contractual obligation to be less secure than we otherwise would be." Just seems silly.

tristor · 2026-02-11T15:31:55 1770823915

Welcome to the reality of most of the "information security" business, which is mostly just compliance by checkbox. A significant proportion of encrypted Internet traffic that is transiting government agencies or major enterprises gets decrypted in flight for inspection, literally inserting a black-box with privileged MITM capabilities into otherwise secure protocols, purely for the purpose of checking a compliance box, and that's not even the worst sin.

There's no insecurity like compliant cybersecurity :)

joshuaissac · 2026-02-02T08:15:50 1770020150

You can also sponsor the development of Neovim. The money goes to funding developers.

https://neovim.io/sponsors/

joshuaissac · 2026-01-30T20:46:50 1769806010

Reverse model aimed at estate agents already posted in this thread by someone: https://news.ycombinator.com/item?id=46829566