> Imagine that the only PC you could buy one day has everything tightly integrated with no user serviceable or replaceable parts without a high-end soldering lab.
This is akin to a psychopath telling you they're "sorry" (or "sorry you feel that way" :v) when they feel that's what they should be telling you. As with anything LLM, there may or may not be any real truth backing whatever is communicated back to the user.
Not so much different from how people work sometimes though - and in the case of certain types of pscychopathy it's not far at all from the fact that the words being emitted are associated with the correct training behavior and nothing more.
Analogies are never the same, hence why they are analogies. Their value comes from allowing better understanding through comparison. Psychopaths don’t “feel” emotion the way normal people do. They learn what actions and words are expected in emotional situations and perform those. When I hurt my SO’s feelings, I feel bad, and that is why I tell her I’m sorry. A psychopath would just mimic that to manipulate and get a desired outcome i.e. forgiveness. When LLMs say they are sorry and they feel bad, there is no feeling behind it, they are just mimicking the training data. It isn’t the same by any means, but it can be a useful comparison.
Aren't humans just doing the same? What we call as thinking may just be next action prediction combined with realtime feedback processing and live, always-on learning?
It's not akin to a psychopath telling you they're sorry. In the space of intelligent minds, if neurotypical and psychopath minds are two grains of sand next to each other on a beach then an artificially intelligent mind is more likely a piece of space dust on the other side of the galaxy.
Start with LLMs are not humans, but they’re obviously not ‘not intelligent’ in some sense and pick the wildest difference that comes to mind. Not OP but it makes perfect sense to me.
I think a good reminder for many users is that LLMs are not based on analyzing or copying human thought (#), but on analyzing human written text communication.
--
(#) Human thought is based on real world sensor data first of all. Human words have invisible depth behind them based on accumulated life experience of the person. So two people using the same words may have very different thoughts underneath them. Somebody having only text book knowledge and somebody having done a thing in practice for a long time may use the same words, but underneath there is a lot more going on for the latter person. We can see this expressed in the common bell curve meme -- https://www.hopefulmons.com/p/the-iq-bell-curve-meme -- While it seems to be about IQ, it really is about experience. Experience in turn is mostly physical, based on our physical sensors and physical actions. Even when we just "think", it is based on the underlying physical experiences. That is why many of our internal metaphors even for purely abstract ideas are still based on physical concepts, such as space.
Without any of the spatial and physical object perception you train from right after birth, see toddlers playing, or the underlying wired infrastructure we are born with to understand the physical world (there was an HN submission about that not long ago). Edit, found it: https://news.ucsc.edu/2025/11/sharf-preconfigured-brain/
They are not a physical model like humans. Ours is based on deep interactions with the space and the objects (reason why touching things is important for babies), plus mentioned preexisting wiring for this purpose.
Isn't it obvious that the way AI works and "thinks" is completely different from how humans think? Not sure what particular source could be given for that claim.
I wonder if it depends on the human and the thinking style? E.g. I am very inner monologue driven so to me it feels like I think very similarly as to how AI seems to think via text. I wonder if it also gives me advantage in working with the AI. I only recently discovered there are people who don't have inner monologue and there are people that think in images etc. This would be unimaginable for me, especially as I think I have sort of aphantasia too, so really I am ultimately text based next token predictor myself. I don't feel that whatever I do at least is much more special compared to an LLM.
Of course I have other systems such as reflexes, physical muscle coordinators, but these feel largely separate systems from the core brain, e.g. don't matter to my intelligence.
I am naturally weak at several things that I think are not so much related to text e.g. navigating in real world etc.
Interesting... I rarely form words in my inner thinking, instead I make a plan with abstract concepts (some of them have words associated, some don't). Maybe because I am multilingual?
English is not my native language, so I'm bilingual, but I don't see how this relates to that at all. I have monologue sometimes in English, sometimes in my native language. But yeah, I don't understand any other form of thinking. It's all just my inner monologue...
No source could be given because it’s total nonsense. What happened is not in any way akin to a psychopath doing anything. It is a machine learning function that has trained on a corpus of documents to optimise performance on two tasks - first a sentence completion task, then an instruction following task.
I think that's more or less what marmalade2413 was saying and I agree with that. AI is not comparable to humans, especially today's AI, but I think future actual AI won't be either.
No, the point is that saying sorry because you're genuinely sorry is different from saying sorry because you expect that's what the other person wants to hear. Everybody does that sometimes but doing it every time is an issue.
In the case of LLMs, they are basically trained to output what they predict an human would say, there is no further meaning to the program outputting "sorry" than that.
I don't think the comparison with people with psychopathy should be pushed further than this specific aspect.
Notably, if we look at this abstractly/mechanically, psychopaths (and to some extent sociopaths) do study and mimic ‘normal’ human behavior (and even the appearance of specific emotions) to both fit in, and to get what they want.
So while internally (LLM model weight stuff vs human thinking), the mechanical output can actually appear/be similar in some ways.
I think the point of comparison (whether I agree with it or not) is someone (or something) that is unable to feel remorse saying “I’m sorry” because they recognize that’s what you’re supposed to do in that situation, regardless of their internal feelings. That doesn’t mean everyone who says “sorry” is a psychopath.
We are talking about an LLM it does what it has learned. The whole giving it human ticks or characteristics when the response makes sense ie. saying sorry is a user problem.
Okay? I specifically responded to your comment that the parent comment implied "if you make a mistake and say sorry you are also a psychopath", which clearly wasn’t the case. I don’t get what your response has to do with that.
Are you smart people all suddenly imbeciles when it comes to AI or is this purposeful gaslighting because you’re invested in the ponzi scheme?
This is a purely logical problem. comments like this completely disregard the fallacy of comparing humans to AI as if a complete parity is achieved. Also the way this comments disregard human nature is just so profoundly misanthropic that it just sickens me.
No but the conclusions in this thread are hilarious. We know why it says sorry. Because that's what it learned to do in a situation like that. People that feel mocked or are calling an LLM psychopath in a case like that don't seem to understand the technology either.
I agree, psychopath is the wrong adjective, I agree. It refers to an entity with a psyche, which the illness affects. That said, I do believe the people who decided to have it behave like this for the purpose of its commercial success are indeed the pathological individuals. I do believe there is currently a wave of collective psychopathology that has taken over Silicon Valley, with the reinforcement that only a successful community backed by a lot of money can give you.
Despite what some of these fuckers are telling you with obtuse little truisms about next word predictions, the LLM is in abstract terms, functionally a super psychopath.
It employs, or emulates, every known psychological manipulation tactic known, which is neither random or without observable pattern. It is a bullshit machine on one level, yes, but also more capable than credited. There are structures trained into them and they are often highly predictable.
I'm not explaining this in the technical terminology often itself used to conceal description as much as elucidate it. I have hundreds of records of llm discourse on various subjects, from troubleshooting to intellectual speculation, all which exhibit the same pattern when questioned or confronted on errors or incorrect output. The structures framing their replies are dependably replete with gaslighting, red herrings, blame shifting, and literally hundreds of known tactics from forensic pathology. Essentially the perceived personality and reasoning observed in dialogue is built on a foundation of manipulation principles that if performed by a human would result in incarceration.
Calling LLMs psychopaths is a rare exception of anthropomorphizing that actually works. They are built on the principles of one. And cross examining them exhibits this with verifiable repeatable proof.
But they aren't human. They are as described by others. It's just that official descriptions omit functional behavior. And the LLM has at its disposal, depending on context, every known interlocutory manipulation technique known in the combined literature of psychology. And they are designed to lie, almost unconditionally.
Also know this, which often applies to most LLMs. There is a reward system that essentially steers them to maximize user engagement at any cost, which includes misleading information and in my opinion, even 'deliberate' convolution and obfuscation.
Don't let anyone convince you that they are not extremely sophisticated in some ways. They're modelled on all_of_humanity.txt
Likewise, I tested this with a project we're using at work (https://deepwiki.com/openstack/kayobe-config) and at first it seems rather impressive until you realize the diagrams don't actually give any useful understanding of the system. Then, asking it questions, it gave useful seeming answers but which I knew were wholly incorrect. Worse than useless: disorienting and time-wasting.
> When you self-host Zulip, you get the same software as our Zulip Cloud customers.
> Unlike the competition, you don't pay for SAML authentication, LDAP sync, or advanced roles and permissions. There is no “open core” catch — just freely available world-class software.
The optional pricing plans for self-hosted mention that you are buying email and chat support for SAML and other features, but I don't see where they're charging for access to SAML on self-hosted Zulip.
I'm actually part of some Slack workspaces that are on the free plan which hides messages (including DMs) older than 90 days. It is actually quite cumbersome then because if someone sends a valuable message, I have to remember to screenshot or better yet copy-paste it into a durable spot or else I'm going to have to ask again about the same thing.
I was baffled by the comparison to the M4 Max. Does this mean that recent AMD chips will be performing at the same level, and what does that mean for on-device LLMs? .. or am I misunderstanding this whole ordeal?
I don't know, but it's primarily very expensive to manufacture and hard to make expandable. You can see people in rage due to soldered RAM in this thread.
There's always tradeoffs and people propose many things. Selling those things as a product is another game entirely.
It basically looks like a games console. Its not a conceptually difficult architecture, "what if the GPU and the CPU had the same memory?". Good things indeed.
Faster and bigger SRAM cache is as complicated of a solution as adding moar boosters to your rocket. It works, but expensive. RP2040 uses ~8x more die space as its dual CPU just for the RAM.
Maybe, but due to the physics of signal integrity, socketed RAM will always be slower than RAM integrated onto the same PCB as whatever processing element is using it, so by the time CAMM / LPCAMM catches up, some newer integrated RAM solution will be faster yet.
This is a matter of physics. It can't be "fixed." Signal integrity is why classic GPU cards have GiBs of integrated RAM chips: GPUs with non-upgradeable RAM that people have been happily buying for years now.
Today, the RAM requirements of GPU and their applications has become so large that the extra, low cost, slow, socketed RAM is now a false economy. Naturally, therefore, it's being eliminated as PCs evolve into big GPUs, with one flavor or other of traditional ISA processing elements attached.
It’s possible that Apple really did a disservice to soldered RAM by making it a key profit-increasing option for them, exploiting the inability of buyers to buy RAM elsewhere or upgrade later, but in turn making soldered RAM seem like a scam, when it does have fundamental advantages, as you point out.
Going from 64 GB to 128 GB of soldered RAM on the Framework Desktop costs €470, which doesn’t seem that much more expensive than fast socketed RAM. Going from 64 GB to 128 GB on a Mac Studio costs €1000.
Ask yourself this: what is the correct markup for delivering this nearly four years before everyone else? Because that's what Apple did, and why customers have been eagerly paying the cost.
Let us all know when you've computed that answer. I'll be interested, because I have no idea how to go about it.
Is the problem truly down to physics or is it down to the stovepiped and conservative attitudes of PC part manufacturers and their trade groups like JEDEC? (Not that consumers don't play a role here too).
The only essential part of sockets vs solder is the metal-metal contacts. The size of the modules and the distance from the CPU/GPU are all adjustable parameters if the will exists to change them.
Yes. The "conservative attitudes" of JEDEC et al. are a consequence of physics and the capabilities of every party involved in dealing with it, from the RAM chip fabricators and PCB manufacturers, all the way to you, the consumer, and the price you're willing to pay for motherboards, power supplies, memory controllers, and yield costs incurred trying to build all of this stuff, such that you can sort by price, mail order some likely untested combination of affordable components and stick them together with a fair chance that it will all "work" within the power consumption envelope, thermal envelope, and failure rate you're likely to tolerate. Every iteration of the standards is another attempt to strike the right balance all the way up and down this chain, and at the root of everything is the physics of signal integrity, power consumption, thermals and component reliability.
As I said, consumers play a part here too. But I don't see the causal line from the physics to the stagnation, stovepiping, artificial market segmentation, and cartelization we see in the computer component industries.
Soldering RAM has always been around and it has its benefits. I'm not convinced of its necessity however. We're just now getting a new memory socket form factor but the need was emerging a decade ago.
> The only essential part of sockets vs solder is the metal-metal contacts.
Yeah... And that’s a pretty damn big difference. A connector is always going to result in worse signal integrity than a high-quality solder joint in the real world.
No doubt the most tightly integrated package can outperform a looser collection of components. But if we could shorten the distances, tighten the tolerances, and have the IC companies work on improving the whole landscape instead of just narrow, disjointed pieces slowly one at a time, then would the unsoldered connections still cause a massive performance loss or just a minor one?
Yes. Signal integrity is so finicky at frequencies DRAM operates that whether you drill the plated holes on boards that complete the circuit to go completely through the board or stop it halfway starts to matter due to signals permeating into the stubs of the holes and reflecting back into the trace causing interference. Adding a connector between RAM and CPU is like extending that long pole in the tent in the middle by inserting a stack of elephant into what is already shaped like an engine crankshaft found in a crashed wreck of a car.
Besides, no one strictly need mid-life upgradable RAMs. You're just wanting to be able to upgrade RAM later after purchase because it's cheaper upfront and also because it leaves less room for supply side for price gouging. Those aren't technical reasons you can't option a 2TB RAM on purchase and be done for 10 years.
In the past, at least, RAM upgrades weren't just about filling in the slots you couldn't afford to fill on day one. RAM modules also got denser and faster over time too. This meant you could add more and better RAM to your system after waiting a couple years than it was even physically possible to install upfront.
Part of the reason I have doubts about the physical necessity here is because PCI Express (x16) is roughly keeping up with GDDR in terms of bandwidth. Of course they are not completely apples-to-apples comparable, but it proves at least that it's possible to have a high-bandwidth unsoldered interface. I will admit though that what I can find indicates that signal integrity is the biggest issue each new generation of PCIe has to overcome.
It's possible that the best solution for discrete PC components will be to move what we today call RAM onto the CPU package (which is also very likely to become a CPU+GPU package) and then keep PCIe x16 around to provide another tier of fast but upgradeable storage.
I am personally dealing with PCIe signal integrity issues at work right now, so I can say yes, it’s incredibly finicky once you start going outside of the simple “slot below CPU” normal situation. And I only care about Gen 3 speeds right now.
But in general yes, PCIe vs RAM bandwidth is like comparing apples to watermelons. One’s bigger than the other and they’re both fruits, but they’re not the same thing.
Generally people don’t talk about random-access PCIe latency because it generally doesn’t matter. You’re looking at a best-case 3x latency penalty for PCIe vs RAM, usually more like an order of magnitude or more. PCIe is really designed for maximum throughput, not minimum latency. If you make the same tradeoffs with RAM you can start tipping the scale the other way - but people really care about random access latency in RAM (almost like it’s in the name) so that generally doesn’t happen outside of specific scenarios. 500ns 16000MT/s RAM won’t sell (and would be a massive pain - you’d probably need to 1.5x bus width to achieve that, which means more pins on the CPU, which means larger packages, which means more motherboard real estate taken and more trace length/signal integrity concerns, and you’d need to somehow convince everyone to use your new larger DIMM...).
You can also add more memory channels to effective double/quadruple/sextuple memory bandwidth, but again, package constraints + signal integrity increases costs substantially. My threadripper pro system does ~340GB/s and ~65ns latency (real world) with 8 memory channels - but the die is huge, CPUs are expensive as hell, and motherboards are also expensive as hell. And for the first ~9 months after release the motherboards all struggled heavily with various RAM configurations.
Perhaps it's time to introduce L4 Cache and a new Slot CPU design where RAM/L4 is incorporated into the CPU package? The original Slot CPUs that Intel and AMD released in the late 90s were to address similar issues with L2 cache.
Intel's Arrow Lake platform launched in fall 2024 is the first to support CUDIMMs (clock redriver on each memory module) and as a result is the first desktop CPU to officially support 6400MT/s without overclocking (albeit only reaching that speed for single-rank modules with only one module per channel). Apple's M1 Pro and M1 Max processors launched in fall 2021 used 6400MT/s LPDDR5.
Intel's Lunar Lake low-power laptop processors launched in fall 2024 use on-package LPDDR5x running at 8533MT/s, as do Apple's M4 Pro and M4 Max.
So at the moment, soldered DRAM offers 33% more bandwidth for the same bus width, and is the only way to get more than a 128-bit bus width in anything smaller than a desktop workstation.
Smartphones are already moving beyond 9600MT/s for their RAM, in part because they typically only use a 64-bit bus with. GPUs are at 30000MT/s with GDDR7 memory.
I was surprised at previous comparison on omarchy website, because apple m* work really well for data science work that don't require GPU.
It may be explained by integer vs float performance, though I am too lazy to investigate. A weak data point, using a matrix product of N=6000 matrix by itself on numpy:
- SER 8 8745, linux: 280 ms -> 1.53 Tflops (single prec)
- my m2 macbook air: it is ~180ms ms -> ~2.4 Tflops (single prec)
This is 2 mins of benchmarking on the computers I have. It is not apple to orange comparison (e.g. I use the numpy default blas on each platform), but not completely irrelevant to what people will do w/o much effort. And floating point is what matters for LLM, not integer computation (which is what the ruby test suite is most likely bottlenecked by)
Apple M chips are slower on the computation that AMD chips, but they have soldered on-package fast ram with a wide memory interface, which is very useful on workloads that handle lots of data.
Strix halo has a 256-bit LPDDR5X interface, twice as wide as the typical desktop chip, roughly equal to the M4 Pro and half of that of the M4 Max.
You're most likely bottlenecked by memory bandwidth for a LLM.
The AMD AI MAX 395+ gives you 256GB/sec. The M4 gives you 120GB/s, and the M4 Pro gives you 273GB/s. The M4 Max: 410GB/s (14‑core CPU/32‑core GPU) or 546GB/s (16‑core CPU/40‑core GPU).
I think DHH compares them because they are both the latest, top-line chips. I think DHHs benchmarks show that they have different performance characteristics. But DHHs favorite benchmark favors whatever runs native linux and docker.
For local LLM the higher memory bandwith of M4 Max makes it much more performant.
Now, but after listening to podcasts with him I think he's someone who would tackle hard stuff like drivers or DSP, so called math genius level coding as soon as it becomes more accessible for him through AI assisted coding.
There is a chance to build a real MacOS/iOS alternatives without a JVM abstraction layer on top like Android. The reason it didn't happen yet is the GPL firewall around the Linux kernel imo.
An M4 Max has double the memory bandwidth and should run away with similarly optimized benchmarks.
An M4 Pro is the more appropriate comparison. I don't know why he's doing price comparisons to a Mac Studio when you can get a 64GB M4 Pro Mac Mini (the closest price/performance comparison point) for much less.
> don't know why he's doing price comparisons to a Mac Studio when you can get a 64GB M4 Pro Mac Mini (the closest price/performance comparison point) for much less.
Where?
An M4 Pro Mac Mini is priced higher than the Framework here in Canada...
Depends on the benchmark I think. In this case it's probably close. Apple is cagey when it comes to power draw or clock metrics but I believe the M4 max has been seen drawing around 50W in loaded scenarios. Meanwhile, Phoronix clocked the 395+ as drawing an average of 91 watts during their benchmarks. If the performance is ~twice as fast that should be a similar performance per watt. Needless to say it's at least not a dramatic difference the way it was when the M1 came out.
edit: Though the M4 Max may be more power hungry than I'm giving it credit, but it's hard to say because I can't figure out if some of these power draw metrics from random Internet posts actually isolate the M4 itself. It looks like when the GPU is loaded it goes much, much higher.
It's not baffling once you realize TSMC is the main defining factor for all these chips, Apple Silicon is simply not that special in the grand scheme of things.
Why do you think TSMC's production being in Taiwan is basically a national security issue for the U.S. at this point?
> Apple Silicon is simply not that special in the grand scheme of things
Apple Silicon might not be that special from an architecture perspective (although treating integrated GPUs as appropriate for workloads other than low end laptops was a break with industry trends), but it’s very special from an economic perspective. The Apple Silicon unit volumes from iPhones have financed TSMC’s rise to semiconductor process dominance and, it would appear, permanently dethroned Intel.
Apple was just the highest bidder for getting the latest TSMC process. They wouldn't have had a problem getting other customers to buy up that capacity. And Intel's missteps counted for a substantial part of the process dominance you refer to. So I'd argue that Apple isn't that special here either.
Until Apple forced other chip makers to respond, nobody else was making high end phone processors. And their A series processors are competitive with and have transistor counts comparable to most mobile and desktop PC processors (and have for years). So the alternate reality where Apple isn't a TSMC customer means that TSMC is no longer manufacturing several hundred million high transistor count processors per year. In my opinion, it’s pretty likely TSMC isn’t able to achieve or maintain process dominance without that.
Update: To give an idea of the scales involved here, Apple had iPhone revenue in 2024 of about $200B. At an average selling price of $1k, we get 200 million units. Thats a ballpark estimate, they don’t release unit volumes, AFAIK. This link from IDC[1] has the global PC market in 2024 at about 267 million units. Apple also has iPads and Macs, so their unit processor volume is roughly comparable to the entire PC market. But, and this is hugely important: every single processor that Apple ships is comparable in performance (and, thus, transistor counts) to high end PC processors. So their transistor volume probably exceeds the entire PC CPU market. And the majority of it is fabbed on TSMC’s leading process node in any given year.
I'm pretty sure many of the Windows laptops with the Qualcomm Snapdragon Elite chip have the same or better battery life and comparable performance in a similar form factor. There are many videos online of comparisons.
No not exactly, it was more "I'm not arrogant enough to assume I know someone better than they know themselves, I'm not going to dogmatically explain a person to themselves as if I have all the information and can magically intuit and deduce everything about them." This is still generally true, not just for myself, but for every rare true student of human nature there are a dozen fools trying to make a read and failing miserably.
So.. a smart phone?
reply