Hacker Newsnew | past | comments | ask | show | jobs | submit | whizzter's commentslogin

I think the claim might harken back to the days when programming was a new thing and mathematicians,physicists,etc were the ones most often getting started at it, if they had by training gotten used to 1 based indexing in mathematics it was probably a bit of a pain to adapt (and why R and Matlab,etc use 1-based indexing).

Thus, 1 probably wasn't "easier", it just adhered to an existing orthodoxy that "beginners" came from at the time.


Yeah, mgmt (and more than anything, query tools) is gonna be a PITA.

But looking at it in a different way, say building something like Google Sheets.

One could place user-mgmt in one single-threaded database (Even at 200k users you probably don't have too many concurrently modifying administrators) whilst "documents" gets their own database. I'm prototyping one such "document" centric tool and the per-document DB thinking has come up, debugging users problems could be as "simple" as cloning a SQLite file.

Now on the other hand if it's some ERP/CRM/etc system with tons of linked data that naturally won't fly.

Tool for the job.


I think a lot of people have a hard time differentiating the underlying systems from what they _see_ and use it to bash MS products.

I heard that it was perhaps recently fixed, but copying many small files was multiple times faster to do via something like Total Commander vs the built in File Explorer (large files goes equally fast).

People seeing how slow Explorer was to copy would probably presume that it was a lower level Windows issue if they had a predisposed bias against Microsoft/Windows.

My theory about Explorers sluggishness is that they added visual feedback to the copying process at some point, and for whatever reason that visual feedback is synchronous/slow (perhaps capped at the framerate, thus 60 files a second), whilst TC does updating in the background and just renderers status periodically whilst the copying thread(s) can run at full speed of what the OS is capable of under the hood.


I dunno about Windows Explorer, but macOS’ finder seems to hash completed transfers over SMB (this must be something it can trigger the receiver to do in SMB itself, it doesn’t seem slow enough for the sender to be doing it on a remote file) and remove transferred files that don’t pass the check.

I could see that or other safety checks making one program slower than another that doesn’t bother. Or that sort of thing being an opportunity for a poor implementation that slows everything down a bunch.


A problem with Explorer, that it also shares with macOS Finder[1], is that they are very much legacy applications with features piled on top, and Explorer was never expected to be used for heavy I/O work and tends to do things the slower way possible, including doing things in ways that are optimized for "random first time user of windows 95 who will have maybe 50 files in a folder"

[1] Finder has parts that show continued use of code written for MacOS 9 :V


This blows my mind. $400B in annual revenue and they can't spare the few parts per million it would take to spruice up the foundation of their user experience.

This is speculation based on external observation, nothing internal other than rumours:

A big, increasing over last decade, chunk of that is fear that they will break the compatibility - or otherwise drop in shared knowledge. To the point that the more critical parts the less anyone wants to touch them (heard that ntfs.sys is essentially untouchable these days, for example).

And various rules that used to be sacrosanct are no longer followed, like the "main" branch of Windows source repository having to always build cleanly every night (fun thing - Microsoft is one of the origins of nightly builds as a practice)


> to bash MS products.

Microsoft gives them a lot of ammo. While, as I said, Microsoft et al. have seen that SMB is indeed efficient, at the same time security has been neglected to the point of being farcical. You can see this in headlines as recent as last week: Microsoft is only now, in 2025, deprecating RC4 authentication, and this includes SMB.

So while one might leverage SMB for high throughput file service, it has always been the case that you can't take any exposure for granted: if it's not locked down by network policies and you don't regularly ensure all the knobs and switches are tweaked just so, it's an open wound, vulnerable to anything that can touch an endpoint or sniff a packet.


Plenty of other workloads that benefit from high performance file access and with networks speeds and disk speeds getting higher whilst single-core perf has more or less plateaued in comparison, it's thus more and more important to support data-paths where the kernel switching won't become a bottleneck.

Even in a distributed database you want increasing (even if not monotonic) keys since the underlying b-tree or whatever will very likely behave badly for entirely random data.

UUIDv7 is very useful for these scenarios since

A: A hash or modulus of the key will be practically random due to the lower bits being random or pseudo-random (ie distributes well between nodes)

B: the first bits are sortable.. thus the underlying storage on each node won't go bananas.


I fully agree that's wrong (can't imagine the overhead of some larger tables I have if that had happened), that said, often people want weird customizations in medium-sized tables that would set one on a path to having annoying 100 column tables if we couldn't express customizations in a "simple" JSON column (that is more or less polymorphic).

Typical example is a price-setting product I work on.. there's price ranges that are universal (and DB columns reflect that part) but they all have weird custom requests for pricing like rebates on the 3rd weekend after X-mas (but only if the customer is related to Uncle Rudolph who picks his nose).


But if you have to model those custom pricing structures anyway, the question what you gain by not reflecting them in the database schema.

There's no reason to put all those extra fields in the same table that contains the universal pricing information.


A lot of unnecessary complexity/overhead for a minor seldomly touched part of a much larger already complex system?

I'll give a comparison.

JSON

- We have some frontend logic/view (that can be feature-flagged per customer) to manage updating the data that's otherwise mostly tagging along as a dumb "blob" (auto-expanded to regular a part of the JSON objects maps/arrays at the API boundary making frontend work easier, objects on the frontend, "blobs" on the backend/db)

- Inspecting specfic cases (most of the time it's just null data) is just copying out and formatting the special data.

- If push comes to shows, all modern databases support JSON queries so you can pick out specifics IF needed (has happened once or twice with larger customers over the years).

- We read and apply the rules when calculating prices with a "plugin system"

DB Schema (extra tables)

- Now you have to wade through lots of customer-specific tables just to find the tables that takes most of the work-time (customer specifics are seldomly what needs work once setup). We already have some older customer-specific stuff from the early days (I'm happy that it's not happened much lately).

- Those _very_ few times you actually need to inspect the specific data by query you might win on this (but as mentioned above, JSON queries has always solved it).

- Loading the universal info now needs to query X extra tables (even when 90%-95% of the data has no special cases).

- Adding new operations on prices like copying,etc now needs to have logic for each piece of customer specific table to properly make it tag along.

- "properly" modelled this reaches the API layer as well

- Frontend specialization is still needed

- Calculating prices still needs it's customization.

I don't really see how my life would have been better for managing all extra side-effects of bending the code to suit these weird customer requests (some that aren't customers anymore) when 90-95% of the time it isn't used and seldomly touched upon with mature customers.

I do believe in the rule of 3, if the same thing pops up three times I do consider if that needs to be graduated to more "systematic" code, so often when you abstract after seeing something even 2 times it never appears again leaving you with some abstraction to maintain.

JSON columns, like entity-attribute-value tables or goto statements all have real downsides and shouldn't be plonked in without a reason, but hell if I'd have to work with overly complex schemas/models because people start putting special cases into core pieces of code just because they heard that a technique was bad.


Italy isn't a puny country, it's over 1000kms between Sicily and the Alps (Like LA to Albuquerque), seems the fault lines reaches northern Italy (about 100km from the alps) but the amount of larger quakes seems smaller there.

I sometimes feel like I go on and on about this... but there is a difference between application and pages (even if blurry at times), and Next is a result of people doing pages adopting React that was designed for applications when they shouldn't have.

People literally play the games they work on all the time, it's more or less what most do.

Pay 2000$ for indie games so studios could grow up without being beholden to shareholders and we could perhaps get that "perfect" QA,etc.

It's a fucking market economy and people aren't making pong level games that can be simply tuned, you really get what you pay for.


It's an valid issue, those of us who worked back in the day on GD/DVD,etc games really ran into bad loading walls if we didn't duplicate data for straight streaming.

Data-sizes has continued to grow and HDD-seek times haven't gotten better due to physics (even if streaming probably has kept up), the assumption isn't too bad considering history.

It's a good that they actually revisited it _when they had time_ because launching a game, especially a multiplayer one, will run into a lot of breaking bugs and this (while a big one, pun intended) is still by most classifications a lower priority issue.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: