Programming embedded devices is not the same thing as "embedded-systems programming". The latter means, first and foremost, that the software is not allowed to crash, ever, for any reason, else it is people's lives.
I did some initial requirements work on a system to monitor continuous-web papermaking machinery; the line had to be stopped, physically and completely, within 100ms if anything went wrong, because an uncontained web of paper can literally cut people in half. They wanted, in order to be able to hire, to use one of the embedded flavors of a well-known consumer-grade OS, and I had to prove to them that there was no way to make any of them safe, at any cost. And they knew their hardware, because they had built it themselves.
The absolute last resort is a watchdog timer that hits the reset button if N milliseconds go by without the software telling it it's okay. This is what you have to implement if you are dealing with buggy and undocumented hardware -- as, all too often, you are. Sometimes you can get some doco for $ and an NDA, but then in order to get the real doco it is much more $$$ and a much tighter NDA, and the existence of that option is not even divulged until after things have already gone very far south.
If it were only a matter of reading the top-level doco for this or that chip, there would be no issue.
Why do the hardware companies make things so difficult?
If I were selling hardware I’d want it to be as open and well documented as possible. So that more people buy it and so that I get credit for all the great stuff people make with my products.
There are a few reasons, using the hardware manufacturers’ logic.
1) The more you open up your design and its behaviour, the more your competitors can learn about your product and how to possibly improve their own. Even stuff as basic as what specific features/capabilities a specific SKU at a specific price point has can be useful information.
2) The behaviour may be sufficiently undefined as to make documenting it impractical (or a bad look). Specs may also be padded (“up to 14 bits of SNR” may mean you’re getting 8 most of the time unless you’ve got a golden sample, and you’re not getting the distribution without paying big bucks and signing a big NDA). This ties in with 1) - if your competitors know your exact yields, they might be able to advertise being better/more reliable more truthfully, or even cheap out on their manufacturing a bit to drop their own yields down to match or just slightly beat yours.
3) The behaviour might be unknown. There’s obviously a crazy amount of validation testing that goes into high-end chips, but even the best test plan can miss things. This is especially true when you’re talking about high-speed stuff and anything involving power delivery/voltage fluctuations, or async/pipeline executions, or a million other things that can go wrong. Again ties into 1) - if your competitor knows that your chip might deadlock the radio with an obscure pattern of inputs and control signals, that could give them insight into how you’ve laid out your silicon and might give them optimization ideas.
4) If all the available info is given out freely, then potential customers can easily compare manufacturers and pick the best one. The manufacturers don’t want this, unless they’re the best, for obvious reasons. And, because everything’s locked down so tightly, no one knows if they’re the best until the chips are on the market and the volume contracts are already signed. And those contracts are hard to break, since the specs agreed upon are pretty vague due to 1-3.
5) The manufacturer knows their chips suck, but needs them moved anyways. This is rarely the case from most non-discount manufacturers, but it can happen. In this case, you don’t want to give away anything you don’t have to, because most info you give out is going to drive customers away to a better option. Good example in the consumer space is Intel refusing to publish acceptable voltage specs for their 12-14th gen Core chips, which resulted in motherboard manufacturers overvolting and killing high-end CPUs to try to meet the frequency specs Intel was advertising. If Intel was truthful in their voltage and frequency specs, there’d be a minuscule percentage of chips that could actually hit the advertised frequency at safe voltages, and 99% would have worse performance than expected, which would almost definitely result in lower sales.
6) The behaviour may be highly dependent on external factors. Basic example, a chip with external DRAM might have its execution pipeline stalled more or less frequently based of DRAM spec, or a wobbly voltage regulator might be known to cause lockups when certain executions are happening. Are you going to tell your customer those problems, or just say “we recommend high-speed DRAM and high-quality VRMs?” Especially if the other guy just says “we recommend high-speed DRAM and high-quality VRMs?”
The world would likely be a better place without such logic, but the incentive is there. Until someone comes and breaks the paradigm, I don’t see things changing.
I did some initial requirements work on a system to monitor continuous-web papermaking machinery; the line had to be stopped, physically and completely, within 100ms if anything went wrong, because an uncontained web of paper can literally cut people in half. They wanted, in order to be able to hire, to use one of the embedded flavors of a well-known consumer-grade OS, and I had to prove to them that there was no way to make any of them safe, at any cost. And they knew their hardware, because they had built it themselves.
The absolute last resort is a watchdog timer that hits the reset button if N milliseconds go by without the software telling it it's okay. This is what you have to implement if you are dealing with buggy and undocumented hardware -- as, all too often, you are. Sometimes you can get some doco for $ and an NDA, but then in order to get the real doco it is much more $$$ and a much tighter NDA, and the existence of that option is not even divulged until after things have already gone very far south.
If it were only a matter of reading the top-level doco for this or that chip, there would be no issue.