Most consumers aren't running LLMs locally. Most people's on-device AI is likely whatever Windows 11 is doing, and Windows 11 AI functionality is going over like a lead balloon. The only open-weight models that can come close to major frontier models require hundreds of gigabytes of high bandwidth RAM/VRAM. Still, your average PC buyer isn't interested in running their own local LLM. The AMD AI Max and Apple M chips are good for that audience. Consumer dedicated GPUs just don't have enough VRAM to load most modern open-weight LLMs.
I remember when LLMs were taking off, and open-weight were nipping at the heels of frontier models, people would say there's no moat. The new moat is high bandwidth RAM as we can see from the recent RAM pricing madness.
Your average PC buyer doesn’t know what an LLM is, let alone why they should run one locally.
They just want a good PC that runs Word and Excel and likely find the fact that Copilot keeps popping up in Word every time they open a new document to be annoying rather than helpful.
Pebble seems to be the only watch company that I feel understands what a smartwatch should be. It feels like everyone else is trying to make a phone replacement. Pebbles are more of an extension to your phone. Sure it can do some things without the phone, but it isn't trying to make calls, access mobile data with its own sim, use GPS. Those kinds of watches feel like they are designed for athletes who want to leave their phone at home. I'm not that guy. My phone comes with me everywhere, so why would I want phone-light on my wrist when my phone in my pocket can do it better?
Pebble also gets battery life. Pebble's 2 weeks compared to 1 day on my pixel watch 3. Want to use that cool sleep tracking feature on your smartwatch? Guess what? Its on the charger.
I'm a runner, so obviously biased, but I implore everyone (to the point of annoyance) to check out some of the cheaper Garmin smart watches. They use MIP displays so the battery life runs about 2 weeks, you get phone notifications, the ability to find your phone, sleep scoring, step counting, heart rate monitoring. Then there's the obvious GPS run recording which you don't have to use. There's more stuff as well but I don't really use that like NFC card payments, music controls, but overall it hits a nice balance of features versus battery life.
For the sake of fair comparison, my wife had an Apple watch, which looked better and had way more features, but the 1 day battery life became such a frustration it sat in a dresser drawer. My last Garmin lasted 5 years with daily use and sports, and only died because I took it into the sea on vacation after the waterproof seal failed on the screen. I replaced it the day I got back with the successor model and couldn't be happier.
I'm not shilling for Garmin (or at least not being paid to), I love the Pebbles and I'm very much looking forward to the launch as I want a more fashionable smartwatch. Apple, Samsung et al have kinda tainted the smartphone market with feature vomit, when in fact there's a lot of good stuff out there, it's just not as hip.
The difference between such Garmin watches and Pebble Round 2 seems to be trading off hardware like built-in GPS and NFC for open source software and thinness. 100% worthwhile trade IMO.
Isn’t this thread about the pebble round 2 which is the closest from form factor to a garmin ? Moreover, GPS is lacking on any pebble anyhow which is also a major use case and again therefore, garmin’s and pebble’s can’t be compared, 2 different use cases for me.
Even the PT2 is significantly thinner than Garmins, roughly 2mm. That's about 15% thinner, which is a big deal in my book. Not as thin as AWs though, but not off by too much. Based on photos, it looks like the PR2 will be 2mm thinner than the PR2, which would make it thinner than AWs.
Every Garmin I've tried has been a complete mess, laggy, and deeply unstable in its connection and what notifications it supported (if any), whether it was cheap or expensive.
I'd love an alternative, but from the models I've tried I don't think Garmin is anywhere near what I liked about Pebbles. Closer than some brands, but not anywhere near what I'd consider "close". Bangle.js is closer, for all its (many) flaws.
I switched over to Coros devices and have been fairly impressed with the performance over the last 5 months. Excellent battery life on the APEX series.
Seconding the Garmin watches, been using a Forerunner for years and the battery still lasts over a week with an always-on display. Does about as much as a Pebble and even has a documented SDK with sideloading. The refresh rate is very low but totally fine for a watch, I think newer models have much better displays. The only thing I wish were better is exporting all the tracked health data in a format I can use myself. Using the official tool for that I couldn't even get heart rate data.
> They use MIP displays so the battery life runs about 2 weeks
Double-check this because they have a lot of OLED models now alongside their MIP ones. Battery life is more or less the same either way with AOD off, but with AOD enabled the OLEDs fall behind the MIPs.
OLED is also a heck of a lot less readable in sunlight. They're constantly improving, but the brightness they need to pump out to actually compete is extreme and they're still nowhere near that.
I find the Garmin UI to be awful, and the Pebble UI to be a breeze. Also, Garmins are pretty bulky compared to Pebbles, and many of them don't have buttons that can be used to control music, for those of us who find touchscreen interfaces to be lacking.
If you don't want to use the touchscreen (I'm in the same boat, totally get it) you need to avoid their "lifestyle" ranges (Venu, Vivoactive, etc) and stick to the "outdoors"/"sport" ranges (Forerunner being the most entry-level of these), these have 5 side buttons (3 left/2 right) and the UI is designed around button-only use.
Man I wish the Garmin UI was better. I have a vivomove luxe that looks absolutely gorgeous (more hybrid smart watches PLEASE!) but the UI is so hard to navigate that I wear my ugly old Apple Watch much more.
Huh? Are you sure you aren’t thinking of another brand? Literally every garmin watch I can find on their site has buttons.
The size is also very much watch specific. They will all be thicker than a pebble, but they’ll also all have far more features. Like pulse ox, which is one of the main drivers of thickness.
Nope, some are touch screen only or with only 1-2 buttons. The Garmin's with 5 buttons (eg: Forerunner 55 at ~$170) are decent once you get used to the button-mechanisms, but pebble's UX for "productivity notifications" has always been top-tier.
I seriously cannot understand how to see old notifications on my Garmin. On the Pebble it was just "scroll up" (or down, I don't remember). But on the Garmin it's like multiple button pushes, and even then the list of notifications is not complete. I basically figure if I don't see a notification when it comes in I won't be able to find it in the Garmin's history.
Appreciate the help! That contains some notifications, but not all. For example, none of my text messages or emails are there. It's mostly a bunch of alerts from my security system/cameras, for some reason.
Garmin Lily 2 Classic (shazam!). Certain Venu and VivoActive seem like 1-button or 2-button.
And also by "touch screen only", I mean like: "can you set an alarm with the buttons like a CASIO from 1982?" ...if you have to use the touch-screen for swiping like a monkey in a one square inch area to set (or turn on) an alarm, then the watch "doesn't have buttons" IMHO.
Pebble had Up/Ok/Down on the right side, and "Android-Back" on the lower-left. So you just generally navigated tree-like menus, and you could set shortcuts to long-presses of up/ok/down (ie: start/request Uber, next train from nearest station, music controls).
I can't wait to have it again, as while Apple says "you don't need to be tied to your phone!" with their watches, Pebble actually delivered on it. You still needed your phone nearby or in bluetooth range, but you could comfortably "leave it" on the table, or in the bedroom or whatever and not worry about missing an important phone call, and still get "just enough" connectivity to drip out of the internet that you didn't need your phone unless you were transitioning into "using your phone for a task".
When I was looking a couple years ago, most Garmins had at least 2 buttons, but only those with 5 supported music control via buttons.
I think I have used the pulse oximeter maybe 1x/year, and that's counting during COVID shutdowns, when people talked about pulse ox more than in normal times.
I will keep my Garmin and will use it when exercising. But I would never buy another one as long as I can get Pebbles instead.
I had two pebble watches, and I used them daily for years. I rarely use my pixel watch 3, mainly because of charging. I only have one proprietary charger for the watch and sometime it is on my desk, sometimes near my bed, sometimes somewhere I can't find. I don't need my watch, but I do need my phone, so I charge the phone, and forget that my watch exists for a few months at a time. I think the biggest hurdle for me and watches is daily charging. I will not buy another smartwatch unless the battery is at least a week. Pebble round 2 having two week battery is great!
I've looked at Garmin, because I have the fitbit sense 2 and was looking for something with a reasonable battery life.
However, I think Garmin has made the flaw of overcomplicating their product offerings. I ended up pre-ordering a pebble because I implicitly don't like a company that tries to segment their market that hard on smart watches.
> For the sake of fair comparison, my wife had an Apple watch, which looked better and had way more features, but the 1 day battery life became such a frustration it sat in a dresser drawer.
To each their own, but it sounds like your wife just couldn't get into the "happy path" routine of an Apple Watch user.
I've been using an Apple Watch since Series 5 introduced the always-on display. I wear it for roughly 23 hours a day, and charge it whenever I'm in the bathroom. I'm fine with this routine 99% of the time, but I'm also not someone who'd camp or stay outdoors for more than a night.
Before that, I was using a Amazfit Bip and was really proud of its 30+ day battery life. I very much prefer the features the Apple Watch has.
as someone who only recharges their garmin watch maybe once a month(with dozens of hours of activity tracking), lol at daily recharging of a watch. that completely eliminates it as a possible product for me.
even after a few years with battery degradation I rarely recharge my watch more than once every 2-3 weeks.
it's kind of wild to me that folks would daily recharge a watch.
That happy path works okay for a while but provides very little margin for when the battery inevitably starts to degrade. I’m a few years in and now every few days it’s started to die at around 8pm (yet claims the battery health is still just barely outside replacement range which is … quite convenient for Apple).
As a reluctant runner, I still don't see any value in a smartwatch. I just use my phone and it does everything I want, which is basically, play podcasts and record my run for Strava.
I did previously have a smartwatch which did heart rate monitoring, but really, once I'd confirmed that when I exercised harder my heart rate went up, I lost interest in it.
I dont see the point of smart watches either. I wear a casio / gshock with the backlight button that sticks right up on the front of the watch. i am on my second watch now cz my sister gifted it to me. the first is ticking away happy with 0 charging , battery changes to date.
0 reasons to change.
my sister otoh has an apple watch that she never charges, lies in a drawer which i hear about when she's trying to find her phone. conversation ends with "eh i should charge it maybe"
if i ever buy a smart watch, will likely be the pebble
I tried a Garmin for a while but the UI bugs/inconsistencies/onboarding process put me off a lot so I eventually got rid of it. Using an old Apple watch SE at the moment and apart from the minor inconvenience of charging it overnight (no need for sleep tracking) it does everything better.
not defending garmin, but they completely redid the onboarding process last year. watch data and everything transfers right over to new devices now. I took me like 10 minutes to setup my new fenix a few months ago.
The problem I have with Garmin is lack of support for older devices. They practically bricked my old bike computer. Unfortunately it's been awhile and I can't recall the details of the issue. I have since switched to Wahoo but have only had it for about 3-4 years now.
Exactly this. Pebbles feel like they were built from the ground up to be a watch, whereas the Apple Watch and Android Wear feel like they started from a phone and stripped things away until it became a watch.
Separately, it baffles me that Garmin, despite them having also built a watch OS from the ground up, never understood watch/limited-button UX. Their Instinct and Forerunner watches have all sorts of wonky, hidden and arcane interactions with buttons (long press this to X, press this here to Y). Pebble proves that a simple, shallow, and linear menu system works great!
> Pebble proves that a simple, shallow, and linear menu system works great!
Hard to say this is true when Garmin watches are far more successful than Pebble. That aside, the forerunner is a sports watch first where you want lots of physical buttons that don't get bothered by sweat. The better Garmin comparison is the Venu series which only have two buttons https://www.garmin.com/en-US/c/wearables-smartwatches/?serie....
I'm making a subjective comparison here, true. But spend fifteen minutes with each company's watches and you'll see what I mean.
> Hard to say this is true when Garmin watches are far more successful than Pebble.
A company's success != UX efficacy. That's like saying Apple's products had terrible UX in 1997 because they were flailing up against their Microsoft counterparts of the same era, despite the fact that Apple's UX guidelines of the nineties are regularly raised here as a rubric for UX evaluation, even against Apple's own modern products!
> The better Garmin comparison is the Venu series which only have two buttons
I'm not sure you've ever used a Pebble, but Pebble OS is entirely button-driven with four buttons, whereas the Forerunner and Instinct have five. I've never used a Venu, but isn't it primarily touchscreen-driven?
(yes, the upcoming pebble watches do have touchscreens, but I believe that's just for use in apps and watchfaces, not navigating the system)
> Hard to say this is true when Garmin watches are far more successful than Pebble
This may not be true for long, honestly. Pebble hasn't made watches in years, and I wouldn't be surprised if within 2-4 years they were selling more units than Garmin. The Pebble UI is a dream, especially compared to Garmin. I could never get my parents to get a Garmin, but a Pebble could totally work for them. Super intuitive, hardly needs charging, gives them notifications when they're in a different room than their phone, always-on/always-readable screen.
Very unlikely. The reason Garmin watches are successful is because they've carved out an audience (athletes, health and exercise focused). Pebble might have a nice UI but most people would be better off with an Apple Watch or whatever the current flavour of the week is on Android
I think a lot of people bought AWs because they seemed like the right thing to get, integrated easily, and were more or less easy to use.
But most people I know who have AWs don't use most of the functionalities they provide. If you went up to 20 random AW wearers and ask them if they would give up a bunch of features they don't use (like the awful Siri assistant) in exchange for 15-30x the battery life, I think a lot of them would say yes.
Add onto that the fact that Pebbles are cheaper than AWs, and I think we're going to see a non-trivial number of people "upgrading" from AWs to Pebbles when the batteries start to degrade.
Ironically, I just talked to all my mates about our Apple Watches, and universally Siri on your wrist for setting timers and replying to messages with voice, completely hands free, was the killer app that everyone agreed on.
Setting a timer is as simple as bringing your wrist to your face and saying the amount of time.
I literally only use Siri on my Apple Watch, I’ve only triggered it accidentally on my iPhone and have the hot word disabled on all my other devices. Of course, all I ever use it for is setting timers and alarms on the watch, but still…
You might be talking about Garmin now-smartwatch devices. The first Forerunners look like something you'd strap on a bike's handlebars. They weren't referred to as smartwatches, but as "personal trainers" and didn't seem to display the time-of-day to classify as a watch. Pebbles and the predecessor InPulse seem to always have been smartwatches, though the need seems to have started by wanting to avoid taking out one's phone while on the bike. Garmin pivoted, but I don't think Pebble did.
What I want is a smart watch that lets me map hardware buttons and rotary knobs to arbitrary actions on my Android phone. For example I want to be able to use the knob to control the volume of whatever I'm listening to on my phone.
I worked on a prototype of this idea back in school[0]
No rotary dial, and... I think no, the closest you'd get is to run an app on the watch (so you can capture the three not-"back" buttons) plus a companion app on your phone to receive events from it and do [stuff]. But that'd mean you'd have to open the app on your watch to control what's on your phone.
The media controls might be able to do the volume part automatically out of the box though? I forget exactly what their UI did when I had music playing, but I think it had volume and skip controls. That's nowhere near arbitrary control though.
I love the bells and whistles watches and I'm not an athlete. I love it in case I forget my phone, I still get all the notifications. I can pay with it. I can watch my home security. Something like the pebble wouldn't work for me anymore. Despite being a first day Kickstarter backer. The charging doesn't bother me because for me a watch is mainly for when I'm not home.
I use a Fenix 7x pro solar and it's one of my favorite pieces of hardware right now. I dread having yet another item to charge and keep track of, but this thing lasts a full month if I'm not actively tracking workouts. My only complaint is it's not hackable like a pebble, but honestly I'm not sure what functionality I'd add to it, other than Doom for lols. I really just use it to tell the time, see phone notifications as they arrive, altitude/baro/compass when outdoorsing, and for heart rate tracking. Works great with Gadgetbridge, handles the abuse of my physical job, and doesn't get in my way like other smart watches I've tried, where I had to remember to charge it every other day or I couldn't track my sleep. This watch lives on my wrist and tells me days in advance when it needs a charge!
I've never tried Pebbles but Huawei also has the same philosophy, mine has 2 weeks battery life and does all I need (which also doesn't include replacing the phone). I don't understand why people would buy watches with 1 or 2 days of battery life.
I like Amazfit well enough but I found the UI to be English-second and therefore a bit confusing. Also, I don't trust a company like Amazfit to have my GPS location at all times.
> Pebble seems to be the only watch company that I feel understands what a smartwatch should be.
So, why do you think Pebble didn’t succeed? I think that’s because you’re a minority, and demand for a Pebble-like product is too low at the price point where it would be a viable business.
IIRC they got out over their skis financially. Eric did a podcast interview where he talked about what went wrong, I think it was this one. [1]
He's self-funding this company and doing pre-orders, which means that risk should not exist this time around.
But to GP's point, I agree that Pebble knows what smartwatches are, and they make the best ones. But it turns out that lots of people want (or have been convinced by marketing that they want) a wrist-worn computer, which has been a boon for Apple/Google.
I think the new Pebbles will convert a lot of people because the battery life skips two orders of magnitude (in the time sense), going from ~1 day to ~1 month. That and the slick user interface should be attractive to folks who are considering upgrading their AWs as the battery degrades. Some will realize that they don't need all the computer-y functionality that the AW provides and just go with a Pebble. The fact that they're a bit cheaper, and available in a nice-looking round case is an added bonus.
Like a lot of people, I assumed I would like AWs, and that they would continue to evolve to better and better battery life. But they haven't approached Pebble territory and I can see that the functionality they provide is not worth the tradeoff for me. I just don't care to tap at a computer on my wrist. Maybe other people do, but I'd bet that Eric's going to win over a lot of AW users who realized they are overkill.
The finance comments may be right. Another factor would be marketing budgets and existing brand recognition. The watches are marketed as having all of these features, and I think customers got lost in the feature comparison instead of thinking if they really want a smart watch to do that. Many customers aren't thinking if they really want to pay another monthly for a watch data SIM. I think people lost sight that their phone can do all that stuff, and they are going to be bringing their phone with them so why would they need redundant functionality generally worse than than what their phone can do. If pebble gets a marketing budget, I would hope they focus on messaging of what makes their watches stand out.
I'm not much of an athlete, and I don't have a smartwatch, but the idea of leaving my phone at home sometimes without being totally disconnected from comms does appeal to me somewhat.
Would that not just be replacing a phone with another smaller phone strapped to your wrist?
Doesn't the fact that you are connected and communicable make whatever device you choose to use essentially a phone?
I will say, if it is possible, going out without any form of internet/comms enabled device can be very liberating. We all used to do it, and I think many of us have gotten used to the idea that we need to be on call or have some sort of utility in case of emergencies that are very unlikely to happen.
Google has really stepped up their game with the Pixel Watch 4 in terms of battery life. I easily get at least 3 days, compared to < 1 day for my Pixel Watch 3.
exactly this. My Galaxy Wear sits in a drawer 10 months per year, as I only use it as a wrist-strapped mini-smartphone when I go to the beach. It's too bulky and cumbersome to wear every day. The Pebble Time 2 I plan to use every day, as it does exactly what I want in a smartwatch sans wireless payments
I've been wondering when we will see general purpose consumer FPGAs, and eventually ASICs, for inference. This reminds me of bitcoin mining. Bitcoin mining started with GPUs. I think I remember a brief FPGA period that transitioned to ASIC. My limited understanding of Google's tensor processing unit chips are that they are effectively a transformer ASIC. That's likely a wild over-simplification of Google's TPU, but Gemini is proof that GPUs are not needed for inference.
I suspect GPU inference will come to an end soon, as it will likely be wildly inefficient by comparison to purpose built transformer chips. All those Nvidia GPU-based servers may become obsolete should transformer ASICs become mainstream. GPU bitcoin mining is just an absolute waste of money (cost of electricity) now. I believe the same will be true for GPU-based inference soon. The hundreds of billions of dollars being invested on GPU-based inference seems like an extremely risky bet that ASIC transformers won't happen, although Google has already widely deployed their own TPUs.
FPGAs will never rival gpus or TPUs for inference. The main reason is that GPUs aren't really gpus anymore. 50% of the die area or more is for fixed function matrix multiplication units and associated dedicated storage. This just isn't general purpose anymore. FPGAs cannot rival this with their configurable DSP slices. They would need dedicated systolic blocks, which they aren't getting. The closest thing is the versal ML tiles, and those are entire peoxessors, not FPGA blocks. Those have failed by being impossible to program.
> FPGAs will never rival gpus or TPUs for inference. The main reason is that GPUs aren't really gpus anymore.
Yeah. Even for Bitcoin mining GPUs dominated FPGAs. I created the Bitcoin mining FPGA project(s), and they were only interesting for two reasons: 1) they were far more power efficient, which in the case of mining changes the equation significantly. 2) GPUs at the time had poor binary math support, which hampered their performance; whereas an FPGA is just one giant binary math machine.
I have wondered if it is possible to make a mining algorithm FPGA-hard in the same way that RandomX is CPU-hard and memory-hard. Relative to CPUs, the "programming time" cost is high.
My recollection is that ASIC-resistance involves using lots of scratchpad memory and mixing multiple hashing algorithms, so that you'd have to use a lot of silicon and/or bottleneck hard on external RAM. I think the same would hurt FPGAs too.
>Those have failed by being impossible to program.
I think you spoke too soon about their failure, sooner they will be much easier to program [1].
Interestingly, Nvidia GPU now is also moving to tile-based GPU programming model that targets portability for NVIDIA Tensor Cores [2]. Recently there're discussions on the topic at HN [3].
[1] Developing a BLAS Library for the AMD AI Engine [pdf]:
The amd npu and versal ML tiles (same underlying architecture) have been an complete failure. Dynamic programming models like cu tile do not work on them at all, be cause they require an entirely static graph to function. AMD is going to walk away from their NPU architecture and unify around their GPU IP on inference products in the future.
I think it'll get to a point with quantisation that GPUs that run them will be more FPGA like than graphics renderers.
If you quantize far enough things begin to look more like gates than floating point units. At that level a FPGA wouldn't run your model, it would be one your model.
I feel like your entire comment is a self contradicting mess.
You say FPGAs won't get dedicated logic for ML, then you say they did.
Why does it matter whether the matrix multiplication units inside the AI Engine are a systolic array or not? The multipliers support 512 bit inputs which means 4x8 times 8x4 for bfloat16 with one multiplication per cycle and bigger multiplications with smaller data types. Since it is a VLIW processor, it is much easier to achieve full utilisation of the matrix multiplication units, because you can run loads, stores and process tiles all simultaneously in the same cycle.
The only thing that might be a challenge is arranging the communication between the AI Engines, but even that should be blatantly obvious. If you are doing matrix multiplication, you should be using the entire array in exactly the pattern you think they should be using internally.
Who knows, maybe there is a way to implement flash attention like that too.
The versal stuff isn't really an FPGA anymore. The chips have PL on them, but many don't. The consumer NPUs from AMD are the same versal aie cores with no PL. They just aren't configurable blocks in fabric anymore and don't have the same programming model. So I'm not contradicting myself here.
That being said, versal aie for ml has been a terrible failure. The reasons for why are complicated. One reason is because the memory hierarchy for SRAM is not a unified pool. It's partitioned into tiles and can't be accessed by all cores. additionally, access of this SRAM is only via dma engines and not directly from the cores. Thirdly, the datapaths for feeding the VLIW cores are statically set, and require a software configuration to change at runtime which is slow. Programming this thing makes the cell processor look like a cakewalk. You gotta program dma engines, you program hundreds of VLIW cores, you need to explicitly setup on chip network fabric. I could go on.
Anyway, my point is FPGAs aren't getting ML slices. Some FPGAs do have a completely separate thing that can do ML, but what is shipped is terrible. Hopefully that makes sense.
I don't think this is correct. For inference, the bottleneck is memory bandwidth, so if you can hook up an FPGA with better memory, it has an outside shot at beating GPUs, at least in the short term.
I mean, I have worked with FPGAs that outperform H200s in Llama3-class models a while and a half ago.
Show me a single FPGA that can outperform a B200 at matrix multiplication (or even come close) at any usable precision.
B200 can do 10 peta ops at fp8, theoretically.
I do agree memory bandwidth is also a problem for most FPGA setups, but xilinx ships HBM with some skus and they are not competitive at inference as far as I know.
I'd like to know more. I expect these systems are 8xvh1782. Is that true? What's the theoretical math throughput - my expectation is that it isn't very high per chip. How is performance in the prefill stage when inference is actually math limited?
This is a common misunderstanding from industry observers (not industry practitioners). Each generation of (NVIDIA) GPU is an ASIC with different ISA etc. Bitcoin mining simply was not important enough (last year, only $23B Bitcoin mined in total (at $100,000 per)). There is amped incentive to implement every possible instructions useful into GPU (without worrying about backward compatibility, thanks to PTX).
ASIC transformers won't happen (defined as a chip with single instruction to do sdpa from anything that is not broadly marketed as GPU, and won't have annualized sale more than $3B). Mark my word. I am happy to take a bet on longbets.org with anyone on this for $1000 and my part will go to PSF.
I don't know if they'll reach $3B, but at least one company is using FPGA transformers (that perform well) to get revenue in before going to ASIC transformers:
TPUs aren't transformer ASICs. The Ironwood TPU that Gemini was trained on was designed before LLMs became popular with ChatGPT's release. The architecture was general enough that it ended up being efficient for LLM training.
A special-purpose transformer inference ASIC would be like Etched's Sohu chip.
It all comes down to memory and fabric bandwidth. For example, the state of the art developer -friendly (PCIe 5.0) FPGA platform is Alveo V80 which rocks four 200G NIC's. Basically, Alveo currently occupies this niche where it's the only platform on the market to allow programmable in-network compute. However, what's available in terms of bandwidth—lags behind even pathetic platforms like Bluefield. Those in the know are aware of what challenges are there to actually saturate it for inference in practical designs. I think, Xilinx is super well-positioned here, but without some solid hard IP it's still a far cry from purpose silicon.
As far as I understand all the inference purpose-build silicon out there is not being sold to competitors and kept in-house. Google's TPU, Amazon's Inferentia (horrible name), Microsoft's Maia, Meta's MTIA. It seems that custom inference silicon is a huge part of the AI game. I doubt GPU-based inference will be relevant/competitive soon.
Gemini is likely the most widely used gen AI model in the world considering search, Android integration, and countless other integrations into the Google ecosystem. Gemini runs on their custom TPU chips. So I would say a large portion of inference is already using ASIC. https://cloud.google.com/tpu
Soon was wrong. I should have said it is already happening. Google Gemini already uses their own TPU chips. Nvidia just dropped $20B to buy the IP for Groq's LPU (custom silicon for inference). $20B says Nvidia sees the writing on the wall for GPU-based inference. https://www.tomshardware.com/tech-industry/semiconductors/nv...
The only time FPGAs / ASICS are better is if there's gains we can make by innovating on the hardware architecture itself. That's pretty hard to do considering GPUs are already heavily optimized for this use case.
There was in the past.
Google had Coral TPU and Intel the Neural Compute Stick (NCS).
NCS is from 2018 so it's really outdated now.
It was mainly oriented for edge computing so the flops was not comparable to desktop computer.
There are also CPU extensions like AVX512-VNNI and AVX512-BF16. Maybe the idea of communicating out to a card that holds your model will eventually go away. Inference is not too memory bandwidth hungry, right?
I'm considering hosting a separate pg db just to be able to access certain extensions. I am interested in this extension as well as https://wiki.postgresql.org/wiki/Incremental_View_Maintenanc... (also not available on RDS). Then use logical replication for specific data source tables (guess it would need to be DMS).
Ruby is my favorite language for writing CLI scripts/apps. I've been feeling the TUI options for ruby feel a bit dated, and I've been secretly wanting charm for ruby for a while. I'm very excited to use this.
The times I've been bitten by type safety issues is far less than the hassle of maintaining types. Seriously, it is a much smaller issue than people make it out to be. I will say that I do get bitten by the occasional `NoMethodError` on `nil`, but it really doesn't happen often. Since ruby is very dynamic it is hard to say how many of those errors would be caught even with type annotation. I also don't find myself needing to write specs to cover the different cases of type checking. For me it is a tradeoff with productivity.
That said, I do like it when an LSP can show some nice method signature info, and types are helpful in that way. I think it depends. At the surface level, I like some of the niceties that type annotations can bring, but I've seen how tricky defining more complex objects can get. Occasionally I would spend way too much time fighting types in elixir with dialyzer, and I've often not enjoyed TypeScript for the verbosity. So I understand the cost of defining types. To me, the cost often outweigh the benefit of type annotation.
I fully agree with this. I'm building a site in OCAML, and I just this week spent 90 minutes debugging some weird error I didn't understand because an implicit type was being pulled through in a global context. It was pretty irritating.
Maybe this isn't a fair comparison, since I'm pretty new to OCAML and I'm sure an experience developer would have seen what was happening much quicker than I would have. But I'm not sure I spent 90 minutes TOTAL on type errors doing Python web dev.
Maybe I'm exaggerating, and I probably just don't remember the first time I hit a type error, but my experience with type errors was that I would very occasionally hit them, and then I would just fix the type error. Pretty easy.
When I'm writing code that will be distributed to other devs, I feel type annotations make more sense because it helps document the libraries and there is less ambiguity about what a method will take. As with everything, "it depends"
That's true, but it can also add unnecessary constraints if done thoughtlessly.
E.g. if you require an input to be StringIO, instead of requiring an object that responds to "read".
Too often I see people add typing (be it with a project like this, or with is_a? or respond_to?) that makes assumptions about how the caller will want to use it, rather than state actual requirements.
That is why I prefer projects to be very deliberate and cautious about how they use types, and keep it to a minimum.
There is [RBS](https://sorbet.org/) (part of ruby 3) and [sorbet](https://sorbet.org/). To be honest, these aren't widely used as far as I am aware. I don't know if it is runtime overhead, ergonomics, lack of type checking interest in the ruby community or something else. Type enforcement isn't a big part of ruby, and doesn't seem to be gaining much momentum.
> lack of type checking interest in the ruby community
IMHO if we wanted to write types in our programming language we would not have chosen Ruby for our programming tasks. We would have chosen one of the zillion of other languages. There were a lot of them when Ruby got traction about 20 years ago and many other languages have been created after then. It's not surprising that one of the main proponent of typing in Ruby is Shopify, because their path away from Ruby is very costly.
In my case one of the reasons I invested in Ruby is precisely because I did not have to write types.
Does it make Ruby slower than Java, my main language in 2005? Yes.
Is it fast enough for my customers? Yes. Most of them decided to use Ruby, then hired me.
Do I have to write unit tests to check for types? I don't.
Occasional problems that static types would have prevented to happen? Once or twice per year. Overall that's a good tradeoff because pleasing the type checker for non trivial types can be a time consuming task and some errors happen at runtime anyway, when the real world hits with its data a carefully type checked code base or a carelessly dynamic typed one. Think of an API suddenly returning a bad JSON, maybe an HTML 500 page. Static or dynamic typing, both won't help with that.
I too feel the type safety concern people have with ruby is overblown. The number of actual wrong type related issues I encounter is hardly enough to justify the costs of strong typing. The biggest type issue is with `nil` values and `NoMethodError` on `nil`. Guard clause or safe nav operator is usually sufficient protection for that. That said, I usually don't find myself needing to write that much defensive code for those cases.
I’ve been leaning hard into Sorbet runtime types for DSPy.rb[0] and finding real value. T::Struct at API boundaries, typed props for config, runtime validation where data enters the system.
For generating (with LLMs) API clients and CLIs it’s especially useful—define the shape once, get validation at ingress/egress for free.
I was hoping LSP support would be implemented. I know there are existing MCP servers that can do something kind of similar, but I doubt the agent would be smart enough to consistently utilize the LSP MCP. Here's hoping for less greps.
I would much rather buy a dumb TV. I feel that the smart TV experience is an opportunity it eventually make TVs feel dated and slow. I would rather buy a standalone streamer that I can plug in. Buying a new $100 dollar streamer every couple years is cheaper and produces less e-waste than buying a new giant TV.
I isolate smart TVs and other IOT devices to a separate network/subnet, and usually block their network access unless they need an update.
I remember when LLMs were taking off, and open-weight were nipping at the heels of frontier models, people would say there's no moat. The new moat is high bandwidth RAM as we can see from the recent RAM pricing madness.
reply