Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As this is HN, I'm curious if there is anyone here on HN who is interested in starting a business hosting these large open source LLM's?

I just finished a test of running this Deepseek-R1 768 GB model locally on a cluster of computers with 800 GB/s memory bandwidth (faster than the machine in the twitter post) and I can now extrapolate to a cluster with 6000 GB/s aggregate memory bandwidth and I'm sure we can reach higher speeds than Groq and Cerebras [1] on these large models.

We might even be cheap enough in OPEX to retrain these models.

Would anyone with cofounder or commercial skills be willing to set up this hosting service with me, it will take less than $30K investment but could be profitable in weeks?

[1] https://hc2024.hotchips.org/assets/program/conference/day2/7...



The Azure(tm) and AWS version of rent-a-second are in the works as we speak. So yes, rent-a-brain/vegetable and no, I will bet you $40k you will not beat either AWS ot Microsoft to the punch. Zero chance of that. They will have their excess computational power with extremely discounted electric rates in place before Friday morning.


I wonder if the real market is actually bringing this stuff inhouse.

Given the propensity for these big tech companies to hoover up/steal any information they can gather, running these models locally, with local fine tuning looks quite attractive.


> Given the propensity for these big tech companies to hoover up/steal any information they can gather

at the end of the day you still have to sell this product to the sorts of companies that are far and away all microsoft 365/google workspace clients and we're gonna have to figure that out one day or another


I'd wager this is the real market. Ship some company a server rack with Deepseek R1 for a $1M annual rental fee + upgrades to the latest models.

think inside the box


Cerebras already does this.


I thought cerebras had moved to a cloud model so that they could more easily manage/patch their systems?


both. The cloud model is also for renting models across several Cerebras Wafer Scale Integrations.


Oh, they will deploy the racks at customers sites?


I don't know what their policy is now, but they talk about it in the Hot Chips 2024 presentation.

I myself just have proof of a single customer having their own private Cerebras rack. There are rumors about several more customers with on-prem Cerebras.


This is what has me most excited. AI has it's limited uses for now but with the current requirement of handing over all your data to big brother it was not even worth considering. Now that on prem is reasonable and doesn't require you to beg Nvidia for H100s it might actually be usable.


It's virtually impossible to not hand over data now. I'd say at this point it's more likely that any attempt to draw less attention actually has the opposite effect in reality.

The reality is that many datasets include stuff that has been absolutely stolen or haphazardly stored online, stuff like emails, texts, conversations, our credit histories, really any information about people that exists in a large quantity has already made its way into these LLMs one way or another.

An AI trained solely/additionally on specific datasets, with the intent of predicting human behavior exactly, should yield successful and highly accurate predictions from rather general demographic data combined with an individuals location history/current their location, plus really any third substantial thing - could be browsing history, or purchases, activity on social media, that should be all that's necessary to rather accurately put everyone into a box.

The most frustrating thing is that there is a box for the people that refuse to be put into a box.


AWS already has platforms for running and fine tuning OSS models that can run privately inside a VPC. If Azure and GCP don’t have equivalent capabilities already, it is surely imminent. Seems pretty hard or impossible to beat cloud providers at their own game.


> If Azure and GCP don’t have equivalent capabilities already, it is surely imminent

GCP offers hundreds of models in its Vertex AI, including all "open source" (actually open weights) models, and the ability to fine tune for your specific needs. This blog post is from 2023 [1].

(disclaimer: I work at Google, but not on the Cloud team)

[1] https://cloud.google.com/blog/products/ai-machine-learning/s...


If the hardware isn't in your physical possession you can't know that your data isn't being hoovered up. You can't end to end encrypt compute tasks (homomorphic processing is fiendishly uneconomical).


True, but at this point we're leaving the realm of cryptography and theoretical infosec, and enter the realm of real-world security. In this realm, permissions are established by armies of lawyers across organizations and governments defining who can or cannot do things, and what happens when transgressions occur; here, "defense in depth" carries all the way to the threat of men with guns escorting you to jail.

So it's true that you can't encrypt compute tasks of this type end-to-end, so you can't know if unauthorized parties mine your data. However, Microsoft is very unlikely to mine your data (for "you" being e.g. any of the many multinational corporations that already run all their office work through Azure-hosted Outlook, Office, SharePoint, etc.), or to let others mine it, because if it ever came out, your customers' lawyers would be after you, your lawyers would be after Microsoft, and the whole thing would explode into a multiple-billion-dollars shitshow and might even get a government or two involved.

That's the working assumption that makes Microsoft well-positioned to eat any fledgling self-hosted DeepSeek market in the business space. They already have things set up at a level that is trusted by governments as well as corporations in critical industrial sectors, with huge financial and legal exposure.

(Presumably Google and Amazon are in a similar position here, though I've only seen this personally with Microsoft/Azure, so that's what I can comment on.)


> "defense in depth" carries all the way to the threat of men with guns escorting you to jail.

For contract breach civil crime like this, there is zero chance it ends with jail time.


That's the typical case, true - but for many (most?) of the big multinationals, the worst case scenario for a hack involves people dying or some piece of critical infrastructure exploding.

On top of that, "everything is securities fraud" - and since that does carry potential jail time, corporations generally try to avoid pissing off parties that would be able to frame a contract breach (and its consequences) in terms of investment fraud.

EDIT:

For starters, almost all data a multinational corporation generates and processes is subject to export control regulations, which are broad, full of special cases, vary over time, space and politics, and most importantly, violations of them come with huge fines and criminal penalties[0] for both businesses and individuals involved. The only reason Microsoft can get a corporation like this to migrate to O365 and run their back-office in Azure cloud is by solid, tested contractual guarantees that the data will be processed in ways that will keep the customer compliant with applicable regulations. Now, I'm not a lawyer, but it's not particularly hard to draw a line from "Microsoft snooping on enterprise customers" to securities fraud.

I mean, even in context of hosting a DeepSeek derivative, we're talking about a cloud service offering enterprise customers secure training on company data. "Company data" may involve, e.g. detailed documentation or specs for software for designing advanced optical systems, which may sound benign until you make the connection[1]: "advanced optics" includes applications in advanced laser systems, which basically means weapons (e.g. ranging, missile targeting, anti-missile countermeasures). Obviously, regulators around the world (and the US in particular) would be very unhappy to see such information crossing through the wrong borders. For both the affected customers and the cloud service, this is high stakes game; a random startup isn't in a position to enter it.

--

[0] - E.g. in US, up to $1M per violation and up to 20 years in prison, possibly at the same time; see https://www.bis.doc.gov/index.php/enforcement/oee/penalties.

[1] - This was a real intro example used in export control training I went through some years ago.


Yes technically you can go to prison for securities fraud, and everything could be securities fraud, if you have multiple share holders and play in that sandbox.

A small random startup is unlikely to play in the securities sandbox until they have enough resources to hire enough lawyers to keep themselves out of prison and the fines "reasonable"(i.e. not enough to incentivize actually doing something about the fine being imposed other than to at least temporarily stop doing the action).

When was the last time securities fraud ended in jail time by any S&P 500 company? My quick web search returned no instances ever(but I could be wrong).


Sure - but that's just, to the extent of our knowledge, regulations working as intended.

My point here is that OP's startup won't be able to compete with incumbents for enterprise money, and since the incumbents already provide this kind of service cheaply and reliably for customers of any size, all while handling applicable security concerns, OP's startup won't be able to compete with them for smaller customers either.


Agreed, unless they can add a hook(some cool unique feature/thing to get them traction). It probably won't work out well.


FWIW, I think one possible hook would be to package up "training and deploying model on site" into a product - because after Azure, GCP and AWS, the next set of players best-positioned to make use of cheap frontier model training are... the very enterprise customers who would buy from aforementioned cloud providers instead of doing it themselves. Simplifying internal deployments could convince at least some of them to pay you instead of the Big Cloud.


Unfortunately US law appears to compel American companies to share your data without your knowledge to the US government.

Given the current US government is headed by a person that just looks to take what he wants - your assurances aren't comforting.


> Unfortunately US law appears to compel American companies to share your data without your knowledge to the US government.

Sure, but that's not some unexpected gotcha - it's just a plain fact of geopolitical reality, managed by international treaties and accounted for in laws and contracts around the world. A multinational enterprise isn't like a person subscribing to a free plan of a random SaaS because the "sign up" button was the right shade of green - there are armies of lawyers on both sides, tasked with navigating applicable regulations (including GDPR and export control laws) and finding out a way to make things work.

When they can't, the deal simply doesn't happen.


If what you say is true no European country would be using any US systems - as I think it's obvious ( despite the various attempts at various fig leaves ) that US and EU law is fundamentally not compatible in terms of privacy.

What actually happens is you have people seeing no evil, hearing no evil and speaking no evil - by going lalalala - hoping that because everybody else is doing it they won't get fired.

This happens because alternatives seem too hard.


If your data is too sensitive for AWS, you're in a different realm that most enterprise users.



> you can't know that your data isn't being hoovered up

There is no evidence of this happening in the last 20 years. None.

And if there was it would be the complete unravelling of the entire cloud concept.

So you're talking about solving a problem no one has.


It's not sure much about hoovering, as targeted spying.

Plenty of evidence of companies and governments using spying for commercial/national ( sometimes the same ) advantage.

So let's say you are a big company, and suddenly the US government decides you are a competitor in a nationally strategic industry - is your data safe if held by a US company?


Pretty much especially in Europe there is lots of big companies and public sector institutions that would pay serious € if they could run these.


Spot on! I concur most European business and public sector institutions would be eager to rent this because they are not allowed by law to use US datacenters like AWS or Azure.


Not true for two reasons! Azure data center in europe is fine if thr data stays there and generally there is the EU-US Data Privacy Framework https://en.m.wikipedia.org/wiki/EU%E2%80%93US_Data_Privacy_F...


That's not the only issue. They want a guarantee that the model wasn't trained on copyrighted material.


Now that is a real feature for now. A lot of hesitation in embracing generative AI in large enterprises stems from uncertainty about copyright issue. Anyone who trained an o1-level model from scratch on public/properly licensed data only would be able to provide a very valuable service to those enterprise customers.

However, if both training and operating costs of a DeepSeek-like model are as small as they are, the companies best able to offer this service are... Microsoft, Amazon and Google. And second best are... teams inside the would-be customer enterprises themselves. $6M to train and $6K to run is effectively free for such companies; there is no moat here. The services that enterprise customers would happily buy instead of building are... operations, and assuming legal liability if the model turns out not to be safe from copyright infringement lawsuits. But those are exactly the services those companies are already buying from Microsoft, Amazon and Google.


This would result in some refreshing models, I guess they would be trained mostly on out-of-copyright stuff from 75+ years ago and wouldn't have knowledge of the modern world.

Maybe they could skin the robotic bureucrats in vintage scifi appearance as well to have the whole consistent experience when you go to the building permits bot, there could be small talk about the latest Beatles record etc.


Enforcing copyright on training data to this extent would actually create a temporary moat for the biggest players - they can afford to hire a lot of cheap labor to supplement the training dataset with human-authored original works that skirt IP protections by interpreting, parodying, commenting on or otherwise describing the protected works without actually infringing on them. As long as they keep those datasets private, everyone else is shit out of luck.

(I'm reiterating my prediction wrt. AI and moats - the only mid-term moat there can be is in human labor. Hardware vendors benefit from selling better hardware to more people for less; software and research are cheap to scale, datasets eventually leak or get reproduced. Human labor is the one thing that doesn't scale, and except for an economic crisis, only ever gets more expensive with time. Whatever edge one can get by applying human labor that cannot be substituted by AI - like RLHF and its evolutions - is the one that will last all the way to AGI; past that, moats won't matter anymore.)

One of the many reasons I'm firmly on the side of making the training of large neural models exempt of copyright considerations for everyone.


Isn't the training already exempt from copyright? Copyright is in the core about enabling licenses related to who's allowed to distribute copies of content (not ideas, but the exact same text, etc).

edit: apparently in the EU the situation is complicated by new AI specific legislation in the works: https://www.morganlewis.com/pubs/2024/02/eu-ai-act-how-far-w...


I think the important metric will be if we can compete against the price of AWS or Microsoft in running large LLMs, not their time to market. Competing on cost against overpriced hyperscalers is not very hard, and $30K is a small investment, not a gamble. If it would fail, worst case you would only loose $3000-$6500 or so.


> $40K is a small investment, not a gamble. If it would fail, worst case you would only lose $3000-$6500 or so.

As someone not familiar with investment sourcing or SME financing. Could you break down the maths/accounting? How do you go from sinking 40k in a business to losing 6.5k if you turn the lights off at the end?


You buy the hardware (48 servers), rent part of a colocation rack with a 10 Gbps or 100 Gbps internet transit link, get a payment processor, make a webpage and GitHub demo with the API. Break down: $3000 labour, $20.5K hardware, $800 monthly rental fees, $376 car fees. When you shut down within a year, the $20.5K popular off the shelf hardware can easily be sold for $17K, a fact you can check from 25 years of data.

I would invest more than the initial $30K on optimization after the servers have found paying customers and thus have proven commercial viability. I would invest in software development, finetuning, retraining and above all reverse engineering GPU and neural engine instruction sets and adapting these open source models to the more than 2 quadrillion operations per second that these 48 servers can do.


Where can you get 10 or 100 Gbps flat with a full 48U rack and power for $800?

Because if that exists, I want to buy them all.


Email in my profile. 100 Gps plus rack is 5 times the cost of 10 Gbps. You'll need 100 Gbps routers and smartnics too.


48 servers at 60K also does not math to me at all. Even if I consider the second-hand ones, I could hardly find zen4 under ~10K. And this is without GPUs.


Yeah, this guy doesn't know what he's talking about. I buy hardware and transit, so I know he's just making up numbers.


The numbers were really too good to be true but there's always a chance that there may be something I am not seeing correctly so I wanted to give the benefit of the doubt. Otherwise, the idea is not too crazy regardless of much higher input costs.


So, $0 budget for software dev / sales / support?


Well, when you run an AI company, you must test your product, right? What better way to test it than by building your own webpage, admin panel, etc.?


I broke down the first $30K investment cost for release of the online API product, that does not need further software development, sales or support.

You would be wise to do the software development I mentioned, do more sales and support than was covered under my initial $3000 labour fee. But that you can pay for with the revenues, it would not be the initial investment to see if it is viable as a business.


That's what the AI is for, no? /s



Hardware requirements? I think I can hit the memory bandwidth building from parts I have in my house. Maybe even 2x. Asking for fun not profit.


I'd love to visit your house then. You have 768-1400 GB DRAM with 6000 GB/s memory bandwidth? Nice house.

In my house I currently have almost 900 GB/S memory bandwidth in aggregate but only 132 GB total DRAM.


I’ve got a terabyte of DDR4, and a bunch of old thread rippers. They can take 256 each and 8 channel.


Yep, that's the right stuff. Now simply cross-connect all the free PCIe lanes of all thread rippers and you have a nice cluster for LLMs.


Did you implement token generation for Deepseek R1 using PBLAS?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: