EDIT: if you really need GPU compute power, go buy a GPU from the computer store around the corner - they are cheap, fast and available.
I gave up on ec2 when they started requiring you "request for quota" to start a gpu instance.
You have to "request for quota" even if you want to run a single instance.
You have to specify which specific instance type you want.
You have to specify which region you want it in.
Then you sit back and wait for some human to "approve your request".
In my case this took 24 hours.
In this time I literally could have walked (not driven) down to the local computer store, bought a computer, pushed it back to my house in a shopping cart, spent a few hours configuring it and still be left with 16 hours to have a sleep, eat and do some other things.
AWS quota system is so far from "scalable" and "elastic" that it's effectively useless. You can't design any sort of infrastructure around that sort of quota system.
I dumped AWS at that point. Mind you, Azure has exactly the same quota system.
Seriously, just rent a server from Ionos or Hetzner or get a fast Internet connection and self host. It's faster and better and cheaper than any of the big clouds.
I can understand why though. One employers first use of a GPU instance was when we were hacked and someone fired up a few to mine crypto, only a day off which cost us a few $1000 that, fortunately, AWS refunded. It's quite likely that most users don't use them and that it is a good signal that you've been hacked.
They have a quota system for sending emails too, and it's not because they need to purchase any hw for sending more emails. It's because those are also a magnet for hackers.
If AWS was really clear about what you were spending then it would be easy to be running an app that tells you the AWS usage and alerts you of anomalous patterns.
My bank contacts me if there are any questionable transactions.
I think you would be shocked at how often banks have similar restrictions on corporate accounts. As would you be surprised at how often companies are hacked.
In that world, you better make it very difficult to change your alerts downard: Otherwise, the very first thing a hacker would do is to take over the alert system, and only then start using your account for their profit-making ventures.
They do this. There's a fine place between balancing the overhead (contacting customers, monitoring) and false positives.
But if weird stuff happens, AWS will take action. I'm sure part of it is because they care, but another is because they don't want to get stuck holding the buck.
AWS will tell you as much as you want to know about how munch you’re spending. Cost breakdowns and billing alerts are more configurable than you’ll see pretty much anywhere else. This is a miscategorisation of the problem with AWS billing. The actual problem - which is not misunderstood - is that it’s complicated. There’s a reason that AWS has an unwritten ‘one-time get out of jail free card’ system if you accidentally mess up and charge $50,000 to your corporate credit card. And there’s a reason that a condition of this escape hatch is that you set up billing alerts like you should as a responsible person hooking up a PAYG service to what’s sometimes an effectively infinite line of credit.
That's like saying the Internet will tell you all you want to know.
Yes, (most of) the detail is in the CUR/CBR - but you have to be smart enough to understand it.
It's misleading to say that AWS will tell you all you want to know. Try getting them to explain your network costs in detail using network load balancers in a simple email.
Lambdalabs used to have spot instances for GPUs and now they are completely sold out. Even vast.ai hardly has any A100 instances. We are starting to see paperclip maximizer economics start with GPUs since they are so damned useful.
Well, paperclips are clearly useful, not so sure about ChatGPTs ... so far looks like their effect is net negative - more cheating and more spam of all kinds.
It's okay for an infitely burstable product to delay provisioning services by 24 hours and require you to capacity plan in specific detail before you can use things?
I don't agree. Their reality distortion field is fooling customers, AWS has their cake and eats it too. Try being a startup and spin up 1000 c6in.8xlarge for a 2 hour batch job.
Also, I felt that Amazon was playing sketchy little games with spot instances.
It's not impossible but quite technically challenging to find out how much you are actually paying for a spot instance.
It felt really dodgy to me that I might have started a spot instance thinking I was paying the minimum listed rate but somewhere amongst the formulas and terms and conditions AWS actually decided I would be paying maximum rate.
They don't actually tell you up front what your spot instance are costing, no doubt it would not be hard for them to do so, but it seems a deliberate strategy to hide this information.
It's just not worth the game playing when self hosting or renting a server from Ionos is cheaper and takes away the uncertainty.
Comparing with self-hosting or renting a server is just dishonest. At that point you may as well compare with spinning up an actual EC2 instance. Your alternatives miss the main original value proposition of EC2 altogether, let alone spot instances.
I don't really buy the idea that companies can't afford to run their own computers due to hardware and infrastructure costs, and that clouds don't require hiring or specialised technical experts.
Pay per second Linux boxes with public IPs. A lot of the VPS/server hosts I used in the past took minutes to boot an instance and billed you by the hour (or maybe by the month, although that was uncommon), not so with EC2.
There’s much less value if you never need to scale of course, which is true of 90% of businesses. But doing a dumb experiment on a 10 cent spot instance for an hour where it would’ve taken a week to arrange space on the company VMWare box? That is awesome.
EC2 was billed by the hour initially. Minimum one hour charge as soon as you started an instance. It was still better than the alternatives; nobody else had per-hour billing with no commitments in 2006. Moving to per-second billing was a much later change (in 2017).
> Pay per second Linux boxes with public IPs. A lot of the VPS/server hosts I used in the past took minutes to boot an instance and billed you by the hour (or maybe by the month, although that was uncommon), not so with EC2.
It's only worth it if the pay-per-second box has an equivalent or better cost-performance ratio than the "legacy" alternative, otherwise it's still cheaper to get one "legacy" VPS/dedicated server, use it for 10 seconds and let the rest go to "waste".
Wow, never realized that that could be a great money laundering strategy.
1. Take cash, load it onto prepaid cards.
2. Buy GPU instances.
3. Mine crypto.
4. Pay for compute with #1
5. Sell crypto
6. "Clean" cash
I just don't have the mind of a criminal I guess, it sounds like this was figured out a long time ago.
There are (were) enough random kids in thier basements making millions on crypto that it probably wouldn't turn too many eyebrows either by the IRS if you actually tried to report your crypto earnings cleanly, they would likely spend their efforts chasing down the people who weren't reporting.
Fortunatelly your evil ML plan going to fail on #1. It's basically impossible to get some cash loaded to prepaid cards anywhere. Also AWS and other hosting providers wouldn't accept prepaid cards either. Instead you just buy crypto with cash P2P and report that as mined or whatever.
On other hand credit card fraud is real problem especially due to fact that Amazon don't want to add any KYC burden on their customers or try to track down any anomaly spending. After all good chunk of AWS profits come out of fact how bad people of tracking their cloud costs.
If Amazon to implement built-in option to track suspicious jumps of AWS costs then it's not only gonna cut on fraud, but on overall AWS profits.
That would be true regardless of the means of obtaining the GPUs -- I was just replying to the issue of the cash->GPU funnel, not asserting that any other part of the operation is easy.
That's a good way of monetizing stolen credit card but it would be a very bad way to do money laundering, because crypto is effectively "cash" so it still isn't clean. It doesn't have a good origin story.
Basically for money laundering you want to have a good story for your money that you can get tax authorities and banks to believe. "I found some crypto" would only work at small scale, and crypto is considered "high risk" anyway so you'll get higher scrutiny.
If I was in the business of laundering money and I wanted to do it via crypto, I think I would create an NFT instead then buy it from myself. Then I have funds I can transfer to cash, and I can claim I got them through my great artistic abilities.
The NFT sounds better but assuming the police FBI, DEA etc. are smart, which they probably are, they can see that you are not famous in the NFT world so why did someone pay so much. So you would need to almost be able to make a million anyway (just pure NFT grift) to launder a million (some other source)
There are a lot of systems and features that are hard (impossible?) to design in a way where you can predict (every dimension of) spend; you can only react to spend events.
For example, a theoretical "prepaid AWS" might allow you to put a hold on a vCPU-month of account credits to start a 1vCPU instance. But what about the bandwidth egress fees when someone makes requests to said instance? Those are going to be completely variable, depending on how much traffic the instance receives.
Yeah, but there are plenty of organizations which have very modest or predictable loads that would be significantly well served by knowing that the monthly spend was capped at $X prepaid.
As a nobody, I want to use AWS, but I refuse to have the unlimited liability in case I screw up something and wake up to a $30k bill. Hell, they could even over-charge me on the credits, and I would gladly take that deal if I knew that once the kitty ran dry, services would stop.
Would there be edge cases and complications to resolve (eg what about storage?). Sure, but AWS pays some smart people a lot of money to figure out tricky things.
I work at AWS in Professional Services of course all opinions are my own.
The answer for you is LightSail.
LightSail is a standard VPS. But if you want to upgrade to “real AWS” later on, you can. The only thing that I’m aware of that could cause bills to go up is egress over your allowance.
To be honest with you, I would be slightly afraid of screwing up and having an unexpected bill from AWS if I were doing a personal project and I do this for a living. There have been plenty of times where I left something expensive running or provisioned an expensive service (Kendra) and forgot to shut it down until I ended up on a list of “people with the highest spend” on our internal system on one of my non production accounts.
If AWS truly cared about customers they would implement spending limits. Note the plural: Customers don't want their S3 data deleted because some GPU stuff went crazy.
How would that actually work? When you reach your spending limit, delete your data from S3 that you’re being charged for? Stop allowing egress traffic? Stop allowing any API calls that cost money? Stop your EC2 instance?
AWS has over 200 services. How would you implement that conceptually?
I know it’s a real concern when learning AWS for most people. I first learned AWS technologies at a 60 person company where I had admin access from day one to the AWS account and then went to AWS where I can open as many accounts as I want for learning. So I haven’t had to deal with that issue.
But what better way would you suggest than LightSail where you have known costs up front?
I think it could be done reactively, as long as two things are true:
1. spending limits are fine-grained — rather than having one global budget for your entire AWS project, instead, each billable SKU inside a project would have its own separate configurable spending limit. The goal here isn't to say "I ran out of money; stop trying to charge me more money"; it's rather to say "I have budgeted X for the base spend for the static resources, which I will continue paying; but I have budgeted Y for the unpredictable/variable spend, and have exceeded that limit, so stop allowing anything to happen that will generate unpredictable/variable spend."
This way, you can continue to pay for e.g. S3 storage, while capping spend on S3 download (which would presumably make reading from buckets in the project impossible while this is in effect); or you can continue paying for your EC2 instances, while capping egress fees on them (which would presumably make you unable to make requests to the instances, but they'd still be running, so you wouldn't lose the state for any ephemeral instances.)
2. AWS "eats" the credit-spend events of a billing SKU between the time it detects budget-overlimit of that billing SKU, and the time it finishes applying policy to the resource that will stop it from generating any more credit-spend events on that billing SKU. (This is why this kind of protection logic can never be implemented the way people want by a third party: a third party can only watch AWS audit events and react by sending API requests; it has no authority to retroactively say "and anything that happens in between the two, disregard that at billing time, since that spend was our fault for not reacting faster.")
Note that implementing #2 actually makes implementing #1 much easier. To implement #1 alone, you'd have to have each service have some internal accounting-quota system that predicts how much spend "would be" happening in the billing layer, and can respond to that by disabling features in (soft) realtime for specific users in response to those users exceeding a credit quota configured in some other service. But if you add #2, then that accounting logic can be handled centrally and asynchronously in an accounting service which consumes periodic batched pushes of credit-spend-counter increments from other services. The accounting service could emit CQRS command "disable services generating billable SKU X for customer Y starting from timestamp Z" to a message queue, and the service itself could see it (and react by writing to an in-memory blackboard that endpoints A/B/C are disabled for user Y); but the invoicing service could also see it, and recompute the invoice for customer Y for the current month, with all spend events for billing SKU X after timestamp Z dropped from the invoice.
In Washington State we have an account for toll infrastructure. You set a top-up amount and a minimum. When the minimum is reached, the top-up is charged to your card and applied. If that fails a bill is mailed to you. If you fail to pay that then civil penalties are assessed.
Your point? The goal here would be to have a policy that protects you from someone stealing your toll badge (i.e. hacking into your AWS account) and running you through a toll bridge a million times (i.e. generating huge variable spend.) I don't see how what you're saying relates.
Devil's advocate: 10 year old accounts are probably just as likely as any to get hacked into and used for crypto mining, and honestly I bet a majority of their customers don't even use GPU instances.
One could argue that older accounts are even more likely to get hacked, as they are more likely to have older passwords that are weaker that may have been leaked along the way, along with various other accumulated security issues (leaked API keys, out of date 2FA choices etc).
Yep, Amazon's threat model has less to do with how reputable a given account is, since credential theft is rampant. Even at massive institutional customers on POs I've had to apply for quotas.
I think it’s tough to blame Amazon here. p4d server hardware costs around 100k and they were almost certainly having countless hours of use on these by stolen aws accounts and credit cards. This doesn’t excuse the requirement to specify region and instance types in advance. The region part at least can be explained by the fact that Amazon tries to make regions as independent as possible.
>and still be left with 16 hours to have a sleep, eat and do some other things.
I hope somewhere in those 16 hours you bother to return the shopping cart. Or are you one of those that just leaves it where ever because you can't be bothered? I will be bringing this up at the next tenant's meeting.
Google Cloud does this, too, for GPUs on their compute instances, but the approval (I assume if there is manual approval, there are conditions which trigger it that I missed) was near instant in my experience.
They closed last year, it's now a 50-minute bus ride and then ~10 minutes walking (one way). Apparently we were one of the last customers to buy most parts for a desktop there. I fear we may not be able to refer to corner stores much longer if they're not of a type most people actually need on a weekly basis, with things moving to online sales
> In my case this took 24 hours.
Even shipping is faster than that, so yeah point still taken
Having your own GPU makes sense if it works out cheaper than the cloud equivalent (not just the GPU bit but everything else - bandwidth/storage fees, etc).
Given the prices of cloud infrastructure, it doesn't take much before walking to your local computer store and buying a GPU (or more!) becomes more cost-effective.
> and still be left with 16 hours to have a sleep, eat and do some other things.
Well there's your problem. You don't need to sit back the whole time while you're waiting for approval, it's okay to leave your chair.
Jokes aside, I think this probably depends on use-case. You can't easily return the computer. And you can't easily buy (and then return) 100 computers if you have a large one-time computation to run.
Do they still wait until the very last moment before launching to tell you that the account you're logged in under doesn't actually have permission to launch EC2 instances?
AWS is following the same path as other web technologies offered by big tech. In the beginning, they want to lure as many people as possible to their new, shiny product, so they offer irresistible prices. The next step, which is the important one, is to secure corporate and government contracts for that technology. Next thing you know, they will see small clients as a nuisance, and do everything to make their lives more difficult.
I gave up on ec2 when they started requiring you "request for quota" to start a gpu instance.
You have to "request for quota" even if you want to run a single instance.
You have to specify which specific instance type you want.
You have to specify which region you want it in.
Then you sit back and wait for some human to "approve your request".
In my case this took 24 hours.
In this time I literally could have walked (not driven) down to the local computer store, bought a computer, pushed it back to my house in a shopping cart, spent a few hours configuring it and still be left with 16 hours to have a sleep, eat and do some other things.
AWS quota system is so far from "scalable" and "elastic" that it's effectively useless. You can't design any sort of infrastructure around that sort of quota system.
I dumped AWS at that point. Mind you, Azure has exactly the same quota system.
Seriously, just rent a server from Ionos or Hetzner or get a fast Internet connection and self host. It's faster and better and cheaper than any of the big clouds.