The complexity implied by anything "better" than three nines is a recipe for disaster.
In reality, neither you, nor Amazon, nor anyone else has any idea how durable S3 is. But if they _did_, it wouldn't matter because unexpected interactions, cascading failures, and SNAFU will keep it from ever being realized.
Much better to have more frequent, very boring failures than to have rare spectacular ones.
The author is proposing to serve his site entirely from S3, claiming it's better than using a couple of nginx boxes because S3 has eleven nines of durability.
Durability means you will get your data eventually (it will not be lost). Availability means you will get your data right now, which is probably what he really cares about in terms of serving live internet traffic.
Put another way: S3 not infrequently has availability hiccups (files are temporarily unavailable, resulting in a disruption of service), without taking durability hits (your files haven't been lost, you just can't see them right now).
How do the traffic exemptions in AWS fit into this net neutrality debate ?
There are quite a few ways in which traffic between different AWS components is discounted[1], or "favored" over other traffic, and this doesn't bother me a bit - it seems quite reasonable.
I lean (slightly) to the net neutrality argument, but I have a hard time seeing how the billing practices of AWS are neutral within that framework.
How does the OP feel ?
[1] For instance, bandwidth between EC2 and S3 is free.
Any chance you could build S3 support into Gmvault? So you could cron it on a linux box to connect to Google and push all the data into an S3 bucket? If money is an issue, I'd be interested in footing the bill.
This is something I have in mind: Cloud save.
It will be added in the roadmap.
Please contact me and I will keep you in the loop.
Regarding the money I am thinking of adding a donate button to support the project.
Very nice, but what if you get a gag order from FBI or NSA? Then you would be required to go to prison if you uphold your promises on the web site and disclose what happened.
But I would be interested in your comments regarding this theoretical situation. Surely you must have thought about it.
Read through it again - it is a positive, affirmative statement that we make each week (and make in three continents). A judge (or LEA, whatever) would have to compel us to make false public statements on an ongoing basis, and would have to further compel foreign (swiss) nationals to do likewise.
Can we be held in contempt, etc., for refusing to make public false statements ? Perhaps.
In reality, since rsync.net is not actually an ISP (we take pains to make sure we do not count as an ISP, since it allows us to skip things like the OP has posted) and since we host no publicly available materials, we're not likely to get a warrant. If we do, it's likely to be an extremely mundane act of discovery, etc. That would get added to the warrant canary and we would continue updating it.
In our 11 years of running this service (7 years under the "rsync.net" brand) we've not gotten a single one.
But the parent to these comments was speaking of taking a stand, which is why this was instituted - people do indeed need to make a stand. We refuse to live in a world with Lettres de Cachet, and that's that.
The key is that our service is cold storage only. All access, regardless of protocol, is with a username and password - there is no anonymous access to data stored here.
So there is no "hosting" or publishing of any kind.
The unintended consequence of this that we are really starting to appreciate is that we are NOT an ISP. The definition is fluid, and there's no guarantee about future regulation, but up to this point every one of the major "provider" laws has not applied to us as we are currently structured.
So the reporting, the LEA interfaces, the logging, etc. - we have no more responsibility to perform these items than your bakery does.
I undertand the attraction of implementing web based "messaging" (chat) in javascript. But why wouldn't I just point that javascript back to myself ?
Why would I route the product of JS based chat through a third party when it could just communicate with the server it got the HTTP from in the first place ?
My guess is that this is for folks that don't have any control over their back end - it's just a web serving black box, and this is just some more content to paste into it. Is that about right ?
The missing piece, though, is the revenue model - the users who would generate more than 30 million messages in a month are the same users who actually might have their own back end, and the wherewithal to use it. I would think if you need to use third party javascript snippets, you're ipso facto a smaller, lower volume user ...
"Why would I route the product of JS based chat through a third party when it could just communicate with the server it got the HTTP from in the first place ?"
I'll take a stab at it with an anecdote.
First it can be used for more than chat. Anything where a message bus would meet the need could work on top of this.
The project I work on uses pubnub(http://www.pubnub.com/) instead of appengine's channel api because we wanted a reliable way to broadcast to several listeners.
Where at the time the channel api would only do point to point(still does), and if you wanted broadcast you had to maintain connection state with all listeners somehow. So you would have to invent your own keep-alive protocol(not my cup of tea) etc...
So now , when the server needs to notify all listening clients of something, a json message is put in a scheduled task queue, and the call goes out to pubnub in a few ms and arrives to clients a few ms after that. It's pretty impressive.
Looks like spire.io provides similar services. essentially a cloud based message bus that supports broadcast/fan out.
The missing piece, though, is the revenue model - the users who would generate more than 30 million messages in a month are the same users who actually might have their own back end, and the wherewithal to use it
This is true, but if you had a service with the potential to generate say 50 million messages per month would you spend $60/month and use this, or multiple thousand dollars to develop your own?
(Also, note that a big market for this is mobile, not just javascript on websites)
We've[1] been doing this for 11 years now, just as you describe. We built the bare metal ourselves, we own it, and the buck stops here.
Most importantly, unlike the OP who speaks of "the big hosting guys don't have a track record of building complex systems software" and your own post speaking of "complex software", we run an architecture that is as simple as possible.
Our failures are always boring ones, just like our business is.
You are correct that a chain of vendors, ending in a behemoth[2] that nobody will ever interact with, and will never take responsibility, is a bad model.
So too is a model whose risk you cannot assess. You have no idea how to model the risk of data in an S3 container. You can absolutely model the risk of data in a UFS filesystem running on FreeBSD[3].
I've been working on some improved labeling for certain grocery products:
https://kozubik.com/items/ThisisCandy/