kozubik's comments

kozubik · 2025-06-23T16:58:20 1750697900

"... it’s an issue with bad ingredient labeling ..."

I've been working on some improved labeling for certain grocery products:

AyyEye · 2025-06-23T19:18:36 1750706316

Awesome idea. Are you doing anything else in a similar vein?

kozubik · on May 9, 2012

The complexity implied by anything "better" than three nines is a recipe for disaster.

In reality, neither you, nor Amazon, nor anyone else has any idea how durable S3 is. But if they _did_, it wouldn't matter because unexpected interactions, cascading failures, and SNAFU will keep it from ever being realized.

Much better to have more frequent, very boring failures than to have rare spectacular ones.

jorgeortiz85 · on May 9, 2012

The author is proposing to serve his site entirely from S3, claiming it's better than using a couple of nginx boxes because S3 has eleven nines of durability.

Durability means you will get your data eventually (it will not be lost). Availability means you will get your data right now, which is probably what he really cares about in terms of serving live internet traffic.

Put another way: S3 not infrequently has availability hiccups (files are temporarily unavailable, resulting in a disruption of service), without taking durability hits (your files haven't been lost, you just can't see them right now).

amouat · on May 9, 2012

Unfortunately frequent boring failures do not preclude rare spectacular ones.

kozubik · on May 8, 2012

How do the traffic exemptions in AWS fit into this net neutrality debate ?

There are quite a few ways in which traffic between different AWS components is discounted[1], or "favored" over other traffic, and this doesn't bother me a bit - it seems quite reasonable.

I lean (slightly) to the net neutrality argument, but I have a hard time seeing how the billing practices of AWS are neutral within that framework.

How does the OP feel ?

[1] For instance, bandwidth between EC2 and S3 is free.

kozubik · on May 7, 2012

No promises, but I think we will enable this on rsync.net storage arrays...

We already have s3cmd in our environment, so you can:

ssh user@rsync.net s3cmd get s3://rsynctest/mscdex.exe

So if we put this into the environment, you could call it over SSH:

ssh user@rsync.net gmvault sync foo.bar@gmail.com

... which is fantastic.

More like this.

zoobert · on May 7, 2012

Awesome. Let me know if you need some support.

toomuchtodo · on May 8, 2012

Any chance you could build S3 support into Gmvault? So you could cron it on a linux box to connect to Google and push all the data into an S3 bucket? If money is an issue, I'd be interested in footing the bill.

zoobert · on May 9, 2012

This is something I have in mind: Cloud save. It will be added in the roadmap. Please contact me and I will keep you in the loop. Regarding the money I am thinking of adding a donate button to support the project.

kozubik · on May 4, 2012

Done and done:

http://www.rsync.net/resources/notices/canary.txt

We've been running it for seven years now:

http://blog.kozubik.com/john_kozubik/2010/08/the-warrant-can...

cnbeuiwx · on May 4, 2012

Very nice, but what if you get a gag order from FBI or NSA? Then you would be required to go to prison if you uphold your promises on the web site and disclose what happened.

But I would be interested in your comments regarding this theoretical situation. Surely you must have thought about it.

kozubik · on May 4, 2012

The gag order is the whole point.

Read through it again - it is a positive, affirmative statement that we make each week (and make in three continents). A judge (or LEA, whatever) would have to compel us to make false public statements on an ongoing basis, and would have to further compel foreign (swiss) nationals to do likewise.

Can we be held in contempt, etc., for refusing to make public false statements ? Perhaps.

In reality, since rsync.net is not actually an ISP (we take pains to make sure we do not count as an ISP, since it allows us to skip things like the OP has posted) and since we host no publicly available materials, we're not likely to get a warrant. If we do, it's likely to be an extremely mundane act of discovery, etc. That would get added to the warrant canary and we would continue updating it.

In our 11 years of running this service (7 years under the "rsync.net" brand) we've not gotten a single one.

But the parent to these comments was speaking of taking a stand, which is why this was instituted - people do indeed need to make a stand. We refuse to live in a world with Lettres de Cachet, and that's that.

cnbeuiwx · on May 4, 2012

Thank you. This planet needs more people like you on it. I will also do everything in my power to prevent the future I see coming.

wow123 · on May 5, 2012

11 yrs without ever receiving a warrant.

That seems quite impressive.

And it suggests to me your customers are well-behaved. Is that how you would characterise them?

I also think it's a great selling point.

Maybe it's desirable not to have "unruly neighbors" in your "cloud service neighborhood".

We've seen plenty of examples what can happen when such neighbors draw attention to themselves.

kozubik · on May 5, 2012

The key is that our service is cold storage only. All access, regardless of protocol, is with a username and password - there is no anonymous access to data stored here.

So there is no "hosting" or publishing of any kind.

The unintended consequence of this that we are really starting to appreciate is that we are NOT an ISP. The definition is fluid, and there's no guarantee about future regulation, but up to this point every one of the major "provider" laws has not applied to us as we are currently structured.

So the reporting, the LEA interfaces, the logging, etc. - we have no more responsibility to perform these items than your bakery does.

We are not a web host, and we are not an ISP.

kozubik · on May 3, 2012

I don't understand the business model at all.

I undertand the attraction of implementing web based "messaging" (chat) in javascript. But why wouldn't I just point that javascript back to myself ?

Why would I route the product of JS based chat through a third party when it could just communicate with the server it got the HTTP from in the first place ?

My guess is that this is for folks that don't have any control over their back end - it's just a web serving black box, and this is just some more content to paste into it. Is that about right ?

The missing piece, though, is the revenue model - the users who would generate more than 30 million messages in a month are the same users who actually might have their own back end, and the wherewithal to use it. I would think if you need to use third party javascript snippets, you're ipso facto a smaller, lower volume user ...

tbatterii · on May 4, 2012

"Why would I route the product of JS based chat through a third party when it could just communicate with the server it got the HTTP from in the first place ?"

I'll take a stab at it with an anecdote.

First it can be used for more than chat. Anything where a message bus would meet the need could work on top of this.

The project I work on uses pubnub(http://www.pubnub.com/) instead of appengine's channel api because we wanted a reliable way to broadcast to several listeners.

Where at the time the channel api would only do point to point(still does), and if you wanted broadcast you had to maintain connection state with all listeners somehow. So you would have to invent your own keep-alive protocol(not my cup of tea) etc...

So now , when the server needs to notify all listening clients of something, a json message is put in a scheduled task queue, and the call goes out to pubnub in a few ms and arrives to clients a few ms after that. It's pretty impressive.

Looks like spire.io provides similar services. essentially a cloud based message bus that supports broadcast/fan out.

pubnub is supposedly servicing 100K messages per second now. And I would guess it's not just chat.(http://techcrunch.com/2012/03/21/as-developers-seek-more-int...)

nl · on May 4, 2012

The missing piece, though, is the revenue model - the users who would generate more than 30 million messages in a month are the same users who actually might have their own back end, and the wherewithal to use it

This is true, but if you had a service with the potential to generate say 50 million messages per month would you spend $60/month and use this, or multiple thousand dollars to develop your own?

(Also, note that a big market for this is mobile, not just javascript on websites)

kozubik · on May 4, 2012

Ok, fair enough. I'm still wrapping my head around JSAI (javascript as infrastructure) so bear with me ...

kozubik · on April 28, 2012

We've[1] been doing this for 11 years now, just as you describe. We built the bare metal ourselves, we own it, and the buck stops here.

Most importantly, unlike the OP who speaks of "the big hosting guys don't have a track record of building complex systems software" and your own post speaking of "complex software", we run an architecture that is as simple as possible.

Our failures are always boring ones, just like our business is.

You are correct that a chain of vendors, ending in a behemoth[2] that nobody will ever interact with, and will never take responsibility, is a bad model.

So too is a model whose risk you cannot assess. You have no idea how to model the risk of data in an S3 container. You can absolutely model the risk of data in a UFS filesystem running on FreeBSD[3].

[1] rsync.net

[2] Amazon

[3] ZFS deployment occurs in May, 2012

kozubik · on April 24, 2012

Unlimited storage is a farce, and places you in an antagonistic relationship with your provider:

http://blog.kozubik.com/john_kozubik/2009/11/flat-rate-stora...

... which is the last thing you want with serious backups.

Dylan16807 · on April 24, 2012

From what I've seen it's not a big problem here because it's unlimited storage for one desktop. The high-end users are only able to go so high.