Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Stopped reading at Kafka. Java technology that combines bad networking with bad messaging and bad queueing.

I would have expected one of the big Internet websites to use better technology.



Better tech? Which is?


Well to begin with, Java is not a serious programming language to do systems programming in, since it has poor control of networking, threads and memory.

Then it's a message bus built on top of TCP. Anyone with basic understanding of networking can see that if you have a producer that wants to send the same data to multiple consumers efficiently, you should use multicast.

Kafka also lacks proper mechanisms to throttle the speed of producers when consumers are too slow, which is the first thing you should ever be concerned about whenever you introduce a queue.

If you want something somewhat decent within Javaland you could try Aeron.


Kafka can buffer indefinitely (so long as you give it the disk space) and you can throttle to your heart's content within the consumer's loop or just consumer.pause().

I don't think you know the basics of Kafka's API or how it works internally.


This person doesn't appear to know the basics of a single thing they're maligning, but certainly isn't letting that slow them down.


The fact that you think this is a proper way to deal with a queue is quite concerning.

You completely missed the point. It's the producer that needs to be paused, and the reason for that is that memory is not infinite. You cannot just keep buffering until the consumer catches up, because it may never catch up.


While it is true that Kafka does not guarantee long-term steady state behavior, it is modeled as an impedance adapter of infinite capacity. It presents no impedance to the producer, has storage that is much larger than the arrival rate, and drains to the consumer as available. Any feedback between the consumer and producer has to happen out of band, which is fine.


And that is one of thr major defects I pointed out.

It's not "fine".


I'm a pretty serious systems programmer with 30 years in the industry and I would never even consider using IP multicast for any purpose.


The fact that the poster mentioned IP multicast and Aeron implies they are talking about HFT and Stock Exchange environments, where high performance switches with explicit support for low latency multicast are the norm and not the exception.

There is a Signals and Threads podcast episode that goes a bit into the history of it.

https://www.youtube.com/watch?v=triyiLwqWUI (Transcript) https://signalsandthreads.com/multicast-and-the-markets/


Pretty much any L3 switch supports UDP multicast. What can be more rare is PIM support which is a router feature.

If anything in HFT environments, people use L2 switches for latency reasons. Those operate at the Ethernet level, so they don't really care about IP at all.

Anyway I don't see why that's specific to electronic trading or even just a low-latency concern. Sending the same traffic to hundreds of people with unicast means using hundreds of times the bandwidth, which is a huge problem.


Oh I'm well aware, I've loaded more than enough Metamako alpha firmware releases for one career.

But there are a lot of professional software devs with zero real networking experience. Sure, they may understand the text book definition of TCP or they may even have seen pictures of fibre with labels like 'this is how far light travels in a nanosecond'. But would have no idea how to calculate the serialization latency of a 10G link, let alone know the duty cycle required to saturate one.

But none of that matters in the cloud (which is where twitter is stacking their jenga tower in the original blog post). And even if both Google and AWS have custom silicon (or at least FPGAs) doing hardware offload for their internal SDN encapsulation protocols at the server level and that their custom switches all support it, it doesn't matter. They hide all of that from you the customer and rarely even acknowledge it's existence.


One does wish the cloud network were a little more exposed to the tenant so we can do fancy stuff. Amazon's EFA is as close as they get. On the other hand I suspect Google of having bog-standard merchant silicon switches and maybe custom silicon NICs in their latest and greatest machines but for the most part off-the-shelf stuff at the host as well.


Still stuck with IPv4?? :)


> is not a serious programming language to do systems programming in, since it has poor control of networking, threads and memory.

Tell that to the Netty folks.


Here we go. You must be some HFT semi-god. For the rest of us, Java and TCP are just fine.


You still haven't answered the question. What is a better programming language?


Guessing he'll say Erlang


The established languages for systems programming are C and C++.


Yes, please elaborate - what’s better and why?


Elaborate?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: