Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: How “normal” is for a developer to be on-call?
5 points by manuelabeledo on Jan 28, 2022 | hide | past | favorite | 20 comments
A few years ago, my managers decided that developers should be on call for incidents. We didn't have any sort of first responder or site reliability engineering staff, so almost all incidents, no matter how small, would be handled by developers.

Things improved a little bit after some time, but the idea of on-call developers is still very much ingrained in the incident management process.

Now, personally, I don't quite understand why a developer would need to be woken up at midnight to diagnose something that cannot be patched by him or her alone.

Perhaps the issue is that our process is dysfunctional, but I'm wondering if it is necessary, or just common across the industry.



This is mostly a cost saving technique that companies quietly incorporated as if it is quite normal. Anybody who complained or criticized were branded as not being team players or being old-school. Companies found a way to get rid of operations team and had developers do the same thing without paying them extra. Well, there are some companies like Google who pay for being on-call, but the vast majority does not. So you end up having to cancel your weekend hike when you're on-call, you know, just in case something blows up. You get woken up in the middle of the night to look at an issue which could be in some part of the system you are not familiar with. But who care, this is progress, right? Right? But there is an upside to the devops model. Now developers think through the part after the code has been deployed, they care about the debuggability, about resilience, and logging correctly and all those things which they casually glossed over before. To summarize, in my opinion, being on-call is okay as long as you get paid extra for it, otherwise you're just being used. Have a good day!


> Anybody who complained or criticized were branded as not being team players or being old-school.

This, plus "you build it, you run it", which I don't agree with, given that most mid to large sized companies run devops teams, effectively hiding infrastructure details from developers.


> Anybody who complained or criticized were branded as not being team players or being old-school

truly being a team player would be to realise the team identity does not include the company but is instead built around professional identity as fellow bit-gardening day labourers who exchange their time and energy providing engineering and operations services to the owners of the company in exchange for money -- then banding together collectively (perhaps starting a union?) to increase negotiating power to obtain better pay and working conditions

if anyone has a particularly pithy response to "you build it, you run it" along the lines of "if you'll pay both of my invoices for performing both jobs, OK", i'd love to hear it


It's not normal. It is scope creep.

My partner is an executive chef at a private school. It's one of our country's top schools that gets the occasional visit from people with royal titles (and one might soon be King), so think super rich kids: there is a lot of pampering. All of a sudden through this Covid thing we're all working through, there are texts and phone calls at 5:AM "There's a kid with a new allergy coming in today. Make changes in the lunch menu to accommodate them." This is literally a situation where one could ask, "Shouldn't this be an email, CC'd to others involved, too? And maybe even the day before so appropriate food items can be ordered?"

It's super annoying and clearly goes against how things are normally done in a well managed business. A lot of companies are clearly taking advantage of all the workarounds this pandemic has forced us to contend with. Except usually the abuse isn't even related to pandemic conditions.

A lot of people are technically working 24x7 nowadays.


When I've been involved in a "developer on call" arrangement it's been nowhere near as arduous yours sounds - we were supporting users of our software but not an online service, so "help, the central server has gone down!" was less of an issue.

We had an informal rotation in the dev team so there was always a named person who'd make sure any customer issue got an intelligent first response - up until a certain point in the evening. That was to ensure we were able to ask for data we needed from customers in other time zones, then look properly when the team were in the next day.

I have known people on 24-hour call in companies that had infrastructure that was critical to their business (i.e. "as a service" sort of deals).

(personally, where I see developers are already doing some out-of-hours support work voluntarily, I think overformalising would put people off and be counterproductive)


Ah yes the devops scam. Makes sense on paper except that execs don’t care about ops or oncall load. So it ends up being yet another way to squeeze the code monkeys.


We did started this whole thing as part of a “devops transformation”, so you may be onto something.

Funnily enough, we went through something similar during our “agile transformation”.


I would go so far as to say it is a best practice. It aligns incentives. If you know you're going to have to wake up at 2am when your code blows up, you are more likely to make it resilient.


Then the CEO should also be woken up to be sure the incentives are truly aligned. After all, it is their website that is blowing up, and being woken at 2am will make them more likely to ensure best practices company wide.


If the CEO is equipped to diagnose and resolve incidents, then yes absolutely they should be in the on call rotation.


I agree with this, however companies should pay extra for the on-call hours. In our team, I am on-call once every 6 weeks. So I am "working" 9 weekends a year. As long as I am paid for 18 days, which is almost 3/4 of a month (considering 22 working days a month) ,I am cool with it. But most companies dont.


> If you know you're going to have to wake up at 2am when your code blows up, you are more likely to make it resilient.

Isn't it also true that a developer who routinely misses sleep, is less productive and makes more mistakes during working hours?


On call incidents shouldn't be routine. A rotation spreads responsibility across the team. Ultimately, someone has to be responsible for incidents and it aligns incentives best for it to be the engineers who maintain the code.


> it aligns incentives best for it to be the engineers who maintain the code.

How many at a time? It has to be two or more, otherwise anyone could push changes to production at any time, without supervision.


I recommend one per team. You want somebody who is knowledgeable on the specifics of any part of the system that could break. I don't think having more than 1 is a requirement though, if we're talking about small companies. As long as there is a "break glass in case of emergency" way to push to production that is auditable.


Do you trust every single dev to push the right workaround?


Why is your productivity more valuable than the Ops/SRE team that probably costs the company more to employ?


It may not be. It all depends on how much value you put on resilience and quality in software. Again, a sleep deprived developer has a higher chance of making mistakes, than a rested one.

But I have heard your argument before, and it kind of rings hollow. Unless you are willing to allow developers to push changes to production in the middle of the night and without supervision, what exactly is the benefit of having them on-call? Saving money?


What exactly is the benefit of having a non developer on call who has no fucking clue what the problem is because they didn’t write the code, they don’t know the AC’s, etc? That doesn’t save any money or productivity either.

Your point is correct thought that the responder does need the ability to make a hot fix. Non actionable alerts are of negative value to the company.

You should read The Phoenix Project, that will help you understand the rationale.


> You should read The Phoenix Project, that will help you understand the rationale.

What makes you think I haven't read it? When my company went through its so called "devops transformation", first thing our devops advocate did was giving away a dozen or so copies.

I understand the rationale. I just happen to not agree with it.

In my mind, it's naive to believe that a team would have more than two or three competent coders, with a thorough knowledge of the code base, the high level vision of where that code fits and interacts with the rest of the infrastructure, the skills to quickly fix things, and, on top of that, the willing to be on-call several days a month.

And then there is the fact that nobody looks for solutions at 2am, just workarounds.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: