Employee lack of ownership

Question

I manage 20 software engineers divided into 4 sub teams. Every team has good work standards and a high-level of ownership except one. That team has one senior guy and three juniors. Every time there is a critical bug (impacting the business), this senior guy always pushes the work to the next day by saying things like "I can't finish it today," "I will look into it tomorrow," "Do we really need it today?," or "How are we going to test that tonight?" Even when I told him I needed it now, he said he had something else to do and sneaked off when I was not there. He also told these juniors to push back their work as well.

Last week, I told them in a team meeting that I expect a higher level of ownership. If they promise something, they should do it. If there is a critical bug, they must fix it even if they have to stay late.

Today, there was a critical bug and this senior guy said the same thing again - "I can't finish it today. I have a meeting with friends and I have to go." then he sneaked out while I was talking to my manager.

This is not the mentality I want my team to have. I plan to tell him that he has to change his work style or find a new job, and waited for the answer. Is it too direct to do that? Is there an alternative way to deal with issues like this?

Update

In this particular example, the bug prevents 90+% of users from logging into the system. On average, this happens once a month this year while it happened twice last year. Critical bugs are well defined bugs which: 1) prevent users from logging into the system and 2) prevent users from purchasing products — only these two type of bugs.

What we did to prepare every release:

We had thorough plans where everyone understand the requirements. We actually plan about field name and functions. I implemented for all teams the rule that requirements can't change after sprint start. We also have test cases ready before sprint start.
We add buffer to all tasks, let's say if we think we can finish something in 1 day, we put 1.5 days. We found that some people always underestimate tasks.
First deadline was end of Jan - it is when they think they can get it done with tests. This is another rule I implemented in all teams. POs tell us what they want and we tell them how long it will take. So, I told other teams that everything would be ready by 3rd week of Feb.
By the end of Jan they said all features are done with tests in test cases. We deployed them to our test environment and found a bug where user can't login. It turned out that they did not write all the tests. I asked them how long it would take to fix the bugs and write the tests, they said two weeks.
First two weeks of Feb, I told everyone that we would only test and fix critical bugs in these two weeks. Again, critical bugs are either 1. users can't login or 2. users can't purchase products in app. Everything else will be in our backlog.
Week 3-4 of Feb after we released it to customers. We spent this two weeks fixing non-critical bugs (that we log from #4) which are reproducible crashes and other less important bugs like layout and etc. Again, all these fixes have tests.
We released it to customers with all tests green. After deployment, we found that some numbers are off so we retested everything and found the same issue coming back - users can't login.
Last time they stayed late at night, I gave them extra 2 days off.

Comments are not for extended discussion; this conversation has been moved to chat. — Neo, Mar 15 '19 at 12:07
Can I suggest that everybody rethinks their answers in the light of the above edits. — DJClayworth, Mar 15 '19 at 13:25
More questions for CodeProject: Is this senior you write about the only person who could fix this bug? And who is responsible for pre-deployment testing? — DJClayworth, Mar 15 '19 at 13:59
Are you telling me (on average), once a month, 90% of your user base is not able to use your service (during the day)?. Boy, you have to have bad QA. You must review your processes ASAP! — Marcel, Mar 15 '19 at 15:18
@DJClayworth no he is not the only one who can fix it but it will be faster because he wrote it. The team owns everything from writing code, testing, and deploying. — Code Project, Mar 15 '19 at 16:18
@CodeProject Does your team have dedicated roles for testing and deploying, or do you just have a group of developers that do everything? — 17 of 26, Mar 15 '19 at 16:31
Is there a consequence to this team if they stop what they're doing to immediately swap contexts? It sounds like you're really tight about releases. If they're under constant pressure and consistently burned out, and there are harsh penalties for falling behind on anything, or harsh penalties for making mistakes on critical fixes. I'd completely get them being reluctant to jump into critical bugs. — AJFaraday, Mar 15 '19 at 16:43
@17of26 We do not have a dedicated testers. Developers do everything from writing code, writing test (unit and ui,) and releasing — Code Project, Mar 15 '19 at 16:43
That's one of your problems. Developers do not make good testers. https://www.joelonsoftware.com/2000/04/30/top-five-wrong-reasons-you-dont-have-testers/ If instead of 20 devs you had 15 devs and 5 testers, your product would be in better shape and you'd be paying less salary. — 17 of 26, Mar 15 '19 at 16:45
If the product is critical to your business, I don't understand why you don't have an out of hours team on rotation for situations just like this one? — JoeTomks, Mar 15 '19 at 17:25
"they must fix it even if they have to stay late." uhh oh nope you're dead wrong — , Mar 15 '19 at 17:42
I agree with getting a set of dedicated testers, ideally one per developer team. One thing I haven't seen anyone mention is that the concept of estimating against time is frowned on in Agile, because humans suck at estimating time. This is why estimates are usually done with points in relation to some non-linear sequence. The idea is that after several sprints, you should have an idea of how long all of the work will take on average, but individual stories may take longer than their point estimates. I.e. a 2 point story will take a day on average, but might take 3 once in a while. — jhyatt, Mar 15 '19 at 18:19
@17of26 I disagree with your assertion, as a developer-in-test :-) I find that having been a developer for two decades makes me a very good tester: I know how developers think, how they'll take shortcuts, and it helps me design fun tests to trip them up during integration testing :-D — Aaron F, Mar 15 '19 at 18:41
@AaronF You are the exception to the rule because you actually like testing :) — 17 of 26, Mar 15 '19 at 18:45
If there is a critical bug, they must fix it even if they have to stay late. Is this a joke OP? Be less of a slave driver and more of a leader. RE: Last time they stayed late at night, I gave them extra 2 days off. This is also just stupid. Have you considered that 2 extra days don't matter if they had something important you made them miss on the one day they stayed late? Are you aware how bad sleep and burnout accumulate doing this? Their obligation to you is limited, you are using ownership as some sort of culty weapon to control them and it appears the senior sees right through it. — CL40, Mar 16 '19 at 03:03
Reading your update, the critical bug apparently has existed for over a month, and blocks one of the most basic functionality (login). This tells me that something is seriously wrong in the structure of your dev/integration environments, service architecture and/or planning. Staying late fixing the bug is not going to work. — frankhond, Mar 16 '19 at 08:38
"If there is a critical bug, they must fix it even if they have to stay late." The critical bug more than likely got introduced by poor planning - maybe not enough testing or trying to rush your developers. This is your fault. You need to take responsibility for the poor planning. It's starting to sound like the senior engineer on that team is the only senior there, as they are protecting their juniors from you. — UKMonkey, Mar 16 '19 at 09:28
@AaronF I think the core idea is that a developer makes for a bad tester of their own code. I like Joel's writing but the QA one is a bit wonky even if the overall advice is sound. I also don't like the idea that testers are cheap and you should somehow base your decision on that. A good tester is worth their weight in gold. Aside from the metaphor In some areas they really aren't cheaper than developers but run around the same price. And even then I don't think "how much do I have to pay people" should be a factor. You need testers, just like you need developers or desks. — VLAZ, Mar 16 '19 at 11:07
"the bug prevents 90+% of users from logging into the system. On average, this happens once a month this year while it happened twice last year." If you can quote statistics on a single bug that covers a period of months (even years), then it is not a critical, because apparently it hasn't been important at all these past few months to fix, which means that your senior may be correct to ignore your call for urgency. Why is it critical now, when it apparently wasn't critical all these months? — Mark Rotteveel, Mar 16 '19 at 11:28
@Aaron You misunderstood Joel's post. You shouldn't test your own code, because most people will subconsciously avoid the areas they know are brittle. A developer makes a great tester for other people's code for the reason you named. But considering the going rate for developers compared to testers that's a rather expensive proposition. — Voo, Mar 16 '19 at 17:10
Step 5-6 is where you went wrong. Inadequate test cases (I realise tests can't cover 100% of everything, but you stated that they "didn't write all the tests" which I assume it means test that were planned to be written, but not implemented) then took the decision to release anyway with only critical bugs fixed. I would have cancelled/postponed that release in favour of a more thorough investigation of the testing etc (because of "unknown unknowns"). Not an answer because it doesn't address your problem, but is this something you considered? — seventyeightist, Mar 16 '19 at 19:07
In this particular example, the bug prevents 90+% of users from logging into the system. On average, this happens once a month this year while it happened twice last year - say WHAT?!?!? Your product has gone BELLY UP once a month this year, and a couple times last year AND YOU ALLOW THIS TO CONTINUE?!?!? *ARE YOU KIDDING ME?!?!?!?* Your absolutely top priority is to stabilize this system - not just band-aid it, but do whatever is necessary so that this cannot happen again. You CANNOT be locking your users out, or they WILL find someone more reliable. COUNT ON IT! — Bob Jarvis - Слава Україні, Mar 16 '19 at 23:36
This question seems to have confused "ownership" with "willing to work arbitrarily large amounts of unpaid overtime with little to no advance notice". They're not at all the same thing. Though ensuring your developers have some actual ownership (as in, shares/equity in the business) can help. Otherwise, why should they care? — aroth, Mar 17 '19 at 01:18
@Voo Its not just that a dev may avoid a difficult area to test (anyone might). Its that a dev has certain assumptions about how things will be used that testers won't ("Why did you click that?" "Why did you do it in that order") and that if they didn't think of a scenario in dev, they won't in test either. A new set of eyes will. Which doesn't mean that the dev should do no testing, they should write unit tests. Its just that unit tests are not the only form of test and should not be relied on to find all bugs. Or as the only thing you do. You need integration and smoke tests. — Gabe Sechan, Mar 17 '19 at 07:02
Nowhere in your lengthy question did you mention the root-cause of "[Jan] test-cases supposedly finished. Found bug where user can't login... [Feb] released... [After deployment] retested everything and found the same issue coming back - [up to 90% of] users can't login." So what's your root-cause analysis of how that slipped past your tests? Did you ever write a testcase that detected that? If no, why not? If yes, why did you release before the bug was fixed? Like, what on earth does "After deployment, we found that some numbers are off (?!) so we retested everything" mean?! — smci, Mar 17 '19 at 08:20
Honestly it sounds like you're ineffective at testing things, even when you know where the bugs are. You need to do a post-mortem with someone and figure out what your test process flaw was on that one. Not flog the employees into unpaid overtime. I also don't like that when you mention mistakes that sound clearly like your responsibility, it's always "we" released the product knowing it had severe bugs that the testcases probably didn't cover, or that you can't measure test coverage accurately. That one's on you, stop trying to pass the buck. — smci, Mar 17 '19 at 08:25
Unpaid overtime, daily floggings and so on do not make up for a basic lack of methodology. If you're embarrassed at what a methodology review might reveal about your process, hire an external consultant. One obvious thing mentioned already by others is people should not just be testing their own code. Hiring one tester per team is also a good idea. — smci, Mar 17 '19 at 08:29
I really strongly object to your title. More like "Our product has severe useability bugs, we've never written testcases that adequately cover them, our coverage metrics are unreliable, yet we keep releasing new code - what should we do differently?" — smci, Mar 17 '19 at 08:37
@seventyeightist I found that out after app is released and since it's on App Store, it's out of my hand. Our plan to tackle it is to make tests more visible on dashboard. — Code Project, Mar 17 '19 at 16:24
How often do emergencies happen? If an employer plays the "you must stay late to fix an urgent issue" all the time, then the employee is correct that it's not genuinely urgent. — Harper - Reinstate Monica, Mar 17 '19 at 18:14
@smci They caught the bug the first time, and again on retest. It:s the engineer in question that:s unreliable. — Mars, Mar 18 '19 at 00:14
@Harper OP literally states it's normally twice a year, yet this year, with this team (not the other 3 under OP's care), its twice a month. And if you read OP's definition of critical, it's pretty critical.... — Mars, Mar 18 '19 at 00:16
Do the other teams get these "critical bugs" as well or is it just that one team with the senior dev? I'm assuming (!) that you have each team correcting their own issues (with some particular area of the product)? — seventyeightist, Mar 18 '19 at 19:21
@Mars: totally wrong. The OP states *"the [login] bug prevents 90+% of users from logging into the system. On average, this happens once a month this year while it happened twice last year". So how I summarized it is totally correct: "severe useability bugs, we've never written testcases that adequately cover them, our coverage metrics are unreliable..."*. Do not buy the rest of what the OP says, semantics about "ownership" are a smokescreen. Clearly product has always had critical bug(s) in login, they've never shipped a release that fixed it, nor ever written a testcase that covered it — smci, Mar 20 '19 at 08:59
@Mars: ...and the killer part is *"7. We released it to customers with all tests green. After deployment, we found that some numbers are off so we retested everything and found the same issue coming back - users can't login."* Can you spot the process errors in that? The coverage metrics (of login functionality) are garbage. That simple. Do you honestly believe that after a year of this, they have a testcase that covers it or not? Almost noone here does. Where does the OP talk about root-causing it? Nowhere. Where's the process? — smci, Mar 20 '19 at 09:11
@smci It looks like we're reading it 2 different ways. I believe OP means 2 crit bugs all year, not 2 per month for a year. Crit bugs don't necessarily mean login bugs, so I think its wrong to assume its always the same, or even related, bugs. — Mars, Mar 20 '19 at 12:37
@smci #7 says "no, we designed it right and someone screwed up. " or it says "the senior engineer in question lied." The reason OP isn't talking about the root-cause is because that's a separate question... Question a (root-cause) = What how do we fix it so this doesn't happen again? question b (the question being asked) = Someone on this team screwed up and the person responsible isn't willing to do what's necessary to minimize the damage — Mars, Mar 20 '19 at 12:41
@smci I'm curious about 2 things: If the senior IS at fault, do you feel they should put in overtime to fix it? (For the sake of leniency, let's say paid OT). And under what condition is it not senior's mistake? — Mars, Mar 20 '19 at 13:02
@Mars ^^^ It doesn't help our opinion of the OP's basic communication that armies of people here still can't decode precisely what they're saying happened, and OP now won't respond to clarify basics. ^^ To me, no it doesn't say either of those. It says the bug existed a year before this release, they never wrote tests that adequately covered it, OP was aware of that before he/she authorized this release, now they want to somehow belatedly use this as a pretext for never-ending firefighting. Clearly the OP did not have a releasable product. Do not buy the story about scapegoating the developer. — smci, Mar 20 '19 at 13:04
@smci We're not here to discuss the reality, we're here to discuss the story. If you want to ask what I think if the situation isn't as OP says, then you should ask that question ;) — Mars, Mar 20 '19 at 13:06
^^ Given it is impossible to even understand precisely what the OP is blaming the SrDev for, but that all the bugs should have been caught with any decent software process, the SrDev is not responsible for the OP's behavior and general disregard for software process. I don't think many are interested in a sideshow about how many unplanned nights of overtime the OP believes they are entitled to. — smci, Mar 20 '19 at 13:09
"We're not here to discuss the reality, we're here to discuss the story." is a very strange proposition to make. Where the OP is vague or self-contradictory on so many basics that it calls into question both their set of facts, competence and basic communication, we absolutely have to try to find the objective reality. — smci, Mar 20 '19 at 13:11
@smci Sorry, I don't see where the OP is contradicting them self... — Mars, Mar 20 '19 at 13:13
You know what I think the most likely situation was? A merge conflict handled incorrectly without proper retesting. Or a seemingly unrelated bug, where the person who fixed it didn't imagine the repercussion and only tested related to the bug. This happens often when tests are not automated--you test only what you fixed (yes, that is not ideal and I push for test automation, but in my experience, it isn't very common) — Mars, Mar 20 '19 at 13:16
Which is a miss that needs to be fixed through a better process. Doesn't change the fact that it was the senior's miss or that immediate firefighting may be required. I think this question is step 1 and the process fix is step 2 — Mars, Mar 20 '19 at 13:20
@smci By the end of Jan they said all features are done with tests in test cases. We deployed them to our test environment and found a bug where user can't login. It turned out that they did not write all the tests I did A. Oh, actually I didn't do A. And then again for this bug. I did B. Oh... I guess I didn't do B. — Mars, Mar 20 '19 at 13:28
@CodeProject you should read https://www.amazon.com/Phoenix-Project-DevOps-Helping-Business/dp/0988262592 seems you have constant fire fighting and unplanned work in your organization — Nahum, Mar 20 '19 at 13:37
@Mars: there's no need to keep repeating back to me the same unconvincing story which I'd already read five times before you posted. Note that every time OP is involved in a mistake, the royal 'we' slips in, never 'I'. ('We' knew critical login bug(s) existed for a year ongoing and never fixed it(/them), 'we' never wrote the testcase, 'we' decided to release it, 'we' found that some numbers are off. In your hypothesis, 'we' failed to notice the merge conflict and 'we' failed to rerun the test suite before release. This is just blame-spreading)... — smci, Mar 20 '19 at 22:48
As to the OP deciding to release without all the tests written (and they should be prioritized, with the login bug tests high up), that was the OP right? And I don't like the vagueness of "It turned out that they did not write all the tests" a) which specific people? SrDev or others? b) how the hell did the OP not notice? c) when did "it turn out" - just before the release? That's not a software process. Trying to blame this on one failed merge by one person is not cutting it. The OP owns the process.) — smci, Mar 20 '19 at 22:54
@smci Very interesting take! I saw that as the exact opposite! Instead of saying "SrDev/SrDev's juniors didn't write the tests," OP put themselves in there to make it a "we". to make it sound as if OP wasn't placing blame. How the hell did OP not notice? OP is running a team of 20. Chances are, OP isn't the one reviewing code, or even looking at the code period. OP might not even know how to read code. That how the hell did the OP not notice should probably be directed at SrDev...OP owns the process OP may not know anything more than what is reported to OP.... — Mars, Mar 21 '19 at 05:18
A lot of this convo is just projecting based on our own environments/experiences and not really helpful.. I think its time to put it to rest :) — Mars, Mar 21 '19 at 05:19
@Mars: I agree that the OP's account and overuse of 'we did X' or the passive voice, when describing things that went wrong, makes it impossible to know who did/did not do what and who was/was not responsible, and renders this objectively unanswerable. But it also doesn't inspire confidence in the OP's communication, and hence their version of things. Anyway yes might as well leave it. I doubt OP will come back and clarify the missing information. — smci, Mar 21 '19 at 05:21
If there is a critical bug, they must fix it even if they have to stay late. How about you as a manager buddy? I hope you fix them at least half of them before asking others. Because, you are a leader and you explained nothing understandable about how you are keeping a buffer for critical bugs and how you are paying overtime. — Prasad Raghavendra, Dec 09 '19 at 17:29

score 344 · Answer 1 · answered Mar 14 '19 at 17:37

344

You seem to be confusing two things:

Them working any amount of hours to meet unexpected or unplanned issues.
Them being responsible and providing quality work in a predictable way.

Ownership is not about the team working the whole night to fit your promises to customers. Ownership is about knowing what's in the code, how it works, having a plan and being able to tell you how and when things will be done. Ownership is developers making the right decisions so the code works correclty not just tonight, but in the years to come.

Sorry if this is a bit rough, but I've had too many managers tell me variations off your post. More often than not it boils to:

lack of clear mandate
changing requirements
short term focus
constant urgency

Would you please elaborate, in the question on what you, as a manager, did to prepare those releases, empower your team, and how you listened to their feedback? Then we can talk about ownership.

answered Mar 14 '19 at 17:37

Jeffrey

4,960
3
16
30

2

Sometimes it's difficult for me to put my thoughts into words. You've done it very succinctly. Thank you. If I could up-vote this to infinity, I would. – joeqwerty Mar 14 '19 at 18:04
173

Well said, especially the last point. If everything is urgent, then nothing is. – 17 of 26 Mar 14 '19 at 18:10
18

While I kind of agree, in principle, IMHO we're ignoring an important fact. If your manager asks you to start fix something TODAY then you DO it today. You do not postpone to tomorrow because you think it can wait, no matters what. You may not finish (because rightly you may not want to work overtime) but you start ASAP, it's not your call. It's near to insubordination and as such you risk to be disciplined. – Adriano Repetti Mar 14 '19 at 18:46
132

@AdrianoRepetti I strongly disagree. Poor planning on my manager's part does not mean life-or-death urgency on my part. Yes, I do what my manager wants to the best of my ability, but I also try to keep my manager's expectations in check. If they are asking me to do something that is unreasonable, I am not going to stress myself out trying to do it. – David K Mar 14 '19 at 19:49
28

@AdrianoRepetti In the original question it mentions that the employee says that he "can't finish it today", so for all you know he does start working on it, but it isn't possible to finish until the end of the workday. – Maxim Mar 14 '19 at 20:18
3

@AdrianoRepetti I will push the "START" task button today - marking the task as "IN PROGRESS" and tomorrow I will think about working on it. – emory Mar 14 '19 at 22:37
6

Without commenting on the merits of the answer, it doesn't answer the question OP (currently has) posted. – user1717828 Mar 15 '19 at 01:06
7

+1 for the logic and explanation. But nothing here has to do with ownership, unless the employees participate in ownership (real ownership) of the company. Sometimes people talk loosely about a feeling of ownership, meaning a feeling of responsibility, but that's not ownership. A business or other property owner may want others to feel or act like they too participate in owning his property, but that does not make that ownership real. Such use is essentially ideological. – Drew Mar 15 '19 at 04:34
2

this is exactly what I've always wanted to say to my managers but never could – Pixelomo Mar 15 '19 at 05:37
3

@david as an employee you do not educate your manager disobeying his orders. People are rightly fired because of that. Yes, I can see that many people dream to do it but that doesn't make it a sensible advice. – Adriano Repetti Mar 15 '19 at 07:15
5

@DavidK Throwing around words like "poor planning" when talking about Critical Bugs is pointless rhetoric. If your manager could plan for that, they would be better served using their precognitive abilities to play the stock-market or to win the lottery. The only "poor planning" is not ensuring that there is always someone "on-call" to provide out-of-hours support when (absolutely) necessary. (And "90% of users can't even get past the login screen" is a situation I would call "absolutely necessary") – Chronocidal Mar 15 '19 at 08:52
4

How is this being upvoted? This should be a comment as it boils down to asking for rephrasing and elaborating. – LVDV Mar 15 '19 at 09:41
2

@LVDV The whole answer, including "Would you please elaborate" seems to me less of a genuine request for rephrasing and more of a thinly vailed slight at the OPs world view. And 'Re-examine your view of the situation because X, Y and Z' is kind of an answer... – Brent Hackers Mar 15 '19 at 12:30
5

@AdrianoRepetti As a manager, you don't ignore employee feedback. As a manager, you don't ignore time or resource contraints. As a manager, you don't ignore issues raised during work that are blocking progress - and still all of those are very common "manager practices". If your employee isn't obeying you, find why. Sometimes the problem is with them, but sometimes the problem is with you. – T. Sar Mar 15 '19 at 12:39
20

@Chronocidal Critical bugs in production can almost always be traced back to poor planning. if a critical bug gets into production, it means that not enough time or resources in the testing phase. Critical bugs will often get to the testing phase because not enough time or resources were spent on development. – 17 of 26 Mar 15 '19 at 12:46
7

@17of26 You are a 100% correct. Heck, this is a login error that affects 90%+ of the user base - how in the world this wasn't caught? – T. Sar Mar 15 '19 at 13:18
13

@AdrianoRepetti: Real world example: Over 6 months, about 80 of my days (= 4 months) have been spent on "urgent fire" emergencies. Every single project has developer absences every single day because some developer needs to emergency fix something in a past project. Over these 6 months, management has specifically blocked me from implementing any form of testing or code review because "it takes extra time". Projects are sometimes intentionally put on hold until they become more urgent than the others. Jumping when management tells you to jump is exactly how this situation came to be. – Flater Mar 15 '19 at 14:25
2

While I can agree on taking ownership being more about delivering quality than putting in overtime, the rest of your answer doesn't fit the updated question. Their methodology and planning seems spot-on. It looks like the team and senior dev in question don't keep to the same standards as the others by omitting mandatory tests and undermining quality. Good manager, bad dev / team performance. – Søren D. Ptæus Mar 15 '19 at 15:21
5

@SørenD.Ptæus The management failure in this scenario is not getting to the root causes of critical bugs continually making it into production. Making devs work overtime to fix the critical bugs is treating the symptoms and not the disease. – 17 of 26 Mar 15 '19 at 15:30
1

The "a critical bug affecting 90% of users shouldn't have gotten into production" hindsight approach is nice and all, but I don't get how most people here agree with working on your features instead of fixing the way the company makes money. Even if you take a "it's treating the symptoms and not the disease" perspective, the lead dev was obviously not going back to his desk to fix the disease (that would be more of a manager's duty). – R. Schmitz Mar 15 '19 at 17:08
3

Something like this - if 5% of users can't login, that's a developer problem (wrote it in a way that's got a bad path that can fail, QA didn't have visibility (or time to play)). If 90% of users can't login, that's called "QA did nothing". Or the app's UX is horrendous (i.e. the "happy path" is weird to go through and there's not enough guidance or error messaging). – Delioth Mar 15 '19 at 18:31

score 124 · Answer 2 · edited Jun 16 '20 at 10:59

124

Even when I told him I needed it now, he said he had something else to do and sneaked off when I was not there.

Today, there is a critical bug and this senior guy said the same thing again - "I can't finish it today. I have a meeting with friends and I have to go." then he sneaked out while I was talking to my manager.

In both of these examples, you refer to him as sneaking off, but by your own words he told you that he wasn't going to do this work and then didn't do it. Sneaking off implies he's being deceptive or dishonest, but it sounds like he's being transparent, and you ought to recognize that. I've worked with people who say they'll handle things and then disappear, and those people deserve to be fired. Someone who informs you of their bandwidth and then follows through is different entirely. This person's integrity isn't an issue; he is only unemployable if his results aren't sufficient.

Last week, I told them in a team meeting that I expect higher level of ownership. If they promise something, they should do it. If there is a critical bug, they must fix it even if they have to stay late.

This is a reasonable statement and a level of ownership that senior engineers should generally accept with some caveats:

Critical bugs must actually be critical. For example, in my own career I have stayed late to fix "critical" bugs that were then not deployed into production for two months. In those cases, it was a manager freaking out about something and wanting it now instead of actually a critical bug. Of course, there have been actually critical issues as well.
Staffing levels must be generally sufficient. Meeting release dates and fixing issues are important, but if we are always late because we have 3 people doing 4+ people's work, that's a different situation.

Is there an alternative way to deal with issues like this?

Some development methodologies have built-in ways to manage these issues. In Agile development, for example, sprints are ways of promising what work will be delivered. It also includes built-in ways of measuring velocity (the amount of work being accomplished) and usually goes along with software (JIRA is the most popular one I believe) that makes whether or not a team or individuals are meeting those goals. In agile development, if you need to change course mid-sprint - like take time to fix a critical bug - it reflects that you're changing the scope inherently. Normally, you take things out in order to add whatever it is that must be added. This process makes it really easy to evaluate whether "I can't get to it today" is because he's working hard on other important goals or that he is just being difficult.

IMO, it's a fantastic method of software development that unfortunately is almost never done correctly.

UPDATE: in response to the question edits, this bug is absolutely critical in nature (at most companies it'd be called a showstopper instead of critical) and should be fixed immediately. I would follow the technique I described above - taking things off of his plate in exchange for him working on it now.

It sounds like this project has been a mess and very stressful for everyone involved, but a bug that makes it so 90% of users can't log in is worth staying late for. You need to assess whether or not this employee has completely checked out (in which case you have to help him move onto other employment) or if the project has just worn him down and he needs a break.

edited Jun 16 '20 at 10:59

Community

1

answered Mar 14 '19 at 18:42

dbeer

11,944
8
31
39

6

I agree with most of your comment but I'm not sure Agile applies here. I'm assuming when the OP says "critical bug" he means something that has come up in released software that really has to be fixed right now (e.g. the recent Facebook outage... I suspect a lot of people were burning the midnight oil). It's true that Agile will let you measure impact on the schedule but OP doesn't even mention the existing work schedule. – DaveG Mar 14 '19 at 19:26
58

+1 for "Critical bugs must actually be critical." Just last week I saw a "critical" item that was ultimately ranked 6th in priority... I've learned, for better or worse, that "critical" is a word which can just be ignored. – zr00 Mar 14 '19 at 20:30
2

@DaveG I don't know if they are using Agile or not; I'm recommending it as a process that makes it clear what the impact of asking for a bug today more clear to all parties. My experience is that all parties involved have a better experience when the impact of escalating a bug is more transparent. – dbeer Mar 14 '19 at 21:23
10

To add to this, most engineers are flexible about working extra hours for genuinely "critical" bugs. But OP, do you then give them hour-for-hour time off in lieu? If not, expect your staff to start working to rule as this team do, because busting a gut for the company does not in the end make any sense if the company doesn't give anything back. – Graham Mar 14 '19 at 23:34
3

I love having 5 “high”, “show-stopping” defects that boil down to “this menu is lime, but is supposed to be chartreuse”. – zero298 Mar 15 '19 at 02:04
1

@Graham Quite right. I once worked for a company where my team got berated by the CEO for getting in at 11am after working till midnight the night before to meet a management-imposed deadline. I fired back that if he was unhappy, we would all be happy to henceforth keep strictly to working our contracted hours. Luckily for him, he had enough sense to shut up and never criticise any of us for arriving late again. – Mark Amery Mar 15 '19 at 15:37
@Graham The question has been updated; "Last time they stayed late at night, I gave them extra 2 days off." – R. Schmitz Mar 15 '19 at 17:14
2

"If they promise something, they should do it." -- It sounds like this guy is the only one who's actually living up to that, by not promising things he isn't going to do. If "do what you commit to" is the standard OP wants to impose, it's probably everybody else that needs the talking-to... – Tiercelet Mar 15 '19 at 20:23
1

While I agree with your answer, the fact bug prevents 90+% of users from logging into the system. On average, this happens once a month this year means no time is spent on testing critical functions. That time not allocated is managerial issue with "do this instead, it's more important". If that happened where I work/-ed more than once and I've warned them, I'm not staying late next time when we've not had time allowed to make sure it never happens again (e.g. unit and functional testing). – rkeet Mar 16 '19 at 09:32
2

@zarose that's called "priority inflation". You put a priority process in place: Normal (5 days) High(1 day) and critical (within the hour) and it starts getting abused. Once people find out, they're getting service faster, suddenly everything starts being critical. Where I work, we managed to put a process into place for showstoppers, where for every critical issue a team was formed to fix it, with a manager having the oversight. And to much annoyance of management I followed that process to the letter..."but I just want you to go fix it!" me: "but process". It reduced critical issues by 90%. – Pieter B Mar 16 '19 at 17:52
I think OP meant that OP reached out to his boss to have his boss kick subordinate's butt... and while he was doing that, subordinate departed the premises immediately. This is not the first time 'round with this sort of thing, so OP reasonalbly guessed subordinate's motive for leaving was to avoid the butt-kicking. Hence, "sneaking off". – Harper - Reinstate Monica Mar 17 '19 at 18:17
@PieterB Oooooooo that's a pretty nifty process I hadn't considered before! Definitely tucking that one away in my back pocket. – zr00 Mar 18 '19 at 14:31

score 76 · Answer 3 · answered Mar 14 '19 at 23:49

76

In my office we use to quote the following:

“Poor planning on your part does not necessitate an emergency on mine.”

In my experience developers often are motivated to help with a problem that appeared because of a mistake on their side or something unforeseen.

But all to often issues arise that are not only unsurprising but predicted. Before you decide to give your developer an ultimatum and likely make him look for a new job, you should ask yourself the following:

Have you done enough to avoid "critical" bugs in the first place? Did you give developers enough time to implement testing, code reviews, refactorings and monitoring?
Are you making sure that new features get activated when there is enough time to fix them? (as opposed to late in the evening or on a Friday).
If critical bugs are common, are you paying enough for overtime or on-call duty?
Did the developers you want to have ownership, "own" the release process? Would they able to stop a feature release, if they think it was buggy?
Are your deadlines realistic and agreed on with the dev team?

If all of the questions can be answered with a clear "yes", then you might have to let go of your senior developer.

If any of the answers is "No" or "I am not sure", then I would start looking for the problem in management and fix these problems first.

answered Mar 14 '19 at 23:49

Helena

8,686
3
24
51

2

I have listed things I have done to prevent critical bugs but obviously it's not enough. Of course, I give them enough time because they tell me when they think they can finish it and this time includes writing tests and code review. On top of that, I add 30-40% buffer time plus another 2 week for testing. Critical bugs wasn't a common thing until lately when we had it twice in there months. And yes, the team own the release process through ci/cd. I believe the deadline is realistic because everyone agree with it from start to end of the sprint (i check in every time we have a standup meeting – Code Project Mar 15 '19 at 03:24
"Have you done enough to avoid "critical" bugs in the first place? Did you give developers enough time to implement testing, code reviews, refactorings and monitoring?"

This strongly agree with this point. There are processes that ensure good quality software. If a manager prevents his/her engineers from completing these processes then crises that occur as a result are the manager's problem, not the engineers'.
– user2818782 Mar 15 '19 at 06:36
10

Developer who goes "not my bug, I don't care how much the company loses because of it" is... a bad person to have on a project, no matter how technically good developer they are. Any developer who doesn't care should just find a new job where they would care and/or where emergencies don't really happen and/or the company does better job at avoiding emergencies. Staying but not caring is not good for anybody. – hyde Mar 15 '19 at 10:12
13

@hyde: On the flip side, quite some companies excel in creating such developers. And the company in the question sounds like one. You can cycle the people, but that won't solve the problem. You end up creating just more cynics. – MSalters Mar 15 '19 at 11:08
24

@CodeProject "And yes, the team own the release process through ci/cd" - that's... not what Helena means. You're talking about the team controlling the mechanism by which code gets deployed. Helena is talking about the team owning the decision about whether to release - about them being able to decide whether it's advisable to release a feature as it stands, and decide not to (and to let a deadline slip) if they think it's not ready. Your comment - which focuses most on defending the deadlines you impose on them - suggests to me that they do not in fact have such ownership. – Mark Amery Mar 15 '19 at 15:28
TL;DR: "Mirror. Here. Use it". – user13655 Mar 17 '19 at 18:18

Sefe · Answer 4 · 2019-03-15T15:03:44.897

47

You claim lack of ownership by the team. Everything your developers build is owned by the company, not them. When you say that your employees should "own" the results of their work, does it also mean that they will receive the profits that those results make for the company? If it doesn't mean that, they don't truly own the work and you can not ask ownership from them.

If there is a critical bug, they must fix it even if they have to stay late.

Your solution to fixing critical problems by making your people stay late is convenient for the company and the employees pay the price. Again, that would be OK if they also get a share of the profits. Do they?

In this particular example, the bug prevents 90+% of users from logging in into the system. On average, this happens once a month this year while it happened twice last year.

When this happens so often and you don't install organizational procedures to reduce the impact of those errors, it is you as an organization that is at fault.

Actually, your current approach to fixing "critical" problems and your contemplation of firing your employee could be considered a sign for a dysfunctional organization. Your employee's behavior might be his way to react to that. Your update on the original question with a list of what you think you are doing right (as opposed to thinking what you might be doing wrong) also shows that you might have an issue accepting that you as a manager are a part of the problem.

There are a lot of things management can do to improve quality and reduce urgency before you ask employees to stay late:

No matter how well you think that you have focus on quality, the results show that you haven't. You have to seriously improve the quality of your development process, which could mean measures like reviews, inspections, pair programming, increased testing, redesign of critical components, improved architecture and design etc. You better start analyzing the organizational issues that cause those problems instead of writing down the list of measures you have already implemented. Obviously, they are not working.
Why does your employee have to stay late to fix the error? Can you do your releases in the early morning to give your developers the entire working day to fix issues?
Have you thought about using feature toggles or other measures to quickly revert to the previous version of the feature to give your team time to fix the problem?
You can not blame your employees for having plans for the evening when issues pop up on short notice. You can install a system of stand-by duty on days of critical releases. Then people know beforehand that they might have to stay late and can prepare accordingly.

edited Mar 15 '19 at 15:03

answered Mar 15 '19 at 08:53

Sefe

1,030
10
12

6

The third point on this is very standard for critical functions on all of our workplace programs. +1 – IT Alex Mar 15 '19 at 14:23
Please do not do point 3. Instead containarize your application. Deploy a new version when ready. If buggy, you can instantly re-deploy the previous version. Feature toggles are a great way to have useless code floating around because it's never used/ "turned on". +1 for remainder of answer :) – rkeet Mar 16 '19 at 09:45
3

@rkeet: totally disagree. it's extremely important to be able to toggle on/off specific features at runtime without having to redeploy anything. And this has absolutely nothing to do with having your applications containerized or not. I don't want to have to involve thirdparties / release managers / platform supporters just to disable / enable a simple feature that's causing havoc if I can avoid it. – devoured elysium Mar 16 '19 at 16:43
@devouredelysium If a feature needs to be turned off because it's causing havoc, you have broken production. If you have broken production, that entire build is not ready. If a build is not ready, it should not be in production. Assuming the previous build works, redeploy that and fix your broken production. Containerized deployment takes like 2 seconds anyhow. If your chose third parties / rm's / platform supporters are as unreliable as it comes across as you think they are, you need to re-evaluate your choice in software. – rkeet Mar 16 '19 at 17:42
1

@rkeet - your points make sense in a server-side environment, but as has gradually come out in the comments, the question is actually about a mobile application - which is a situation where on the iOS side updates have to go through app store approval latency and on the Android side there is an untestable variety of platforms implementations, and on both where the installation of updates is subject to manual end-user approval. Feature toggles or even "this version is broken, you must update it to get to any other screen" server responses are clever attempts to deal with these realities. – Chris Stratton Mar 16 '19 at 18:13
1

@rkeet: so by your reasoning, if your right arm hurts, it must be because your whole body is damaged? If feature A is problematic (and it can even be because another system is malfunctioning and it's preferable just for safety to also disable this functionality) then you shut off feature A, you don't roll back a whole sprint worth of functionality. – devoured elysium Mar 16 '19 at 20:17
@ChrisStratton Hadn't read anywhere it concerned app stores / mobile apps. Then, and only then, I suppose it could make sense. Better to make sure it never reaches production. – rkeet Mar 16 '19 at 20:48
@devouredelysium biology doesn't come into play here. But playing on that, yes, if your arm hurts, it is the body that is damaged, is your arm not part of your body? Weird analogy. Whatever. And yes, I would roll back a whole sprint of functionality. Not throwing it away, just taking it down to be fixed. (In your analogy, you'd apparently rip off the arm to fix in a hospital, or put a tournequette on it till it's fixed...). In fact, I recently rolled back a release with 3 weeks worth of work. It causes errors (though hotfixable) it would've caused overtime. No need, roll back, tomorrow new day – rkeet Mar 16 '19 at 20:52
@rkeet it's a little hard to have any customers if your mobile app never "reaches production" so the mechanisms proposed are mostly definitely meant to. Arguably, the asker might consider going further and making their app essentially a viewer for dynamic content downloaded from a server and at most cached; then they actually can roll back changes to much of what matters easily. But that can be mathematically demonstrated to be the extreme of making the entire thing a collection of fine grained toggles. – Chris Stratton Mar 16 '19 at 20:55
1

@rkeet: and are customers (and other service's owners) going to be willing to have your old version running for hours (or even days) until you find the problem, fix it and properly test it? What if then it's not really fixed and you have to roll it back yet again? lol. So your plan is to have the whole office looking after you instead of properly isolating the problem (shutting it off) and calmly fix it while keeping everything else running smoothly as always? Good luck. – devoured elysium Mar 16 '19 at 21:00
@devouredelysium I suppose you don't mind staring at blank, not loading, pages. Or "500 Internal Server Error". Or "Incorrect login, please try again" or whatever else. I prefer a working version. If a feature is broken, yes, take the bugger offline and fix it. Don't let me, user, be troubled by your failings. – rkeet Mar 16 '19 at 21:03
1

@rkeet: if the feature is broken, as the OP original stated, you disable the whole feature. The user won't see it. You don't leave it broken. That's precisely the idea of having a killswitch. – devoured elysium Mar 16 '19 at 21:04
1

@devouredelysium You do realize that if you'd deploy a previous version, the whole broken thing is not in it right? You then fix it without bothering your end-users who judge you on their experience. When you've fixed it: re-deploy. – rkeet Mar 16 '19 at 21:19
@rkeet: I have written feature toggles or other measures to quickly revert to the previous version. My main point is to make it easy to revert. How to do that should be the OP's decision, but he should defenitely find a solution. – Sefe Mar 16 '19 at 21:28
@rkeet: and you leave all the features of all the other teams, possibly even with other minor bug fixes included, grinding to a halt? lol. I won't continue this discussion. Absolutely no one in the business does anything remotely similar to what you're describing (as the post we're commenting on testifies) if they can avoid it. – devoured elysium Mar 16 '19 at 23:03
@rkeet your solution to this problem requires specific project and ci/cd setup. What if db migration is not safe to be rolled back? What if other already deployed services depend on some other properly created new feature? What if new build fixed some critical bug? Etc, etc... – Askar Kalykov Mar 17 '19 at 17:10

score 40 · Answer 5 · answered Mar 14 '19 at 19:20

40

Working in software this is very common.

You treat your people as professionals. You're talking ownership but then giving demands that a 'critical' bug must be fixed NOW.

Is the bug actually 'critical'? Is it the result of unclear requirements? Our old friend 'scope-creep'?

In each of these you (as the manager) need to manage expectations. Not every bug is 'critical'. Requirements can suck. Project scope changes.

Instead of demanding they drop everything for something 'critical' work with your teams to when it will be fixed. Then hold them to this estimate.

I've been putting 'critical' in quotes because after 30+ years in this field (yikes I'm old) this term is very misused. Everything can not be 'critical'.

answered Mar 14 '19 at 19:20

JazzmanJim

8,338
1
20
39

30

Holding people to their estimate is pointless - it's called an estimate because they don't actually know when it'll be fixed. – Erik Mar 14 '19 at 19:47
@Erik In that case they should know exactly what went wrong to cause their estimate to be off. – JMac Mar 14 '19 at 20:11
26

@JMac when it's solved they will know what was wrong and where the time went, if you want to have a retrospective. But until it's solved they can only tell you what the time has gone to trying (or what other obligations have gotten in the way), and maybe their current hunch for what to check / try next. Some discussion along the way can be productive and insight can come even from the act of conversation; but there's a point where discussion and reporting itself become a self-defeating source of delay. – Chris Stratton Mar 14 '19 at 20:20
@ChrisStratton It doesn't really make holding people to their estimate "pointless" though. I'm not saying they need to take time away from the solution to articulate where their time went; but when someone provides an estimate they should be held to it, to a reasonable extent. If holding people to their estimates is pointless, so is getting the estimate. The fact is, estimates are useful and commonly employed, often required to organize work in a way to meet deadlines. I was mostly getting at the point that you should be able to support your estimate, and changes to it. It's not a guess. – JMac Mar 14 '19 at 20:47
6

Indeed, estimates on complex problems usually aren't meaningful, and everyone with any sense knows that. At best for something large with the right guess multiplying factor you may come out approximate on average, but the specific time sinks are rarely those expected, so it's really just a test of one's pessimism skill. – Chris Stratton Mar 14 '19 at 21:01
1

@ChrisStratton I would argue that there is a world of difference between "often wrong" and "aren't meaningful". No matter how incorrect your estimations are, there should still be meaning behind them, and any estimate you give will definitely be interpreted with meaning. The person giving the estimate isn't solely responsible that the estimate is right, the system is far too complex for that, but they should definitely be held to it to an extent, or else project planning becomes essentially useless. It's not like "we'll finish when we finish" is acceptable to most clients. – JMac Mar 14 '19 at 21:11
6

In that case you want to have people make an honest estimate, then multiply that by 5, and than tell you that amount of time, just to make sure that it will be likely that the issue is fixed within that timeframe. That's no longer an estimate but safe expectation management. – Mar 15 '19 at 08:11
2

Here is nice talk about estimates: https://www.youtube.com/watch?v=QVBlnCTu9Ms (I don't agree in every aspect with everything, but it makes a few good points). Hint, the title is "#NoEstimates" – Frank Hopkins Mar 15 '19 at 12:21
2

"yikes I'm old" - That means your input is very valuable. – DxTx Mar 15 '19 at 16:18
@DT While I agree with JimmyB, that's not necessarily true. I've met a fair few older developers, managers and bosses by now. Quite a few them might've been around the block, that doesn't mean they learned from it ;) (though JimmyB seems to have picked up on things ;) ) – rkeet Mar 16 '19 at 09:40
@Erik I would argue that holding people to estimates is reasonable because estimates should always be overestimates. If you estimate 2 weeks, and deliver at one, no one is going to complain that they've a week to do extra testing or that you've a week to spend on some other project. Project planning completely depends on this fact; and if you want a project manager to be your friend, don't deliver late. If you deliver late on the other hand, you don't know what impact on the rest of the project you're going to have - which can result in an urgent issue. – UKMonkey Mar 16 '19 at 11:11
@UKMonkey once a manager told us all estimates have to be accurate and we absolutely have to finish within that timeframe. From that day on until he left the company about two months later all my estimates where "two years". I managed to finish all tasks within that time frame, which gives me 100% correctness of the estimates. For some strange reason, as far as I know none of my estimates where communicated to the customer. I wonder why. – Josef Mar 16 '19 at 14:32
@Josef lucky for you they didn't. Professional services engineers are expected to make 3-5x their salary from clients; and not doing so puts their salary at risk. The usual way of estimating is "how long you think it'll take x 3" – UKMonkey Mar 16 '19 at 14:59
@UKMonkey but if someone asks me to give a "estimate" that is the absolute maximum, I will give that. If you estimate three times what you think how long it takes there will be tasks where it takes longer than that estimate. – Josef Mar 17 '19 at 22:01
@Josef you've never been in the situation where the number you've given has been sent to the client and money charged based on that - you acknowledge this; but had it been, client's would've always rejected your number, and as an engineer failing to bring money to the company your contract would've been terminated.
An estimate is an estimate sure; but if you're picking stupid numbers then you're making no friends at best, and at worst you can be putting your job on the line.
– UKMonkey Mar 17 '19 at 22:28
1

@UKMonkey those aren't estimates, those are pessimistic "done by" dates. If people need those, they need to tell me that, and I'll give them my normal estimate x 5. But if you ask for an estimate, I'll give you an honest estimate. If numbers needs to go to people who fundamentally don't understand what an estimate is, the person I'm giving the estimate to needs to apply whatever nonsense they want to my numbers. I'd rather be honest to the people I'm working with. – Erik Mar 18 '19 at 07:39
@UKMonkey I am often in the situation that numbers I give are sent to clients. I often discuss numbers with clients in projects I manage. I never acknowledged anything else. If someone asks me for a worst case "done by" date, they get a worst case. – Josef Mar 18 '19 at 10:04

17 of 26 · Answer 6 · 2019-03-15T13:11:31.837

35

With the updated question, it is now clear that you are trying to fix the wrong problem. The senior engineer's behavior is a symptom of a fundamentally broken software development process and/or dysfunctional company.

If you have critical bugs getting into production every month, then you have at least one of the following problems:

Incompetent engineers
Unmaintainable code base
Inadequate testing

Given how much manpower you have at your disposal (20 engineers is a LOT of resources), it's likely a combination of all three.

My guess is that the senior engineer is fed up with the constant firefighting, and rightfully so.

You need to dig deeper and fix the underlying problems that are creating the need for people to continually work late. Convincing one engineer to work late more often is not going to help the big picture.

Now, what to do about it...

Step 1: Figure out why testing is not catching these critical bugs

The first thing you absolutely need to do is stop these critical bugs from ever reaching production. Every bug that reaches production is a failure in the testing process.

Go back over every critical bug that was discovered in production and determine exactly why it was not caught in testing. Add more automated test coverage, manual test coverage, or testing resources as necessary.

Step 2: Determine the root cause of every critical bug

For every critical bug, find out:

Who created the bug
When the bug was created
Why the code was being modified
Where the bug was introduced in the code

By doing this analysis, you will discover some patterns. Maybe there is one or two developers who keep introducing these bugs. Perhaps there is one code module that is very difficult to modify without causing problems. Or it's possible that the code as a whole is very difficult to with.

edited Mar 15 '19 at 13:11

answered Mar 15 '19 at 12:40

17 of 26

1,712
13
13

22

And STOP ADDING MORE BROKEN FEATURES until the critical bugs in the existing code are fixed! – shoover Mar 15 '19 at 15:33
5

Sometimes these issues aren't even "bugs" but rather fundamentally wrong design incompatible with the environment in which the code runs; you can fix an endless stream of "bugs" surfaced by the new way that causes breakage every time the code encounters previously unseen (but perfectly compliant) behavior of interacting systems or OS layers, or you can take time to fix the underlying design mistakes. – Chris Stratton Mar 15 '19 at 15:57
2

@ChrisStratton Exactly, which is why it is absolutely necessary to understand the root cause of these critical bugs. – 17 of 26 Mar 15 '19 at 15:59
2

I very like this answer but I'd expect a SENIOR engineer to go to the manager to discuss the problem when he is "fed up" with something, to stop doing "firefighting" is not IMHO the response I want to see from a professional experienced engineer. – Adriano Repetti Mar 15 '19 at 19:17
Every bug that reaches production is a failure in the testing process - Almost correct. "testing" is part of development. With 20 developers, you should be employing unit / functional testing (preferably both). As such, testing is in integral part of the entire development process, not a stand-alone process. Would reword as : Every bug that reaches production is a failure in the development process. Also, I think you're missing the "management wants features XYZ by ABC". Always a giant domper on proper work. Then you get OP demanding you stay late... then you look for a new job. – rkeet Mar 16 '19 at 09:49
6

@AdrianoRepetti - who says they didn't? Developers usually get into this cynical phase when they learn that management in their company doesn't listen. – Davor Mar 16 '19 at 12:05
@Davor because OP didn't mention it (and probably he wouldn't even need to ask if someone did it). We're making way too many assumptions here, I'd love to keep them at the minimum whenever possible. Judging from what OP wrote all I can say is that process should be improved (it always can be) but they have a serious (technical) problem with this senior (whether or not OP is entitled to expect overtime work it really depends on the culture). – Adriano Repetti Mar 16 '19 at 14:22
I mean: code you wrote and you had plenty of time to test...stopped 90% users to use the service? S* happens but as a developer I'd be ashamed to answer "let's see tomorrow", no matters what my reasons are (especially if I didn't spoke out beforehand). – Adriano Repetti Mar 16 '19 at 14:27
1

@AdrianoRepetti maybe he has, and nothing happened. We're only seeing one side here – user87779 Mar 16 '19 at 16:04
@user definitely. That's why I was talking about assumptions. Literally everything or nothing might be true then IMHO we should (mostly) stick to what we know because OP wrote it. – Adriano Repetti Mar 16 '19 at 16:28
@AdrianoRepetti I would agree about that for the comments (and note that your comment on "expectations" IS based on a few assumptions). But the answers should go with different assumptions to broaden the help to people other than the OP as well. Fwiw, I would say the only things we KNOW are that there is a failure in testing and there is a failure in design. The fact is that the most important thing to correct now is testing and BROAD design practices, not one employee's (of 20) performance – user87779 Mar 16 '19 at 17:22

score 29 · Answer 7 · answered Mar 15 '19 at 00:14

29

I want to make one additional point. Rushing out a bug fix often leads to technical debt. If your senior developer is questioning how it will be tested tonight then that is a good question that a senior developer should be asking! I’ve worked at places where urgency is prioritized over quality and this has had negative long term consequences. Ultimately, your team will have reduced capacity because it is always fighting fires.

answered Mar 15 '19 at 00:14

The Gilbert Arenas Dagger

641
4
6

4

Yes, and don't automatically assume that just because the other teams would work back and fix the bug that they're in the right on this. They may even feel the same way, but don't want to pick a fight with management over it. – Matthew Barber Mar 15 '19 at 00:52
The question about how are we going to test that was asked about two months ago and we fixed it by having tests running against all commits and prs. – Code Project Mar 15 '19 at 03:28
1

@CodeProject the sequence of events in your edit is actually a pretty good example of what this response is warning about. The tests you ran in late February did not catch this issue, so no, the testing concern was not actually fixed. Likely there are areas of your codebase (or its interaction with the underlying mobile OS or remote services) which are not yet properly understood, and as a result the bug fixes continue to contain unsafe assumptions that break in situations outside those you've thought to test for. Taking time to really understand it will be needed, tests can't catch all. – Chris Stratton Mar 15 '19 at 04:04
13

If you really want "ownership" you are likely going to have to be open to letting your senior people determine more of the agenda in order to include the things they need to do to really get a handle on the underlying issues. In contrast, if you dictate the goals to the degree where they can only address symptoms then in actuality you have taken ownership in a way that precludes them from having an opportunity to do so. – Chris Stratton Mar 15 '19 at 04:36
1

@ChrisStratton I agree with the part that we need to take time to understand our code base. For the second comment, we actually had 2 sprints that they decide what they wanted to work on which were refactor code, fix bugs, add more tests. Any thoughts on how to tackle this issue? – Code Project Mar 15 '19 at 07:05
Let's admit that the process can be definitely improved. Let's also be honest and admit that there is the remote possibility that the team is not cough cough competent enough. Cough. – Adriano Repetti Mar 15 '19 at 19:12
3

Rushing a fix can also have catastrophic short term consequences, – gnasher729 Mar 15 '19 at 21:14

score 23 · Answer 8 · answered Mar 15 '19 at 05:01

23

It sounds like you have a huge testing problem. You ask why does everyone not drop all outside commitments to put out a fire but the real question is why are there fires starting every month?

Do you have any QA/Testing? Why did they not find that the first and most basic step (logging in) does not work. How did something that does not work at all get pushed to production.

Also why is your response to users not being able to log in to get everyone to stay late rushing "critical" fixes instead of having a system admin revert the update and the update can be attempted again later after the issues have been fixed.

"How are we going to test that tonight?" This is the correct response. When there is a critical issue and you are being pressured to fix it right now how will developers set aside time to properly review the changes are correct/high quality and how is QA meant to check that everything else is still working after the change. It sounds like you are also asking for these changes at the end of the day where everyone is tired and their thinking ability is at its lowest making it even more likely other issues will sneak in to this critical fix.

answered Mar 15 '19 at 05:01

Qwertie

337
1
7

3

The question and its updates make it clear that there is testing and that the testing was critical to the release decision - but also that the testing that there is, is not catching the flaws. Some modern environments have enough distinct moving parts that expecting tests to catch everything is naive, since unsafe code can work or break depending on conditions outside of the test environment. What that points to is a system with aspects that no person on the team fully understands. – Chris Stratton Mar 15 '19 at 05:15
5

@ChrisStratton If your testers are unable to actually test things then I would suggest that that is a problem itself. That also does not explain why there is no process in place to simply roll back changes that went bad. The developers likely have no control over the testing and ops procedures and are sick of constantly having to deal with failures in the process. – Qwertie Mar 15 '19 at 05:18
5

No, neither testers nor automated test harnesses can cover every eventuality. They have their role, but the set of possible interactions is larger than you can enumerate and easily involves more physical variety than you can either reasonably purchase or simulate to include in your test coverage. Patching what broke the tests last time is not enough - it has to be right by design, and not have fundamentally unsafe parts that just happen to work in all the tests tried. Software deployed to customers is not necessarily as easy to just roll back as something server side. – Chris Stratton Mar 15 '19 at 05:30
1

+1 this is the answer I was composing in my head – Phil Mar 15 '19 at 12:07
12

@ChrisStratton Both of those are true, but they should be known and accounted for. If you are unable to get sufficient test coverage, and you are not able to rollback the deployment then you should arrange in advance for a developer to be on hand to cope with any issues that may arise. I react much better to 'Can you work late next friday to cover any issues arising from the release?', than 'Its all gone to hell again, cancel your plans'. – Phil Mar 15 '19 at 12:11
2

@ChrisStratton reading OP's further comments it sounds like their idea of testing is limited to having the developers write unit tests. – Aaron F Mar 15 '19 at 17:59
@Aaron untrue, they described testing on multiple phone models. But neither will protect against fundamental misunderstanding of the mobile OS. – Chris Stratton Mar 15 '19 at 18:07
1

@ChrisStratton this is the comment I was referring to. It sounds to me like there are no testers and not very much testing going on. They need a QA team, not having the same developers who wrote the code test it as well. – Aaron F Mar 15 '19 at 18:36
@user87779 - not as much as you would think - when a mobile app sits between an often widely misunderstand mobile OS and remote services, the scope of interaction easily exceeds achievable test coverage. You might note that the asker has described a situation which passed tests but was still buggy in a way that only showed up after release. That's quite common - fundamentally wrong code seems to work until it is exposed to just the right set of circumstances to expose the flaw. Or some particular mobile device actually mis-implements an API - that is less common, but it does happen. – Chris Stratton Mar 16 '19 at 17:15
@user87779 - I see plenty of both. But then, I'm often engaged specifically to solve these kinds of problems - ie, others' code with unsafe assumptions that were not triggered in the original tests, or situations where the platform actually deviates from its specification. – Chris Stratton Mar 16 '19 at 17:22

score 10 · Answer 9 · answered Mar 14 '19 at 19:39

10

When are people most productive? When is the team most able to handle critical bugs? There have been studies that answer said questions to when humans are best able to handle certain tasks.

You have a critical bug, and you want, a) Sr. to switch mental gears, b) Pick up a new "critical" task, c) work "till whenever" to fix it. And you expect this critical patch to work? Honestly, what do you expect for the product, the team, the team members if your wants were satisified?

Let go of your ego, and your irrational beliefs.

answered Mar 14 '19 at 19:39

paulj

1,298
8
13

3

So it's late in the day and you find a bug where 90% of users can't log into PROD... you are saying that if studies show that your best work is done at 10am in the morning that you should wait until then to work on this? That doesn't sound silly to you? – JeffC Mar 15 '19 at 16:50
4

@Jeffc If 90% of users can not log in, why was this not picked up during testing before rollout? That is not a bug, that is a system error. Where is the dedicated support team out of the four teams mentioned above? Even if rotating. – paulj Mar 15 '19 at 17:24
I don't disagree with your comment... but I do disagree with your answer. Either way, the point is when the PROD issue is found is the time to fix it, not to wait until the optimal, most productive time for the employees. – JeffC Mar 16 '19 at 00:39
2

@JeffC I would say it's time to roll back from the experimental branch they should be deploying this code on. Then methodically fix the issue without rushing and possibly introducing other bugs – user87779 Mar 16 '19 at 17:24

score 10 · Answer 10 · answered Mar 15 '19 at 09:01

10

The term you are looking for is Discretionary Effort not Ownership.

I am assuming that your employees are meeting their contractual obligations (otherwise your course of action is clear).

You have no right to expect discretionary effort. That is what it is by definition. Fundamentally this is not something that you can speak to them about and expect a change. You are likely to get the opposite response. They are under no obligation to give it. Threats about firing them are likely to have such an overwhelming poor response, as well as being illegal.

I don't have any good suggestions on how you can improve things. The very fact that you can rely on Discretionary Effort by some of your people suggests to me the culture is not necessarily broken.

Fixing this will take time, so instead, I can offer stop-gap measures:

Fix the bus-factor of 1

Why can only a single employee resolve this issue?

Have an on-call roster

According to reimbursement agreed upon with individual employees, not what you think it is worth.

Roll out updates at better times

It may not be possible, but rolling things out at better times can increase the chance for someone to assist.

The worth of your software is a function of how well it is supported, so you shouldn't use Discretionary Effort as a crutch. If you want your software to be supported to a level, you need to ensure you have things in place to ensure it.

answered Mar 15 '19 at 09:01

Gregory Currie

59,575
27
157
224

You missed the part where the OP said the senior told the team to go home. Bus-factor > 1. Also, I'm curious about this whole "discretionary effort" thing. Whatever term you choose to use, in the US, engineers are professionals, meaning they essentially get paid to do a job, not to work for X hours (they decide, or in most cases give up the right to decide) how long things should take. The engineers delivered a product that they greenlit, which turned out to be defective. If that is the case, I think an employer legally has the right to expect OT, even unpaid OT... – Mars Mar 15 '19 at 09:23
But if you have info that says otherwise, I'm all ears – Mars Mar 15 '19 at 09:24
3

I can't speak for the US, but in Australia "workers" fall into two categories, employees and contractors. Engineers may be employed as either. Most of the time, workers are employees. To the best of my knowledge, every single engineer (200+) I've worked with has been an employee. – Gregory Currie Mar 15 '19 at 09:33
3

Regarding unpaid overtime, in Australia, forcing an employee to work unpaid overtime, even if they made an error, would be considered illegal. – Gregory Currie Mar 15 '19 at 09:34
Interesting. In the US, software engineers have no right to overtime pay (although I think its at least common practice for an overtime pay clause to be in their contract) – Mars Mar 15 '19 at 09:35
1

I think the two terms that seperate them in the US are "contractual engineers" and "salaried engineers". You may very well be correct that most/all engineers are contractual engineers. (Also something that may be interesting is the legal definition of the term "engineer" - in Europe it has a specific well defined meaning - this may not be true for USA and "anyone" can be an engineer?) – Gregory Currie Mar 15 '19 at 09:38
Interesting. I've worked in the US and Japan and I see a person who said they delivered X, but delivered Y. I expect him to change X into Y ASAP or accept that I will no longer employ them to not give me what was promised. But, different cultures and laws apparently! – Mars Mar 15 '19 at 09:38
A "professional" (a subcategory of salaried employee) is exempt from overtime restrictions as they are considered knowledgeable enough to determine their own workload/pay balance – Mars Mar 15 '19 at 09:40
As far as I recall, engineer does not have a legal meaning in the US – Mars Mar 15 '19 at 09:42
1

The legal meaning of "engineer" varies by state in the US. Regarding software engineers, it's very common for us to be either contractors or employees in the US--and I've recently worked with an Australia-based contractor. (The contract involved still wouldn't permit "forcing" to work overtime.) – chrylis -cautiouslyoptimistic- Mar 15 '19 at 10:36
"Professional" has nothing to do with it. If you are salaried and make over a certain amount (~40k, I think, though those numbers are in flux based on the discretion of executive orders) then you are not legally entitled to paid overtime (though your company may still choose to have an overtime policy that includes extra pay, it isn't required by law). – David Rice Mar 15 '19 at 14:08

score 8 · Answer 11 · answered Mar 15 '19 at 16:02

So, you expect your employees to give up their social and/or family lives at the drop of a hat in order to fix problems?

Are they really all that critical?

Managers always seem to think that everything is critical because saying no is hard. This is a strong potential reason why your lead dev is pushing back. They are trying to protect their boundaries because you won't. And they are trying to protect their team's boundaries because you won't.

If they truly are all that critical, then what is going wrong that allows these issues to happen?

If your product quality is that bad, then you need to move over and let your developers devise a plan to get the product back on track. Poor quality isn't just about bugs. Poor quality derails predictability. If you are consistently going off plan because your quality is this bad, then fix your quality. And you don't fix it by asking developers to do it in their personal time. If that is the expectation you set, then you are telling your developers the business does not care about quality and therefore does not value predictability. If you do not value predictability, then stop complaining.

If they truly are all critical, then why don't you plan an on-call rotation?

Not only does this protect employees' personal time and protect the business's needs, it also creates incentive for developers to fix the systemic problems that are causing them to fire fight so much. (maybe you need more or better tests, maybe you have broken legacy code, etc.)

Why don't you stay late and fix things?

You're complaining that somebody doesn't step up to work through the night to fix a problem. Why don't you work through the night to fix it? I think you'll find the same conclusions as your team lead.

Your behavior

You have threatened to fire your employees for not doing something which you yourself refuse to do. You are complaining this happens a lot, yet you have not planned for it with an on-call rotation or by repaying technical debt.

Reading your list of steps to plan a release, what stands out to me is the frequent use of "I told them to..." and the granularity of planning all the way down to function names. You plan out minor details that are easily changeable, but won't plan a support process for your product.

This is 100% your problem.

Your team

It sounds to me like you have a bunch of smart, honest, professionals who know how to make good software, but their manager likes to dictate to them how to do their job and when the manager's approach causes a problem, force them to work more hours.

Have you stepped back and asked your team how to get less critical bugs? Have you asked your team how they think they should handle responsibility for unexpected critical issues?

Your team lead is right to push back on your expectations. And I'm glad to hear that he is encouraging his team to say no to things. He is trying to protect the team because you aren't.

In my time as a team lead, I can tell you that one of the hardest but most important lessons is learning how to say no. Maybe you can learn something from this employee of yours.

score 4 · Answer 12 · answered Mar 15 '19 at 08:57

4

You can't force someone to do overtime (depends on country and potentially extreme circumstances as exeption in laws)

If the person can't or won't do it, it is your responsibility to find a willing employee or to hire external help if the task is vital and needs immediate attention.

Ask an employment lawyer in your jurisdiction to clarify.

As for ownership and following through with assigned or promised tasks, you have your disciplinary arsenal all the way up to ending employment contracts.

Also, what Sefe said...

answered Mar 15 '19 at 08:57

DigitalBlade969

11,542
4
22
37

2

Very good answer, and I think it cuts to what I think the very crux of the issue is.
In addition, if the OP wants guarantees, they should hire contractors, who are not paid a salary. Contractors would be obliged to deliver no matter what (however they see fit).
– Gregory Currie Mar 15 '19 at 09:18
2

Also, just want to say, an employee not following through with an assigned/promised task is likely to be a performance issue, rather than a disciplinary issue, assuming the employee worked on it to the best of their ability. – Gregory Currie Mar 15 '19 at 09:24
1

@GregoryCurrie agreed,it could become disciplinary if it's done on purpose. – DigitalBlade969 Mar 15 '19 at 09:43
1

Nobody capable does mobile app development on a fixed price contract, precisely because of issues like these. – Chris Stratton Mar 16 '19 at 14:27
@ChrisStratton I don't think you commented on the correct post...haven't seen anyone talking about fixed price mobile dev... – DigitalBlade969 Mar 16 '19 at 14:58
Response was to the first comment which proposed the broken idea that using contractors rather than employees would change this – Chris Stratton Mar 16 '19 at 15:04
1

@ChrisStratton a I think what the comment meant was instead of using employees, get freelancers or other contractors that are required by contract to deliver milestone X at date Y. While getting out of having to observe overtime laws. (less so with freelancers, as they still can't be forced to work if they don't want to) – DigitalBlade969 Mar 16 '19 at 15:13
1

You won't find capable people willing to sign such absurd contracts. Those with actual knowledge understand that inflexible deadlines lead to unusable non-quality, because things will go wrong and when they do something has to give. To actually get anywhere the structure has to permit working together towards the goals and through the unexpected. – Chris Stratton Mar 16 '19 at 15:16
1

@ChrisStratton unfortunately not true.contracting companies way too often put the need for new and recurring clients,the money or the prestige way ahead of sensible deadlines.Some even accept short term financial losses if it secures mid / long term cashflow.many freelancers on the other hand for instance work on day rates and accept overtime or weekend / holiday work while being excluded from employee privileges and many charge premium for urgent work, making good living on that alone.nowadays products are announced with a publication date years ahead and missing it costs tens of millions. – DigitalBlade969 Mar 16 '19 at 15:48
No. The reality is that either deadlines slip, or unusably low non-quality is delivered - quite often both. And besides, you can't have a contractual deadline unless you have a meaningful statement of work and specification. Most of the asker's problem arrises from the difficulty (one might say near impossibility) of determining by external testing if the code will work in all situations where it might be desired that it would. Play the deadlines game, and you might get something that seems like it works today but will find out in a month that it does not meet your actual needs. – Chris Stratton Mar 16 '19 at 15:53
2

In contrast, when people who actually understand the subject matter work together - as a manager and employee, or as contractor and client, they understand where the the challenges and risks are, they understand that some cannot be predicted, they understand that the writable specifications never capture the full detail of the need, and so they work together to solve the problems, rather than burn the relationship making unfulfillable demands. – Chris Stratton Mar 16 '19 at 15:56
@ChrisStratton I agree but then you have finance weighing in and the board / management most often will set the cheapest approaches and quickest turnarounds so that income will be generated sooner and ROI is achieved quicker.it may not make sense in many aspects but they conscientiously accept a buggy release and some backlash, as well as the additional costs to fix it while already raking in profit.they're not interestesd in your approach because it costs more upfront and delays releases.I've seen this countless times happen. – DigitalBlade969 Mar 16 '19 at 19:26
@DigitalBlade969 - the point is that if someone pushes to release software with fundamentally broken internals, then they have to accept the risk. You can't say "this has to release on Friday no matter what" and then "if it breaks you must work magic" and do that over and over again. Pick your point on the cost/benefit tradeoff, but be honest about it, accept the consequences and have a productive attitude for working together to recover from those that are inevitable - pushing blame downhill is not a productive attitude for recovery. – Chris Stratton Mar 16 '19 at 19:29

score 4 · Answer 13 · answered Mar 15 '19 at 14:26

Answering the updated question:

Your big problem is not a lack of ownership. This rather seems to be a sympton of deeper, underlying problem: The fact that your Development Process seems to be substantially broken.

In theory your process (i.e. automated tests & test coverage, planning in sprints, no sudden requirement changes) should prevent most of the issues you see.

By your own statement, you ran into multiple "Showstopper" issues with the program even when deployed to the customer and some of them even being regressions. The sprints are not finished on time (tests are part of the sprint). And even when written the test do not provide sufficient coverage to catch those bugs. You also said you already had a "stay late" situation (for which you gave them days off afterwards).

You need to discover what is going wrong. Only by finding and resolving the underlying issue you can hope to fix things.

It's not even a coverage issue... They reran the tests and the errors are coming out again. In other words, someone screwed up the tests OR fixes caused regression! So who is responsible for failed tests? The senior who greenlit it or the manager who believed the senior? — Mars, Mar 16 '19 at 04:32

score 4 · Answer 14 · answered Mar 15 '19 at 16:01

4

If 90% of users cannot log in, and users are not able to make purchases ( i.e. sales are being lost ) you need to revert the update to the previous working version immediately. Waiting for your developers to troubleshoot and fix the bug can take much longer and cause more of a negative user impact than simply reverting to a previous version.

More importantly, your developers are less likely to want to continue working for you if they are forced to perform overtime work when there is a better solution available. If you value your employees you should respect their time outside of work.

answered Mar 15 '19 at 16:01

sf02

78,950
39
179
252

If it was websites or backend services then we would have rolled it back. Unfortunately, this is app on AppStore. – Code Project Mar 16 '19 at 07:07
1

@CodeProject There are limitations, but there are ways to speed up the rollback (actually just an update) of your apps in the store, if you haven't done so already! – Mars Mar 16 '19 at 07:48

score 2 · Answer 15 · answered Mar 15 '19 at 21:23

Answering the updated question,

It appears good working practice has been followed:

sprints
feature lock down close to deployment
time estimation from developers
automated tests

I partially agree there is a positive culture of fixing critical bugs in overtime. However the culture also needs to reflect no code is perfect, unless you spend a lot of money on it, there's an age old storey saying that NASA spent around $1,000 per line of code!. I won't comment to much on the cultural side, as that has already been covered by others, instead some methodology suggestions:

A team structure in fashion at the moment is feature Squads who own end-to-end delivery and operational responsibility for isolated vertical they write. If it goes wrong, they are the ones woken in the night, however I'd be wary in introducing this in a legacy environment as it will heed distaste if a team didn't have the opportunity to introduce the quality they desire from the get-go. And traditional Ops roles with contracts/pay to suit are likely more open to on-call way of working.

A better idea would be to introduce the idea of QA champions within a team and the "3 amigos" approach i.e. product, developer and QA member all write the Feature Spec (Behaviour Driven Design) together. This ensures that "how a QA member would break it" is accounted for from the beginning and it should be in just enough detail i.e. a specification by example manner. The QA member doesn't need to be the person writing the automation, but they should code review it. As people have mentioned above the writer of the code shouldn't be solely responsible for accrediting it's quality and introducing a 3rd party as early in the process as possible is a positive move.

Perhaps also the Production environment and release management needs enhancing. "Blue/Green deployments" and "Testing in Production" are common practices for gradually rolling out changes to a wider and wider audience only as metrics prove themselves. Ideally your staging environment should catch critical bugs, but there's always something different about production, therefore it should never be a big-bang release.

Judging from the timeframe although you are using a typical release cadence, you may wish to consider releasing more often. More frequent smaller releases can lead to less risk if coupled with good test and release automation. This can be paired with feature switches so that features can be switched on in full when the composing user stories are complete.

score 1 · Answer 16 · answered Mar 15 '19 at 21:29

1

I read your update. Your problem is that you shipped a broken product. That’s it. That’s what you need to work on - don’t ship broken products.

Your complaint is ridiculous. YOU decided to ship a product when it was broken. That’s where the buck stops, with you.

You then made two mistakes: First, it seems that the timing of your release was such that you were told about it at 4pm. Release half an hour before developers arrive at work. Very simple solution, it’s your job to know this. Second, there seems to be no motivation for the employee to work overtime. Paying for overtime usually works quite well. An environment where you allow broken code to ship regularly doesn’t.

answered Mar 15 '19 at 21:29

gnasher729

169,032
78
316
508

2

What? It sounds like OP deployed a product that the senior marked as all green. The OP may not even have a technical background (although in this case it sounds like they do), but the senior engineer is the one responsible here... OP might literally not even know what some of the tests mean, but the person who is paid to know what they mean said they were fine, then, after release, said "ah, actually.....!" OP decided to ship a "working" product, not a broken product. – Mars Mar 16 '19 at 04:19
1

"There seems to be no motivation for the employee to work overtime" Um... it's clear that that's about to be a condition for keeping their job! It's also not stated that it's unpaid OT, and it's explicitly stated that last time they worked a little extra, they got a LOT of extra time off – Mars Mar 16 '19 at 04:23
@Mars Apple releases my apps on the AppStore when i press the button. Totally under my control. I can also release to a small percentage of customers first. Play Store does the same. – gnasher729 Mar 16 '19 at 15:07
My mistake, I was not aware of manual release! But the other points still stand. In addition, assuming that the release was around 4pm is mostly projection--releasing at 9 and getting reports at 10 still only leaves 7 hours to diagnose, fix and retest. If it takes more than 7, then that means OT... – Mars Mar 17 '19 at 02:31
Also, OP may not be responsible for the release--that could be the client. In that case, they need to discuss with the client about using that feature, but even bringing up the subject will likely not go over well with a less than tech-savvy client... ("What do you mean delay the release? You said you tested t? We paid you to test it. If you can't have it ready for a full release by this day, then maybe we should have someone else who can...") – Mars Mar 17 '19 at 02:35

score 1 · Answer 17 · answered Mar 16 '19 at 12:17

In light of the updated question, I suggest that while you may have an employee attitude problem on your hands, what you really have is a software quality problem:

There is no quality assurance process independent of the developers who are writing these features. Developers are notoriously bad at testing their own code, largely because they naturally start with the assumption that anything they wrote does and should work the way they intended it to, as opposed to how the end user expects it to.
Whatever testing environment you have isn’t adequate for reproducing problems identified by your customers. You identified bugs that prevented login at several stages in the process, but the final check did not identify this problem.
The senior developer is willing to ignore critical issues that actually are critical. “Dude, my friends and I were going to do something cool tonight” is not the kind of reaction to “nobody can log in therefore our business is losing 100% of this products’ possible revenue that we use to pay your salary” you want to see. If this is how he reacts to an emergency, how do you think he deals with minor stuff? With more attention to detail or less?

There is a specific part of your updated story that raises many more questions for me and I think you need to investigate in detail: At one point in the process, the team said they wrote all of the tests and there were no issues. Then you say later, it was discovered that not all of the tests were written after all, that the login bug was not identified, and that you devoted additional time to writing those tests and fixing the login issue, which later was demonstrated in production to be not fixed.

When you said the tests were not written after all, why were they not written? Was it that the case in question was an oversight, or did the team misrepresent the completion of their work? The former is reasonable to expect will happen occasionally, the latter is unacceptable behavior that you need to make clear will not be tolerated in the future (unless it was a misunderstanding on your part).
Once the tests were written, how do you know that these tests are correct? Did you review the tests? Clearly they don’t work; they all passed but the problem that was explicitly tested for and said to be fixed still made it to production. It is not unknown for unscrupulous people to write tests such that they always pass, especially if there is deadline pressure. If you are not capable of reviewing the tests yourself, you should find someone technical on another team to review the tests for you.

There’s also something else you wrote, that the number of critical bugs is now up to once a month, but before it was more like once or twice a year. Why is that? Have you done any investigation into why that is now happening? Did the product or team change significantly in that time period? This sounds like quality is slipping.

Here’s what I think you should do:

Hire a good, experienced QA tester and subject all of this team’s work to independent QA testing (you should do this throughout your company, but this team in particular needs it because quality is slipping).
Review the test environment and compare with production to ensure that the test environment still reflects production.
Review tests and code your team is writing for quality and correctness on a regular basis and in more detail than you are currently doing.

You might have an attitude problem on your team with the senior guy, but you definitely have a quality problem. Attitude problems are hard to fix, but quality problems are easier to fix, and have the added bonus of rendering the attitude of one guy on the team irrelevant if you do it right.

score 0 · Answer 18 · answered Mar 16 '19 at 10:22

Being a senior team member myself, and having to deal with such situations time and again, I can only tell that a guy whom you can always rely on, needs some motivation as well. The guy is surely honest and has his right to his personal life after office but he will stay and fix bugs if there's a better motivation for him, or if you manage to create a friendly environment for him.

Think of it the other way around: what if the guy gets sick or leaves the job or has some emergency and he has to leave the office for a time. Currently his time with his friends seems to be a problem for you but he can have all the legitimate needs to leave the office even during work hours (and spending tim with friends after office is a legitimate need as well).

A very common managerial attitude is to always through it on the guy who can do it. Now this varies person to person but being an employee, this can lead to frustration and such situation emerges when employee starts saying no to you.

So in my opinion you need to do two things:

1:) You should create a platform of learning for junior staff, and gradually delegate things so that in time of critical need you don't have to look at the same guy each time.

2:) As long as within office hours the guy is doing what he is supposed to do and meeting his KPIs, he is fine and needs to be dealt with in a friendly manner.

score -2 · Answer 19 · answered Mar 14 '19 at 17:38

It sounds to me that he simply places a higher value on being away from the office than in it.

There are many reasons why, and a good manager would not simply assume the worst. Perhaps he's having personal issues at home, be it a relationship, sickness, whatever other stresses.

Perhaps he is growing dissatisfied with the job and simply doesn't care anymore.

There are many reasons why he may be acting as he does. I'd suggest seeing if there is something that you'd want to give extra grace for before you rip into him too much, or discipline him.

Comments are not for extended discussion; this conversation has been moved to chat. — Neo, Mar 17 '19 at 20:34

Mars · Answer 20 · 2019-03-16T08:00:53.493

-5

Welcome to WorkPlace, where everything is the manager's fault. Everything.

I look at it as they said the product is ready and it is not, so they need to keep their words. – Code Project

Frame the question a different way.
You have a senior engineer who has been assigned responsibility for a feature, is responsible for the testing and has passed the final product. That green lit product has critical quality issues. (And I agree that your definition is pretty critical!)
In other words, you have an under-performing senior engineer who does not wish to do overtime in order to correct their own mistakes (or as you put it, fulfil their promises).

But the key here isn't that the engineer won't do overtime--from the comments, you have a mobile app, so it's not like you can release a fix quickly anyway.
The key issue is that the senior engineer is green-lighting these broken products in the first place.

Expectations/legalities here greatly depend on where you are located, but on a personal account, I have never worked at a company (US or Japan) where your senior engineer's lack of ownership would be accepted. Whether at-fault or not, it's a given that a critical error must be fixed as soon as possible. Luckily, I have also not worked at a company where I wouldn't at least get paid (within reason) for my efforts to fix my own mistakes.

The next step here is to review and find why these mistakes are occurring. Most answers here suggest that the problem is something systemic--I'd say the fact that only one out of your multiple teams has the problem suggests otherwise.

Still, you should review, find the cause and try to fix it. If it's merely an underperforming senior engineer, then steps need to be taken to get that engineer up-to-par or to get another engineer who can do the job properly.

edited Mar 16 '19 at 08:00

answered Mar 15 '19 at 10:03

Mars

6,050
2
16
36

3

The cultural differences are interesting (UK). I consider myself dedicated. I happily work odd hours to accommodate people that I work with AND actually my contract states that I should work my hours "to meet the needs of the business", but if I was told, on a moments notice, that I must work overtime to get X bug fixed, I'd be half way home before the door stopped swinging, mentally updating my CV as I went. Different story if ASKED and I made the decision to do so off my own back obviously. Fortunately, employers and clients who respect employees work-life balance exist and I got lucky. – Brent Hackers Mar 15 '19 at 13:47
@BrentHackers Oh, I certainly recognize that forced overtime and overtime that YOU decide is necessary feels different. But at the same time, if I was that senior engineer, I'd feel like "oh crap, my mistakes or my team's mistakes may cost this company clients. i better do my piece to fix that!" – Mars Mar 15 '19 at 16:34
It's a different story when you feel the blame belongs elsewhere though. I've put my share of OT in for someone else's blunders and that feels awful! Got out of there real fast – Mars Mar 15 '19 at 16:36
3

"Welcome to WorkPlace, where everything is the manager's fault. Everything." -- Completely agree. I read the question then braced myself for the onslaught of answers declaring that OP the is worst manager in the world. Was not disappointed. – GHP Mar 15 '19 at 17:41
1

@Graham If live breaks multiple times a month, and they've not introduced measures to stop that happening, who else do you hold responsible? – UKMonkey Mar 16 '19 at 11:24
@UKMonkey How about the tester who said it was fine? Or the one who reviews the tester's work? – Mars Mar 16 '19 at 11:28
@Mars or the one that watches the crits happen every month but doesn't demand the team do something about it? – UKMonkey Mar 16 '19 at 12:57
1

@UKMonkey No info here to suggest that they aren't doing something about it... Not to mention the rate of crits wasn't particularly a problem until very recently and seems to be a very localized problem. But as my answer suggests, I agree that the bigger problem is that this senior engineer is releasing the crits in the first place. – Mars Mar 16 '19 at 13:21
@UKMonkey There are really 2 questions here: Should an engineer be expected to drop everything to fix unexpected crit bugs? Culturally speaking, that's the norm here. Can't say it's the same everywhere though. Should an engineer be expected to drop everything to fix a crit bug that was their mistake? I'd say yeah, if they want to keep their job. – Mars Mar 16 '19 at 13:25
"No info here to suggest that they aren't doing something about it" sure there is - it keeps happening. The correct reaction is to see there's a problem, have a team meeting and establish why the errors are getting through and fix it. The rate is unimportant - because EVERY time there's a failure, steps should be taken to ensure it doesn't happen again; – UKMonkey Mar 16 '19 at 13:26
@UKMonkey How fast should the problem be solved? It's been happening THIS year. It's the third month of the year. There is zero evidence here to suggest that they didn't have a team meeting, or establish why the errors are getting through – Mars Mar 16 '19 at 13:28
1

@UKMonkey You're assuming they're making the same mistakes every time as well...look at how much OP has posted around. They don't seem to be stupid or harbor ill-will, so try not to assume the worst about them? – Mars Mar 16 '19 at 13:30
@Mars I'm not assuming the worst of them at all - The cause for user not being able to log on may be different but clearly the testing of the user being able to do so isn't sufficient. If I was the manager, I'd have pulled up the testers and enquired what steps they've taken to stop this ever happening again. Trying to blame the engineer for wanting to go home when they're not being paid is what I would define as an abuse of position. – UKMonkey Mar 17 '19 at 19:12
@UKMonkey I believe tester and engineer are the same people... And testers caught the bug (not insufficient testing!), claimed to have fixed it, and approved the product for release. 1) No one said the OT was unpaid (in fact, apparently from a little OT the engineer got rewarded with a lot of extra vacation), and 2) Why is the engineer being paid in the first place, if they're not doing their job properly? Yes, steps should be taken to help the engineer do their job properly in the future, but first things first--fix the release and minimize damages – Mars Mar 17 '19 at 22:52

score -20 · Answer 21 · answered Mar 14 '19 at 19:41

-20

It seems that it's time to put this whole team on notice. If they do not start meeting guidlelines, clear guidelines, they will be terminated.

You can accomplish this by establishing guidelines around bugs/issues that are reported against what they support. For example a priority 1 bug must be started work within 1 hour, progress reported to management (you) every hour after that, until bug is either fixed or a work around in place to your satisfaction. The to your satisfaction is key, and you must support them and the other teams with getting things done when the work flows past normal working hours by being present or highly available yourself.

When they bug out and go home, then they failed to meet your requirements to work on the issue and check in every hour. This is now measurable. These guidelines would have to be met by other teams as well.

Now when they fail to meet these requirements once, they go on notice, formally written notice, CC to your manager and HR that they failed to meet their obligations. Second time, you report again, inform team that a third incident is grounds for termination. Third time, fired.

Now I am guessing this senior developer has some critical knowledge that he/she assumes makes them invaluable. Nobody is that valuable and this sends that message. They are not supporting the company, the company will no longer support them.

Expect some fallout if it comes to termination. They do indeed possess knowledge that is going to take some effort to get up to speed with others. Make sure the Jr. developers know these policies, if they can start pulling the weight even if their Sr. bails on them, then they will be able to keep things going when you terminate the troublemaker.

answered Mar 14 '19 at 19:41

Bill Leeper

14,012
39
45

28

This approach is a sure way to earn your shop a reputation as a great place for software developers to avoid. "Build projects around motivated individuals. Give them the environment and support they need, and trust them to get the job done." (agilemanifesto.org) That's the way to go. – Bill Horvath Mar 14 '19 at 19:45
13

Normally, I might agree with you. However, OP's statement Every time there is a critical bug... makes it painfully clear this happens pretty often. And that tells me that the OP is quite possibly failing at their job and putting pressure in places it doesn't belong. Personally I think the OP needs to reevaluate why they are deploying code that is this buggy to production and perhaps figure out how to fix that before firing people. In other words I'm trying to understand why a place with 20 devs doesn't have proper testing or even an actual production team to handle these issues. – NotMe Mar 14 '19 at 19:51
18

If you actually implement the procedure described in the second paragraph, I expect the "fallout" to consist of the other three teams quitting to get out from under the thumb of a tin-pot dictator. – Mark Mar 14 '19 at 22:07
12

Remind me to never work in a company where you have any management authority – Magisch Mar 15 '19 at 09:02
@NotMe Is the senior engineer designing, developing, testing and releasing the product not the reason these critical bugs are occurring? I agree that a production/testing team would be a much better team format, but as is (which is also fairly common), it's case of someone responsible for a product not properly delivering the product... – Mars Mar 15 '19 at 10:07
3

@Mars: based on the OPs update, it sounds like the reason for these critical bugs is that their test environment doesn't actually use a copy of the live data. In other words - their test environment / plan is garbage. They need to fix that first. – NotMe Mar 15 '19 at 14:35
@NotMe If thats the case, I totally agree. But sounds like thats only a problem for that team--which means its that team's lead who has the insufficient environment...But it didn't sound like that to me. Sounds more like they claimed to have fixed it, but lost the fix somewhere along the way (not actually fixed, or lost during merges, etc) – Mars Mar 15 '19 at 15:26
5

If you fire your entire dev team, or even threaten to do so, rest assured that nobody will be staying late to fix critical bugs. – Brandon Mar 15 '19 at 15:33
@Mars I think it's just that team that's not staying late to fix critical bugs, though he didn't really specify whether the other teams have them or not. – Bill Horvath Mar 15 '19 at 15:40
1

True. I think both sides are kinda making assumptions here--this question feels too culturally charged to me. I read it (already after the update) and my impression was that the person is not even remotely qualified to be in a position of responsibility, yet my answer and this are so far negative it's not even funny haha – Mars Mar 15 '19 at 15:45

Employee lack of ownership

21 Answers21

Linked