Google Analytics

Friday, March 13, 2020

Team John, part 2

Yesterday's post ended a bit abruptly while I had a lot more to say on the subject of Team John. The reason for this is a silly one, but this is just a silly blog, so I'll do with it however I will, including using run-on sentences as I feel like it: I am trying to get back to using 750words.com. It's a great site that simply tracks the number of words as you write them to encourage you to write at least 750 words each day. I'm picking this back up, as I have been reading Stephen King's marvelous "On Writing". King encourages people that want to write to start with at least 1000 words a day. Being more of an underachiever, I'm starting by going back to 750 each day. I'll have more to say on Mr. King and his book hopefully in coming days and weeks. The 750 word goal also works to make sure that I don't spend an inordinate amount of time writing drivel and sometimes even forces me to stop when I have more to say.

So I have more to say on Team John. In the previous post I talked a little about how we implemented it with the current team. Other than what I think were good ideas of tying it to our oncall schedule and having two developers on in a rolling fashion, it probably wasn't exactly helpful in understanding much beyond the message that I think Team John is a good idea for teams. I did hit on the primary purpose that Team John shields the team from interruptions and interruptions are very costly to teams. As an Agile, or more specifically a Lean practice the idea of why the Team John works goes beyond just funneling interruptions  away from the main development team and it goes to queuing theory.

Queuing theory is really the study of how queues or "waiting in lines" affects system throughput. As a study it was first used in telecommunications and then was adopted by Toyota in manufacturing and then came from there to software development through Lean teachings. As a practice it is as old as when people first began lining up to wait for something.It's helpful if you think about waiting in line at Wendy's or at the local grocery store. Will opening more registers help people waiting in line move faster? Yes, it will, but up to a point, right? I can open up four registers at Wendy's, but if I only have one person back on the grill, I can only move people as fast as burgers come off the grill. Or think about those videos you've seen of an automotive assembly line. The line moves along at a regular pace with each man or woman doing the job at their station. Highly repetitive, but also very precise - not only in the placement of each bolt but also in the time the entire task takes.

Applying just those two ideas, it's easy to see that the entire process can only move as fast as it's slowest component (we call that a bottleneck) and that a sustained pace is better than randomly throwing work at people. Like I said, these aren't new concepts but have been around at least as long as people have been waiting around for hamburgers. So how does Team John help the process?

First, when a Scrum team has planned out a sprint, they have planned an amount of work that is sustainable if they work a regular pace throughout the sprint. Unplanned work coming to a development team means a) it will be of some volume that the team couldn't previously deduce and b) it will have some deadline that the team couldn't previously expect (presumably sometime within that sprint, however). There is a rhythm to a sprint, particularly when a team has a sprint goal and is doing a similar group of repetitive tasks towards that goal. Unplanned work throws that cadence off kilter.

Secondly, Team John helps teams deal with those bottlenecks. That guy on the grill can become a bottleneck. If you picture the auto assembly line again, it's easy to see that the line can only move as fast as the slowest person working it. The other thing to realize, particularly in software development but also in fast food service, is that you can never completely predict how a relatively short run on the system will unfold. One hour I might have everyone order singles with cheese, a Frosty, and a small fries. The next I may have people each ordering custom orders ("Hold the ketchup, hold the pickles!"), one of a variety of beverages, and all wanting large fries. What if I have one or two people just waiting, ready to place toppings or fill beverages or put down more fries as needed? I wouldn't need to always anticipate exactly where those bottlenecks are going to be.

I probably have more on the subject, in fact I know I have more, but I'm at almost 850 words now and need to leave some for tomorrow.

Thursday, March 12, 2020

Team John, part 1

I was introduced to Agile concepts around 2004 or 2005, primarily eXtreme Programming ideas from articles by Kent Beck and Ward Cunningham. Fairly shortly thereafter, 2007 or 2008, I attended my first Mile High Agile conference here in Denver. At either that first one, or possible the next year (my memory is starting to go), I was introduced to the very energetic V. Lee Henson and a number of his really great ideas in implementing Agile including the "Team John" concept. Briefly, a software development Team John is an engineer who works separately from the rest of the development team on bugs, production incidents that arise, work that comes in sideways last minute ("Hey, can you pull some numbers from the database for me?"), or tech debt. The idea is that developer works on all the other shit that isn't part of the team's sprint work that would otherwise lead to interruptions. The responsibility rotates to a new developer with a new week or sprint.

It seems counterintuitive that removing a developer from the mix would allow the rest of the team to go faster, but I think any development team that regular needs to deal with incidents or gets hit with work sideways, from outside the Scrum process will immediately see the value. Interruptions kill a developer's productivity. That loss in productivity is attributed to context switching: The very act of switching from one task to another results in a tremendous amount of lost productivity. Psychological research has shown that context switching can result in 40% productivity loss, and the more complex the tasks, the greater the loss. 

In his blog, Joel Spolsky, the creator of Trello and one of the founders of StackOverflow, puts it this way, "The trick here is that when you manage programmers, specifically, task switches take a really, really, really long time. That’s because programming is the kind of task where you have to keep a lot of things in your head at once. The more things you remember at once, the more productive you are at programming." Spolsky goes on to sum up, "In fact, the real lesson from all this is that you should never let people work on more than one thing at once."

The primary purpose of Team John is to shield the team from interruptions or at the very least funnel them all to a single person. With one of the teams I currently support we were getting a number of ad hoc requests coming into our sprints sideways, whether from our business partners ("Hey, can you get some numbers for us so that we can see how the latest campaign is working?"), or elsewhere within the technology department ("These three servers have critical updates and need to be patched by Friday"), or from within our own group ("This service is causing alerts. Does anyone even know what it does?"). Then you had production bugs and incidents which had to be addressed right away and suddenly we would have three developers dropping what they were doing to look at a single bug. I brought up the idea of using the Team John concept - we already  had an oncall rotation for who would handle incidents outside of business hours. We decided whomever was oncall would also handle the interruptions that arose day-to-day. They could work on regular sprint work if there were no interruptions, but at least each individual developer wasn't being pulled away almost hourly.

We quickly discovered that there was a lot of Team John work once we all started funneling it to a single person. Not enough work to keep that one developer busy the entire time, thankfully there weren't that many bugs and incidents, but enough that they really ended up contributing very little on the actual sprint. Meanwhile the rest of the team was able to accomplish more work that it had in previous sprints. I think there were six or seven developers on the team at the team. By taking one out of the mix to handle interruptions our velocity increased by roughly 15%. 

Rather than have Team John work on sprint work intermittently, we had the developer begin to pick up some of the technical debt we had accumulated and try to pay that down in between other interruptions. As we began doing that, some Team John work would carry over to the next week (our oncall schedule rotates weekly) leaving the next person in the rotation without much context for what was being worked on. After a while, and after adding a couple more developers, we decided that both the primary and secondary, which was always the previous week's primary, oncall developers would both be on Team John. This way as the primary rolled over to be secondary, they could bring the new primary up to speed on what was being worked on. With the addition of the second developer on Team John I don't believe the velocity increased, but it definitely did not decrease.