Showing posts with label Test-driven development. Show all posts

Continuous deployment and continuous learning

0 comments
At long last, some of the actual implementers of the advanced systems we built at IMVU for rapid deployment and rapid response are starting to write about it. I find these on-the-ground descriptions of the system and how they work so much more credible than just theory-type posts that I am excited to share them with you. I can personally attest that these guys know what they are talking about; I saw them do it first-hand. I will always be full of awe and gratitude for what they accomplished.
Continuous Deployment at IMVU: Doing the impossible fifty times a day by Timothy Fitz
Continuous Deployment isn’t just an abstract theory. At IMVU it’s a core part of our culture to ship. It’s also not a new technique here, we’ve been practicing continuous deployment for years; far longer than I’ve been a member of this startup.

It’s important to note that system I’m about to explain evolved organically in response to new demands on the system and in response to post-mortems of failures. Nobody gets here overnight, but every step along the way has made us better developers.

The high level of our process is dead simple: Continuously integrate (commit early and often). On commit automatically run all tests. If the tests pass deploy to the cluster. If the deploy succeeds, repeat.

Our tests suite takes nine minutes to run (distributed across 30-40 machines). Our code pushes take another six minutes. Since these two steps are pipelined that means at peak we’re pushing a new revision of the code to the website every nine minutes. That’s 6 deploys an hour. Even at that pace we’re often batching multiple commits into a single test/push cycle. On average we deploy new code fifty times a day.
We call this process continuous deployment because it seemed to us like a natural extension of the continuous integration we were already doing. Our eventual conclusion was that there was no reason to have code that had passed the integration step but was not yet deployed. Every batch of software for which that is true is an opportunity for defects to creep in: maybe someone is changing the production environment in ways that are incompatible with code-in-progress; maybe someone in customer support is writing up a bug report about something that's just being fixed (or worse, the symptom is now changing); and no matter what else is happening, any problems that arise due to the code-in-progress require that the person who wrote it still remember how it works. The longer you wait to find out about the problem, the more likely it is to have fallen out of the human-memory cache.

Now, continuous deployment is not the only possible way to solve these kinds of problems. In another post I really enjoyed, Timothy explains five other non-solutions that seem like they will help, but really won't.
1. More manual testing.
This obviously doesn’t scale with complexity. This also literally can’t catch every problem, because your test sandboxes or test clusters will never be exactly like the production system.
2. More up-front planning
Up-front planning is like spices in a cooking recipe. I can’t tell you how much is too little and I can’t tell you how much is too much. But I will tell you not to have too little or too much, because those definitely ruin the food or product. The natural tendency of over planning is to concentrate on non-real issues. Now you’ll be making more stupid mistakes, but they’ll be for requirements that won’t ever matter.
3. More automated testing.
Automated testing is great. More automated testing is even better. No amount of automated testing ensures that a feature given to real humans will survive, because no automated tests are as brutal, random, malicious, ignorant or aggressive as the sum of all your users will be.
4. Code reviews and pairing
Great practices. They’ll increase code quality, prevent defects and educate your developers. While they can go a long way to mitigating defects, ultimately they’re limited by the fact that while two humans are better than one, they’re still both human. These techniques only catch the failures your organization as a whole already was capable of discovering.
5. Ship more infrequently
While this may decrease downtime (things break and you roll back), the cost on development time from work and rework will be large, and mistakes will continue to slip through. The natural tendency will be to ship even more infrequently, until you aren’t shipping at all. Then you’ve gone and forced yourself into a total rewrite. Which will also be doomed.
What all of these non-solutions have in common is that they treat only one aspect of the problem, but at the expense of another aspect. This is a common form of sub-optimization, where you gain efficiency in one of the sub-parts at the expense of the efficiency of the overall process. You can't make these global efficiency improvements until you get clear about the goal of your development process.

That leads to a seemingly-obvious question: what is progress in software development? It seems like it should be the amount of correctly-working code we've written. Heck, that's what it says right there in the agile manifesto. But, unfortunately, startups can't afford to adopt that standard. As I've argued elsewhere, my belief is that startups (and anyone else trying to find an unknown solution to an unknown problem) have to measure progress with validated learning about customers. In a lot of cases, that's just a fancy name for revenue or profit, but not always. Either way, we have to recognize that the biggest form of waste is building something that nobody wants, and continuous deployment is an optimization that tries to shorten this code-data-learning feedback loop.

Assuming you're with me so far, what will that mean in practice? Throwing out a lot of code. That's because as you get better at continuous deployment, you learn more and more about what works and what doesn't. If you're serious about learning, you'll continuously learn to prune the dead weight that doesn't work. That's not entirely without risk, which is a lesson we learned all-too-well at IMVU. Luckily, Chad Austin has recently weighed in with an excellent piece called 10 Pitfalls of Dirty Code.

IMVU was started with a particular philosophy: We don't know what customers will like, so let's rapidly build a lot of different stuff and throw away what doesn't work. This was an effective approach to discovering a business by using a sequence of product prototypes to get early customer feedback. The first version of the 3D IMVU client took about six months to build, and as the founders iterated towards a compelling user experience, the user base grew monthly thereafter.

This development philosophy created a culture around rapid prototyping of features, followed by testing them against large numbers of actual customers. If a feature worked, we'd keep it. If it didn't, we'd trash it.

It would be hard to argue against this product development strategy, in general. However, hindsight indicates we forgot to do something important when developing IMVU: When the product changed, we did not update the code to reflect the new product, leaving us with piles of dirty code.

So that you can learn from our mistakes, Chad has helpfully listed ten reasons why you want to manage this dirty-code (sometimes called "technical debt") problem proactively. If we could do it over again, I would have started a full continuous integration, deployment, and refactoring process from day one, complete with five why's for root cause analysis. But, to me anyway, one of the most inspiring parts of the IMVU story is that we didn't start with all these processes. We hadn't even heard of half of them. Slowly, painfully, incrementally, we were able to build them up over time (and without ever having a full-stop-let's-start-over timeout). If you read these pieces by the guys who were there, you'll get a visceral sense for just how painful it was.

But it worked. We made it. So can you.

Read More »

Refactoring yourself out of business

0 comments
Let me start out by saying I am a big fan of refactoring, the ongoing process of changing code so that it performs the same behaviors but has more elegant structure. It's an essential discipline of good software development, especially in startups. Nonetheless, I want to talk about a dysfunction I've seen in several startups: they are literally refactoring themselves to death.

Here's a company I met with recently. They have a product that has a moderate number of customers. It's well done, looks professional, and the customers that use it like it a lot. Still, they're not really hitting it out of the park, because their product isn't growing new users as fast as they'd like, and they aren't making any money from the current users. They asked for my advice, and we went through a number of recommendations that readers of this blog will already be able to guess: adding revenue opportunities, engagement loop optimization, and some immediate split-testing to figure out what's working and what's not. Most of all, I encouraged them to start talking to their most passionate customers and running some big experiments based on that feedback.

I thought we were having a successful conversation. Towards the end, I asked when they'd be able to make these changes, so that we could meet again and have data to look at together. I was told they weren't sure, because all of their engineers were currently busy refactoring. You see, the code is a giant mess, has bugs, isn't expandable, and is generally hard to modify without introducing collateral damage. In other words, it is dreaded legacy code. The engineering team has decided it's reached a breaking point, and is taking several weeks to bring it up to modern standards, including unit tests, getting started with continuous integration, and a new MVC architecture. Doesn't that sound good?

I asked, "how much money does the company have left?" And it was this answer that really floored me. They only have enough money to last another three months.

I have no doubt that the changes the team is currently working on are good, technically sound, and will deliver the benefits they've claimed. Still, I think it is a very bad idea to take a large chunk of time (weeks or months) to focus exclusively on refactoring. The fact that this time is probably a third of the remaining life of the company (these projects inevitably slip) only makes this case more exaggerated.

The problem with this approach is that it effectively suspends the company's learning feedback loop for the entire duration of the refactoring. Even if the refactoring is successful, it means time invested in features that may prove irrelevant once the company starts learning again. Add to that the risk that the refactoring never completes (because it becomes a dreaded rewrite).

Nobody likes working with legacy code, but even the best engineers constantly add to the world's store of legacy code. Why don't they just learn to do it right the first time? Because, unless you are working in an extremely static environment, your product development team is learning and getting better all the time. This is especially true in startups; even modest improvements in our understanding of the customer lead to massive improvements in developer productivity, because we have a lot less waste of overproduction. On top of that, we have the normal productivity gains we get from: trying new approaches on our chosen platform to learn what works and what doesn't; investments in tools and learning how to use them; and the ever-increasing library of code we are able to reuse. That means that, looking back at code we wrote a year or two ago, even if we wrote it using all of our best practices from that time, we are likely to cringe. Everybody writes legacy code.

We're always going to have to live with legacy code. And yet it's always dangerous to engage in large refactoring projects. In my experience, the way to resolve this tension is to follow these Rules for Refactoring:
  • Insist on incremental improvements. When sitting in the midst of a huge pile of legacy code, it's easy to despair of your ability to make it better. I think this is why we naturally assume we need giant clean-up projects, even though at some level we admit they rarely work. My most important lesson in refactoring is that small changes, if applied continuously and with discipline, actually add up to huge improvements. It's a version of the law of compounding interest. Compounding is not a process that most people find intuitive, and that's as true in engineering as it is in finance, so it requires a lot of encouragement in the early days to stay the course. Stick to some kind of absolute-cost rule, like "no one is allowed to spend more time on the refactoring for a given feature than the feature itself, but also no one is allowed to spend zero time refactoring." That means you'll often have to do a refactoring that's less thorough than you'd like. If you follow the suggestions below, you'll be back to that bit of code soon enough (if it's important).

  • Only pay for benefits to customers. Once you start making lots of little refactorings, it can be tempting to do as many as possible, trying to accelerate the compounding with as much refactoring as you can. Resist the temptation. There's an infinite amount of improvement you can make to any piece of code, no matter how well written. And every day, your company is adding new code, that also could be refactored as soon as it's created. In order to make progress that is meaningful to your business, you need to focus on the most critical pieces of code. To figure out which parts those are, you should only ever do refactoring to a piece of code that you are trying to change anyway.

    For example, let's say you have a subsystem that is buggy and hard to change. So you want to refactor it. Ask yourself how customers will benefit from having that refactoring done. If the answer is they are complaining about bugs, then schedule time to fix the specific bugs that they are suffering from. Only allow yourself to do those refactoring that are in the areas of code that cause the bug you're fixing. If the problem is that code is hard to change, wait until the next new feature that trips over that clunky code. Refactor then, but resist the urge to do more. At first, these refactoring will have the effect of making everything you do a little slower. But don't just pad all your estimates with "extra refactoring time" and make them all longer. Pretty soon, all these little refactorings actually cause you to work faster, because you are cleaning up the most-touched areas of your code base. It's the 80/20 rule at work.

  • Only make durable changes (under test coverage). There's no point refactoring code if it's just going to go back to the way it was before, or if it's going to break something else while you're doing it. The only way to make sure refactorings are actually making progress (as opposed to just making work) is to ensure they are durable. What I mean is that they are somehow protected from inadvertent damage in the future.

    The most common form of protection is good unit-test coverage with continuous integration, becaus that makes it almost impossible for someone to undo your good work without knowing about it right away. But there are other ways that are equally important. For example, if you're cleaning up an issue that only shows up in your production deployment, make sure you have sufficient alerting/monitoring so that it would trigger an immediate alarm if your fix became undone. Similarly, if you have members of your team that are not on the same page as you about what the right way to structure a certain module is, it's pointless to just unilaterally "fix" it if they are going to "fix" it right back. Perhaps you need to hash out your differences and get some team-wide guidelines in place first?

  • Share what you learn. As you refactor, you get smarter. If that's not a team-wide phenomenon, then it's still a form of waste, because everyone has to learn every lesson before it starts paying dividends. Instead of waiting for that to happen, make sure there is a mechanism for sharing refactoring lessons with the rest of the team. Often, a simple mailing list or wiki is good enough, but judge based on the results. If you see the same mistakes being made over again, intervene.

  • Practice five whys. As always with process advice, I think it's essential that you do root cause analysis whenever it's not working. I won't recap the five-why's process here (you can read a previous post to find out more); the key idea is to refine all rules based on the actual problems you experience. Symptoms that deserve analysis include: refactorings that never complete, making incremental improvements but still feeling stuck in the mud, making the same mistakes over and over again, schedules slipping by increasing amounts, and, of course, month-long refactoring projects when you only have three months of cash burn left.
Back to my conversation with the company I met recently. Given their precarious situation, I really struggled with what advice to give. On the one hand, I think they have an urgent problem, and need to invest 100% of their energy into finding a business model (or another form of traction). On the other, they already have a team fully engaged on making their product architecture better. In my experience, it's very hard to be an effective advocate for "not refactoring" because you can come across as anti-quality or even anti-goodness. In any event, it's enormously disruptive to suddenly rearrange what people are working on, no matter how well-intentioned you are.

I did my best to help, offering some suggestions of ways they could incorporate a few of these ideas into their refactoring-in-progress. At a minimum, they could ensure their changes are durable, and they can always become a little more incremental. Most importantly, I encouraged the leaders of the company to bring awareness of the business challenges to the product development team. Necessity is the mother of invention, after all, and that gives me confidence that they will find answers in time.


Read More »

Continuous integration step-by-step

0 comments
Let's start with the basics: Martin Fowler's original article lays out the mechanics of how to set up a CI server and the essential rules to follow while doing it. In this post I want to talk about the nuts and bolts of how to integrate continuous integration into your team, and how to use it to create two important feedback loops.

First, a word about why continuous integration is so important. Integration risk is the term I use to describe the costs of having code sitting on some, but not all, developers' machines. It happens whenever you're writing code on your own machine, or you have a team working on a branch. It also happens whenever you have code that is checked-in, but not yet deployed anywhere. The reason it's a risk is that, until you integrate, you don't know if the code is going to work. Maybe two different developers made changes to the same underlying subsystem, but in incompatible ways. Maybe operations has changed the OS configuration in production in a way that is incompatible with some developer's change.

In many traditional software organizations, branches can be extremely long-lived, and integrations can take weeks or months. Here's how Fowler describes it:
I vividly remember one of my first sightings of a large software project. I was taking a summer internship at a large English electronics company. My manager, part of the QA group, gave me a tour of a site and we entered a huge depressing warehouse stacked full with cubes. I was told that this project had been in development for a couple of years and was currently integrating, and had been integrating for several months. My guide told me that nobody really knew how long it would take to finish integrating.
For those of you with some background in lean manufacturing, you may notice that integration risk sounds a lot like work-in-progress inventory. I think they are the same thing. Whenever you have code that is un-deployed or un-integrated, it's helpful to think of it as a huge stack of not-yet-installed parts in a widget factory. The more code, the bigger the pile. Continuous integration is a technique for reducing those piles of code.

Step 1: get a continuous integration server.
If you've never practiced CI before, let me describe what it looks like briefly. Whenever you check-in code to your source control repository, an automated server notices, and kicks off a complete "build and test" cycle. It runs all the automated tests you've written, and keeps track of the results. Generally, if all tests pass, it's happy (a green build) and if any tests fail, it will notify you by email. Most CI servers also maintain a waterfall display that shows a timeline of every past build. (To see what this looks like, take a look at the CI server BuildBot's own waterfall).

Continuous integration works to reduce integration risk by encouraging all developers to check in early and often. Ideally, they'll do it ever day or even multiple times per day. That's the first key feedback loop of continuous integration: each developer gets rapid feedback about the quality of their code. As they introduce more bugs, they have slower integrations, which signals to them (and others) that they need help. As they get better, they can go faster. In order for that to work, the CI process has to be seamless, fast, and reliable. As with many lean startup practices, it's getting started that's the hard part.

Step 2: start with just one test
.
You may already have some unit or acceptance tests that get run occaisionally. Don't use those, at least not right away. The reason is that if your tests are only being run by some people or in some situations, they probably are not very reliable. Startng with crappy tests will undermine the team's confidence in CI right from the start. Instead, I recommend you set up a CI server like BuildBot, and then have it run just a single test. Pick something extremely simple, that you are convinced could never fail (unless there's a real problem). As you gain confidence, you can start to add in additional tests, and eventually make it part of your team-wide TDD practice.

Step 3: integrate with your source control system
.
Most of the times I've tried to introduce TDD, I've run into this problem: some people write and run tests religiously, while others tend to ignore them. That means that when a test fails, it's one of the testing evangelists who inevitably winds up investigating and fixing it - even if the problem was caused by a testing skeptic. That's counter-productive: the whole point of CI is to give each developer rapid feedback about the quality of their own work.

So, to solve that problem, add a commit hook to your source control system, with this simple rule: nobody can check in code while the build is red. This forces everyone to learn to pay attention to the waterfall display, and makes a failed test automatically a big deal for the whole team. At first, it can be frustrating, especially if there are any intermittent or unreliable tests in the system. But you already started with just one test, right?

The astute among you may have noticed that, since you can't check in when the build is red, you can't actually fix a failing test. There are two ways to modify the commit hook to solve that problem. The first, which we adopted at IMVU, was to allow any developer to add a structured phrase to their check-in comment that would override the commit hook (we used the very creative "fixing buildbot"). Because commits are mailed out to the whole team, anyone who was using this for nefarious purposes would be embarrassed. The alternative is to insist that the build be fixed on the CI server itself. In that case, you'd allow only the CI account to check in during a red build.

Either way, attaching consequences to the status of the build makes it easier to get everyone on the team to adopt it at once. Naturally, you should not just impose this rule from on high; you have to get the team to buy-in to trying it. Once it's in place, it provides an important natural feedback loop, slowing the team down when there are problems caused by integration risk. This provides the space necessary to get to the root cause of the problem. It becomes literally impossible for someone to ignore the failures and just keep on working as normal.

As you get more comfortable with continuous integration, you can take on more advanced tactics. For example, when tests fail, I encourage you to get into the habit of running a five whys root-cause analysis to take corrective action. And as the team grows, the clear-cut "no check-ins allowed" rule becomes too heavy-handed. At IMVU, we eventually built out a system that preserved the speed feedback, but had finer-grained effects on each person's productivity. Still, my experience working with startups has been that too much time spent talking about advanced topics can lead to inaction. So don't sweat the details - jump in and start experimenting.



Reblog this post [with Zemanta]

Read More »

The engineering manager's lament

0 comments
I was inspired to write The product manager's lament while meeting with a startup struggling to figure out what had gone wrong with their product development process. When a process is not working, there's usually somebody who feels the pain most acutely, and in that company it was the product manager. Last week, I found myself in a similar situation, but this time talking to the engineering manager. I thought I'd share a little bit about his story.

This engineering manager is a smart guy, and very experienced. He has a good team, and they've shipped a working product to many customers. He's working harder than ever, and so is his team. Yet, they feel like they are falling further and further behind with every release. The more features they ship, the more things that can go wrong. Systems seem to randomly fail, and as soon as they are fixed, a new one falls over.

Even worse, when it comes time to "fix it right" the team gets pushback from the business leaders, who want more features. If engineers want more time to spend making their old code more pretty, they are invited to do so on the weekends.

Unfortunately, the weekends are already taken, because features that used to ship on a friday now routinely cause collateral damage somewhere else on the platform, and so the team is having to work nights and weekends cleaning up after each launch.

Every few months, the situation comes to a head, where the engineers finally scream "enough!" and force the whole company to accept a rewrite of some key system. The idea is that once we move to the new system (or coding standard, or API, or ...) then all the problems will be solved. The current code is spaghetti, but the new code will be elegant. Sometimes the engineers win the argument, and sometimes they are overruled. But it doesn't seem to matter. The rewrite seldom works, and even when it does, a few months later they are back in the same dilemna, finding their now-old code falling apart. It's become "legacy code" and part of the problem.

It's tempting to blame the business leaders ("MBA-types") for this mess. And in a mess this big, there is certainly blame to go around. But they are pushing for the things that matter to customers - features. And they are cognizant that their funding is limited, and if they don't find out which features are absolutely critical for their customers soon, they won't be able to survive. So they are legitimately suspicious when, instead of working on adding monetization to their product, some engineer wants to take a few weeks off to go polish code that is supposed to be already done.

What's wrong with this picture? And why is the engineering manager suffering so badly? I can relate to his experience all too well. When I was working my first programming jobs, I was introduced to the following maxim: "time, quality, money - pick two." That was the watchword of our profession, and I was taught to treat with disdain those "suits" who were constantly asking for all three. We treated them like they had some kind of brain defect. If they wanted a high-quality product done fast, why didn't they want to pay for it? And if money was so tight, why were they surprised when we cut corners to get the product out fast? Or went past the deadline to get it done right?

I really believed this mantra, for a time. But it started to smell bad. In one company, we had a QA team as large as our engineering team, dozens of people who worked all day every day to find and report bugs in our prodcut. And this was a huge product, which took years to develop. It was constantly slipping, because we had a hard time adding new features while trying to fix all the bugs that the QA team kept finding. And yet, it was incredibly expensive to have all these QA testers on staff, too. I couldn't see that we were managing to pick even one. Other, more veteran programmers told me they had seen this in many companies too. They just assumed it was the way software companies worked.

Suffice to say, I no longer believe this.

In teams that follow the "pick two" agenda, which two has to be resolved via a power play. In companies with a strong engineering culture, the engineers pick quality. It's their professional pride on the line, after all. So they insist on having the final say on when a feature is "done" enough to show to customers. Business people may want to speed things up by spending more money, but enough people have read the Mythical Man-Month to know that doesn't work.

In teams that have a business culture, the MBA's pick time. After all, our startup is on a fixed budget. They set deadlines, schedules, and launch plans, and expect the engineering team to do what it takes to hit them. If quality suffers, that's just the way it is. Or, if they care a lot about quality, they will replace anyone who ships without quality. Unfortunately, threats work a lot better at incentivizing people to CYA than getting them to write quality software.


A situation where one faction "wins" at another's expense is seldom conducive to business success. As I evolved my thinking, I started to frame the problem this way: How can we devise a product development process that allows the business leaders to take responsibility for the outcome by making conscious trade-offs?

When I first encountered agile software techniques, in the form of extreme programming, I thought I had found the answer. I explained it to people this way: agile lets you make the trade-offs visible to whole company, so that they can make informed choices. How? By shipping software early, you give them continuous feedback about how it well it's working. They can use the software themselves, since every iteration produces working (if incomplete) code. And if they want to invest in higher quality, they can. But, if they want to invest in more experiments (features), they can do that too. But in neither case should they be surprised by the result. Sound good?

It didn't work. The business leaders I've run this system with ran into the same traps as I had in previous jobs. I had just passed the burden on to them. But of course they didn't feel reponsible for the outcome - that was my job. So I wound up with the worst of both worlds: handing the steering wheel over to someone else, but then still being blamed for the bad results.

Even worse, agile wasn't really helping me ship higher quality software. We were using it to get features to market faster, and that was working well. But we were cutting corners in the development methodology as well as in the code, in the name of increased speed. But because we had to spend more and more time fixing things, we started slowing down, even as we tried to speed up. That's the same pain the engineering manager I met with was experiencing. As the situation deteriorates, he's got to work harder and harder just to keep the product from regressing.

It was my own failure to ship quality software in the early days of IMVU that really got me thinking about this problem in a new way. I now believe that the "pick two" concept is fundamentally flawed, and that lean startups can achieve all three simultaneously: quickly bring high-quality software to market at low cost. Here's why.

First of all, it's a myth that cutting corners saves time. When we ship software with defects, we wind up having to waste time later dealing with those defects. If each new feature contains a few recurring problems, then over time we'll become swamped with the overhead of fixing and won't be able to ship any new features.

So far, that sounds like just another argument for "doing things right" the first time, no matter how long they take. But that's problematic too. The biggest form of waste is building software nobody wants. The second biggest form of waste is fixing bugs in software nobody wants. If we defer fixing bugs in order to bring a product to market sooner, and this allows us to find out if we're on the right track or not, then it was worthwhile to ship with those bugs.

Here's how I've resolved the paradox in my own thinking. There are two kinds of bugs:
  • One kind are what I call defects: situations where the software doesn't behave in a predictable way. Examples: intermittently failing code, obfuscated code that's hard to use properly, code that's not under test coverage (and so you don't know what it does), bad failure handling, etc.

  • The second kind of bugs are the type your traditional QA tester will find: situations where the software doesn't do what the customer expects. For example, you click on a button labeled "Quit" and in response the software erases your hard drive. That's a bug. But if the software reliably erases your hard drive every time, that's not a defect.
The resolution to the paradox is to realize that only defects cause you future headaches, and cannot be deferred. That's why we need continuous integration and test-driven development. Whenever we add a feature or fix a bug, we need to make sure that code never goes bad, never mysteriously stops working. Those are the kinds of indefinite costs that make our team grind to a halt over time. Traditional bugs don't - you can choose to fix them or not, depending on what your team is trying to accomplish.

Defects are what make refactoring difficult. When you improve code, you always test it at least a little bit. If what you see is what will ultimately make it into production, it's pretty easy to make sure you did a good job. But code that is riddled with defects is a cameleon - one moment it works, the next it doesn't anymore. That leads to fear of refactoring, which leads to spaghetti code.

By shipping code without defects, the team actually speeds up over time. Why? Because we never have to revisit old code to see why it stopped working. We can add new team members, and they can get started right away (as an aside, new engineers at IMVU were always required to ship something to customers on their first or second day). And the whole team is gettng smarter and learning, so our effectiveness increases. Plus, we get the benefit of code reuse, and all the great libraries and tools we've built. Every iteration, we get a little more done.

So how can I help the engineering manager in pain? Here's my diagnosis of his problem:
  • He has some automated tests, but his team doesn't have a continuous integration server or practice TDD. Hence, the tests tend to go stale, or are themselves intermittent.
  • No amount of fixing is making any difference, because the fixes aren't pinned in place by tests, so they get dwarfed by the new defects being introduced with new features. It's a treadmill situation - they have to run faster and faster just to stay at the level of quality/features they're at today.
  • The team can't get permission from the business leaders to get "extra time" for fixing. This is because the are constantly telling them that features are done as soon as they can see them in the product. Because there are no tests for new features (or operational alerts for the production code), the code that supports those new features could go bad at any moment. If the business leaders were told "this feature is done, but only for an indeterminate amount of time, after which it may stop working suddenly" they would not be so eager to move on to the next new feature.
Here's what I've counseled him to try:
  • Get started with continuous integration. Start with just one test, a good one that runs reliably, and make sure it gets run on every checkin.
  • Tie the continuous integration server in with source control, so that nobody can check in while the tests are failing.
  • Practice five why's to get to the root cause of future problems. Use those opportunities to add tests or alerts that would have prevented that problem. Make the investment proportional to the problem caused, so that everyone (even the business leaders) feels good about it.
My prediction is that these three practices will quickly change the feeling around the office, because the most important code will wind up under test quite soon (after all, it's the code that people care the most about, and so when it fails, the team notices and fixes it right away). With the most important code under test, the level of panic when something goes wrong will start to decrease (because it will tend to be in less important parts of the product). And has the tension goes down, it will be easier to get the whole team (including the MBA's) to embrace TDD and other good practices as further refinements.

Good luck, engineering manager. May your team, one day soon, refactor with pride.
Reblog this post [with Zemanta]

Read More »

A new version of the Joel Test (draft)

0 comments
(This article is a draft - your comments are especially welcome as I think through these issues. Please leave feedback!)

I am convinced one of Joel Spolsky's lasting contributions to the field of managing software teams will turn out to be the Joel Test, a checklist of 12 essential practices that you could use to rate the effectiveness of a software product development team. He wrote it in 2000, and as far as I know has never updated it.

I have been thinking a lot about what a new version of this test would look like, given what I've seen work and not work in startups. Like many forms of progress, most of the items on the new test don't replace items from Joel's - they either supplement or extend the ideas on which the original is based.

Let's start with the original list:

  1. Do you use source control? This is still an essential practice, especially on the web. There was a time when "web content" was considered "not code" and therefore not routinely source controlled. but I have not seen that dysfunction in any of the startups I advise, so hopefully it's behind us. Joel mentions that "CVS is fine" and so is Subversion, its successor. I know plenty of people who prefer more advanced source control system, but my belief is that many agile practices diminish the importance of advanced features like branching.
  2. Can you make a build in one step? You'd better. But if you want to practice rapid deployment, you need to be able to deploy that build in one step as well. If you want to do continuous deployment, you'd better be able to certify that build too, which brings us to...
  3. Do you make daily builds? Daily builds are giving way to true continuous integration, in which every checkin to the source control system is automatically run against the full battery of automated tests. At IMVU, our engineering team accumulated thousands upon thousands of tests, and we had a build cluster (using BuildBot) that ran them. We did our best to keep the runtime of the tests short (10 minutes was our target), and we always treated a failing test as a serious event (generally, you couldn't check in at all if a test was failing). For more on continuous deployment, see Just-in-time Scalability.
  4. Do you have a bug database? Joel's Painless Bug Tracking is still the gold standard.
  5. Do you fix bugs before writing code? Increasingly, we are paying better lip service to this idea. It's incredibly hard to do. Plus, as product development teams in lean startups become adept at learning-and-discovery (as opposed to just executing to spec), it's clear that some bugs shouldn't be fixed. See the discussion of defects later in this post for my thoughts on how to handle those.
  6. Do you have an up-to-date schedule? This, along with #7 "Do you have a spec?" are the parts of the Joel Test I think are most out-of-date. It's not that the idea behind them is wrong, but I think agile team-building practices make scheduling per se much less important. In many startup situations, ask yourself "Do I really need to accurately know when this project will be done?" When the answer is no, we can cancel all the effort that goes into building schedules and focus on making progress evident. Everyone will be able to see how much of the product is done vs undone, and see the finish line either coming closer or receding into the distance. When it's receding, we rescope. There are several ways to make progress evident - the Scrum team model is my current favorite.
  7. Do you have a spec? I think the new question needs to be "does the team have a clear objective?" If you have a true cross-functional team, empowered (a la Scrum) to do whatever it takes to succeed it's likely they will converge on the result quickly. You can keep the team focused on customer-centric results, rather than conformance to spec. Now, all well-run teams have some form of spec that they use internally, and Joel's advice on how to craft that spec is still relevant. But increasingly we can move to a world where teams are chartered to accomplish results instead of tasked with creating work on spec.
  8. Do programmers have quiet working conditions? Joel is focused on the fact that in many environments, programmers are considered "just the hired help" akin to manual labor, and not treated properly. We always have to avoid that dysfunction - even the lean manufacturing greats realized that they couldn't afford to see their manual-labor workforce that way. I think we need to modify this question to "Do programmers have access to appropriate working conditions?" We want every knowledge worker to be able to retreat into a quiet haven whenever they need deep concentration. But it's not true that energized programmers primarily do solitary work; certainly that's not true of the great agile teams I've known. Instead, teams should have their own space, under their control, with the tools they need to do the job.
  9. Do you use the best tools money can buy? Joel said it: "Top notch development teams don't torture their programmers." Amen.
  10. Do you have testers? I think reality has changed here. To see why, take a look at Joel's Top Five (Wrong) Reasons You Don't Have Testers. Notice that none of those five reasons deals with TDD or automated testing, which have changed the game. Automated testing dramatically reduces the cost of certifying changes, because it removes all of the grunt work QA traditionally does in software. Imagine a world where your QA team never, ever worries about bug regressions. They just don't happen. All of their time is dedicated to finding novel reproduction paths for tricky issues. That's possible now, and it means that the historical ratio of QA to engineering is going to have to change (on the other hand, QA is now a lot more interesting of a job).
  11. Do new candidates write code during their interview? Completely necessary. I would add, though, a further question: Do new employees write code on their first day? At IMVU, our rule was that a new engineer needed to push code to production on their first day. Occasionally, it'd have to be their second day. But if it languished until the third day, something was seriously wrong. This is a test of many key practices: do you have a mentoring system? Is your build environment difficult to set up? Are you afraid someone might be able to break your product without your automated defenses knowing about it?
  12. Do you do hallway usability testing? I love Joel's approach to usability, and I still recommend his free online book on UI design. Some people interpret this to mean that you have to do your usability "right" the first time. I strongly disagree. Usability design is a highly iterative process, and the more customers who are involved (via in-person interview, split-test experiment, etc) the better.

Now let's take a look at some new questions:

Do you work in small batches? Just like in lean manufacturing, it's generally more efficient to drive down the batch size. I try to encourage engineers to check in anytime they have the software in a working state in their sandbox. This dramatically reduces the waste of integration risk. We rarely have code conflicts, since nobody gets out of sync for very long. And it's way easier to deploy small bits of code, since if something goes wrong, the problem is automatically localized and easy to revert.

Do you routinely split-test new features? I hope to write at a future date about how to build your application so that A/B tests are just as easy as not doing them.

Do you practice Five Why's? Joel himself has written about this topic, in the context of doing root cause analysis to provide excellent quality of service without SLAs. I'm not aware of anyone using this tool as extensively as we did at IMVU, where it became the key technique we used to drive infrastructure and quality improvements. Instead of deciding upfront what might go wrong, we use what actually went wrong to teach us what prevention tactics we need to do. Our version of this was to insist that, for every level of the problem that the post-mortem analysis uncovered, we'd take at lesat one corrective action. So if an employee pushed code that broke the site, we'd ask: why didn't our cluster immune system catch that? why didn't our automated tests catch it? why couldn't the engineer see the problem in their sandbox? why didn't they write better code? why weren't they trained adequately? And make at all five of those fixes.

Do you write tests before fixing bugs? If a bug is truly a defect, then it's something that we don't want to ever see again. Fixing the underlying problem in the code is nice, but we need to go further. We need to prevent that bug from ever recurring. Otherwise, the same blindspot that lead us to create the bug in the first place is likely to allow it happen again. This is the approach of test-driven-development (TDD). Even if you've developed for years without automated tests, this one practice is part of a remarkable feedback loop. As you write tests for the bugs you actually find and fix, you'll tend to spend far more time testing and refactoring the parts of the code that are slowing you down the most. As the code improves, you'll spend lest time testing. Pretty soon, you'll have forgotten that pesky impulse to do a ground-up rewrite.

Can you tell defects from polish? Bugs that slow you down are defects, and have to be fixed right away. However, bugs that are really problems with the experience design of your product should only be fixed if they are getting in the way of learning about customers. This is an incredibly hard distinction to understand, because we're so used to a model of product development teams as pure "execution to spec" machines. In that model, anything that the product owner/designer doesn't like is a bug, and Joel's right that we should always fix before moving on (else you pile up an infinite mound of debt). However, in the learning phase of a product's life, we're still trying to figure out what matters. If we deploy a half-done feature, and customers complain about some UI issues (or split-tests demonstrate them), we should refine and fix. But oftentimes, nobody cares. There are no customers for that feature, UI issues or no. In that case, you're better off throwing the code away, rather than fixing the UI. The hardest part is forcing yoursel fot make this decision binary: either continue to invest and polish or throw the code out. Don't leave it half-done and move on to new features; that's the fallacy Joel tried to warn us about in the first place.

Do your programmers understand the product they are building and how it relates to your company's strategy? How can they iterate and learn if they don't know what questions are being asked at the company's highest levels. At IMVU, we opened up our board meetings to the whole company, and invited all of our advisers to boot. Sometimes it put some serious heat on the management team, but it was well worth it because everyone walked out of that room feeling at a visceral level the challenges the company faced.

What other questions would you ask a brand-new startup about its product development practices? What answers would predict success?
Reblog this post [with Zemanta]

Read More »

Smarticus — 10 things you could be doing to your code right now

0 comments
Smarticus — 10 things you could be doing to your code right now

A great checklist of techniques and tools for making your development more agile, written from a Rail perspective. Of the techniques he mentioned, I think four are fundamental and critical for any lean startup:

TDD (or the even more politely named TATFT)
Continuous integration
Automate your deployments
Collect statistics

The tools to help you do these things are getting better and better every day, but don't confuse tools with process. Whatever state your code or team is in, you can always start going faster. Or to borrow from a military context (John Boyd) "people-ideas-hardware, in that order."
Reblog this post [with Zemanta]

Read More »

Test-Driven Development as andon cord

0 comments

You cannot control what you cannot see, and the hardest part of managing software projects is that the final product is so intangible. Automated tests make defects visible. In almost all Agile Development systems, thousands of automated tests are run against every change to the software. These tests are written by the engineers themselves, and they must all pass every time. Even one test failing is considered a “red alert” situation that requires all work to stop. This is equivalent of the andon cord in a Toyota production plant. It signals the presence of a defect in the assembly process and directs all attention to fixing it before work can continue. As the team gets better and making more changes without introducing new problems, the team can speed up. A failing test provides a natural feedback mechanism, slowing the team down when going too fast.

Reblog this post [with Zemanta]

Read More »

Refactoring for TDD and interaction design

0 comments

In TDD, we follow a rhythm of “test-code-refactor.” This basic pattern is useful in all aspects of product development. The basic idea is to avoid building something based on what you think it might need to do in the future. Instead, we build for today, but then constantly look for ways to reconfigure what we’ve created to make it more general, more flexible, more useful. This process is called refactoring. In code, it helps us to build reusable components that are suitable for assembly into complex systems without having to guess which components are needed (or how they should work). In effect, the specific examples drive the spec for what the general pieces should look like.

The same process works in Interaction Design. The temptation is try and build a toolkit of general pieces, which can be assembled into many kinds of final product. This rarely works, because it is hard to build pieces that work optimally for diverse uses without damaging each use case (which leads to mediocre or compromised results). Much better is to infer the right toolkit by searching for commonalities among actually-built experiences.
Reblog this post [with Zemanta]

Read More »