Friday, May 02, 2008

Software Plasticity and Risk

I've been toying with the concept of plasticity with respect to software. I've also been thinking about the software development process and how plasticity relates to the risks introduced by making changes to software.

Think about using a scale from 1 to 10 to describe how plastic a software product is. A one (the least plastic software product) would be a consumer electronics device where the software is embedded and it's impossible to update the software. Examples would include a portable DVD player or a digital clock radio. A ten would be something like a website (perhaps like the Netflix website or Instant Viewing PC client -- and I happen to be hiring one or two QA Engineers), where if there are defects discovered, they can be fixed quickly by pushing new code to the servers, and instantly, all users get the new code.

Then, think about the risks you run by making changes. My canonical examples to contrast the risk of changes usually go something like this:

1. We're updating the encryption key that we use to encrypt the credit cards for our members. We test this very thoroughly. The risks to the business are very large if there's a screwup and the benefits of avoiding screwups are equally large. Therefore, it makes a lot of sense to invest a lot in testing.

2. We're modifying how we format the pages for different genres of movies so that boxshots are right aligned instead of left aligned. Not a lot of risk for damaging the business there. We may not even do any formal QA and leave it up to the UI engineers to make sure they get that one right.

I haven't worked out the details yet for measuring risk vs. plasticity, but I imagine some kind of matrix with plasticity across the top and different software components down the side. The intersection of each row and column would have some metric that boils down the cost to fix, the cost to deploy, the risk of change and the risk to the business. Then, during the course of normal development, or in the case of an emergency, evaluate the proposed change on those four dimensions to come up the metric for that change and compare to the matrix for where the change lies. If there's a net positive risk/reward, do the change immediately in the case of a fix to production. Likewise, if it's part of the normal development cycle, then if there's a positive risk/reward, allocate the QA resources to fully test. If there's not adequate reward for testing, or the risk to the business is low enough, dedicate the limited QA resources elsewhere in that push cycle.


No comments: