Over-engineering #3

January 25, 2009

ibam-macbook-headphone-jack

This is  the headphone jack from a Macbook.  I know what you’re thinking… Mac is a statement in engineering genius!  This guy’s got a lot of gall to be writing about the Macbook.  Well, you’d be wrong.  I ran into this fun situation recently while visiting my in-laws.

My inlaws are Mac people.  I don’t hold it against them.  When we got to their house my mother in law was mildly distraught.  Her precious Mac would no longer play sound!  However, if they plugged in headphones it would play sound, but if they took the headphones out it wouldn’t play anything.

It sure sounded like the internal speaker had gotten damaged somehow.  That was, until my father in law told me that when the Mac starts up without headphones plugged in that it still makes its telltale Mac startup noise.  So the internal speaker clearly wasn’t totally broken.

Here’s what we knew:

  1. The startup sound came out of the internal speakers.
  2. If headphones were plugged in, all mac sounds came out of the headphones
  3. If headphones weren’t plugged in, the headphone jack would turn red BUT the volume slider on the menu bar would be locked.  Also, when pressing the volume buttons on the keyboard it would display a little X below the volume.  You couldn’t change it.
  4. If you started playing sounds without the headphones in the volume control would be locked.  Once the headphone jack was fully in the volume control would become unlocked.

Clearly something was odd about this engineering.  The mac recognizes three states rather than the expected two.  

  1. The internal speakers are on, nothing is in the headphone jack.  Sound comes out through the speakers.
  2. The internal speakers are off, something is partially in the headphone jack.  No sound comes out anywhere, but the volume control is locked.
  3. The internal speakers are off, something is in the headphone jack.  Sound comes out of the headphones, the volume control is enabled.

And it was this set of facts that led me to the solution and conclusion that this headphone jack was overengineered.  So,  took the headphone plug, put it in about halfway and started jiggling it left and right.  And ta-da, the music started playing out of the main speakers again.  Problem solved.

Why do I consider this overengineering?  What’s the point of the second state in the three possible states above?  Clearly, you have some way of knowing when the headphone jack is fully in because that’s when you enable the volume control.  So what’s the point of the halfway control?  Why would I want to disable the external speaker and not enable the headphones?  It’s just dumb.


Underpromising and overdelivering, not such a good idea

January 23, 2009

Back in December, after the horrid experience I had with the local power company, I wrote a post about communicating when all else fails.  In it I discussed a variety of ways in which communication could have been greatly improved.  But, as I was thinking the other day, there was one point that I alluded to that has more relevence in the business world than some of the others.  “Underpromise and overdeliver.”  If you haven’t said it yourself, you’ve probably heard it from someone.

The idea seems ok on the surface.  If I say to you “you’ll have it by Friday” and then I give it to you on Tuesday, how unhappy will you really be?  I mean, after all, you were OK with getting it Friday, so early delivery should make you excited, right?  Wrong!

Let’s return to the ice storm.  The message we were given was “power will be restored by the end of the week.”  There’s something like, who knows, 20,000 people in my town.  So how true is that statement for most people?  In it’s most generous of meanings, that statement is true: you will most likely have power restored by the end of the week.  But, for most people the statement is also false.  In fact, your chances are really good that you will have power well before then.  The act of fixing the power grid is a right skewed curve, like this:

right-skewed-distribution

Why is this interesting?  Well, the morning after the storm, almost nobody has power, but then the crews work immediately on the main lines.  Once they get the main lines back up, power is quickly restored to many people, hence the quick rise in the curve.  Indeed, the pace of recovery is quite quick, perhaps only a day or two for most people.  In reality, 55% of the town had power by the evening of the third day.  And yet, it took almost 12 days for the last people to get power back.  Clearly, the probability that you’d have power back by day X had a very long right tail.  Some people, a very few, were in the cold a long time.  Most of us were not.

Well, so what, you say?  What’s the big deal that the town underpromised 60-70% of their residents and overdelivered?  I’ll tell you what the big deal is.  Money and lots of it!

Every day you are without power you have to make a choice – if I stay here, what are the chances that I’ll get power back?  If the odds aren’t good, well, you make certain financial decisions.  On day two, because the town employees were still saying “end of the week” I went out and spent several hundred on a generator for my home.  Other people spent nights in hotels.  That might be $150 or more a night AND if you didn’t really need the room but you occupy it (i.e. use the resource unnecessarily), then you deny someone else who does need it access to a warm place to sleep.  So, how acceptable would it have been to me if on day two someone told me “end of the week” and then I drove home with the generator, used it for a little on the expectation that I’d need it for several days, and power came on three hours later?  I’d be furious, that’s how I’d feel!  By the way, you can’t return a used generator… stores don’t allow it because they don’t want to become a generator rental company during major outages.  If you spend the money, it’s yours to keep.

Well guess what, as a systems organization, it’s the same deal.  When you underpromise to your customers and then overdeliver, how do you know that they haven’t made an expensive financial decision based on when you said you could deliver it?  Maybe they retained extra staff to do the work manually until you automated the process.  Now they’re paying for people they don’t need.  Maybe they bought extra hardware to cover until you could resolve your performance issues.  Great, now they own hardware that they wouldn’t have otherwise bought.

Sure, underpromising and overdelivering means you’re never late, but that doesn’t mean you’ve done your customer a favor.  Underpromising is as much a lie as overpromising.  PROMISE and DELIVER!


My instructions for a good data presentation

January 22, 2009

Good presentations are like mom and apple pie.  Everyone talks about giving them, nobody gives you anything specific to do to get from ho hum to hurrah with your presentations.  Well guess what, I’m not going to either!  So if that’s what you came looking for, sorry to disappoint.

I will, however, give you my instructions for giving a good presentation that includes statistics and data, especially in the business world.

  1. Get a projector.  Most charts of statistical analysis are hard enough to read without having to squint at a tiny PowerPoint slide deck.  Plus it saves a tree and plays into #2.
  2. Don’t hand out slides.  You can hand them out afterwards if people ask for them.  If you hand them out before, they will be flipping through them and not paying attention to you.  Handing out slides makes a mess out of #4.  It is hard to tell a compelling story when they already know what the end is.
  3. Stand up.  You cannot give a compelling presentation sitting in your chair.  People will not be looking at you.  You will, I assure you, have to point at your charts, so you’d better be standing near where they are projected to do so.
  4. Tell a story.  Nobody, not even a geek like me, is interested in a random assortment of charts that you want me to draw a conclusion from.  Put them in an order that makes sense, and walk me through your thinking.  That’s what analysis is all about, your thinking process.  Now, that doesn’t mean you can’t give them an executive summary at the start so they know what they’ll be hearing, but make it a teaser.  “Despite commonly held beliefs, it is unwise for us to invest in automating this part of the factory.”  See, you told them the conclusion, but they’re sure as heck going to be curious as to why you think that.
  5. Know the data/chart/analysis tool.  I can only assume you have the data/chart/analysis tool/whatever for a reason.  If you don’t know it inside and out, DON’T USE IT.  There’s a good chance that someone in the audience will ask you what a residual is if you show residual plots from a linear regression.  There’s some chance that someone with a little knowledge will ask you a hard question.  If it happens, and you don’t know the answer, say goodbye to the credibility of your presentation.  A personal favorite “why didn’t you use a two sample T test instead of a kruskal-wallis?”  “Because the data was extremely non-normal and kruskall-wallis is more robust in that situation, even though a 2 sample T test can survive some non-normality in the data.  And yes, I did check for homoscedacticity in the samples. “  OK, so that last comment was unnecessary, but it will shut the know-it-all up.
  6. Be prepared to explain simple statisical concepts.  Going to use a Pearson correlation?  You’d better be able to explain the possible range of result values (-1 to 1) and what being closer to -1 or 1 means.  Plus, you should have some idea on what’s considered no relationship, a weak relationship or a strong relationship.  For the uninitiated, they aren’t going to know what you are talking about unless you explain it.  Be prepared to teach. 
  7. Big bold takeaways.  Regardless of all the data in the chart, overlay a box (or several) on it and tell them what you want them to take away from it.

Point is, I don’t want to see you mumble through your charts and data the way you would slides of your summer vacation.  Both are very boring to me.


Overcomplicating Things

January 21, 2009

How overjoyed I was when I was approached by a senior manager and his team to help develop a true MBF.  They were interested in the end to end quality of the the products they test.

Having gone out to their customers and gotten VOC, their customers told them what quality meant to them.  Though there are many things one might consider quality code - maintainability, stability, etc. - our customers overwhelmingly told us one thing: NO DEFECTS!

Great, we knew what our customers didn’t want.  Now, what’s the opportunity to have a defect.  Oh… well… um… unfortunately that question hasn’t been well answered by the software industry.

I do want to digress and point you towards a presentation I saw given by Gary Gack today regarding measuring software productivity which I thought was very interesting.  Productivity and quality both need that same “opportunity” measure to make them meaningful.  A good measure of the quality of a product is (defects / opportunities) and a good measure of productivity is (effort / opportunity).  In this case, opportunity might mean lines of code or function points or who knows what.  Mr. Gack presented the standard case against these types of measures – lots of challenges abound.  He argues that instead of measuring the opportunity (size of the work) measure how leanly you do the work.   Of course, that fixes productivity but does nothing for quality of code.  We’ll come back to that in my next post perhaps.

Anyway, long digression… the problem was just like everyone else, this QA department didn’t know what the opportunity was.  So they chose one.  GASP!  Horrors, you say!?!?  I disagree for one simple reason.  Rather than avoiding the measurement because they didn’t know what exactly an “opportunity for a defect” might be, they chose to acknowledge that whatever operational definition they came up with would be imperfect AND they would work to counteract its imperfections with additional measures to balance the scorecard.

By balancing, in this case, I don’t mean having measures for cost, quality and speed, but instead having measures that counteract or watch over the potential gaming that could be done to the main measurement.

Up to this point, I am a happy boy.  Nay, I am practically shedding tears of joy.  Never before in this organization have I seen some senior leader actually take this kind of initiative to try and at least get us in the ballpark when it came to where our quality stood.

So, let’s dissect the proposed measurement and see where it went awry.  I’m happy to say up front that they performed a proper analysis of where they ended up and corrected it before moving on.

So, back to the original question… what is the opportunity for a defect?  Most people choose lines of code (LOC) or function points.  This team started with something they felt confident they could measure – test cases.

I know, I know.  Test cases is no measure of opportunity… or is it?  In our world, it well may be.  We perform testing looking for good coverage over all the code created, so QA examines the requirements, writes scenarios and test cases from those requirements and executes them all.  Since QA does not selectively not test (ie perform risk based testing), the number of test cases written probably correlates pretty well to the amount of opportunity.  Yes, it is true, a test case is an opportunity to find a defect, not an opportunity to create a defect.  It’s a subtle distinction, but important.

Of course, the team took it further.  What about the complexity of the project and what about the amount of people involved?  Aren’t these things important indicators of how much opportunity there is for a defect as well?  After all, just like you probably initially reacted to it, test cases seems like a bad way to measure opportunity.  So they added those in to create their opportunity measure called “Weighted Test Cases (WTC)”.  WTC is Test Cases * Project Size * Project Complexity.  Ignore for a moment how size and complexity are figured out.  Ultimately, their output measure of quality for a project would be (Defects / WTC).

Here’s my sanity check for “do I have the right opportunity.”  Do a correlation test.  That’s the whole thinking behind having the opportunity as part of your measure in the first place, right?  If I have more opportunities, then having more defects doesn’t necessarily mean I’m doing worse.  Less opportunities, less defects.

After they got their data together for their first go at it, we did some analysis.  First we looked at the WTC denominator of this equation.  Test Cases * Size * Complexity.  Hmm, size… test cases… might these two things be related?  I mean, after all, if you have more test cases you probably need to exert more effort to run all those tests.

Indeed they were, and quite strongly.  Ok, so either size or test cases doesn’t belong in the denominator.  Since test cases was our starting point, we dropped size.  And for good measure we dropped complexity as well.  Now we had Defects / Test Case.  This seemed better.

Next, we went after the numerator.  All all defects created equal?  No, I’m afraid not.  Some defects have a high severity, some a medium and some a low.  So maybe weighted defects (defect * severity) is closer to what we want instead?

Sure enough, when we compared the correlations of Defects to Test Cases and Weighted Defects to Test Cases, the Weighted Defects came out with a stronger relationship.  Interesting!  So it appears that more test cases doesn’t just mean that you’ll get more defects, but you’ll also get more defects of greater severity.  This makes sense intuitively.  Bigger projects are more complicated and have more chance to make big mistakes.  (By the way, that same Gary Gack presentation alluded to something similar in productivity.  Larger projects have lower productivity on average than smaller ones.  There’s a nonlinear relationship between the two.)

Finally, we arrived at a simpler solution. From Defects / (Test Cases * Complexity * Size) to Weighted Defects / Test Cases.

And lastly, the team added some countermeasures.  Why?  Well, “test cases” is a proxy for opportunity, but only if the testing process remains stable.  If testing gets better (ie, more defects found per test case) then the quality of the code looks worse even though it may not be.  If testing gets worse (ie, “hey, we can make the denominator bigger if we write lots of teensy test cases”) then quality would look artificially better.  So, the team added another measure – defect containment rate (DCR) – a Capers Jones favorite.  By having DCR alongside the quality, if containment went up in proportion to the change in quality, then we’d know that quality remained the same while the appraisal process (testing) had improved.

And on the other side we decided to measure effort / test case executed.  Since smaller test cases would take less effort to execute, if we saw a drop off in the effort per case we would know that people were trying to increase the denominator of our main measurement artificially.

Alas, those last two paragraphs have little to do with my takeaway lesson.  Ready for this one?  Start simple.  Even though you can imagine why something so basic as “test cases” isn’t a good proxy for the opportunity to create a defect, you could be wrong.


Screwed up by the queue

January 12, 2009

I’ve written about Boston Market before as an interesting example of queuing.  Today I was at Quiznos and watched something else interesting to me that I wondered about.  It was early, before noon, by the time I sat down with my lunch.  I had just run out to a local tire store to get a flat repaired – darn my luck.  I had managed to drive over, who knows where, a 2 1/2 inch screw which promptly lodged in the tread of my tire and made a nice sized hole.

But I digress, I was in a bit of a rush when I got to Quiznos because I had used up the vast majority of my lunch break waiting on a tire repair.  So, as I sat there eating I took note of some standard features of fast food joints.  For example, on the wall were posted some nice sandwich-making job aids.  I couldn’t read them clearly from where I was sitting, but they had organized their sandwiches by major type of meat that was on them, so there was a nice picture of a cow and a chicken and a salami.  A salami is an animal, right? :)

In addition, the store was set up for a standard single queue.  It’s a small store, so the queuing is a bit of a problem.  The queue doesn’t have a lot of room to wind around, but that’s irrelevant to this story.  In fact, since it was before noon, the queue was empty until these two guys walked in.

The first guy, an older gentleman of husky build, with greying hair and a tan Dickies coat on stepped up to the counter.  “Do you have a $5 meatball sub?”

“4.79 for the regular and 5.29 for the large,” the woman behind the counter responded.

“Oh, I see, I see!” replied the man as if he now saw the menus clearly.  But clearly he didn’t see, because he stood there, probably 10 seconds (it felt like a minute or two).  If you’re not a Quiznos frequenter, you wouldn’t know that for the past few months they had been running $5 large subs of all different kinds.  Then, they recently stopped doing it and now offer “everyday values.”  Same subs, $0.29 more expensive for a large. 

Anyway, he continued to stand there and stare, kind of stepping forward and then back away from the counter, like he was uncertain or nervous.  Finally, he started again “do you have meatball subs?”  Which was particularly amusing to me because right on the door that he just walked in was a large sign, in bold letters “NOW SERVING MEATBALL SUBS.”

“Yes, ” the woman behind the counter replied, pointing toward the everyday value subs sign.  The man still looked confused, so she walked closer to the sign and pointed up towards it.  The sign didn’t just say “meatball,” if it serves as any excuse to this ridiculous interaction.  It in fact says “Primo Meatball.”  I wondered for a moment what makes a primo meatball and what the odds are that quiznos is actually serving something that would meet my definition of “primo.”

Finally slightly less confused, the man says “ok, I’ll have a large meatball.”

“Large white or wheat, ” asks the woman.  I’m not sure if it’s a different bread if you ask for a small, so why she added the large qualifier, I don’t know.  That  confused him again.

“Large.”

“White or wheat, ” she repeated.  He stared blankly.  I started thinking he might have suffered a head injury in his line of work.  Still nonplussed, the man continued to stare.  The woman pulled out a loaf of white bread and showed it to him with a look on her face that said “this one?”

“Yes, that one, ” he finally answered.

HOORAY!!!  He had successfully ordered a sandwich, I thought!

“With american cheese, ” he added…. oh good lord!   They proceeded to have an entire conversation, drawing in the other employee behind the counter, about the fact that they did not have american cheese.  He finally, after some debate about his options, opted for mozzarella, which is apparently their standard for a meatball sub anyway.

So, what’s the point of my entire story.  Remember how I started out this story about Quizno’s queuing system… can you imagine what the single queue looked like now?  Queues, if they contain widgets to be processed are one thing, but queues where the contents of the queue have minds (or a lack thereof) of their own is a whole other thing.  For a moment, the chaos of McDonald’s multi line queuing made sense.  If one line is incapacitated by a moron, the others can move forward, and although processing is somewhat crippled, it is not totally brought to a halt.

Quiznos doesn’t really have that luxury.  Not having any pre-made sandwich inventory, the person taking your order also assembles it to the point where it goes into the toaster.  And since it was early, they only had one order-taker/sandwich pre-assembler on hand.  The queue was hopelessly immobile.  Foiled completely by one indecisive person.  The other end of the queue, the guy who takes the toasted sandwich out, adds lettuce and rings you up was empty.  He was bored.  No throughput at all.  For some reason he wasn’t serving the clogged end of the queue.

I think Quiznos and others who would employ this model need a timer of some form and a mechanism to deal with troublemakers in the queue.  For some reason, Seinfeld’s Soup Nazi comes to mind.  “No sandwich for you!”  Hey, it may be harsh but at least service to the rest of the more compliant people in the queue is vastly improved.  I’ll throw this out there, since I am not a queuing expert.  How should such a morass be handled?


Risk based testing doesn’t change the goal

January 8, 2009

I was just out to lunch with a coworker and friend who was telling me that the quality assurance department was switching over to risk based testing.  It’s a simple concept as I understand it – test more where there is more risk, test less where there is less risk.  Risk is determined via experience, typically in the form of some scoring system which rates how risky a given change or application is.  The higher the score, the more or different types of testing you do.

Now I’m not a testing expert by any means, but the conversation turned to how they were going to measure their success.  Prior to risk based testing, the measure of success for the department was defect containment rate (DCR).

Defect containment rate is fairly basic as well.  It’s simply (every defect you find in testing) / (every defect you find in testing + every defect you find in production).  In effect, if you find 75 defects while testing and after the code reaches production 25 more defects are found then you have a 75% (75 / (75 + 25))  defect containment rate.   Generally, the higher your DCR the better.

But, no, I’m told by my friend that the new measurement will be on defects found for the areas QA tested.  So, by that logic, if through risk based testing you determined function A wasn’t risky enough to be worth testing, and it breaks in production, that defect shouldn’t be counted against you… Such a decision would only serve to affect the denominator.  You’d still report all the bugs you found in test, but for each prod defect that was found, you’d get to decide whether or not you meant to test for that bug.  Suddenly, 25 defects in production might only count as 10 or 15 if you deemed the remainder as “things we weren’t looking for.”  Now instead of 75% containment (75 / (75 + 25)) you’d have a 88% (75 / (75+ 10)) containment rate.  Hey, you improved!!!  Wrong!

Something is amiss here!  Since when did just because you changed the way you do things change what is important to your customer?  If your prior measurment – defect containment rate – measured what your customer expected of you, where’d you get the free pass to not accomplish that goal anymore?

You don’t design metrics around what will look good.  Looking good, as I’ve written before, is NOT actually good.  Actually being good, and meeting your customers’ needs is the goal.  If you refuse to measure how defects impact your customer just because you weren’t looking for those defects, it doesn’t make the defects go away.


Today may be just like yesterday should have been

January 7, 2009

OK, I admit, it’s an odd title for this entry.  Here’s the story.  Some time ago, several years now, a group of people got together and decided what a process organization should look like at our company.  I’m sure it was modelled on other companies ideas.  Then, that same team set about to make it a reality.  They divided the group into three distinct parts – strategic process, tactical process and metrics. 

The tactical process group did targeted improvements.  If someone came to us and needed a process fixed, that was the work of the tactical group.  It was easy to get started on tactical things so this group had an abundance of work.  However, leadership recognized that a purely tactical approach was problematic.  Based on the way the end-to-end process was, the tactical approach was too narrowly focused.  For any given sub-process the assumption was made that the inputs and outputs of that process were basically immutable.  You could only change inside the process so long as you could use the same input and provide the same output.  It’s suboptimizing, but it is better than not optimizing at all.

The strategic group was there as more R&D.  It was supposed to look at the process holistically, take it apart, reassemble it… generally understand it and THEN act upon changing it.  Specifically, the strategic team would determine what the tactical team would focus on. 

Finally, the metrics team was a service group who had the skillset to query various databases and produce charts and graphs and other reports about the process that were needed.

At least, that was the idea of what the group would be.  Then reality hit.  The tactical team had too much work, so they started stealing resources from the strategic team.  It wasn’t long before the strategic team was doing the same thing that the tactical team was doing.  The metrics team had become mired down in manual processes to collect data and produce charts – ironic for a process organization, no?

Anyway, a year or two on, there was only one manager and one team left standing – the tactical team.  The strategic folks had been absorbed, the management moved on and the metrics merged in with some other group who also produced charts.  After all, without the strategic team defining how the end to end process should be measured, there wasn’t much for metrics to do.

Lo and behold, we all get invited to a planning session for the next year.  We’re asked “what are the things that a process organization should do?”  After about an hour or two of people throwing out ideas and whatnot, you could see that the proposed answer looked very familiar.  A process organization, at least so far as we could tell, ought to have a strategic, tactical and support (metrics) arm.  It makes sense.  You need governance.  You need metrics to support that governance.  And finally, you need a handful of tactical people to get their hands dirty, provide facilitation skills for tough business process decisions, etc.

Why does what should be done today look just like yesterday?  And why are we planning for today?  We never did the original plan, so its not like this new plan was informed by our past failures.  Well, other than to say we’re about to repeat it apparently.

On one hand, yesterday’s team structure seemed sensible.  We thought, at the time, that it was the right way to do things.  But it didn’t happen.  The company didn’t support the structure.  So, now we’re at today… do we go back to a failed model and hope it works this time?  Do we just keep doing what we’re doing and leave it at that?

I can’t say that I know the answer, but I will say this: doing the same thing over and over again and expecting a different result is the definition of insanity.  As an organization – your organization, my organization – if you aren’t collecting data and learning from your failures, what exactly are you basing your new team structure/process/whatever on?


It’s already too late

January 1, 2009

I may be reading too much, or something entirely wrong into this post over at Curious Cat Management.  I agree with John that cost cutting and lean are not equivalent.  And it brings up a good point.  If you had started out as a lean organization, you never would have brought on excesses which would then have to be cut.

But you didn’t, did you?  The economy was good, and companies, being made of people, do exactly what individuals do when things are feeling good.  They spend money!  Money burns a hole in the pocket of a company just like it does in the pocket of an individual.

So you’ve gone out and spent it.  You hired new people to work on projects and things that don’t add value to your customer.  You don’t even know what your customers needs really are probably.  But gosh darn it, if you hire enough people and they do enough stuff, you can be sure that the right thing (plus some extraneous garbage that they really don’t care about) is getting built.

The thing is, I agree and disagree with John.  He’s totally right.  Run a lean organization and when a recession comes a-callin’ you’ll be ready.  You won’t have these excesses which need to be stripped away just so you can survive.  Instead of being like a bear, hibernating through the hard times, shedding the fat it built up over the fall, your company will be like a wolf.  Alive and well and still moving at full speed in the winter.  Lean.  (I’m leaving off “and mean”, since it suggests being not nice to your employees, and I just don’t agree with that.)

But, it’s too late for that.  You are a bear already.  Fat, sleepy, and with a whole lot to burn off before Spring returns.  You haven’t got a choice, unfortunately.  Getting lean for you does mean cost cutting.  It does mean re-examining everything you do and deciding not to do some things anymore.  It does mean that people will probably lose their jobs and departments disappear.

It has to happen at least once.  You can either continue to live the cycle of the bear, getting fat, then shedding people like mad until good times return OR you can do it one last time and decide to be a wolf in the spring.  Think of it as reincarnation; your old ways are going to have to die before you can come back as something new.