Human Error Reduction and Cross Training Improvement Experiment in IT

     Many articles and studies have been published on human errors, especially in the medical field where people’s lives and wellbeing can be affected.  However my focus for this article is on Human error reduction in an IT manufacturing environment, where mistakes could mean downtime for our customer – the factory.

   In recent years, IT in many organizations has been the focus of cost optimizations.  “Doing more with Less” is the IT motto.  From a technological aspect, to reduce TCO (Total Cost of Ownership), many companies have moved to virtualization solutions, automated and streamlined to increase the number of managed systems.   Companies have incorporated LEAN and Six Sigma to address behavioral aspects as well.  Likewise the demand for higher computing reliability has been ever increasing placing even more stress on its staff.   Unfortunately, adding more systems (i.e. more responsibility) increases the total number of daily maintenance tasks, the IT organization needs to perform.  There will be more reboots, patches, application upgrades and user management.   It is a known fact that the probability for human error increases as we increase the total number of tasks individuals perform per unit of time.  This is especially true for those tasks we are very familiar with.  We tend to feel over confident and either skip steps or incorrectly perform a step because we are in a hurry.

  To counter human error, many organizations have adopted the “peer” installer practice along with check lists, where an individual performing a potentially risky task in the environment enlist a peer college to watch over the mouse clicks and key entries.   Unfortunately it’s been demonstrated that this practice on its own is not enough.  Due to complacency and over confidence, because of expertise (on the peer side, mistakes are still overlooked.

  I am huge proponent for velocity, which makes the following experiment proposal even more painful and counter intuitive.  However, velocity should not compromise safety and quality.  The goal of the experiment is to reduce the likelihood of human error by:

  1. Slowing the rate of change: Disassociating the person who usually performs a specific task or who discovers a task needing to be completed from the actual task execution.    Tasks are recorded and completed offline in an organized manor.
  2. Cross training by reversing the roles:  Encouraging a peer, not comfortable with the tasks to serve as the executioner and the person who is familiar to act as the peer.
  3. Encouraging a meaningful peer execution:  The peer which is most proficient with the task will serve as the coach while the peer executing the task is the student, making both parties tread more carefully and thoughtfully before acting.

     Certain ground rules need to be put into place to keep the organization from grinding to a halt, such as which tasks should be performed in this fashion.  Focus should be placed on the tasks which can cause the most impact and provide the highest return of investment, training wise.

     In summary, I expect that this experiment will have the desired effect but I am unclear if it will hinder velocity within the organization to a point where it is crippling.  Likewise this experiment requires a conscious effort on the participants which may make it even more difficult to implement.  Finding the correct balance of cross peer implementation will provide the most benefit.