blog/posts/high-reliability-organizations.org

6 KiB

— title: "High reliability organizations" date: 2022-06-01 tags: management, social science toc: false —

[cite/t:@dietterich2018_robus_artif_intel_robus_human_organ] is an interesting article about how to make robust AI. High risk situations require the combined AI and human system to operate as a high reliability organization (HRO). Only such an organization can have sufficiently strong safety and reliability properties to ensure that powerful AI systems will not amplify human mistakes.

Reliability and high-reliability organizations

The concept of high reliability organization (HRO) comes from [cite/t:@weick1999_organ]. Examples of HROs include nuclear power plants, aircraft carriers, air traffic control systems, and space shuttles. They share several characteristics: an unforgiving environment, vast potential for error, and dramatic scales in the case of a failure.

[cite/t:@weick1999_organ] use the concept of "mindfulness", a kind of "enriched awareness" (which I interpret as "awareness with explicit processes"), consisting of the five elements listed below. This mindfulness leads to the capacity to discover and manage unexpected events, which in turn leads to reliability.

Characteristics of a high reliability organization

An HRO is an organization with the following five attributes.

Preoccupation with failure

There are many possible failures, most of them extremely rare. Consequently, HROs study all forms of failure and near misses with extreme carefulness and attention to detail. They also study the absence of failure: why it didn't fail, and the possibility that no flaws were identified because we weren't attentive enough to potential flaws. HROs encourage reporting all mistakes and anomalies by anyone.

Reluctance to simplify interpretations

HROs avoid having a single interpretation for a given event. They encourage generating multiple, complex, contradicting interpretations for every phenomenon. People are encouraged to have different views, different backgrounds (important for Hiring), and are re-trained often. To resolve the contradictions and the oppositions of views, interpersonal and human skills are highly valued, possibly more than technical skills.

Sensitivity to operations

HROs rely a lot on "situational awareness". Basically, we have to check that there is no emergent phenomena (cf Complex systems and Compositionality): all outputs should always be explained by the known inputs. Otherwise, there might be other forces at work that need to be identified and dealt with. A small group of people may be dedicated to this awareness at all times.

Commitments to resilience

HROs train people to be experts at combining all processes and events to improve their reactions and their improvisation skills. Everyone should be an expert at managing surprise. This can include rapid formation of ad hoc teams to improvise solutions to novel problems.

Underspecification of structures

There is no fixed reporting path, anyone can raise an alarm and halt operations. Everyone can take decisions related to their technical expertise. Information is spread directly through the organization, so that people with the right expertise are warned first. Power is delegated to operation personal, but management is completely available at all times.

HROs vs non-HROs

Non-HROs increasingly exhibit some properties of HROs. This may be due to the fact that highly competitive environments with short cycles create unforgiving conditions (high performance standards, low tolerance for errors). However, most everyday organizations do not put failure at the heart of their thinking.

Failures in non-HROs come from the same sources: cultural assumptions on the effectiveness or accuracy of previous precautions measures.

Preoccupation with failure also reveal the couplings and the complex interactions in the manipulated systems. This in turn leads to uncoupling and less emergent behaviour over time. People understand better long-term, complex interactions.

Reliability vs performance, and the importance of learning

An interesting discussion is around the (alleged) trade-off between reliability and performance. It is assumed that HROs put the focus on reliability at the cost of throughput. As a consequence, it may not make sense for ordinary organizations to put as much emphasis on safety and reliability, as it may cost money.

However, investments in safety can also be viewed as investments in learning. HROs view safety and reliability as a process of search and learning (constant search for anomalies, learning the interactions between the parts of a complex system, ensuring we can link outputs to known inputs). As such, investments in safety encourage collective knowledge production and dissemination.

Mindfulness also stimulates intrinsic motivation and perceptions of efficacy and control, which increase individual performance. (People who strongly believe they are in control of their own output are more motivated and more efficient.)

HROs may encourage mindfulness based on operational necessity in front of the catastrophic consequences of any failure, but non-HROs can adopt the same practice to boost efficiency and learning to gain competitive advantage.

Additional lessons that can be learned from HROs (implicit in previous discussion):

  1. The expectation of surprise is an organizational resource because it promotes real-time attentiveness and discovery.
  2. Anomalous events should be treated as outcomes rather than accidents, to encourage search for sources and causes.
  3. Errors should be made as conspicuous as possible to undermine self-deception and concealment.
  4. Reliability requires diversity, duplication, overlap, and a varied response repertoire, whereas efficiency requires homogeneity, specialization, non-redundancy, and standardization.
  5. Interpersonal skills are just as important in HROs as are technical skills.

References