DesignWIKI

Fil Salustri's Design Site

Site Tools


design:systemic_flaw

Systemic Flaw

A systemic flaw is an inconsistency between elements or flows in a system, or an unmanaged error that can occur in a system.

What is a systemic flaw?

No engineered system is “perfect”; however, some systems are better than others. When designing interventions, one generally wishes to improve on whatever system is currently in place. Identifying systemic flaws in the current situation is an important part of determining what constitutes a proper improvement.

Systemic flaws come in three different types.

Mismatched interfaces:
Systems interact via interfaces, across which mass, energy, and information flow. If the output from one system is not properly matched to the input of its co-system, then significant failures can result. That is, given two systems that interact across a boundary, if the interface of one system doesn't accept or produce a flow that matches what the corresponding interface in the other system produces or accepts, then you have a mismatch.
An example of this is a spike in domestic electrical supplies, which can happen during a electrical storm. To account for such circumstances, resettable fuses (i.e., circuit breakers) are built into homes and into many electrical products.
When you specify mismatched interfaces, be sure to identify exactly what the interfaces are. This can be done easily by referring to an ID code for each flow you drew in your system diagram. Otherwise, you have to specify: (1) the two systems that interact, (2) the specific interfaces in each system through which a flow occurs, and (3) the values of the flow on each of the two interfaces.
Insufficient transformation:
Every system exists for the sake of transforming available inputs into needed outputs. If the nature of that transformation is not sufficient to meet the goals for implementing the system to begin with, then the system is by definition unsuccessful.
An example of this is a cheap pen that doesn't write on cheap paper, or that produces an uneven line of ink on the page, or that has ink that dries in the ballpoint too quickly leading to so-called “hard starts.”
Brittleness:
While all errors cannot be eliminated, they can often be managed in some way. Even a small error can lead to a catastrophic failure if the system's design does not manage it well.
An example of this is how a user of a wheelchair cannot exit a tall building if the elevators are not working, or how a single bad transaction appears to have caused a significant crash of the cryptocurrency Ethereum in 2017.

All systems have emergent behaviour: they react to inputs in ways that result from their internal operation and not from any one of its elements or co-systems. (See Wikipedia for some examples of emergent structures in natural and artificial systems. Emergence is in fact a whole field of study in systems engineering.) This means that you can't determine the flaws in a system by only looking at its elements; you also have to look at how all the elements interact. Therefore, one must look for systemic flaws.

How do we diagnose systemic flaws?

Diagnosing systemic flaws is done by considering the system model of the reference design and the usage scenarios and interaction error charts that you've already developed (if you're following the design roadmap).

Review the system model and USs, looking for occurrences of each type of flaw listed above.

Mismatched interfaces

Review each system interface flow in your system model. For each flow of mass, energy, and information, ask yourself: does the quality and quantity of the source of the flow match the quality and quantity expected by the receiving system? If the flows don't align, then you've got a mismatched interface.

Similarly, perform the same kind of analysis for each interaction error chart.

Here are some examples:

  • If the rate of flow of traffic on a highway cannot match the rate at which new traffic merges with the highway, then you'll get traffic congestion.
    • Notice that this is a time-dependent phenomenon: the rate at which vehicles enter a highway will increase and decrease cyclically over a day. This poses trade-off problems: you can theoretically eliminate congestion by making the highway huge, but the cost of such an undertaking will make it prohibitive considering that maximum congestion only happens for a few hours each day (excluding long weekends, of course).
  • If the rate at which patients arrive at a clinic is greater than the rate at which the doctors can process them, then wait times in the clinic's waiting area will increase unacceptably.
  • If electrical spikes and surges (or lags and dropouts) are more (or less) than a particular piece of electrical equipment can handle, then the equipment may be damaged, or at least not function correctly.
  • If the weight of a blender is too heavy for someone to lift, then the otherwise “portable” blender could well damage the user physically.
  • If information is provided to a person too fast, they will be unable to process/react to it; if it is provided too slowly, they will lose interest/focus/attention. Either way, there is a potential for the person to underperform as a result.
  • If the control button does not light up as expected, the user may mistakenly execute incorrect actions, leading to unnecessary harm or damage.

Notice that we're not interested only in the typical flow rates through interfaces, but also minima and maxima that we might expect in typical situations. Determining what flows a system must handle depends on the environment in which it will be used.

In all such cases, you need to quantify both “ends” of the interface to know just how bad the interface mismatch is.

Insufficient transformation

Review each subsystem of your design intervention with respect to the FRs and constraints it's supposed to provide. Each subsystem has to provide constrained functionality within a context so refer back to your usage scenarios. Do you expect it to function appropriately in those scenarios?

Identify each shortcoming in this regard, particularly for borderline or extreme (but still reasonable) cases. How will you measure that shortcoming? Avoid relative assessments (e.g., too short, too heavy, too hot, etc.) at all costs and attach values to the shortcomings as a difference between what is currently expected and what your users need or want. Make sure all quantifications are justified with respect to evidence.

Review each interaction error chart. Are there requirements you can add that will mitigate or even prevent those interaction errors in your design?

Brittleness

Review each of your usage scenarios, focusing on the error conditions that you've identified. Trace the nature of each error back to a system flow or element/subsystem. How can you change the specification of that flow or element (from a strictly functional point of view) to address the error? Be specific; simply saying “failsafe” (or some other generic principle) is not enough.

Exercise for the reader: Consider an existing elevator, the design of which is such that a user in a wheelchair has no recourse at all if the elevator car does not come to pick the user up.

You cannot simply design out all such failures, because no product is perfect. What alternatives can you imagine you could put in place to assist such users?

Document each error and how you will alter system and interface specifications to eliminate, mitigate, or at least manage the error.

If you find there are too many errors to reasonably deal with within the scope and timeframe of your project, you can use FMEA to analyze and ultimately prioritize those errors, so that you handle the most important ones first, thus improving your design the fastest.

Deliverables

To document your analysis of systemic flaws, produce free-form text paragraphs for each systemic flaw you found. Make sure each flaw is identified by a short descriptive phrase, and references specific US steps, or IECs, as well as Personas.

Group the flaws by their type. Expect to have one section for each type of systemic flaw, and one paragraph describing each flaw.

Note: we don't fix the flaws, only explain them.

Example systemic flaw specification

Blender too heavy.
Bart is too weak to lift the blender from the high shelf it is stored on normally (see US 2.1b). There is an unacceptable risk of the blender falling on him, and of him straining his wrists/arms/shoulders/back.

The specification of a systemic flaw in the blender example is a mismatch between the energy required to lift a blender (of a presumed known weight) and the energy that the user doing the lifting can safely and comfortably expend.

This specification could lead to a requirement on the maximum allowed weight of the blender. If this simply is not physically possible, it might alternatively lead to a requirement that documentation make clear the blender must not be stored above a certain height.

Example systemic flaw specification

Electric kettle runs dry Per IEC #23, kettle runs even if all water has evaporated. The continued flow of electricity into the kettle generates heat that has no way to exit the system.

In this case, the input electricity is matched by the evaporation of boiled water in the kettle. Once the water evaporates completely, the flow becomes mismatched. This flaw could also be reasonably identified as an insufficient flow (i.e., insufficient flow of heat out of the system per unit electricity into the system).

This could damage the kettle and, possibly, harm a user touching it if it has overheated.

However

TODO Describe consequences and counter-indications.

See Also

design/systemic_flaw.txt · Last modified: 2020.03.12 13:30 (external edit)