What this article answers
Article summary
To replace fragile onion logic in PLC programs, engineers should use explicit finite state machines. A state-machine architecture can reduce scan-order ambiguity, isolate fault handling, and make abnormal-condition testing observable in simulation before logic reaches a live process.
A common misconception is that ladder logic becomes “advanced” when it becomes dense. In practice, density is often just ambiguity wearing a hard hat. Machines do not fail because a rung looked sophisticated; they fail because the sequence was not deterministic when a real input changed at the wrong moment.
During a recent internal benchmark of 200 user-submitted mixing-sequence exercises in OLLA Lab, 82% of deeply nested “onion logic” programs entered an unrecoverable sequence lock when subjected to a 100 ms intermittent sensor-loss event, while refactored explicit-state versions recovered deterministically in 100% of those same scenario tests [Methodology: n=200 submitted mixing-sequence tasks, comparator = original nested-latch implementation versus refactored explicit-state implementation, time window = Ampergon Vallis Lab internal review cycle completed Q1 2026]. This internal benchmark supports a narrow architectural point: explicit state models were more fault-recoverable under the tested conditions. It does not support broad claims about all PLC code, all industries, or operator competence.
The practical distinction is simple: syntax is not deployability. A program that works on the happy path but freezes on a flickering permissive is not ready for commissioning, however neat the rungs look.
What is “onion logic” in PLC programming?
Onion logic is a ladder-logic anti-pattern in which machine behavior is controlled through layers of interdependent booleans, scattered latches, and nested permissives whose combined execution path is difficult to reason about during abnormal conditions.
The name is informal, but the failure mode is real. Each new condition wraps around the previous one until the sequence becomes dependent on hidden interactions between rung order, latch history, and transient inputs. It usually works during demonstrations. Commissioning is less polite.
The 3 symptoms of onion logic
Sequence progress depends on multiple `(S)/(R)` or `(OTL)/(OTU)` instructions distributed across many rungs, often without a single authoritative source of machine state.
- Interdependent latches
Program behavior changes depending on rung order, one-scan timing, or whether a permissive drops before or after an unlatch condition is evaluated.
- Scan-cycle vulnerability
The sequence runs correctly when every sensor behaves cleanly, but jams, half-latches, or requires manual reset after a mid-cycle interruption.
- Happy-path bias
Why onion logic becomes fragile
Onion logic increases effective cyclomatic complexity. In plain terms, the number of possible execution paths grows faster than a junior engineer can reliably trace under fault conditions, especially when several booleans can remain true from prior scans.
That matters because PLC troubleshooting is not done in a vacuum. It is done while a conveyor is stopped, a pump is unavailable, or a batch step is waiting on a permissive that “should be true.” “Should” is not a diagnostic method.
Why do explicit state machines provide better fault recovery?
Explicit state machines provide better fault recovery because they make machine intent singular, observable, and deterministic. At any given time, the machine occupies one defined state, and transitions occur only when explicit conditions are met.
This is the architectural difference that matters. Onion logic asks, “Which collection of bits currently implies where I am?” A state machine asks, “What state am I in, and what condition permits the next one?” The second question is much easier to answer at 3 AM.
### Operational definition: what is an explicit state machine in ladder logic?
An explicit state machine is a control architecture in which:
- a machine’s sequence status is represented by a single authoritative state variable, often an integer;
- each state is mutually exclusive from the others;
- transitions are explicitly defined by observable conditions;
- abnormal conditions route the machine to a defined fault or hold state;
- physical outputs are derived from the active state rather than scattered across unrelated sequence rungs.
A simple example might use:
- `0 = Idle`
- `10 = Starting`
- `20 = Running`
- `30 = Stopping`
- `99 = Fault`
This approach aligns with established software-structuring principles under IEC 61131-3, which supports structured program organization, clear execution behavior, and maintainable control logic. The standard does not prescribe one universal sequence pattern for every machine, but the preference for explicit, readable architecture is not controversial.
Explicit state vs. onion logic
| Engineering Aspect | Explicit State Machine | Onion Logic | |---|---|---| | State representation | One authoritative state variable, typically integer-based | Many overlapping booleans and latches | | Sequence visibility | Current machine phase is directly observable | Sequence position must be inferred | | Fault handling | Explicit jump to a defined fault or hold state | Fault recovery depends on unlatching the right conditions | | Output mapping | Outputs derived in a dedicated routine from active state | Outputs often scattered across sequence rungs | | Troubleshooting | Ask: “Why didn’t state X transition to Y?” | Ask: “Which bit failed to set or reset?” | | Scan-order sensitivity | Reduced when transitions are cleanly partitioned | Often highly sensitive to rung order | | Maintainability | Easier to review, test, and revise | Degrades quickly as conditions accumulate |
How does scan-cycle behavior make onion logic fail?
Onion logic fails under real scan-cycle behavior because PLCs do not evaluate intent; they evaluate instructions in order, one scan at a time, using the current state of memory and inputs.
That sounds obvious, but many sequence bugs are just delayed appreciation of that fact. A sensor can drop for 50 ms. A feedback can arrive one scan later than expected. A latch can remain set because the unlatch rung was never made true under the exact sequence of events that occurred.
Typical failure mechanisms
A transition bit sets and resets in adjacent logic, leaving downstream sequence logic in an indeterminate condition.
- One-scan races
A sequence step remains true because the reset path depends on a permissive that disappeared during the fault.
- Latched memory persistence
Moving one rung for readability changes behavior because the logic was relying on implicit execution order rather than explicit state transitions.
- Rung-order dependence
A noisy or briefly lost field signal causes the machine to partially advance, then lose the conditions needed either to continue or to recover.
- Sensor bounce or intermittent loss
These are not exotic edge cases. They are ordinary plant-floor events. Hardware is under no obligation to flatter your sequence design.
How do you build a finite state machine in ladder logic?
You build a finite state machine in ladder logic by separating state transition logic from output action logic. The transition routine decides when the machine changes state. The output routine decides what the machine does while in that state.
That separation is the core discipline. If transitions and outputs are mixed together everywhere, the architecture drifts back toward onion logic.
### Step 1: Define the machine states
Start by assigning mutually exclusive state values.
- `0 = Idle`
- `10 = Start_Request`
- `20 = Starting`
- `30 = Running`
- `40 = Stopping`
- `99 = Fault`
Use numbering that leaves room for insertion later. Engineers who number every state 1, 2, 3 usually meet state 2.5 eventually.
### Step 2: Define the transition conditions
Each transition should answer one narrow question:
- What condition allows entry to the next state?
- What condition blocks progress?
- What condition forces a fault or hold state?
- What condition permits reset or recovery?
Transitions should be explicit and testable. Avoid hidden dependency on side effects from unrelated rungs.
### Step 3: Write transition logic first
Below is a compact ladder-style example for state transitions:
Language: Ladder Diagram - State Transition Logic
Rung 1: Idle (0) -> Starting (10) EQU(Machine_State, 0) --- XIC(Start_PB) --- XIC(System_Ready) --- MOV(10, Machine_State)
Rung 2: Starting (10) -> Running (20) EQU(Machine_State, 10) --- XIC(Motor_Run_Fdbk) --- MOV(20, Machine_State)
Rung 3: Any active state -> Fault (99) on trip NEQ(Machine_State, 0) --- XIC(Trip_Active) --- MOV(99, Machine_State)
Rung 4: Fault (99) -> Idle (0) after reset and safe conditions EQU(Machine_State, 99) --- XIC(Reset_PB) --- XIO(Trip_Active) --- XIC(System_Ready) --- MOV(0, Machine_State)
The important point is not the exact syntax. The important point is that the current state and the transition condition are visible, singular, and reviewable.
### Step 4: Map outputs from state, not from sequence fragments
A separate routine should derive outputs from the active state.
For example:
- If `Machine_State = 20`, command `Motor_Run = 1`
- If `Machine_State = 40`, command `Motor_Run = 0`
- If `Machine_State = 99`, de-energize non-safe outputs and assert fault indication
This reduces scattered output coils and makes machine behavior easier to audit.
### Step 5: Define fault behavior deliberately
A fault state should not be a vague “something went wrong” bucket. It should define:
- which outputs de-energize,
- which alarms assert,
- whether restart is automatic or manual,
- what reset conditions are required,
- and what evidence the operator or engineer can observe.
Determinism matters most when things go wrong. Normal operation flatters weak architecture.
What does “Simulation-Ready” mean for PLC state-machine work?
“Simulation-Ready” means the engineer can prove, observe, diagnose, and harden control logic against realistic process behavior in a risk-contained environment before that logic reaches a live process.
This is an operational definition, not a compliment. It does not mean “familiar with ladder syntax,” “ready for unsupervised commissioning,” or “automatically employable.” It means the engineer can subject a sequence to abnormal conditions, inspect the resulting behavior, and revise the logic based on evidence.
Observable behaviors of a Simulation-Ready engineer
A Simulation-Ready engineer can:
- force and monitor discrete and analog I/O;
- compare expected sequence behavior against observed machine response;
- inject a fault, such as intermittent sensor loss or missing feedback;
- verify that the logic transitions to a safe, defined state;
- revise transition permissives or fault handling;
- rerun the scenario and demonstrate improved deterministic behavior.
That is the difference between writing code and validating control behavior. The gap is expensive.
How does OLLA Lab simulate state-transition failures?
OLLA Lab simulates state-transition failures by giving engineers a browser-based ladder environment, simulation controls, visible variables and I/O, and scenario-based digital twins that allow fault injection without live equipment risk.
This is where the product becomes operationally useful. The value is not that it draws ladder logic in a browser. Many tools can draw. The value is that the logic can be exercised against simulated equipment behavior and abnormal conditions in a contained environment.
Relevant OLLA Lab capabilities for state-machine validation
Engineers can build state transitions using contacts, coils, timers, counters, comparators, math, logic, and PID-related instructions.
- Web-based ladder logic editor
Logic can be run, stopped, and observed without physical hardware.
- Simulation mode
Tags, inputs, outputs, analog values, and state variables can be monitored and adjusted directly.
- Variables panel and I/O visibility
Ladder logic can be compared against simulated equipment behavior in realistic machine or process contexts.
- 3D / WebXR / VR simulations
The engineer can test whether sequence logic produces the intended equipment behavior before any live deployment discussion.
- Digital twin validation workflow
More than 50 named presets across manufacturing, water and wastewater, HVAC, chemical, pharma, warehousing, food and beverage, and utilities provide context-specific sequences, hazards, and interlocks.
- Scenario-based industrial presets
A practical OLLA Lab test case
In a conveyor-jam scenario, an engineer can:
That is a credible rehearsal task because it mirrors a real commissioning question: what happens when the machine does not get the clean signal sequence the original author assumed?
- set `Machine_State = 20` to represent Running;
- observe the digital twin conveyor operating;
- drop a permissive or feedback input mid-sequence using the variables panel;
- verify whether the state transitions to `99 = Fault` or hangs in an inconsistent condition;
- revise the transition logic;
- rerun the scenario to confirm deterministic recovery.
How should engineers document state-machine skill as engineering evidence?
Engineers should document state-machine skill as a compact body of engineering evidence, not as a screenshot gallery.
A screenshot proves that a screen existed. It does not prove that the logic was correct, tested, or revised after failure.
Required evidence structure
Use this six-part structure:
State what successful behavior means in observable terms: start conditions, outputs, transitions, alarms, safe stop behavior, and reset requirements.
Identify the abnormal condition introduced: sensor dropout, failed feedback, analog excursion, timeout, jam, estop chain interruption, or similar.
Document the exact logic change: added timeout, revised permissive, explicit fault transition, debounce handling, output remap, or reset condition.
- System Description Define the machine or process, its purpose, and its sequence boundaries.
- Operational definition of “correct”
- Ladder logic and simulated equipment state Show the state architecture, key tags, and the corresponding simulated equipment response.
- The injected fault case
- The revision made
- Lessons learned Explain what the fault revealed about the original architecture and why the revision improved determinism or recoverability.
This structure is useful because it makes the engineering reasoning inspectable. Employers and senior reviewers do not need a portfolio of pretty interfaces. They need evidence that you can think through failure.
Which standards and literature support this architecture?
Explicit state architecture is supported indirectly by established control-software and safety-engineering principles, even where the standards do not prescribe one exact ladder pattern.
Relevant standards and technical grounding
Supports structured programming approaches for industrial-control software and clear organization of logic, functions, and execution behavior.
- IEC 61131-3
Emphasizes systematic integrity, fault response, and the importance of reducing dangerous design ambiguity in safety-related systems.
- IEC 61508
Reinforce the need for deterministic behavior, traceability, and testable fault handling in control-system design.
- exida guidance and functional safety practice
Recent industrial literature consistently supports simulation as a means to validate behavior, reduce commissioning risk, and expose sequence faults before deployment, while also warning that simulation quality depends on model fidelity and test design.
- Digital twin and simulation literature
Research in engineering training environments suggests that interactive simulation can improve procedural understanding and fault recognition, particularly when learners can test cause-and-effect rather than only read static examples.
- Immersive and interactive training literature
The bounded conclusion is straightforward: simulation does not replace field experience, but it is a defensible place to rehearse failure logic that would be unsafe, expensive, or impractical to test on a live process.
When should you still be cautious with state machines?
State machines are not magic. They can still be poorly designed, overcomplicated, or implemented without enough attention to safe-state behavior, mode handling, or operator recovery.
Common state-machine mistakes
- too many states without clear hierarchy;
- unclear distinction between mode, state, and fault;
- transitions triggered by noisy signals without debounce or validation;
- fault states that do not define output behavior;
- recovery paths that allow unsafe or unintended restart;
- output logic that quietly reintroduces scattered dependencies.
A bad state machine is still easier to diagnose than bad onion logic, but that is not the same as saying it is good. Architecture improves your odds; it does not suspend engineering discipline.
What is the practical takeaway for PLC engineers?
The practical takeaway is that explicit state machines are usually the better architecture when sequence clarity, fault recovery, and commissioning visibility matter.
If the machine has multiple phases, interlocks, abnormal conditions, or recovery requirements, a single authoritative state model will generally outperform layered latch logic in maintainability and diagnosability. That is especially true for junior and intermediate engineers who need an architecture they can reason about under pressure.
OLLA Lab fits into this workflow as a bounded validation environment. It allows engineers to build ladder logic, observe I/O, compare state against simulated equipment behavior, inject faults, and revise the sequence before any live deployment context exists. That is a serious use case. No fireworks required.
Related Resources
- For a deeper breakdown of latch behavior, review “Seal-In” vs. “Latch”: Why Professional Engineers Choose Carefully. - To see this architecture applied to a continuous process, follow Step-by-Step Build: The Automated Mixer State Machine.
- Mastering state architecture is a core competency in our Ladder Logic Mastery Hub.
- Stop guessing how your logic will handle a sensor failure. Open the Conveyor Jam Scenario in OLLA Lab.
Continue Learning
- Up (Pillar Hub): Explore Pillar guidance - Across: Related article 1 - Across: Related article 2 - Down (Commercial/CTA): Build your next project in OLLA Lab