AI Industrial Automation

Article playbook

How to Program Latch and First-Out Alarms for Intermittent Signal Loss

Learn how to capture transient PLC faults with latch logic and preserve the initiating cause with First-Out alarms, then validate the sequence in OLLA Lab using a square-wave input test.

Direct answer

To diagnose intermittent signal loss in PLC systems, engineers typically need two things: a latch mechanism that captures a transient fault and First-Out alarm logic that preserves the initiating cause during a cascade. In OLLA Lab, a square-wave input test can be used to validate that behavior safely before live commissioning.

What this article answers

Article summary

To diagnose intermittent signal loss in PLC systems, engineers typically need two things: a latch mechanism that captures a transient fault and First-Out alarm logic that preserves the initiating cause during a cascade. In OLLA Lab, a square-wave input test can be used to validate that behavior safely before live commissioning.

Intermittent faults are often not "mystery trips." They are fast trips. A loose sensor lead, fretted terminal, or bouncing contact can change state quickly enough for the PLC to react and shut the process down, while the HMI never displays a stable active alarm.

During internal stress testing in OLLA Lab, applying a 10 Hz square-wave disturbance to an unlatched motor permissive produced 600 state changes per minute. [Methodology: 1 digital input test case / unlatched permissive baseline / single 60-second run] This supports one narrow point: transient faults can cycle far faster than operators can reliably observe on a screen. It does not prove field failure rates or plant-wide alarm performance.

A Simulation-Ready engineer is not merely someone who can draw ladder syntax. It is someone who can prove, observe, diagnose, and harden logic against realistic abnormal behavior before it reaches a live process. That distinction matters. Syntax is cheap; commissioning mistakes are not.

What causes the "vibration bug" in industrial control systems?

The "vibration bug" is usually an intermittent electrical discontinuity, not a software ghost. Common causes include contact fretting, loose terminations, degraded cables, connector wear, and mechanical switch bounce under impact or vibration.

The control consequence is straightforward. A digital input that should remain stable begins fluttering between `True` and `False`. If that input is part of a permissive, trip, proof, or run feedback chain, the PLC can react within its scan cycle even when the disturbance is too brief for an HMI refresh or operator observation window to preserve clearly.

This timing mismatch is the real problem. PLC scan times commonly operate in the millisecond range, while HMI update behavior is slower and often filtered by communications, polling intervals, and display logic. The machine stops. The alarm clears. Operations calls it random. It usually is not.

Common sources of intermittent signal loss

  • Fretting corrosion on terminal blocks or relay contacts
  • Loose field wiring at marshalling panels or device terminals
  • Degraded M12 or similar connectors on high-vibration equipment
  • Limit switch bounce during impact or poor mechanical alignment
  • Cable fatigue near moving equipment, hinges, or drag chains
  • Sensor power interruptions caused by marginal supply or grounding faults

A useful correction here: not every flickering input needs debounce first. If the signal represents a true loss of permissive or proof, masking it too early can hide the initiating fault. Noise filtering and fault capture are related problems, but they are not the same problem.

How do you use a square wave to simulate a loose wire?

A square wave is the correct test pattern for discrete intermittent fault injection because it forces deterministic boolean toggling. In practical terms, it behaves like a wire or contact repeatedly making and breaking connection.

In OLLA Lab, the square-wave signal can be bound to a digital input and used to stress the logic path that depends on that input. This is where the platform becomes operationally useful: you are no longer asking whether the rung looks reasonable, but whether the control sequence actually traps a transient fault, preserves the initiating cause, and drives the process to a safe state.

Waveform application in fault testing

| Waveform | Best use | Engineering purpose | |---|---|---| | Sine Wave | Analog drift or cyclic process variation | Observe gradual value change and threshold behavior | | Sawtooth | Command ramps or tracking behavior | Test analog following and reset patterns | | Square Wave | Discrete boolean toggling | Simulate loose wires, bouncing contacts, or intermittent proofs |

The point is not waveform variety for its own sake. The point is matching the fault model to the failure mode. A loose wire is not a sine wave.

How do you program a basic latch circuit to capture transient faults?

A latch circuit is used to preserve evidence of a fault after the initiating condition disappears. Without that retention, the PLC may trip correctly but leave no durable indication of what happened.

There are two common ladder approaches: a seal-in pattern and explicit latch/unlatch instructions. Both can be valid, but they are not interchangeable.

Seal-In vs. OTL/OTU

Uses the output's own contact in a parallel branch to hold the state after the initiating condition occurs.

  • Standard Seal-In (self-holding)
  • Often appropriate for non-retentive control behavior
  • Typically drops out on power loss
  • Common in motor control and straightforward alarm hold circuits

Uses explicit retentive memory behavior.

  • Latch/Unlatch (OTL/OTU or equivalent retentive instructions)
  • Bit remains `True` until a deliberate reset occurs
  • Survives logic transitions more explicitly than a simple seal-in branch
  • Requires disciplined reset design and operator acknowledgment logic

The engineering choice depends on what must be remembered, for how long, and across what system state changes. Retention is useful; accidental retention is a nuisance with paperwork attached.

### Example: basic latched fault capture

Objective: Capture a brief loss of `Motor_Run_Proof` so the alarm remains visible after the signal recovers.

| Motor_Commanded_On | /Motor_Run_Proof |--------------------(OTL Fault_Motor_Run_Proof_Latched) |

| Reset_Faults_PB |-----------------------------------------(OTU Fault_Motor_Run_Proof_Latched) |

Operational meaning:

  • If the motor is commanded on and run proof is lost, latch the fault bit.
  • The fault remains active even if run proof returns.
  • A deliberate reset is required after diagnosis.

That is the minimum structure needed to trap a transient. It is not yet First-Out logic.

What is First-Out alarm logic and why does ISA-18.2 require it?

First-Out alarm logic preserves the initiating alarm when one fault triggers a cascade of secondary alarms. In practice, it answers the only question operators and technicians actually need first: what happened first?

ISA-18.2 is an alarm management standard for the process industries. While implementations vary by system and philosophy, the standard's alarm rationalization principles strongly support alarm designs that prevent alarm floods and preserve meaningful operator response. First-Out logic is a common and defensible method for doing exactly that during cascade failures.

Here is the failure pattern. A vibration-induced intermittent trip causes the main motor to stop. Once the motor stops:

  • flow may fall,
  • pressure may collapse,
  • temperature control may drift,
  • downstream permissives may drop out.

If every resulting condition alarms equally, the initiating cause is buried under consequences. That is not better visibility. It is just louder confusion.

Why First-Out matters during cascade failures

  • It preserves the initiating event for diagnosis
  • It suppresses misleading alarm floods
  • It improves operator response quality
  • It supports post-event troubleshooting and alarm review
  • It aligns with rational alarm design principles under ISA-18.2-style governance

Standard First-Out rung concept

A common pattern is to set a global `First_Out_Active` bit when the first qualifying alarm occurs, then block subsequent candidates from claiming first position.

// Candidate 1: Vibration / intermittent proof fault | /First_Out_Active | Fault_Vibration_Latched |---------(OTL FirstOut_Vibration) | | /First_Out_Active | Fault_Vibration_Latched |---------(OTL First_Out_Active) |

// Candidate 2: Low flow consequence | /First_Out_Active | Alarm_Low_Flow |-----------------(OTL FirstOut_Low_Flow) | | /First_Out_Active | Alarm_Low_Flow |-----------------(OTL First_Out_Active) |

// Candidate 3: Low pressure consequence | /First_Out_Active | Alarm_Low_Pressure |-------------(OTL FirstOut_Low_Pressure) | | /First_Out_Active | Alarm_Low_Pressure |-------------(OTL First_Out_Active) |

// Reset | Reset_First_Out_PB |----------------------------------(OTU First_Out_Active) | | Reset_First_Out_PB |----------------------------------(OTU FirstOut_Vibration) | | Reset_First_Out_PB |----------------------------------(OTU FirstOut_Low_Flow) | | Reset_First_Out_PB |----------------------------------(OTU FirstOut_Low_Pressure) |

Engineering intent:

Image alt text: Screenshot of the OLLA Lab Variables Panel showing a First-Out alarm sequence. The vibration sensor bit is latched true, while subsequent low-flow and low-pressure alarms are blocked from activating the primary HMI alert.

A practical note: First-Out logic should be designed with clear qualification rules. If you allow nuisance alarms, stale bits, or poorly reset state into the candidate pool, the sequence will faithfully preserve the wrong answer.

  1. The first valid alarm sets its own First-Out bit.
  2. The same event sets `First_Out_Active`.
  3. Once `First_Out_Active` is set, later alarms cannot overwrite the initiating cause.
  4. Reset occurs only through a deliberate diagnostic workflow.

How does Ampergon Vallis validate First-Out sequences?

Ampergon Vallis validates First-Out behavior by injecting a controlled intermittent fault into a simulated input path and observing whether the logic captures the initiating event, suppresses downstream alarm clutter, and places the process in a safe state. That is the bounded claim.

In OLLA Lab, "digital twin validation" should be understood operationally, not romantically. It means comparing ladder state, I/O state, and simulated equipment response under a defined fault case to verify that the control philosophy behaves as intended before live deployment.

Typical OLLA Lab workflow for intermittent fault validation

  1. Bind the signal source Assign the OLLA Lab square-wave generator to a target digital input such as `DI_03_Vibration_Sw` or `Motor_Run_Proof`.
  2. Run the simulated process Start the relevant 3D or web-based equipment scenario and establish normal operating state.
  3. Inject the intermittent fault Trigger the square wave at the selected frequency to create repeatable `True/False` toggling.
  4. Observe the Variables Panel Confirm the initiating fault bit latches, the First-Out bit claims correctly, and secondary alarms are blocked from taking first position.
  5. Verify safe-state behavior Check that outputs, permissives, and sequence state move to the intended safe condition.
  6. Reset and retest Clear the sequence deliberately, then rerun the disturbance to confirm repeatability.

This workflow is useful because it rehearses a class of commissioning task that is awkward, risky, or physically abusive to reproduce on real equipment. Intentionally chattering a live field device is a poor way to make friends with maintenance.

What should "correct" behavior look like during an intermittent fault test?

Correct behavior must be defined before the test starts. Otherwise, engineers end up admiring animation while learning very little.

For a latched First-Out design, "correct" usually means:

  • the initiating fault is captured even if the signal recovers,
  • the process transitions to a defined safe state,
  • secondary alarms may still exist internally but do not overwrite the First-Out indication,
  • reset is deliberate and controlled,
  • the sequence is repeatable across multiple test runs.

This is the practical meaning of Simulation-Ready in a controls context: the engineer can state the expected behavior, inject the fault, observe the result, and revise the logic if the behavior does not match the control philosophy.

A compact engineering evidence package

When documenting this work, build evidence rather than a screenshot gallery. Use this structure:

Document what changed after testing: latch placement, reset conditions, alarm qualification, or First-Out blocking logic.

  1. System Description Define the machine or process segment, the relevant I/O, and the control objective.
  2. Operational definition of "correct" State exactly what the logic should do during fault, alarm, safe-state transition, and reset.
  3. Ladder logic and simulated equipment state Show the rung logic together with the equipment state or sequence state it governs.
  4. The injected fault case Record the input disturbed, waveform used, frequency, duration, and expected consequence chain.
  5. The revision made
  6. Lessons learned Capture the engineering takeaway, especially where the original logic failed or produced ambiguous operator information.

That format is more persuasive than a polished dashboard image because it shows reasoning, fault handling, and revision discipline. Employers and reviewers generally prefer evidence of judgment over evidence of mouse clicks.

When should you use debounce logic instead of latch-plus-First-Out logic?

Debounce logic is appropriate when the signal is noisy but not representing a meaningful abnormal condition that must be captured immediately. Latch-plus-First-Out logic is appropriate when a transient indicates a real fault, trip, or loss of proof that must be preserved for diagnosis.

The distinction is simple: - Debounce asks: "Should I ignore this brief instability?" - Latch + First-Out asks: "If this instability is real, how do I preserve the initiating cause?"

Many systems need both, but in the right order and with the right intent. Filtering a nuisance input can improve robustness. Filtering away the only evidence of a dangerous or production-critical event is a different achievement.

What standards and technical literature support this approach?

The approach is supported by established alarm management, functional safety, and simulation literature, though each source governs a different part of the problem.

Relevant standards and literature

  • ISA-18.2 supports disciplined alarm philosophy, rationalization, prioritization, and management of alarm floods in process environments.
  • IEC 61508 provides the broader functional safety framework for safety-related electrical, electronic, and programmable systems. It does not prescribe your exact First-Out rung, but it does reinforce the need for deterministic behavior, validation, and lifecycle discipline.
  • exida guidance and industry practice consistently emphasize proof of behavior, abnormal condition handling, and validation before deployment in safety-relevant contexts.
  • Digital twin and simulation literature in industrial engineering and control domains supports simulation as a valid environment for testing control responses, operator interaction, and fault scenarios before live operation.

The narrow claim here is defensible: simulation is a credible place to validate alarm and fault-handling logic before commissioning. The broader claim that simulation alone confers site competence would be unserious.

Conclusion

Intermittent signal loss is difficult to diagnose because the fault can disappear before the operator ever sees it. The engineering answer is not guesswork. It is to trap the transient with a latch, preserve the initiating cause with First-Out logic, and validate the sequence against a realistic fault injection before the code reaches live equipment.

That is where OLLA Lab fits credibly. It is a web-based ladder logic and digital twin simulator that lets engineers build logic, run simulation, monitor I/O and variables, and inject repeatable fault behavior such as square-wave boolean toggling to test whether the sequence actually holds up. Used properly, it is a rehearsal environment for commissioning judgment, not a substitute for it.

Keep exploring

Interlinking

References

Editorial transparency

This blog post was written by a human, with all core structure, content, and original ideas created by the author. However, this post includes text refined with the assistance of ChatGPT and Gemini. AI support was used exclusively for correcting grammar and syntax, and for translating the original English text into Spanish, French, Estonian, Chinese, Russian, Portuguese, German, and Italian. The final content was critically reviewed, edited, and validated by the author, who retains full responsibility for its accuracy.

About the Author:PhD. Jose NERI, Lead Engineer at Ampergon Vallis

Fact-Check: Technical validity confirmed on 2026-03-23 by the Ampergon Vallis Lab QA Team.

Ready for implementation

Use simulation-backed workflows to turn these insights into measurable plant outcomes.

© 2026 Ampergon Vallis. All rights reserved.
|