What is Failure Analysis? 

Failure Analysis(FA) is the process of collecting and analyzing data to determine the cause of a failure.
It is vital in identifying liabilities, improving design, and correcting malfunctions.

Failure Analysis Process

Failure Analysis Challenges

Assigning Priorities

In the example to the right, the graph displays how many types of each failure has been recorded across all customers using the same model automated bottle filler. Each of the faults may take a significant amount of effort to resolve. In some cases, it will be impossible to reproduce the failure. In others, the time between failures may be so long it is difficult to resolve the issue in a reasonable amount of time. As a result, if the cost of performing failure analysis exceeds the value generated by identifying the root cause, that specific quality issue may never be addressed despite the fact failures have a clear impact on the perceived value of the product.



Key information FA engineers wish they had: 

Automated Bottle Filler


True Time of Use

  • How long was this machine powered but idle?
  • How long was this machine operating?
  • How hard was the machine working, at capacity or a lesser level?


Warranty Limits

  • Was the product misused?
  • Was the product damaged due to improper operation?
  • Was the product serviced and maintained correctly?


The Operating Environment

  • Was this machine operated at extremely high or low temperatures?
  • Did this machine experience excessive vibration?
  • Was the humidity extremely high or low?


Conditions Before Failure

  • Was the machine vibrating?
  • Was the machine making a unusual noise?
  • Was it producing product in spec and the proper speed?

Questions on how to leverage data to improve the quality of your products?

Download our whitepaper focused on helping you leverage your data in your machines. Learn more here:

Leveraging IIoT Data Whitepaper

A Strategy for Deploying a Historical Data Failure Analysis Support System

Step 1: Organize Data Inventory

Many devices in a system have multiple status registers that are not used by the control solution. For example, most microcontrollers have an embedded temperature sensor. A Modbus-TCP enabled device typically leverages a micro-controller and the value of the temperature sensor is often mapped to a register on that device. This data can be captured by the PLC and pushed to data storage in the cloud. Devices like the eWON Flexy can be used to transport the data if this function is beyond the capability of the PLC. The Flexy offers data access at almost no incremental cost relative to a Cosy remote access solution. As a result, any machine with a remote access requirement can be upgraded to a remote data access solution for a very little incremental cost.

Step 2: Adding Critical Sensors

Any critical sensors required for failure analysis or warranty resolution not supported by the current system, should be added to the system. This can be done by integrating a new sensor into the machine itself, leveraging the PLC to collect the data. It can also be added as an overlay sensor network that has no impact on the underlying functionality of the machine. The eWON Flexy is a good match for anyone trying to add additional sensors to a machine without disrupting the PLC subsystem.

Step 3: Monitoring Sensors

All sensors on the machine should be monitored and logged overtime during the final production system test. Recording the value of a sensor over time during a known test creates a set time series data which is a signature for a properly operating machine. The signature of each sensor during the test is stored in the FA database. Once a machine has been deployed to the field, any on-site testing signatures should be stored in the same location. Ideally, portions of the onsite field test are identical to portions of the final system test. The signatures can be compared to verify that the shipping and installation process did not impact the machine.

Step 4: Leveraging Machine Data

Once the machine is operational, the data from that machine can be streamed to the Failure Analysis database or batch uploaded to the database on a periodic basis. This data can later be used by a data scientist to perform analysis regarding how the machine is being used by each customer. This data is also useful for creating a predictive maintenance model.
When a machine is failing on-site or has been returned to the factory, the original production system tests can be executed on the machine. Comparing sensor signatures from prior testing can quickly shed light on the failure mechanism.

Step 5: Machine Failure Segmentation

When machine failures occur, they are segmented into two major buckets; those that require service (broken part) and those that can be resolved by the machine operator (jams or positioning issues). In both cases, stored time series operation data can be used to help determine the cause of the malfunction. This data is particularly useful when the flaw is intermittent and resolvable by the machine operator. Without this data service personnel must witness a failure, which can take a significant amount of time for an issue that occurs infrequently.

Unlock new services with your machines data!

Learn more about the Ewon Flexy