How to Prioritize Safeguards When They Are All Critical

Written by Risk Alive

June 3, 2020

Ronjit Mukherjee
ACM Facility Safety Inc.
926 5 Ave SW #300, Calgary, AB T2P 0N7
rmukherjee@acm.ca 

Richard Carter
ACM Facility Safety Inc.
926 5 Ave SW #300, Calgary, AB T2P 0N7
rcarter@acm.ca

Prepared for Presentation at American Institute of Chemical Engineers
2019 Spring Meeting and 15th Global Congress on Process Safety

New Orleans, LA
March 31 – April 3, 2019

Abstract

With the industry norm of budget constraints and the continuous effort of trying to do more with less, the ability to correctly prioritize work has become increasingly critical.  This becomes especially important when it relates to risk-reducing Safeguards and the prioritization of Safeguards for maintenance.  With a growing list of equipment and corresponding Safeguards, the ability to stay on top of maintenance prioritization can be significantly impacted. This can lead to critical Safeguards having delayed or missed maintenance, and in turn lead to unintended and significant risk exposure.

In order to correctly identify critical Safeguards and prioritize them, it would require hundreds of hours of effort from cross-disciplinary Subject Matter Experts (SMEs).  Fortunately, these are the same SMEs involved in developing Process Hazard Analysis (PHA) documentation.  By using PHAs as a data source, the documentation of hazardous scenarios and Safeguards can be used to prioritize critical Safeguards using the following criteria:

  • Frequency of a Cause
  • Severity of a Consequence and its corresponding Tolerable Frequency
  • Probability of Failure on Demand (PFD) of Safeguards intended to prevent that Consequence

These criteria are used to determine the risk reduction associated with each occurrence of a Safeguard, and by accumulating the risk reduction, the total risk-reducing benefit of each Safeguard can be determined to assist in prioritization.  By having Safeguard prioritization based on SME-approved data, as well as using a simulation tool to consider plant realism of Safeguards failing and PHA recommendations being implemented, the confidence in Safeguard maintenance prioritization will increase.  An optimized and reflective prioritization list will reduce the probability of missing maintenance on critical Safeguards, which in turn will minimize risk exposure to site personnel.

1. Introduction

In the process industries, effort is placed on trying to reduce the risk of hazardous events.  The key items involved in reducing the risk of these scenarios are Safeguards.  A safeguard can be an engineered system or an administrative control that either interrupts the chain of events following an initiating event or that mitigates the consequences. Safeguards are used to ensure the impact to people, equipment and the environment is minimized from many critical consequences such as overpressures of equipment to fires and explosions.  Examples of Safeguards include:

  • Relief and Vent Devices
  • Control Systems
  • Interlocks and Emergency Shutdown Systems
  • Administrative Controls including Alarms, Procedures, Operator Rounds

The effectiveness of a Safeguard is highly dependent on the maintenance of that equipment.  Without regularly scheduled maintenance the probability of safeguard failure will increase, and when a Safeguard fails it will no longer be effective in reducing risk.  Unfortunately, the realism of the industry is that there will be always be a constraint on budgets and people, which means there will be equipment that we are not able to maintain at the desired level and associated reliability. The scenarios which are protected by the equipment maintained at a lower level are then more likely to occur, increasing the risk to site personnel, the environment, and facility production. 

As with all problems related to resource constraints, the key solution is correct prioritization.  Thus, the key solution to the problem of maintenance constraints is Safeguard prioritization.  Although organizations in industry have made steps in the right direction to create prioritized Safeguard lists through Safety Critical Equipment identification and Alarm Rationalization, there are several current issues which may hinder the goal of ensuring critical Safeguards are maintained correctly to keep high risk scenarios at a tolerable level.  By improving the methodologies for prioritizing Safeguard maintenance, the opportunity for unanticipated risk exposure can be minimized, and the following issues related to the current approach to critical Safeguard lists can be resolved:

  • Within current lists of critical Safeguards such as the outputs of Alarm Rationalization, in which alarms are tiered into different priorities, or Safety Critical Equipment lists, in which equipment related to high risk scenarios are documented, it can be difficult to identify the relative importance between each unique Safeguard. The importance of distinguishing safeguards among a large group magnifies all the more when maintenance budgets and resources are further constrained.
  • Current documentation of identification of critical Safeguards is not easily understood by personnel unfamiliar with the process. It is important that the reasoning for identifying Safeguards as critical is recorded clearly, so that anyone who reviews the document will be able to understand the justification. Through regular swings in the economy, it is commonplace for management to direct engineers and operations to find areas of cost savings.  If engineers or operations review critical Safeguard lists to identify areas of savings, and due to poor documentation unnecessarily declassify critical Safeguards, it can result in missed maintenance of critical Safeguards and increased risk exposure.
  • Current Safeguard lists are static, however, in a processing facility Safeguards fail or are bypassed, and new Safeguards are implemented, changing the importance of the remaining functional Safeguards on site.

Fortunately, the solution to improve this methodology is available today to all members of the process industry.  This paper presents how integration of PHAs with the creation of a prioritized Safeguard list can resolve these current issues and will show that:

  • Although all Safeguards reduce risk, they do not all have equal weighting in their risk reduction and therefore they do not have equal importance for ensuring regular maintenance.
  • PHAs already have enough documentation of what scenarios Safeguards are protecting against and provide an easy-to-understand reasoning of the Safeguard criticality.
  • Using tools to simulate failing Safeguards and implementing recommendations can help keep an active and reflective list of Safeguard criticality.

The PHAs analyzed in this paper were Hazard and Operability (HAZOP) studies performed on existing processing sites. Data from the Center for Chemical Process Safety (CCPS) publication on Layer of Protection Analysis (LOPA) was used for the study when risk reduction credits were applied to existing Safeguards [2]. 

It should be noted that:

  • Irrespective of how much maintenance is performed on a safeguard, there will always be a certain limit on safeguards PFD due to its design and features, and maintenance will simply ensure that the reliability of a safeguard does not reduce to a point that increases the risk of a scenario past its target tolerable frequency.
  • There are some safeguards identified in a PHA that are purely personnel related, such as procedures. Although these safeguards do not relate to equipment maintenance, they do relate to refresher training and procedure knowledge verification.
  • Multiple companies not only perform HAZOPs but also LOPAs. As LOPAs may not only go into more detail on scenarios analyzed in HAZOP, but also may identify new scenarios, it is important to conglomerate the learnings from PHA methodologies.  This would create a data set of HAZOP scenarios without LOPA equivalents, LOPA scenarios that supersede certain HAZOP scenarios, and LOPA scenarios that were not identified in HAZOP.
  • As with all knowledge gathering sessions that are to be driven by SMEs, they are only are good as the people involved. As some companies use the mentality of “use whoever is available” for attendance of PHA sessions, the quality of these risk assessments may be below average quality.  This would also mean that there would be less value for the purpose of using this documentation for input into safeguard maintenance prioritization.  The higher the quality of PHA sessions, the more valuable they are for prioritizing safeguard maintenance and PHA analytics in general.

2. Safeguard Risk Reduction

2.1 Calculating the Risk Reduction Benefit Provided by Safeguards

PHA studies are used, and in some countries are required by regulatory bodies, to identify process hazards and operational issues for facilities. These studies enable management, engineers and site personnel to make decisions that reduce the risk of hazards identified in PHAs. 

Although the outputs of PHAs are often focused on the recommendations and future action items that are identified, hundreds of hours of Subject Matter Expert effort is put into breaking down hazardous scenarios and the existing Safeguards which protect them.  This information can be used to identify the risk reduction associated with each Safeguard, which in turn will directly correlate with the maintenance prioritization of that Safeguard.  Simply put, the more risk reduction a Safeguard provides to a facility, the more critical it is to ensure its maintenance.

Figure 1.  Sample PHA Safeguard Summary

In order to utilize this PHA dataset to determine the actual risk reduction provided by Safeguards and assist in identifying critical Safeguards, the following need to be considered:

  • The current frequency of the consequence and/or the frequency of the cause.
  • The probability of failure on demand (PFD) of Safeguards on related consequences, which is a measure of the safeguard reliability.
  • The severity level and associated tolerable frequency of each cause/consequence pair the Safeguard is associated with. These tolerable frequencies are usually identified on the Corporate Risk Matrix.

In the presented case studies, the following assumptions were made when calculating the risk reduction provided by each Safeguard.  This was based on the Risk Matrix shown in Figure 2:

  • Frequency of causes were assigned based on the correlations provided by the PHA Team. (Table 1.).
  • A PFD of 0.1 was assigned to each Safeguard given 1 “credit” of risk reduction in the PHA. Safeguards given 2 “credits” were assigned a PFD of 0.01 [2].
  • Tolerable Frequencies were determined based on the below assumption for each severity (Table 2.). Each consequence receptor (e.g. Health and Safety, Environment, etc.) were assumed to have the same relationship between severity and tolerable frequency, which is found from LOPA documentation.  Note that in certain LOPA documentation it may be specified that various consequence receptors will have different relationships between severity and tolerable frequency (e.g. for the same severity, Health and Safety may have a tolerable frequency of 0.0001, whereas Financial may have a tolerable frequency of 0.001).

Companies can use specific IEFs and PFDs from their internal equipment databases for the risk reduction calculations to improve the precision of the calculations. Only safeguards that are credited for risk reduction in the PHA have been reviewed for this report.

Table 1.  Frequency Correlation for Risk Matrix Likelihoods

LikelihoodCause Frequency (/yr)
10.0001
20.001
30.01
40.1
51

Table 2.  Target Tolerable Frequency for Risk Matrix Severities

Severity CodeTolerable Frequency (/yr)
A1
B0.1
C0.01
D0.001
E0.0001

Figure 2.  Case Study Risk Matrix

The methodology used to calculate risk reduction provided by Safeguards included the following criteria:

  • Safeguards provide only reduction of the frequency of a consequence from occurring. The severities of the worst credible consequences identified in the PHA were assumed to be constant before and after the consideration of Safeguards.
  • The assumed tolerable frequency of a consequence is indicative of its relative impact. The less tolerable a consequence is considered, the more severe its impact.
  • If existing Safeguards are not as effective as described in the session, the corresponding risk reduction provided for the Safeguard may change.
  • The assumed frequency of the cause and the PFD of the corresponding Safeguards can be used to determine the current projected frequency of the consequence scenario.
  • The assumed frequency of the cause and the PFDs of the Safeguards remaining once the Safeguard of focus has been excluded can be used to determine the projected frequency of the consequence if the Safeguard of focus were to fail or be bypassed.
  • The risk reduction of a Safeguard is the change in risk associated with the simulated failure or bypassing of that Safeguard.
  • Risk reduction across different severities and consequence receptors were aggregated to determine the cumulative risk reduction associated with a Safeguard. This means that the more times a Safeguard was used in a PHA, the more risk reduction it will provide.  These cumulative values can be broken down into different scenarios, consequence receptors and severities as required.
  • These calculated Safeguard risk reductions are snap shots in time that can be continually updated as Safeguards fail or are bypassed. This will be further elaborated in Section 3.

By using these assumptions and criteria, it is possible to calculate a “Criticality” value for each Safeguard that indicates the relative importance of that Safeguard in preventing hazardous scenarios. This Safeguard criticality considers how often the Safeguard is required for risk reduction in the PHA, the reliability of that safeguard, and the severity of the hazardous event or events it is protecting against.

2.2 Examples of Safeguard Risk Reduction Calculated from PHAs

Below are “snapshots” which present the initial static risk reduction of each Safeguard identified in three PHAs from different facilities.  Figure 3 to Figure 8 below show:

  • Safeguards sorted in descending order of the risk reduction, indicating the priority of that Safeguard for maintenance
  • The risk reduction shown below is filtered to only show Health and Safety Consequence Receptor risk reduction as this is often the focus of Safety Critical Equipment lists
  • Bowtie Visualizations associated with critical Safeguards to showcase how linking to the original PHA Data can make it easier for reviewers unfamiliar with the original study to understand why a Safeguard may be considered critical. On the bowtie visualization, blue rectangles represent the current risk-reducing Safeguards and black rectangles represent additional PHA recommendations for potential future Safeguards.

Figure 3 and Figure 4 showcase a 3-day PHA Session, in which 82 unique Risk-reducing Safeguards were identified.  Observing the risk reduction for each Safeguard associated with Health and Safety related scenarios, we see that:

  • The top risk-reducing Safeguard, a mechanical relief valve, is utilized 35 times for risk reduction and as can be seen on the Bowtie Visualization in Figure 4, as SG 18, is applied to high risk scenarios, contributing to its importance. The importance of the top Safeguard stems from protection of overpressure of the 1st stage discharge piping, which results in leaks or rupture and loss of containment of gas leading to fire.  By visualizing this scenario, we can understand the logic of how this Safeguard may be identified as safety critical.
  • The majority of risk reduction contribution comes from Mechanical Safeguards, such as Pressure Safety Valves (PSVs) and other relief devices.
  • Occupancy a Modifier is within the Top 10 risk-reducing items identified in the PHA session, indicating the importance to communicate to site personnel the potential risks in the Flare Knock Out Drum area.
  • The categories listed in the Donut Chart are Mechanical (MEC), BPCS Trips/Interlocks (BPCS-T), Compressor Panel Trips/Interlocks (CP-T), Procedure (PRO), Operator Rounds (ROUND), Occupancy (OCC), and Other
  • The consequence receptors listed in the bottom right panel and the right hand side of the Bowtie are People (PPL), Environment (ENV), Asset (AST), Financial (FIN), Operational Costs (OPS), Regulatory (REG) and Reputation (REP).

Twenty (approximately 25%) unique Safeguards contribute to more than 90% of the risk reduction identified in the PHA.  This indicates that a large concentration of the risk reduction is in only 25% of the identified Safeguards, making it logical to focus on this subset for ensuring regular maintenance.  It should be noted that for larger facilities, its risk assessments are likely broken down into multiple PHAs.  These PHAs can be combined together to give a complete picture of risk and risk reduction on a facility level.

Figure 3.  Safeguard Criticality for Compressor Station Filtered for H&S (Top 10)

Figure 4.  Bowtie Visualization for Top Safeguard of Compressor Station

Figure 5 and Figure 6 showcase a four-day PHA session, in which 40 unique risk-reducing Safeguards were identified.  While analyzing the risk reduction for each Safeguard associated with Health and Safety related scenarios, it can be seen that:

  • The top risk-reducing Safeguard, a BPCS alarm with operator action (BPCS), is utilized only 4 times for risk reduction, but as seen on the Bowtie Visualization is applied to high risk scenarios which have no other current Safeguards, contributing to its importance. The importance of the top Safeguard stems from protection of overpressure of a tank and the potential for fire or explosion. By visualizing the scenario, we can understand the logic of how this Safeguard may be identified as safety critical
  • The majority of risk reduction contribution comes from BPCS Alarms with Operator Action, which implies the importance of not only maintaining the physical alarm equipment functionality, but also the training of operations to ensure they are aware of what actions to take when the alarm activates
  • The top 5 (approximately12.5%) unique Safeguards contribute to more than 90% of the risk reduction identified in the PHA. This shows a high concentration of risk reduction in few Safeguards, providing a focus on which Safeguards require maintenance prioritization.
  • The categories listed in the Donut Chart are Mechanical (MEC), BPCS Alarm with Operator Action (BPCS), Operator Rounds (ROUND)
  • The consequence receptors listed in the bottom right panel and the right hand side of the Bowtie are Health and Safety (H&S), Environment (ENV), Economic (ECO), Regulatory (REG) and Reputation (REP).

Figure 5.  Safeguard Criticality for Butane and Propane Loading Station Filtered for H&S (Top 10)

 Figure 6.  Bowtie Visualization for Top Safeguard of Butane and Propane Loading Station

Figure 7 and Figure 8 showcase a seven-day PHA session, in which 166 unique risk-reducing Safeguards were identified.  Observing the risk reduction for each Safeguard associated with Health and Safety related scenarios, it can be seen that:

  • The top risk-reducing item, excluding Modifiers of Ignition or Occupancy considered in the PHA, is from personnel using building entry procedures with personal O2 monitors, and as can be seen on the Bowtie Visualization it is applied to high risk scenarios. The importance of the top Safeguard stems from low level of oxygen in a building due to nitrogen leaks. By seeing this scenario, we can understand the logic as to why this Safeguard may be identified as safety critical.
  • The majority of risk reduction contribution comes from Mechanical Relief Devices, which implies the importance of having a stringent preventative maintenance plan to ensure the relief valves are maintained to achieve the required reliability level.
  • The top 30 (approximately 20%) unique Safeguards contribute to more than 90% of the risk reduction identified in the PHA. This shows a high concentration of risk reduction in relatively few Safeguards, allowing for easier focus to ensure maintenance prioritization.
  • The categories listed in the Donut Chart are Mechanical (MEC), PLC Trip/Interlock (PLC-T), Local Trips/Interlocks (Local-T), Shutdown PLC (SD-PLC), Procedure (PRO), and Modifiers.
  • The consequence receptors listed in the bottom right panel and the right hand side of the Bowtie are People (PPL), Environment (ENV), Asset (AST), Financial (FIN), Operational Costs (OPS), Regulatory (REG) and Reputation (REP).

Figure 7.  Safeguard Criticality for Refrigeration Plant Filtered for H&S (Top 10)

Figure 8.  Bowtie Visualization for Top Safeguard of Refrigeration Plant

In the above examples, most facility risk reduction can be found within less than 25% of Safeguards, which will allow maintenance schedulers to focus in on the critical few.  PHA data, either through utilizing Bowties or simply through PHA documentation reference, can provide an efficient way to identify critical Safeguards and communicate the reasons behind its criticality ranking.

3. Impact of Non-Effective Safeguards on Critical Safeguard Lists

3.1 Determining the Change in Critical Safeguard Lists

As mentioned in the prior sections, simply calculating the risk reduction of Safeguards based on the initial dataset may not be enough to reflect real processing facility scenarios, where Safeguards unexpectedly fail or are bypassed, and new recommendations are implemented to become Safeguards.

When the functionality of a Safeguard changes and it may no longer be relied upon to reduce risk, its impact on the frequency of the consequence and the criticality of the remaining Safeguards may need to be updated.  For example, if a Safeguard fails, the current frequency and risk of a consequence scenario will increase, increasing the requirement and criticality of the remaining functional Safeguards to act when required. This simulation methodology will improve the ability of personnel to ensure critical Safeguard lists are up to date and reflect the current status of the processing facility, which in turn can minimize the risk exposure of site personnel. 

The below figures showcase three simulation scenarios which demonstrate the impact on Safeguard criticality and the maintenance focus of remaining functional Safeguards when changes in a plant occur:

  • Figure 9 and Figure 10 showcase the baseline case of a Butane and Propane Loading Facility and highlight an LEL detector
  • Figure 11 and Figure 12 showcase the change that occurs if the LEL Detector were to fail or be bypassed and its effect on the criticality of remaining Safeguards
  • Figure 13 and Figure 14 showcase the change that occurs to the baseline Safeguard criticality if a risk-reducing recommendation were to be implemented.

Figure 9 shows that the LEL Detection alarm is utilized 66 times within the facility, causing it to be the 9th most critical Safeguard identified in the PHA.  The bowtie visual in Figure 10 shows that for one of the hazardous scenarios that the LEL alarm protects against, there is another Safeguard currently present, an interlock to trip the butane loading pumps on low flow to prevent damage to the pump seals causing loss of containment of butane.

Figure 9.  Butane and Propane Loading Station Safeguard Criticality (Baseline)

Figure 10.  Bowtie Visualization for LEL Detection in Butane and Propane Loading Station

 Figure 11 and Figure 12 show that once the LEL Detector has failed or been bypassed, the criticality of the remaining Safeguards has changed.  For example, the Interlock to trip the Butane Pump has moved from Rank 11 to Rank 5, as its criticality has increased without the LEL detector available to help in reducing the risk of the scenario.  This example showcases the importance of ensuring Safeguard Prioritization Lists are not static, but rather adjust to represent current conditions.

Figure 11.  Butane and Propane Loading Station Safeguard Criticality After Failure of LEL Detector

Figure 12.  Bowtie Visualization of Butane and Propane Loading Station After Failure of LEL Detector

 Figure 13 and Figure 14 showcase what would happen to Safeguard Criticality and prioritization for maintenance if a recommendation were to be implemented.  When Recommendation # 36 is implemented, it becomes the 2nd most critical Safeguard for maintenance, while the Butane Pump Interlock drops in rank from # 11 to # 12.  This shows that when recommendations are implemented, the risk on current scenarios may change, which in turn changes the risk reduction of remaining Safeguards. Without updating the Safeguard criticality list, the high criticality and importance of maintenance of the new Safeguard may not be identified.

Figure 13.  Butane and Propane Loading Station Safeguard Criticality After Implementation of PHA Recommendation # 36

Figure 14.  Bowtie Visualization of Butane and Propane Loading Station After Recommendation # 36 Implementation

Table 3 shows a summary of how the Butane Pump Interlock Safeguard changes in criticality in each of the above scenarios.

Table 3.  Changes in Safeguard Criticality of Butane Loading Pump Trip over Simulations

SimulationRank
Baseline11
Failure/Bypass of LEL Detector5
Implementing Recommendation # 3612

As can be seen from the above simulations, common process facility changes such as the effectiveness of current Safeguards and the implementation of recommendations can change the criticality and prioritization of other Safeguards.  With current static critical Safeguard lists, teams will not be aware of the impacts of changes to maintenance prioritization, but by keeping an up-to-date list reflecting the current in-plant status of safeguards, the risk exposure in a facility can be reduced by ensuring maintenance is focused on the highest priority safeguards to maintain the intended required reliability levels.   As mentioned previously, it should be noted that irrespective of how much maintenance is performed on a safeguard, there is a limit on its PFD due to its inherent design and features.

4. Conclusions

Current industry methodology for determining critical Safeguards produce a static list that may not be reflective of a process facility on a day-to-day basis, as there are changes in the effectiveness of Safeguards or recommendations implemented.  Also, there is difficulty in understanding the relative criticality differences of various Safeguards, and documentation gaps which may lead independent reviewers or management to question why something is critical.  These cumulative issues may lead to poor prioritization of maintenance, critical Safeguards with reduced availabilities and reliabilities, and a greater chance for process incidents with associated higher risk exposures.

By determining baseline Safeguard criticality from PHAs, industry can have a consistent way of determining the importance of Safeguards by taking advantage of the hundreds of SME hours that are put into developing PHA scenarios and the associated identified Safeguards.  By utilizing simulation modelling, the Criticality List can also be easily updated to reflect risk changes from implemented recommendations and from Safeguards failing or being bypassed. This can provide an understanding in changes in criticality of the remaining functional Safeguards, revealing the change in the maintenance requirements of those Safeguards.  Finally, by linking a critical Safeguard list to the corresponding PHA documentation, the reasoning behind the criticality of a Safeguard can be preserved. This improves the ability for engineers, operations and maintenance schedulers to make informed, effective decisions when considering modifying or removing existing Safeguards.

These documentation and simulation tools will allow management to better focus on Critical Safeguard maintenance at any point in the facility operation and ensure the long-term reliability of the most impactful Safeguards, as well as providing consistent input for Critical Safety Equipment and Alarm Rationalization.

Disclaimer:

The information in this paper is general in nature only and should not be relied upon without first obtaining advice from a qualified professional person. The advice and strategies herein may not be suitable for your situation. Any use which a third party makes of this paper, or any reliance on or decisions made based on it, are the responsibility of such third party. Neither the author nor ACM Risk Sciences and Development Inc. shall be responsible for damages, if any, suffered by any third party as a result of decisions made or actions taken based on this paper.

5. References

[1] Center for Chemical Process Safety (CCPS). Guidelines for Enabling Conditions and Conditional Modifiers in Layer of Protection Analysis. Hoboken, NJ, USA: John Wiley & Sons, Inc., 2012

[2] Center for Chemical Process Safety (CCPS). Layer of Protection Analysis: Simplified Process Risk Assessment. New York, New York, USA. American Institute of Chemical Engineers, 2001

https://www.aiche.org/academy/videos/conference-presentations/how-prioritize-safeguards-when-they-are-all-critical

You May Also Like…

Loading...