Susan Barney, EIT
Risk Alive Analytics Inc.
#300, 926 5th Ave SW, Calgary, AB, Canada
Prepared for Presentation at American Institute of Chemical Engineers
2019 Spring Meeting and 15th Global Congress on Process Safety
New Orleans, LA
March 31 – April 3, 2019
There is growing concern among owners/operators regarding the quality and consistency in the Process Hazard Analyses (PHA) of their facilities. When a PHA is conducted, whether through an internal or contracted facilitator, there can be a strong influence on the data from many factors. These factors affect the integrity of the data and, can lead to increased risk exposures in the facility.
While PHAs have been conducted within the industry for many years, there are still distinct differences between the quality and completeness of data generated within each session. Using data analytics and visualizations, the industry can move from inconsistent data quality to a place of quality awareness. This knowledge can reduce risk exposure by helping remove inconsistencies within an organization’s PHA data. Additionally, this can increase the percentage of critical scenarios being captured during a PHA session, reducing the likelihood of missing a high-risk scenario and the potential corresponding recommendation required to adequately address the risk. This concept can be adopted for a newly purchased or a newly built facility by having a guideline to start with for the baseline PHA, or for the revalidation of a PHA.
To achieve these visualized analytics, there is a series of transformations while mining the PHA data which will allow it to be in a comparable state. The data must be generalized so specifics such as tag numbers are removed, and comparisons can be drawn. It is then organized into subsections of process safety vulnerabilities, which summarize the threats the facility is exposed to.
From this generalized information, the most critical vulnerabilities across multiple facilities can be extracted to be assessed in the next PHA. Along with the aid of a subject matter expert, it can also be determined if any vulnerabilities were not considered in the PHA, such as prior incidents. This can create a more thorough PHA with the confidence that most scenarios have been analyzed.
Every organization that processes hazardous materials should have a rigorous Process Safety Management (PSM) Program. Process Hazard Analysis (PHA) is a foundational and key component of a complete PSM program. PHA studies are performance based and an organization must choose what type is used to assess the risk at their facilities. The selection should be appropriate to the complexity of the process and must identify, evaluate, and offer recommendations to mitigate the hazards uncovered in the session . For the purpose of this paper, we will be focusing on Hazard and Operability (HAZOP) as the PHA study for comparing data quality.
HAZOP sessions require the experience, knowledge and technical backgrounds of the team members along with systematic methodologies to comprehensively uncover hazards and risks present at the facility . Risk can be perceived as a function of frequency and consequence severity, therefore, the risk in the scenarios captured within a HAZOP session should be understood by the whole team. Thorough analysis of the hazards during HAZOP sessions allows for necessary information to be available to key stakeholders enabling them to make suitable decisions to manage the risk.
The concern of owners and operators regarding the quality and consistency in the HAZOP has grown in the last few years. Whenever an incident occurs at a processing facility, the HAZOP team will typically return to the documentation to determine if that scenario was captured during the study. This reactive approach can be beneficial to help with future HAZOPs to ensure all previous incidents are captured, but a proactive approach would help in avoiding the incident. The overall culture of process safety should be proactive rather than reactive, as when lives are at risk, personnel safety needs to be treated with the utmost importance.
The most dangerous type of risk to a facility is the unknown risk. This unknown risk exposure can arise from an incomplete, non-concise, poor quality HAZOP session. The risk gaps that are created from these inconsistencies can be improved upon, and the starting point is the data that is already available.
Over decades of conducting thousands of risk assessments for the some of the world’s largest operating companies, ACM Facility Safety has accumulated an extensive database of process safety information. Many methods and tools have been developed to help clients get the most out of their HAZOP sessions and help make their facilities safer. One of the latest of these tools is Risk Alive Analytics; a web-based service that uses the PSI data from HAZOP sessions to find, and bring to the surface, information that is typically hidden or not considered. Having this information can help eliminate many risk gaps. One of the analytics within Risk Alive is the Facility Unit Comparison module. Data can be generalized and with the aid of subject matter experts, conclusions can be drawn based on the comparison of similar facilities to find this hidden information. This can help organizations become much more proactive with their knowledge of risk exposure and understanding of potential hazards in their respective industries.
2 HAZOP Introduction and Required Information
HAZOPs are used as a formal and systematic examination of the intentions of new and existing facilities to assess the hazard potential mal-operation or malfunction of individual items of equipment and the consequential effects on the facility as a whole . The HAZOP generates a record and provides proof that a recognized form of hazard analysis has been completed for a project, or a facility. For many operations, a HAZOP study provides all the safety assurance that is necessary to satisfy the regulatory authorities . If an unexpected incident occurs the investigating board will typically use the HAZOP reports to determine if the organization completed its due diligence. If it is found that the organization did not fully investigate the extent of a foreseeable hazard, they could face significant regulatory and financial consequences.
The personnel requirements for a HAZOP session include:
- The HAZOP team typically consists of between five to ten individuals who are familiar with the project and can use critical thinking to provide insight and knowledge on the information being discussed.
- The team members should be competent in three main areas: they are knowledgeable, experienced with the process, and trained on the methodology and how it relates to process safety.
- One or two individuals to facilitate and document the findings of the sessions. They must be familiar and experienced with the HAZOP methodology and able to properly record and report upon the technical discussions.
One of the root issues of HAZOPs is that they are highly subjective and qualitative approach. Subjectively developed reports are often difficult to return to and use as cold eyes documentation as there are many decisions made in session that are critical to scenario analysis that are only in the minds of the attendees. If the suitable members are within the team and the information is recorded properly, the subjective manner should not heavily affect the data. It is also important that a well-defined risk matrix is used to understand the consequences. Valuable insights from experienced members of the team are generated during the session. There are many instances where certain data is not documented as a time saving method for the session, such as the initiating event probability to determine the probability before safeguards or magnitude of likelihood reduction from a safeguard but this lack of explicit information can make it challenging to remember what was discussed during the session. It would be illogical to expect someone to recall a conversation they had months ago based on their memory alone.
The optimal minimum amount of required data to complete a PHA session includes:
- Worst Credible Consequence Description
- Initiating Event Probability, Worst Credible Consequence Severity and Risk Ranking (Before the consideration of Safeguards)
- Safeguard Description
- Magnitude of Likelihood Reductions from the Consideration of Safeguards
- Frequency Worst Credible Consequence Severity and Risk Ranking (After the consideration Safeguards)
- Magnitude of Likelihood Reductions from the Consideration of Recommendations
- Responsibility for ensuring the implementation recommendations
- Proposed Risk Ranking (After Recommendations Are Implemented)
Globally, there is a very large number of facilities that process different types of hazardous materials. Each of these facilities conduct and document a form of PHA. This data comprises of a significant number of hours of sessions where there were multiple subject matter experts within one room discussing potential proposed hazards to a specific facility. Therefore, the data generated within a PHA session is quite valuable because of the experience, insights & expertise that went into generating it.
As methodologies of hazard identification have evolved over time, it is important to ensure the data being compared is in a comparable state. One of the main things that creates difficulty when trying to compare data is the quality of the data itself. As the data generated during a HAZOP session can be influenced by many different factors, there can be large variations within it. Although there are many things that can lead to data variations, the items of focus in this paper are :
- Missing information, some organizations may decide to not fill in certain columns of data to save time within the session, this may include, initiating event probability, or risk ranking, the magnitude of likelihood reduction from a safeguard or recommendation
- The quality of data recorded within the session, for example, brief consequence text, not capturing the full scenario or skipping steps in the sequence, using very generic safeguards that do not capture the proper information required.
Examples below will describe in more detail how the quality and completeness of the data can strongly affect the outcome of the session and the follow up action items.
2.1 The Variation in HAZOP Data Quality
These examples show how the quality of the data recorded within the session does not always capture the proper information required. Figure 1 below shows consequence descriptions from the same type of facility across three different organizations.
Figure 1: Consequence Variation between HAZOP Files
It would be challenging for someone to look back on the results post-session to understand the sequence of events associated the first consequence or the true risk they were trying to capture. Time would have to be spent looking at drawings, the associated cause, the safeguards and any other applicable information to be able to determine what was being discussed. If the information had been accurately recorded as in the second description, it would be easily understood how the consequence could occur. The third description is quite lengthy and wordy, here the key information can also be difficult to extract and understand as there is a large volume of data to review. Ideally, someone who is familiar with the processing unit should be able to look at a consequence text and determine what it is describing without any major questions.
Figure 2 below shows two examples of safeguards that were used within the same HAZOP, underneath each example, it shows how many times this safeguard was used.
Figure 2: Safeguard Description from Real HAZOP Files
In this example, a generic safeguard such as the top safeguard in Figure 2 does not provide information on where the safeguard is located, the set points, and what type of conditions it protects against. This safeguard was used 125 times within the respective HAZOP. This may raise concern that each time this general safeguard is applied on a consequence scenario, is it truly a risk reducing safeguard? Are the members in the session discussing the set points, locations and functionality of the Pressure Safety Valves? The bottom safeguard in Figure 2 has a description that is more specific. It includes the location, the pressure rating and the condition in which it protects against. Recording safeguards in this manner will create more explicit information regarding the protection layer. To understand why the quality and consistency of a PHA is important, it is essential to understand how the information within the HAZOP report can be used post-session. As the session is focused on identifying hazards and how they create risk within an operating facility, having data where the hazard is not fully understood creates questions regarding the risk at the facility.
Other than the risk within the session, there is also unconscious and unknown risks. An unconscious risk is one where the hazard is not perceived. Unknown risk is one where the full range of consequences is not known or fully understood . Both types of risk, unconscious and unknown can lead to missed hazardous scenarios.
In Figure 3 below, as there is no initiating event probability, as well as no mention of which safeguards are valid for a magnitude of likelihood reduction, there are columns of data left blank. When this report is viewed after the session, the individual would have to try to determine the required likelihood reduction required to ensure this scenario’s risk is at a tolerable level. This could pose some difficulty as they may not be aware of the applicability of each of the safeguards listed. To complete this step in the HAZOP, they would first have to determine the probability of the initiating event and then use best practice guidelines to determine which safeguards are valid in the relevant scenario.
Figure 3: Screen shot of example PHA Pro File of HAZOP Session Data
If these columns of data had been recorded within the session when they were discussed, it would create no questions post session when the report is viewed. The inherent safety of the design is important to be understood during the session. If the design is robust and built inherently safer, the initiating event probability can be reduced, because the hazard has been removed at the source. If the design is not built to be inherently safer, you must rely on passive, active or procedural constraints required to compliment the design to control the hazard . They are not required for the process, they are required to take the design to a safe state if the initiating cause occurs, by reducing the likelihood of the consequence. This reflects the earlier point that it is essential to have the proper competent team members within the session that thoroughly understand the design and can critically think about the inherent design of the system. An individual going back to look at this report post session most likely will not have this knowledge to determine the initiating event probability. If this information is intentionally left blank during the session it can create a confusion post session.
3 Introduction to Facility Unit Comparison
A common issue that has been discussed within organizations is “How can we tell if our processing units are being assessed in a complete and consistent way?” To understand this, the HAZOP data must be reviewed to draw comparisons between facilities of similar design. It should now be understood that the quality and completeness of the data recorded within the session is affected by many factors.
The traditional PHA session does not leverage the accumulated knowledge within an organization or across the industry, only that within the room. To resolve this lack of accumulated knowledge, an analytical tool has been developed. This tool is designed to help in two main approaches:
- This tool can help any organization understand the similarities and differences between their own facilities’ PHA or between themselves and other organizations PHAs from similar facilities,
- It can help prepare for the next scheduled PHA session by having some information that can be reviewed prior to the session or aid with the discussions during the session, such as:
- Sparking conversations on causes or consequences that may not have been considered, leading to conversations to determine if they are exposed to such scenarios.
- Providing reference to see how organizations categorized the severity of a consequence. Although different organizations have different definitions for similar severities, normally health and safety severities have similar overlapping definitions.
- If a group is having difficulty determining a proper risk reducing safeguard, they can use the tool to look at safeguards that have been applied to scenarios with similar consequences.
This method of comparing data can be achieved through visual representation of overlap or uniqueness between similarly designed facilities. Information can be provided to the organization helping them understand which threats/causes have not yet been considered and which safeguards have been deemed risk reducing. Threats that have not yet been considered fit into two groups:
- First are threats that were not considered, but are protected against
- Second are threats that were not considered and are not protected against
It is important to gather this information to ensure a high percentage of critical scenarios are captured. Using this analytical tool and the knowledge it comes with, organizations can begin to create PHAs that are more consistent and complete, ensuring they are leaving no potentially high risk unassessed.
To create this visual, there are a series of transformations while mining the HAZOP data to allow it to be in a comparable state. The data must be generalized so specifics such as tag numbers are removed, and comparisons can be drawn. The method of creating comparable data is completed using a combination of artificial intelligence and machine learning along with the knowledge of a process safety engineer and review from a subject matter expert.
Figure 4 below demonstrates how the visual is displayed and the main three components in the hierarchical format including the Vulnerability, Mode of Failure and the Generalized Cause or Protection Layer.
- Vulnerability is the first node on the left-hand side of the visual, which is defined as the final point of failure or damage within a consequence. Vulnerabilities are identified across comparable facilities to determine the similarities and differences.
- The next node in the middle of the visual is the Mode of Failure, which describes the general process upset that leads to the vulnerability. Although the Mode of Failure identified often overlaps with the PHA Deviation/Guideword, it is not a consistent pattern, and the consequence description is normally the key source of data.
- The final node on the visual can toggle between two modes, the Generalized Cause. The generalized information is used to increase readability and comparativeness, as many organizations only include the tag number on the cause, safeguard or recommendation but not where it is located or what it is protecting against.
Figure 4: Facility Unit Comparison Visual
When these visuals are completed, a subject matter expert should always review the data to ensure that the “unique” Vulnerabilities and Mode of Failures are comparable to the other facilities. This is important as it is not intended to say that the organization needs to consider every Vulnerability or Mode of Failure listed within the Facility Unit Comparison. What it can do is help create a more complete HAZOP session if this information is reviewed prior to the next session to determine if there is any unknown risk that has not yet been considered. It can also aid in allowing organization to determine where they have inconsistencies within their facilities. Including:
- SIS system requirement comparisons
- Consistency in risk ranking of scenarios
- Completeness of consequence scenarios considered
Figure 5: Facility Unit Comparison Sample Visual
The tool has a drop-down list of all the vulnerabilities that have been identified within the facility. In Figure 5, it can be seen which facility considered which vulnerability. Below the drop-down list, the comparable facilities are displayed with a corresponding color dot. In Figure 5, the first vulnerability “Damage to Catalyst Bed” was considered by Facility A, B and D. In the area below, the mode of failures related to damage to catalyst bed can be observed. This demonstrates that “High H2S in Gas Feed” as a mode of failure for damage to catalyst bed was considered by Facilities B and D. The final node shows the generalized causes/protection layers associated with this mode of failure and subsequently the vulnerability.
4 Facility Unit Comparison Case Studies
The concept of Facility Unit Comparison can now be used to demonstrate some case studies on how the data comparison can be utilized and what actions can come from this. Organizations have been able to improve their quality and consistency, while increasing the safety of their facilities and optimizing the number of protection layers a facility requires, potentially resulting in reduced capital and operating costs.
4.1 Case Study #1 from Facility Unit Comparison
A Facility Unit Comparison was completed for a client with the goal of comparing HAZOPs from ten (10) similar facilities to determine their overlap and uniqueness. The facilities are in different locations across multiple countries and had different teams conducting each HAZOP session. Figure 6 displays the bold generalized cause of interest on the right-hand side, Valve Failure related to the Amine Regenerator Reflux Pump.
Figure 6: Case Study #1 Facility Unit Comparison
The cause was considered by three of the facilities during their respective HAZOPs. Each consequence related to this cause was classified as a health and safety impact and involved damage to the same piece of equipment, Amine Regenerator Reflux Pump. When the native data associated with this cause was investigated, as seen in Table 1, it was found that the teams had identified conflicting initiating event probability, worst credible consequence severity and in turn, risk rankings for the same cause and consequence description. Although there may be cases where these differences make sense because of design or operational differences, it is important to investigate the differences to see if there is valid justification for the inconsistency.
Table 1: Partial Drill-down Data related to Generalized Cause in Figure 6
|Cause||Consequence Category||Worst Credible Consequence Severity||Initiating Event Probability||Risk Ranking (Before Safeguards)||Recommendations|
|Facility 1||Valve Failure||Health and Safety||2||4||Medium||No Recommendations|
|Facility 8||Valve Failure||Health and Safety||6||5||Very High||Three Recommendations|
|Facility 6||Valve Failure||Health and Safety||4||4||High||One Recommendation|
As seen in Table 1, when the same cause and consequence were analyzed to have different risk rankings, it created a different number of recommendations to be applied to each scenario to reach a tolerable level. It needed to be determined which of the initiating event probability, worst credible consequence severity and risk rankings were correct and what number of recommendations are needed based on the existing protection layers that previously existed. This discussion required a detailed consequence analysis with a process safety subject matter experts and operations specialists. The investigation determined the amount of hazardous material that would be released during this consequence would be negligible from this scenario alone, and therefore four (4) recommendations could potentially be eliminated.
This is an example of how conducting a complete HAZOP session with the appropriate individuals in the room, while ensuring the discussions are accurately recorded, can change the outcome of the session. An investigation was completed on how much money the organization saved by removing each of the recommendations. The results are displayed in Table 2.
Table 2: Recommendation estimations over a five-year period
|Recommendation||Estimated CapEx cost||Estimated OpEx cost||Total Cost over 5 years|
|Recommendation 1||$10,000 (assuming software and document updates only)||None expected||$10,000|
|Recommendation 2||$100,000 (assuming new instrument added to SIS)||$10,000 annually for testing and verification (SIS)||$150,000|
|Recommendation 3||$100,000 (assuming cabling required)||$10,000 annually for testing and verification (SIS)||$150,000|
|Recommendation 4||$50,000 (assuming installation of a new horn)||$5,000 annually for testing||$75,000|
|Total Cost||$260,000||$25, 000/year||$385,000|
The cost estimate is representative to how much can be saved within an organization, while not changing the risk exposure of the facility. Each of the recommendations were estimated and it was found that a total of $385,000 over a 5-year period could have been eliminated from this one scenario alone. It is believed within this organization and across other organizations that these types of inconsistencies may lead to more over instrumentation cost savings in the future.
5.2 Case Study #2 from Facility Unit Comparison
Another Facility Unit Comparison was completed with major findings related to consistency and completeness within the HAZOP. It was determined that the same generalized safeguard “Alarm – High SO2 in the Incinerator” was being used at the same facility (Unit A) for 1 and 2 magnitudes of likelihood reduction. In the other comparable units, it is seen that the same generalized safeguard consistently has 1 magnitude of likelihood reduction applied to it. This is demonstrated in Figure 7 below.
Figure 7: Facility Unit Comparison Findings #2
This information was reviewed by a subject matter expert and the outcome was best practice for taking safeguard magnitude of likelihood reduction for alarms are as follows:
- If both alarms are ringing into the boardroom at the same time and the operator is expected to take an action for both, then only one magnitude of likelihood reduction should be given
- If one alarm rings in as an early warning signal in advance of the other alarm and there are different actions to be taken due to the progression of the process event, then two magnitude of likelihood reductions may be justifiable
As a result of this facility unit comparison, this organization is now aware of this inconsistency. This will spark some conversation with the associated personnel to ensure they are following best practice on consistently applying magnitude of likelihood reduction for safeguard alarms. If this safeguard when given two magnitude of likelihood reductions is bringing the residual risk to a tolerable level, the scenario should be evaluated to ensure it is adequately protected against.
5.3 Case Study #3 from Facility Unit Comparison
Another Facility Unit Comparison finding related to the quality and consistency in PHA is shown in Figure 8 below. This example highlights the completeness of assessing all scenarios within the PHA study.
Figure 8: Facility Unit Comparison Findings #3
As seen in Figure 8, seven of the ten facilities considered damage to the amine regenerator, with six different modes of failure. The three facilities that did not consider damage to the amine regenerator did have the equipment within their scope. This began conversations to determine why this had not been considered or recorded within the HAZOP, and if there were any unknown risks related to damage. In addition, by using the information within the facility unit comparison, it can help guide the scenarios that may have been missed.
In addition to these three examples, there are many other ways that Facility Unit Comparison has helped to improve the quality and consistency in HAZOP. It brings together a large amount of data in a valuable way to draw conclusions on the question of “How can we tell if our processing units are being assessed in a complete and consistent way.”
By using this information, organizations can be much more confident in the quality and consistency of their HAZOPs and reduce the risk of missing scenarios. To ensure a higher quality PHA, organization must ensure proper methodologies are being followed and the appropriate individuals are a part of the discussion. Additionally, ensuring all required information is being recorded accurately will improve the thoroughness of the PHA. The case studies within this paper demonstrate examples of findings while comparing different facilities HAZOP data. This knowledge can help reduce the risk exposure and increase percentage of critical scenarios captured during the session.
This concept of creating a higher quality of data can be adopted by many different types of industries. In a world where the role of artificial intelligence is becoming more prominent; this will aid in simplifying the creation of higher quality data. The HAZOP should no longer feel like a check box exercise but a method of completely understanding all the hazard and risks within a facility. This will help reduce the number of safety related incidents and create a safer working environment for all employees of an organization.
By taking advantage of the data that is available within the industry and using data analytics, artificial intelligence, machine learning and the expertise of subject matter experts, organizations will be able to develop a culture where their approach to safety is much more proactive and the number of process safety incidents are minimized.
 B. Skelton, Process Safety Analysis an Introduction, Gulf Publishing Company, Houston, TX, 1997
 US Department of Labor, Process Safety Management, Available at: https://www.osha.gov/Publications/osha3132.html#pha, Accessed pm January 15, 2019
 ACM Facility Safety Inc., PHA / HAZOP Facilitation Workshop, ACM Facility Safety Inc., Calgary, AB, Canada, 2005-2018
 Center for Chemical Process Safety of the American Institute of Chemical Engineers, Inherently Safer Chemical Processes A Lifecycle Approach, John Wiley & Sons, Inc., Hoboken, New Jersey, 2009