Welcome to the fifth installment in our six-piece series on ICA Engineering’s unique proposition, System Lifecycle Analysis. If you haven’t already, check out our previous articles on the first steps you can take to ensure business continuity.
In an ever-more competitive world, anything that causes production to stop can be a disaster for industrial organizations, both financially and reputationally. We hear a lot about cybercrimes like the Colonial Pipeline attack of 2021. Still, utility providers and factories are at risk every day from much more mundane issues, such as a failure of industrial automation systems or spare parts that go out of stock. Since the beginning of the COVID-19 epidemic, many parts that were traditionally stock items at local distributors have become long-lead items.
System lifecycle management means understanding every part of your system and having contingencies in place to deal with all kinds of business-halting situations. Let’s go over the steps that maximize business continuity — in particular, careful risk analysis and remediation planning.
System Lifecycle Management — Why it Matters
In February 2021, bad actors were able to access the industrial systems of a Florida water treatment plant. They changed how much sodium hydroxide was entering the water supply — right before the Super Bowl. Thankfully, the water treatment facility had vigilant employees who spotted the issue and rectified it before any harm was done. However, it’s possible that with more secure remote systems, the attack could have been avoided altogether.
System lifecycle management helps assess the possibility of risks like this and puts remediation plans in place. From cyberattacks to equipment failure, from pandemics that cause staff illness to supply chain shortages, numerous factors impact business continuity.
Building a process that works holistically, assessing remote access, industrial automation, and every part of your industrial systems, prepares your organization for any eventuality and increases the likelihood of positive outcomes even when facing the most challenging circumstances.
The Four Main Steps of System Lifecycle Management
ICA Engineering identifies four main steps for system lifecycle management:
- Inventory and cataloging
- Gap analysis
- Risk analysis
- Remediation planning
Of course, each of these steps may break down further into other actions that need to be taken.
Inventory and Gap Analysis
In our last article, we explored cataloging and gap analysis in more detail. Creating an accurate and thorough inventory is vital. Every controller, human-machine interface (HMI), switch, communications driver, or piece of large plant equipment needs to be logged in detail. That includes makes, models, and software and firmware versions where relevant. Careful cataloging allows lifecycle management specialists to note what equipment needs to have spare parts in stock, what might need maintenance soon, and what legacy systems can be upgraded to make them more secure. It’s essential to include remote systems and anything connected to industrial automation as well as equipment that’s manually handled by employees.
Gap analysis is simply analyzing the gap between how things are now and where they need to be. This could highlight systems that aren’t secure enough, equipment that’s hit its end-of-life, or software that’s become incompatible with business-critical devices.
Risk Analysis and Creating a Remediation Plan
The final two steps are assessing your industrial organization’s potential points of exposure to risk and creating a meaningful plan to deal with those situations.
Once you have an awareness of the infrastructure of your industrial systems, you can start assessing what risks are most likely to impact your business continuity. Understanding how all the pieces of your organization interconnect will help you determine the potential impact should any one of them fail. Common risks to a variety of industrial settings include:
- Lost configuration and device failure
- Product environment change
- Unauthorized access to documents, equipment, or areas of your facility
Human resourcing and a lack of qualified staff can also be factors in reduced production. “The Great Resignation” has led to any number of highly skilled staff taking their experience and deep knowledge of systems and equipment elsewhere. New employees may have the drive and will to learn new skills, but it takes time and extensive training to fill those gaps when experienced workers retire or simply leave. They may find legacy technology unfamiliar and non-intuitive. This situation is exacerbated if you don’t have a working knowledge base, including all aspects of machinery, computers, and other equipment. Relying on one person to be the expert in IT or for a particular piece of machinery only works until that person leaves.
Problems also occur as devices become obsolete. End-of-life equipment may not receive support from vendors, leaving organizations with useless equipment that can’t be upgraded or repaired. Equipment failure should be a situation that’s quickly remedied, but if you can no longer get spare parts for that piece of equipment, production may be slowed or halted as you seek alternative solutions.
Supply chain issues are another risk, and one that’s become a significant problem since the pandemic. Bottlenecks at ports cause shortages in raw materials and equipment parts, while delays for logistics partners cause customer and client dissatisfaction. Ensuring you have clear channels of communication along your supply chain and that your third-party vendors have adequate cybersecurity to protect their links in the chain is a key part of reducing risk.
Remediation planning or hazard planning means putting in place contingencies to help prevent shutdowns of your industrial setting in the event of any of the scenarios highlighted in your risk analysis. A plan should take into account:
- What risk it addresses
- Prevention scenarios
- Methods of mitigating damage and keeping production continuous
- If production does stop, the length of time to recovery
- How to log incidents and report them to the relevant authorities where necessary
- How to use information about incidents to improve the hazard plan for potential future events
Effective remediation planning will consider all the possible risks your organization could face. Having a suite of plans that cover a range of situations means your remediation solution is customized to the size of your business and the scope of the incident. The prevention strategies you identify might require you to take action right now, well ahead of any incident. This action could include improving your in-house knowledge base, upgrading safety processes and equipment, or even updating legacy systems.
Look out for the final article in our series, which will explore an example of exactly how ICA’s System Lifecycle Analysis proposition positively impacted one particular industrial organization. If this topic has left you wanting to know more about how system lifecycle management aids business continuity, consult ICA Engineering for more information.