Many industrial leaders operate their businesses with a false sense of security. Ransomware events such as the ones faced by Norsk Hydro, Hexion, and Momentive in 2019 that caused considerable operational disruption serve as a warning to all industrial organizations. According to IBM, the cyberattacks against industrial targets doubled in 2019. They not only bring down IT infrastructure, where the average cost of downtime reached $5 million per hour according to ITIC for nine critical verticals including manufacturing, utilities and healthcare, but also affect operational technology (OT) networks with even costlier impact. Ponemon Institute surveys conducted in 2018 and 2019 involving utilities and manufacturing companies reveal that
56% of respondents reported at least one shutdown or operational data loss per year with many reporting outages, damage, injury, and even environmental consequences from incidents involving OT,
insider threats represented the majority of attacks in OT, and
45% of organizations experienced attacks involving IoT/OT assets.
While cyber-attacks continue to rise, many companies feel ill prepared to manage their OT cybersecurity risk. The same Ponemon Institute surveys highlighted a lack of alignment between OT and IT security, difficulties in finding and building industrial cyber skills in employees, and an incorrect belief that protections designed for IT are effective for OT as some of the key challenges facing the organizations. The reasons include
the IT/IS teams’ insufficient knowledge of OT systems, operation and environment,
a lack of cybersecurity expertise in OT teams,
a lack of adequate engagement from operations leadership in cybersecurity, and
occupational biases in IT/IS teams.
The combination of rapidly expanding OT connectivity in Industry 4.0, the increasing rate and sophistication of cyber-attacks, and uncertainty on how to effectively manage OT cybersecurity leaves many industrial companies exposed to significant financial and operational risk.
In the following sections we review the problems with some conventional cybersecurity methods, key considerations for OT, and recommendations for effective risk management for industrial companies.
Centralized cybersecurity approaches – high cost, low effectiveness for OT
The IT/IS team is often requested to lead OT cybersecurity efforts to exploit potential cost synergies and their knowledge of cybersecurity. It is natural for them to explore and strongly advocate for opportunities to apply IT security practices to OT. These IT security practices mostly depend on centralization, early detection, and standardization to achieve economy of scale. In theory, extending them to OT should deliver cost efficiency at scale by using existing network security tools such as segmentation, centralized security operations to monitor and respond when an intrusion is detected, and applying standardized endpoint security tools to edge devices and systems. It also minimizes the need for change management and cross-functional engagement as it is largely managed by the central IS team using the latest and greatest technical tools.
There are, however, fundamental differences in the intended use of these two networks. On the IT side, devices are mostly general-purpose computing devices (e.g. computers, phones, servers, etc.), network performance variation is more of nuisance than a real disruption to core operations, and standardized endpoint security tools (e.g. anti-virus) are easily applied and maintained over time. On the contrary, cyber-physical systems connected to OT networks are special purpose devices (e.g. an anesthesia machine, CNC machine, etc.) installed to function in a specific manner and support specific processes in operations. Therefore, standardized endpoint security tools cannot be applied to these special purpose devices without ensuring the integrity of their functional performance in the given environment. Further, OT network performance degradation due to security tools can be operationally disruptive.
As illustrated in the following figure, extending centralized IT security practices to OT environments can be both ineffective and costly. Worse yet, the approach may create a false sense of security in the organization. Even the advanced technical tools such as artificial intelligence (AI) based intrusion detection system (IDS), advanced threat detection (ATD), micro-segmentation (i.e. software defined network), etc. may fail to deliver much value to OT cybersecurity in most cases. The key reasons are:
1. Benefits are limited mostly to managing broad network-based attacks:
The approach is heavily geared towards managing risk of broad network-based attacks. In this case, attackers cast a broad net by injecting malware into a network and hoping that the malware reaches a device that fits the target profile. Granular segmentation, in theory, can reduce the risk of an end device being infected by restricting and managing traffic to the device through various network-based policies. Segmentation, however, may not help if an attack was targeted, for example, to a cyber-physical system by using already compromised access credentials. Targeted attacks account for 86% of attacks in the manufacturing sector where attackers use knowledge of an organization’s vulnerabilities, operating environment, compromised user credentials, etc. In fact, in 70% of the cases, publicly known vulnerabilities are exploited.
This approach also falls short in managing the risk of insider attacks which are on the rise. According to Nucleus Cyber report, 60% of companies experienced insider attacks in 2019. Ill-intending, unskilled or misled insiders can pose big risks to organizations.
2. Granular segmentation may not actually be feasible:
Granular segmentation often comes at the expense of network performance. Further, OT environments present considerable diversity in device types, intended functions, and associated processes. Hence, deploying and managing policies for a multitude of scenarios adds significant cost in terms of policy management using a highly skilled workforce. Unmanaged policies, at the same time, can either lead to operational disruptions or security vulnerabilities. When a Midwest-based hospital failed to manage policies and parameters for their medical devices with needed operational changes, the network started to reject genuine medical devices and caused disruptions in patient care. Hence, granular segmentation may not be applicable for many OT applications.
3. IDS in OT environments can be very expensive:
There is value in detecting intrusions early before any of the OT devices are affected. An infection of an OT device could lead to malfunction or down-time resulting in operational disruption. Trade-offs involving placement of IDS include network performance, accuracy of detection, and cost. Similar to network segmentation, bringing IDS capability closer to the OT devices, and eventually on to a device (i.e. Host-based IDS), would increase detection accuracy. This action, however, can add significant data-load onto the network and impact network performance. It also adds large sums of cost for adding IDS sensors onto or closer to the installed OT devices, ensuring functional as well as network performance for the systems, and managing integrity of the sensors over time.
On the other hand, network-based IDS, typically placed further away from the end devices, does not affect network performance as much. It, however, leads to lower detection accuracy. As a result, full-time monitoring (i.e. analyst-in-the-loop) resources would be required. There is considerable cost associated with the triaging alerts generated by the IDS to find the ones deemed reliable, as shown in the adjacent figure. The diversity of OT devices on a network makes the situation worse at scale than shown in the figure. They generate more alerts, and often require field personnel with knowledge of the OT device functionality and operational break (i.e. downtime) to evaluate the alerts. False positives can create organization fatigue and a lack of trust in the system. Even the most advanced IDS with AI could lead to significant incremental costs in the OT applications with questionable return.
4. Many OT devices and the associated risks can remain unmanaged:
The centralized approach to cybersecurity also primarily relies on active and/or passive scanning operations to discover devices and identify associated product-level vulnerabilities. In many instances, active scanning shuts down or reboots OT/IoT devices. In such cases, advanced passive scanning along with other analytical methods are used. There are many devices, such as the one shown in the preceding figure where a motor vibration tester is in peer-to-peer connection with the motor/drive, that are usually not identified with such scanning operations. Further, many intermittently-connected OT devices can remain misidentified. It is hard to manage risk relating to the devices that are not identified, tracked and managed. In general, about 15 – 40% of devices may remain unmanaged in this approach leaving substantial unaccounted risk.
5. Many critical vulnerabilities including human-factors go unaddressed:
The centralized approach is often limited to identifying product-level vulnerabilities. Various studies, however, indicate that up to 95% of breaches are human-enabled. For example, misconfigurations such as missing isolation, weak passwords, delayed updates, disabled QoS, incorrect permissions, inconsistent system integration, and missing authentication are frequently exploited in cyber-attacks. The given approach doesn’t adequately address such human-factors in cybersecurity, and leaves the organization with a false sense of security. Semiconductor giant TSMC was faced with a virus outbreak, which brought three of its plants down, because of mis-operation during the software installation process for a new tool, which caused a virus to spread once the tool was connected to the network.
6. Ineffective incident response:
The centralized approach assumes relatively more standardized response mechanisms. An incident response involving an OT network often requires coordination across many domains. In most cases, isolating just an infected device without bringing the whole network and associated processes down may not be as feasible in OT as it would be in IT. In other words, it is difficult to have standardized responses in the case of OT devices.
Extending IT security practices based on centralization, early detection and standardization onto the OT side can result in high costs and large unmanaged risk. Applied generally, without consideration of the OT devices and environment, certainly leads to low ROI while providing a false sense of security to the organization.
Optimal risk-based approach to OT cybersecurity
To be efficient, cybersecurity needs to be risk-based to prioritize the areas of greatest impact. Cybersecurity of OT needs more of an operational approach through which all vulnerabilities impacting a device are proactively identified and risk is managed by a cross-functional team. As discussed in the preceding sections, cyber-risk associated with a device includes not only technology but also people, policies and process considerations. A security control implementation may every so often come at an expense of operational flexibility or other operational values. Hence, operational risk-return trade-offs have to be part of risk management decisions involving cyber-physical systems. In this way, the most effective security tools are applied to the biggest risks to the business. An example of how the risk-based approach can be optimally applied to OT cybersecurity is illustrated in the following figure.
An optimal solution may include perimeter controls, IDS at the enterprise level to detect any unusual activities before the OT network is infected, datalink (VLAN) or network layer segmentation, demilitarized zones (DMZ) where data and services can be shared, cybersecurity training for all personnel, and active device-level risk management using relevant controls. It would ensure effectiveness, efficiency and adaptability in cybersecurity risk management over time.
Successful execution of the suggested strategy, however, requires a considerable level of change management. Hence, industrial organizations must make OT cybersecurity a top business priority, link it to the board of directors’ risk oversight process, and operationalize a robust cybersecurity program via cross-functional engagement. The program needs to be championed by an operating executive who is also accountable for operational excellence. Doing this ensures the right stakeholders are involved and the right trade-offs are made around operational performance, cost, and security. More importantly, it helps institutionalize a culture of security – the best defense mechanism! This is analogous to Toyota’s approach to quality.
Overall, cyber-physical systems together with the Internet of Things (IoT), Big Data and Cloud Computing in Industry 4.0 promise to deliver improved productivity, quality, and compliance. According to some experts, increased interconnectedness and digital collaboration across the full supply-chain can further reduce operational costs by at least 30% and reduce inventory requirements by as much as 70%. The exponentially increasing connectivity, however, also raises concerns around cybersecurity. Hence, organizations will have to make OT cybersecurity a core competency to win in today’s digitally connected environment.
ResiliAnt has developed a proprietary solution to help organizations manage their IIoT/OT cybersecurity risks, inclusive of a platform that helps address all of the challenges mentioned above starting from tracking inventory, vulnerabilities and threats, to training personnel, mitigating risks, and responding effectively when an incident takes place. If you have interest in learning more about ResiliAnt’s solution, you can reach us at firstname.lastname@example.org.