The 4 types of OT monitoring and which to choose

Raphael Arakelian has been testing OT monitoring system for their effectiveness for many years. He has developed a guide that identifies 4 categories of OT monitoring. In this episode of OT Security Made Simple, Raphael explains host Klaus Mochalski and our listeners the differences and which type works best depending on the operational goal(s) within a company.

Duration:

40 min

Guest in this episode:

Raphael Arakelian

OT & IoT cybersecurity manager at PwC Canada

Transcript

Klaus Mochalski

Hello and welcome to a new episode of the OT Security Made Simple Podcast. I'm Klaus Mochalski, your podcast host, and I'm a founder of Rebo. My guest today is Raphael Arakelian. He is with PwC Canada, and he is an OT and IoT Cybersecurity Manager. We'll spend our time today talking about a classification scheme for OT security solutions that I believe you have developed. But before we dive into this, a few words about yourself, Raphael.

Raphael Arakelian

Thanks, Klaus, for the introduction. Thank you, Rhebo, for having me. I'm very excited for our conversation today. I've been with PwC Canada for the past few years. I specialize in OT and IoT cyber security, specifically in OT cyber security monitoring. I lead our national team here in Canada, where we do head-to-head evaluations of OT cyber monitoring products. We also do proof of concepts, implementations, and also configuration or monitoring services. This is definitely a topic that I'm very passionate about and happy to talk about here today.

Klaus Mochalski

Very good. We spent some time preparing our session today, and we had some interesting conversations about your classification scheme. We don't have the liberty to use slides today, so I think it's up to you to tell our listeners what your scheme looks like. What different categories do you see and what's special about these categories? I think most importantly, how do they serve your customers? How do they help your customers pick the right OT monitoring solution?

Raphael Arakelian

Yeah, so first a bit of a background before we jump into the categories. Earlier this year I developed a guide for asset owners to help them conduct head-to-head evaluations in proof of concepts for OT cyber monitoring tools. This was part of my presentation or technical session for CISA's ICS-JWG annual event, which stands for the Industrial Control Systems Joint Working Group, which takes place annually in Salt Lake City.

So this is a guide that I developed based on the thought leadership that I've put together over the past few years based on my experience both in the field and in a lab environment that I continuously research products on. The guide itself, as I mentioned, helps asset owners through a step-by-step structure to collect requirements about what they're looking for in a cyber monitoring product, understanding the different types of OT cyber monitoring, and then there's various phases which at the end are concluded by the company picking a product. So the categorization, which I talked about early on in this guide in the initial phases, identifies that there are four main different types of OT cyber monitoring, and I classify them as following:

network-based OT monitoring,
host-based OT monitoring,
integration-based monitoring,
and then the fourth and final one is targeted active scanning.

Klaus Mochalski

Okay, that's interesting. So you mentioned four categories. Let me repeat, so if I remember correctly. It's host-based monitoring, second is network-based monitoring, third is integration-based monitoring, and the fourth category is active scanning.

Raphael Arakelian

Yes, targeted active scanning, and not to be confused with the traditional or IT type active scanning.

Klaus Mochalski

Right. Maybe let's start from first to fourth and just give a quick summary of what you mean by each category, and most importantly, how it is distinguished from each other?

Raphael Arakelian

Sure. First, I'll give a summary of each one, and after I've done all of them, I'll talk about the pros and cons for each of them.

We'll start with the first one, which is typically seen as the de facto method of OT cyber monitoring. It's the network-based monitoring. When we talk about this type of monitoring, we're talking about the tool essentially leveraging network traffic or network data in an environment, whether that's through port mirroring or through the use of [network] taps, the tool will take in that traffic, dissect it, analyze it. Based on that, it will identify assets, it will identify attributes, vulnerabilities, and will also do network-based threat detection. This is network-based monitoring. You're essentially dependent on the network infrastructure of the environment.

The host-based monitoring, which, unlike network-based monitoring, is not reliant on the network infrastructure, is reliant on individual endpoints. So host-based monitoring will use agents that will be deployed on specific endpoints. Typically, these are IT-type endpoints. So Windows machines, Linux, etc. There are very few vendors that also provide host-based monitoring by deploying agents on OT-type endpoints. The majority is focused on IT-type endpoints. And typically what's happening with host-based monitoring is instead of network traffic, traditionally through switches, it's actually examining the various types of audit trails and log files that are on the individual endpoint or host to be able to do those various activities that I mentioned for network-based monitoring.

The third type, integration-based monitoring, is a term that I like to use specifically for tools that are focusing on the data sources that are available in an operational environment. Organizations might have a SCADA application, they might have an OPC software or server that's running in the environment. Integration-based monitoring will take those data sources, which are available and considered as a source of truth, and it will help either create a configuration management database or it will give visibility into the vulnerabilities. So it's focused more on asset inventories and on the identification of vulnerabilities, and it focuses less on threat detection. That's what integration-based monitoring is. Integration-based monitoring can be a standalone product, although it's very rare. Most vendors actually use integration-based monitoring as a supplemental feature for network-based monitoring or host-based monitoring. So it's typically an add on, although there are very few vendors that also only focus on integration-based monitoring.

Klaus Mochalski

So, is this something that we would usually source directly from your asset vendor? Let's say you have a Honeywell or Siemens environment, and so you would turn to these vendors for an integration-based monitoring solution.

Raphael Arakelian

Yes, that is correct. Well, not with the OEM directly, but the OT monitoring vendor would have developed custom integrations with your existing inventories to ingest that data and then to be able to analyze it for those different functions that I mentioned – whether it's asset inventorying or vulnerability detection.

I'd also mention that any ingestion of configuration files or existing asset inventories, even if it's just an Excel sheet, even though it's a manual process, I still classify that as integration-based monitoring because you're not taking in live data from the network infrastructure, like the switches or from the endpoints themselves. You're taking data that is available to you and running the processes that you need for monitoring. So that's integration-based monitoring.

The fourth one, targeted active scanning, as I mentioned, is not to be confused with traditional or IT type scanning. I know vendors typically in the OT cyber monitoring space, they avoid using the words active and scanning in the same sentence. Instead, there are about maybe a dozen different names for this. So active queries, safe queries, polling, smart polling, safe polling, probing, selective probing. But I like to group them under this one term, which is targeted active scanning. Because at the end of the day, even though IT-type scanning has all this bad preconception about it, which is true, what happens with targeted active scanning is that the tool is actually scanning at points. It's an active communication, but it's very targeted. That means the tool is replicating the known communication, which will typically happen between an engineering workstation and the specific endpoints. It's very targeted, it's very specific, and research shows, and this is something that I'm currently working on, that there is a negligible impact on the endpoints that get scanned with this method.

Klaus Mochalski

Probably, especially for new infrastructures and also I think one of the advantages – but we'll get to the pros and cons later – is the superior data quality that you get in a shorter period of time as compared to the passive solutions.

Raphael Arakelian

Correct. So we will jump into the pros and cons. The last thing I'll mention about targeted active scanning is that similar to integration-based monitoring, it can be its own standalone product or it can be an augmentation or an add-on to network-based or host-based or even integration-based. There are vendors out there, even though they're few, they only focus on targeted active scanning. That's the core of their product. But then there are a bunch of vendors that have targeted active scanning, as I mentioned, as an add-on to compensate for the limitations of whatever their core product is.

Klaus Mochalski

Right. So most of the passive monitoring vendors, one way or the other, have integrated active components to their product that you can enable or that the customer can enable if they choose to do so.

Raphael Arakelian

Yes, certainly. Many of them have gone down that path.

Klaus Mochalski

Okay. So maybe before we get into the pros and cons, our Rhebo product is, I think, natively best categorized as a network-based monitoring product. But we also discussed the overlap here, and I think this is particularly interesting. What would I, as an asset owner, do in terms of classification, and which may be supportive for my latest selection process?

Our [the Rhebo] system usually uses network traffic, so I would classify it as network-based. But we also have installations where we integrate directly with the asset. So we may run on an industrial switch, we may run on a PLC, and there are other solutions, other vendors out there doing the same. But even though we are then running on a specific asset and we can potentially run on hundreds or even tens of thousands of assets, even IoT assets, we still operate mostly on network traffic. So would you classify such a solution then as network-based or host-based?

Raphael Arakelian

So the instances where you mentioned the PLCs, for example, is there what would end up running on the PLC, sort of an agent?

Klaus Mochalski

You could call it an agent, but it's basically a capturing process which captures network traffic of the network interface through different means that the operating system provides. And then the software is doing exactly the same as the data would come from a network tap. It's just not coming from a physical tap, but from, let's call it a virtual tap, running on the PLC as part of the software agent.

Raphael Arakelian

Okay, and the other question I would ask is what's running on the PLC, does it have visibility into the internal processes of the PLCs?

Klaus Mochalski

This would actually be my next question. We actually can do that. We have example installations where we purely work on network traffic, even though we are running on a PLC, but we also have installations where we, in addition to capturing network traffic, also tap into local sources like log files and the audit trails that I mentioned earlier. We are combining these two data sources to basically understand in a shorter period of time what's going on and potentially also provide higher quality analysis on a certain event that's currently happening.

Raphael Arakelian

In the latter example that you gave me where log files are being examined, I would be very confident to call that host-based monitoring. In the example where you're still deploying sort of an agent on the PLC, but it's only looking at network traffic, to be honest, I would still classify it as host-based monitoring just because in terms of the type of the deployment and the effort that's needed and the preconception around it, it would still fall under host-based monitoring. If you're only looking at the network traffic, it would probably not have as much visibility as the classical host-based monitoring deployment. Those are my initial thoughts.I would say.

Klaus Mochalski

Okay. I think it's important to understand for our listeners that the boundaries between these categories can be fluid. And maybe it makes sense now to look at each category and talk about the pros and cons of the solution. And this may then also help with the differentiation between one category or the other, especially if you want to make your first choice of what solution actually to implement.

Raphael Arakelian

Yes. We'll start off with the same order.

So for network-based monitoring, typically based on what I've seen and based on requirements from organizations, network-based monitoring will typically involve a type of deployment that is relatively quicker. I know that this also might be fluid or subjective. Of course, there are exceptions to the rule. But overall, when I look at various organizations across various sectors, with network-based monitoring, you have different options. Either your network infrastructure is mature enough, which means you have switches that support port mirroring. In those cases, the deployment will be of relative ease. And if the organization doesn't have a mature network infrastructure, it can either upgrade the infrastructure or it can introduce network taps. If we put cost aside, generally because of the dependency on the network traffic, you will have high coverage through a relatively rapid deployment.

Of course, when you're upgrading the infrastructure, that will take time. But as I mentioned, generally, the organizations will have a few options available to them. That flexibility is important to keep in mind. The other pro, when it comes to network-based monitoring, is that it has a non-intrusive nature. Since it's listening in passively, the non-intrusive nature is beneficial from an operational standpoint. But then this relates to the third benefit or the third pro which is that – because of this non-intrusive nature – it's relatively easy to onboard the OT stakeholders, site owners, asset owners, technicians, etc, to this vision of cybersecurity monitoring because it is not disruptive to their day-to-day processes.

Klaus Mochalski

They don't have to worry that you're breaking something in the process of deploying the solution.

Raphael Arakelian

Yes. It's typically relatively much lower risk, and then that provides ease of mind theoretically, and that can help things move faster. The other pro, which I alluded to maybe earlier with this type of monitoring, typically you're doing asset management, configuration management, vulnerability management, and threat detection as well. So there's a broad coverage of cybersecurity controls.

Now, when we shift to the cons, it's interesting because it's a double-edged sword. The ease of deployment, the high coverage can shift into cons because, even though you might have a high coverage of assets, the visibility into the individual assets is going to be low and there's going to be a fair amount of configuration required and simulation of traffic to be able to enrich the information about the assets. And sometimes for many organizations, that's not really sustainable if they're talking about thousands of assets. So that's one of the limitations for network-based monitoring.

The other limitation – again, there's a high coverage of cybersecurity controls – but when it comes to threat detection, there is actually a high rate of false positives. And that's because the specificity or the visibility into the assets is low and you're also relying on network traffic. The confidence of what's showing up as alerts is going to be relatively low. I guess that's for network-based monitoring.

We'll jump to the host-based monitoring if you don't have comments or questions. Right, no? That's good. For host-based monitoring, the pros are that you will have high confidence in the visibility or specificity into individual endpoints. That's because you have an agent that's specifically residing on that endpoint and it's aware of the processes and various logs and trails that are being assessed or analyzed. So, you have high confidence in what's being seen. I would say the other benefit is that in addition to all those cybersecurity controls that network-based monitoring covers, host-based monitoring will be able to do that. In addition to some new ones, like patch management, for example, which typically will not be conducted through network-based monitoring. It's something that can be conducted through host-based monitoring. I would say those are the strong benefits, and they typically complement the limitations of network-based monitoring.

In terms of limitations though, the deployment process for host-based monitoring is going to be slower. You are dealing with individual assets that require the deployment. In the OT environment, there's typically a lack of a centralized capability to push the agents on the individual endpoints. So you have to basically do that on each one. The other downside or con is that there is typically more reservations from the OT stakeholders to deploy agents on endpoints because there might be risks associated with disruptions.

Although based on research and based on what we've seen in OT environments and agents that have been specifically designed for OT endpoints, the risk is quite low because the workload on those endpoints is the same as an IT-type agent. Of course, there's always a risk and it's always relative when you compare it with network-based, but there is that preconception. So sometimes it might be more difficult to onboard stakeholders to that vision. I would say those are the main downsides.

The other downside with host-based monitoring that also comes to mind is, you will have more limitations in terms of compatibility because the agent has to be compatible with the endpoint itself, with the OS of the endpoint. Most vendors, like I mentioned, focus on the IT type. So if your environment doesn't have, let's say, IT-type endpoints and the vendor doesn't offer capabilities on the OT-type endpoints, then you won't have a lot of options on going down that route. There are more infrastructure dependencies compared to network-based monitoring because [for network-based monitoring] you most likely will have a network switch in your environment.

Klaus Mochalski

Especially in multi-vendor environments, this could become a problem that you never get to the same level of coverage that you would get with a network-based solution. Because the different vendors are lacking the support for the agents that you would need to deploy.

Raphael Arakelian

Yes, that is correct. Typically, if we were to compare the two: network-based monitoring, more coverage, less visibility or specificity; host-based monitoring, less coverage, but very high specificity or visibility into individual endpoints. That's why they complement each other.

Klaus Mochalski

To me, that sounds like a hybrid solution which would be an ideal choice here where you basically complement the pros and cons. So, basically deploying both where they provide the best fit and let's say support from the vendor, this might make for the best solution possible. But before discussing this, let's also jump very quickly at the pros and cons for the two other categories.

Raphael Arakelian

Sure. So for the integration-based, the pros are that you are dependent on, you're relying on sources of truth that you know are applicable to your environment because of what you're utilizing – the SCADA, OPC, etc. You will have a high confidence or visibility into the asset inventories that are being created or ingested from that. Also, you would be able to track configurations, vulnerabilities – again, without having dependencies on the network traffic or the individual endpoints or the rollout. From that perspective, you'll have very high confidence in those types of cybersecurity controls.

The cons, once again, we see the double-edged sword coming in. Because this type of monitoring requires a dependency on the database itself or the software that you're working with, then there is a high level of customization that's required. So if you have a Siemens platform, an ABB platform, that would require the vendor, the OT monitoring vendor, to actually have an integration with each of those. And that takes a lot of time and effort, and there should be a relatively decent return on investment for the vendor to be able to commit to that. There may not be a lot of options on how to do that.

The other con might not be a con, it's more about a taste on how to approach OT cyber monitoring. But with this approach, you're not focusing on threat detection. You're focusing on developing a comprehensive asset inventory so you can track your configuration items, vulnerabilities, etc. That's all for integration-based monitoring.

For targeted active scanning, the pros are going to be very high visibility into individual assets. Because you are basically retrieving specific individual responses from those endpoints based on communication that they're typically accustomed to transmitting. You're going to get high visibility into the information about the assets, into their configurations. When you do that, you're also going to have high confidence in the vulnerabilities that are being detected. That's one of the main advantages of targeted active scanning.

When it comes to the cons, there is also that preconception sometimes, and even manufacturers or OEMs might not want scanning to be conducted on their equipment. We also see that sometimes with the deployment of agents. So there is that reservation.The other con is in terms of deployments, some vendors require their solution to be software that has to be installed on a workstation. So there's going to be limited compatibility there. Some sites might not have workstations. There are specific industries that don't have that flexibility.

The other challenge is that with targeted active scanning, although every type of monitoring requires configuration, this type of methodology will require a bit more effort in terms of the configuration. Because you want to make sure that the profiles that you're using or the targeted scans that you're using are matching up with the endpoints and what they're expecting to be seeing. Of course, your environment is changing, so you always need to make sure you're on top of that, so you don't send packets that are not expected by the endpoints and cause potential disruptions. And of course, there's also another consideration, which is the capability of the vendor. Because there's so many protocols, so many vendors, so many different types of devices, you may not have the full coverage of capabilities. The most common manufacturer, most common protocols will typically be covered, but there's a fair percentage that your vendor might not be able to cover, and that's something to keep in mind.

Klaus Mochalski

Right. So, it's pretty much a trust issue, if you think about deploying them and you're worried about disruption. And then in the long run, it becomes a support issue where in the network-based solution you are up against a certain number of network protocols that you need to support. While with this targeted active scanning approach, you are up against many different products, product categories, but even individual products that may behave differently, even for a single vendor. So you need to rely on the long-term support of your selected vendor in this case.

Raphael Arakelian

Yes. Because, still a high level of customization will be required.

Klaus Mochalski

You have looked at the market of solutions for quite some time. You've seen deployments at customers, advised customers. You built this classification scheme. Do you see the market of OT monitoring solutions heading in a certain direction? Is it like a hybrid solution using all of them? Or do you think one or two of these categories will prevail in the market in the long run?

Raphael Arakelian

That's a good question. I'm not sure if I can say that a specific category will prevail. I think there will always be needs for all four of them. I think more and more what we will see is what I refer to in the guide as a decoupling of threat detection from OT cyber monitoring, because historically, like I mentioned, through network-based monitoring, you were able to do asset management, configuration management, et cetera, but also threat detection.

Now I see more organizations, especially with more vendors playing that role, start to shift their focus on asset inventories only or tracking configurations, tracking vulnerabilities without the threat detection. But that doesn't mean that the types of monitoring that focus on threat detection will become obsolete. I think at the worst case, it will be split 50-50 in the market between these two approaches. I think eventually what we will see is these types of solutions coming together. Of course, that's going to be potentially in the top 5 to 10 % of mature OT organizations doing that. But I think more and more, we're going to see the shift on asset inventories, but then also complementing that to the limitations of either network-based or host-based monitoring.

Klaus Mochalski

It seems that a long-term vision could be that when they start to think about open interfaces, that the solutions can actually be used in conjunction and work together and share data either at a central vantage point like a SIEM system, or even by exchanging data directly so that you can combine the strengths of different classes of products.

Raphael Arakelian

Yes, and we've actually already started seeing that in the market. There are at least a couple of vendors that I can think of that have been doing this mix and match of capabilities or approaches. Over time, we will see how successful they will be.

Klaus Mochalski

Right. This brings me to the most important question of this episode. We are always trying to give our listeners a key message, a key learning towards the end of each episode. Let's assume I am an asset owner. I'm pretty mature with regard to cyber security. I did my homework with network segmentation. I have proper firewalls and other security controls in place. I'm not using OT monitoring today, but I'm planning to implement it. What would you recommend to me?

And also we all know that budgets for these deployments, they're usually limited in the first instance. So we probably don't have this ideal solution where we can start to mix and match from different vendors and deploy everything in one step. So what's your recommendation for first steps for an asset owner to take when they're choosing one of the different classes that you've mentioned?

Raphael Arakelian

Yeah, that's a great question. So from my perspective, Klaus, regardless of the maturity of the organization or where they are in their OT cyber security journey, there are important questions that they need to ask themselves before they decide on their approach of OT cyber security monitoring.

They need to understand which cyber security controls they would like to focus on. Which are the ones they're lacking? Maybe this is based on best practices, maybe it's based on compliance-related issues, but it's very important to reflect on those. But there's also other questions that might sound detailed, but they're very important.

What are the types of endpoints they have in their environments?

Are there different types of manufacturers? How many are they?

What is the state of their network infrastructure maturity?

Do they have capabilities to match, let's say, what's required for network-based monitoring?

Do they have the capabilities for deploying agents? Do they have the willingness for that?

What is the culture of their OT team? What is the relationship between cyber and OT?

Do they have any limitations regarding the deployment in terms of space or the power requirements?

Although they are detailed, by going through these questions and reflecting on them – and this is something that I've actually laid out in the guide itself that I've developed – the organization will be able to have a stronger grip on understanding its technical and business requirements. And once the organization can do that, it then has to revisit the strengths and limitations of every type of monitoring and think about it with the context of its requirements. Then the organization has to choose the type of monitoring.

Of course, it might not happen directly. They might be required to conduct a head-to-head evaluation between different types of monitoring, and that's totally plausible and can be encouraged. But by going through this exercise, they'll have a better understanding of which strengths they're aiming for and which limitations that they're able to either accept or they're able, as an organization, to keep up with and work around it. I think that's the main recommendation.

Klaus Mochalski

Right. Instead of focusing on the pros and cons and pick the, let's say the best class of product, you would rather have to look at yourselves as an organization to understand:

where you stand, what your specific requirements are,
what you need to protect,
what you want to achieve,
what other resources available.

To me, this sounds like something where we arrive quite often. If we ask these questions, you should pick a certain cybersecurity regulation framework like the NIS framework, or especially in the industry, the IEC 62443. This is probably the best guide to help you through this discovery process of what you have in place, where you stand, what the gaps are.

And then, if you understand this and also the risk level you are willing to accept and what you are willing to invest to reduce risk, then decide for the right solutions that you need to deploy selectively. And maybe different solutions are a fit for different parts of your organization. So this is usually the process that you would go through, if I understand correctly.

Raphael Arakelian

Yes, absolutely. That's a great summary because basically what we want organizations to avoid is doing panic spending or rushing into an approach or purchasing something or evaluating something very limited, which they end up not utilizing, not having the capability to leverage.

Because, then they've invested in something that's not helping them, and this preconception perpetuates across peers that we invested in this technology. Yes, the technology is not perfect. It has limitations, but it's important for you to understand it in the beginning, to understand if it is a fit for you or not. Because there might be other methods out there that will help you secure your environments.

Klaus Mochalski

Right. Methodology, again, is more important than the tool at the end, and this would be a general recommendation for everyone, no matter what their maturity is. Look at the security framework of their choice or the ones that fit best to the industry sector they are in and then take it and go through the process. Maybe get help from a consultancy and then have the selection process at the very end of this whole discovery journey.

Raphael Arakelian

Okay. I would say that's a very fair summary.

Klaus Mochalski

Very good. It's interesting that no matter what the specific technical topic is, we always or we quite often arrive at this conclusion. It's something that we can't emphasize enough. I think this is a great summary. I would like to thank you for being on the show. Thank you for presenting this classification scheme. I think there are lots of details that we could discuss potentially in the follow-up episode in more detail. I would be very curious to see how customers are using this methodology. This is something where I would be very curious to see results from your work in the future.

Raphael Arakelian

Absolutely. Thanks so much, Klaus. It's been a great opportunity to have this conversation with you, and hopefully your audience and listeners will be able to benefit from it, and it will encourage them towards the important journey of OT cyber security monitoring.

Klaus Mochalski

All right. Thank you very much for being on the show, Raphael.

‍