Press Releases Rhebo

News

OT Security Made Simple | A week in the life of an OT security manager

This time we take a direct look at the day-to-day work of a security team at a distribution network operator. We welcome Daniel Beyer (area manager for systems engineering and OT) and Sebastian Miethe (network engineering and IT security manager) from Thüringer Energienetze (TEN). The two frontline OT security experts tell us why visibility across all networks, systems and interfaces is key, why a SIEM alone is worth naught, and how systems engineering teams can integrate OT security into their daily routine without falling into alarm fatigue.

 

 

 

Listen on:

  

 

Transcript

Klaus Mochalski

Hello and welcome to a new episode of the Rhebo Podcast. I'm sitting here today with Daniel Beyer and Sebastian Miethe from Thüringer Energienetze (TEN) [Remark: the largest energy distribution system operator in Thuringia, Germany]. A warm welcome to you all!

We agreed in advance that today we simply want to tell our listeners how a relatively long-standing customer works with our system [Remark: the OT intrusion detection system Rhebo Industrial Protector]. We want to explore what experience has been gained, how the system is operated and also how over the years the application has evolved.

At the very end, we also want to take a look at the implementation of cybersecurity requirements defined by the German Federal Office for Security in Information Technology (BSI). But before we start, a quick introduction. Let's start with you, Daniel.

Daniel Beyer

My name is Daniel Beyer. I'm the head of systems engineering and OT at TEN, and I'm responsible for TEN's operation technology [OT] network. I am responsible for ensuring that it is always operational. And on top of that, I am also responsible for IT security. Mr. Miethe, who sits next to me, is in charge of that.

Sebastian Miethe

I'm still a relatively new colleague at TEN. I think I've been here for about a year and five months. My background is actually in electrical engineering with a specialization in communications technology, and I'm now in Daniel's department, where I'm responsible for OT and IT security.

Klaus Mochalski

Perfect. So you are exactly the experts. I think that's extremely valuable for our listeners. Tthat it's not always just us as manufacturers who tell them what our customers' experiences are like, but that it comes from the original source. TEN has been a Rhebo customer since 2019. That's when we conducted our security audit [the Rhebo Industrial Security Assessment]. Daniel, I think you were there from the beginning. Can you talk a bit about what the evolution has been like? How you guys came to do our security audit in 2019. Whether there was already a clear plan? How did you proceed and has the installation [of the OT monitoring & anomaly detection Rhebo Industrial Protector] developed until today?

Daniel Beyer

We started doing the first security audit in 2019. The motivation for us came from the fact that we had introduced a new control system at our company and a colleague and I had managed this system together. And then we became more and more aware that the requirements are getting higher and stricter, also in IT security and especially in all these network things. And we realized that there are quite a few things that we would like to monitor and – matter of fact – should monitor.

And then there was an event here in Leipzig at the municipal utilities [Netz Leipzig] where I saw the Rhebo system for the first time and found it relatively innovative. I went back after the event and talked to my colleague about it. He said we should test it.

As a result, we had our first security audit. And afterwards, we set up the first instance of Rhebo Industrial protector which focused primarily on the control system. That’s how it started. And yes, we've been using it ever since and continue to expand it.

Klaus Mochalski

Perhaps very briefly some background: our security audit or Rhebo Industrial Security Assessment is always the first step with new customers. It’s results form the basis to decide together with the customer exactly what the right deployment of the OT IDS looks like. In fact, the standard deployment scenario in the first step is that we monitor the communication to and from the control room. So that coincides with the experience.

And actually because you mentioned Netz Leipzig, that's one of our very first customers. They've been using the system since 2015. So, of course, it's all the more beautiful to see how that inspires others to work with it.

Sometimes, through the security audit things come to light that aren't always pretty for the infrastructure operator. What was it like for you?

Daniel Beyer

I can still remember it very well. We set the Rhebo system up and let it run for three months. For us, a very big effect or a very good result was that we first saw that our infrastructure was well structured. And in the process, of course, some weak points emerged that we were not so aware of. But we were able to mitigate these with the help of our colleagues who were involved in the project. And afterwards we knew what we had to focus on and what Rhebo was actually capable of.

Klaus Mochalski

Yes, over time we have learned more and more how to deal with these delicate results, which are often presented by our technicians. And in the beginning it sometimes came across very direct and blunt. And the customers almost went through something like psychological training, you know. We are often in a situation where we are presenting to a group of technicians. But sometimes the head of the department or the managing director is also present at the concluding workshop of the audit. And then these weak points come to light. Of course, this is not always a happy situation.

But the good thing is that from there we can work together to improve these things. The most important thing is the continuous improvement process that takes place. What were your next steps? When did you decide to include other locations in the monitoring and which are these today?

Daniel Beyer

So at that point we had the system running well, and could get a picture of what communication went towards the control system. We captured all the outer edges, so to speak, and were able to see what was coming into the control system. So the next question was: What is going on in our other network areas? And then we added other network interfaces and segments, which were linked to the Rhebo Controller. And from there we expanded it step by step. That's how it worked for us. Ultimately, we expanded it to the substations. Sebastian can say more about it.

Klaus Mochalski

The substations?

Daniel Beyer

Right into the substations. So we are now trying to cover the entire line of communication, for example, from the substation to our control system. And we also have monitoring in the other direction, i.e. from the enterprise network. So we have all interfaces under control.

Klaus Mochalski

This is actually a trend that we are also observing - the expansion of OT monitoring to substations. Because many substations are currently being digitized. New components are being installed that speak IEC 61850. This means that there are new possibilities not only for automation, but also for monitoring. Sebastian, can you tell us a little bit about what you are currently doing and what you are monitoring?

Sebastian Miethe

We have Rhebo in four places or four network segments. One at the outer edge of OT, not to the Internet, but to the network from our parent company. Initially, we didn't know what came over to us from there. That was interesting. Then one at the OT network connection for the 104 protocol. And the latest two are actually a service LAN, where third-party apps run over. 

Daniel Beyer

Management networks.

Sebastian Miethe

Exactly, but reaching up to the substations. And next we will install a Rhebo sensor everywhere, in all instances and digitalized substations.

Klaus Mochalski

This joint journey with you is really exciting. Because we are also observing the trend that the infrastructures you operate are becoming more and more distributed. More and more distributed components are being added and connected digitally. All the way to things like energy storage systems and charging stations where you have figure out which operator manages these systems in the respective local markets.

Such systems get added in large numbers. And it is important for us to be able to offer a solution that allows us to monitor such systems by simple means and with manageable effort.

Effort is the perfect keyword, because we get this question very often. How much effort do you currently use on average to operate the Rhebo intrusion detection system, which we used to call anomaly detection system? Many of our new customers are actually afraid of the operational effort, of the so-called false positives, that is, alarms that are not really relevant from an operational point of view. What's your experience there over the last few years?

Sebastian Miethe

Well, I can only speak for the last year and a half. Would you like to start, Daniel?

Daniel Beyer

Well, in the beginning, we spent a lot of time on it, but over time, as you said, the system learns. And then we also gained more experience, and knew what we had to pay attention to. That's why the effort decreased. And we knew how to deal with the false positives. We were able to observe them [using the monitoring function for separate events], and we were able to assess them better. At the beginning with one Rhebo instance, it took us half an hour a day, later an hour a week. When we scaled the instances it changed again, of course. Sebastian has more details about this.

Sebastian Miethe

Currently, we have 1 or 2 people who look at it every day, at least every evening. I also try to look at it for half an hour every day. And we have established that we always get notifications sent to us of events where we say they are really important to us. As a rule, these are mainly security-related things where we get a notification right away. And our goal is also to integrate the Rhebo system with our SIEM systems, so that we also get a connection there and better monitoring.

Klaus Mochalski

That would be my next question. Email notification doesn't sound like the final solution yet?

Sebastian Miethe

Right.

Klaus Mochalski

This is also something we often get as a request. Which interfaces into a SIEM system could we provide? Our answer is always “first of all, potentially every or we support every SIEM”. [Note: Rhebo Industrial Protector is already available as an app for IBM QRadar and provides standard interfaces for integration with other SIEM systems].

From our point of view, the problem is not so much the technical interface, but the agreement between the often different parties, which may also operate the SIEM in a SOC, in a Security Operations Center, and the operators of the intrusion detection system. So currently there is flexibility. A great deal of data can potentially be made available. 

The actual question is, if the receiving party knows what to do with this data stream. Of course, we can provide selective data to make it easier. Do you already have an idea of how this will work for you? Who will monitor the data? Will there be a single party or will there always be several parties which process this data?

Sebastian Miethe

For the time being, there will be several parties. We will also remain in the mode that we first have to log-in to the web interface of Rhebo Industrial Protector to look at the notifications history. But yes, the Rhebo system will send notifications to our SIEM in the future.

Daniel Beyer

And email, unfortunately. We still have to use this because of our processes. It will probably be phased out at some point, when the SIEM is running properly. But to date, it's still the case because that’s how the infrastructure has developed. Many infrastructures at energy distribution network operators are based on email. That is simply our experience.

Klaus Mochalski

Earlier, you said you need an hour a week on the system. That is actually a very manageable amount of time if you consider what an incident would cost a network operator such as yourselves. Not only in terms of purely monetary costs, but also in terms of the impact on customers and energy supply.

How many alarms or notifications do you have to deal with in the current setting and do they end up on your desk? Or do you have a 24/7 rotation principle?

Sebastian Miethe

The number of email notifications fluctuates. There are perhaps up to ten per day.

Klaus Mochalski

Do you process them immediately or do you sift through them briefly and then put them on hold for the week's review.

Sebastian Miethe

We don't have a 24/7 operation yet, so it’s definitely working hours. I try that this is worked off on at least a weekly basis. And we also have other colleagues who are looking at it, plus two students I'm training for on it. So I would say that as of today, we need more than one hour per week for the system. But again, we now have four Rhebo instances running and no longer just one as we did at the beginning.

Klaus Mochalski

Because of the substations in particular?

Daniel Beyer

In contrast to the control system, there is quite different traffic in substations, it must be said. So there are different scenarios. And that takes a bit more time.

Klaus Mochalski

Control system is a good keyword. A while ago, we signed a partnership with PSI, a major control system manufacturer. One of the ideas behind this partnership is ultimately to integrate the data that we generate into the control system. So that, so to speak, you end up with a dashboard where you have the normal operational dashboard combined with a smaller part that shows the security parameters.

How do you guys see that? I know there are customers who say “tTat's a great idea. Then we can report at the central point maybe in a simple way and with a traffic light system. If there is a security issue, then the processes can start.” But there are also others who have said quite clearly that “security is a separate issue. We don't want to mix that with what the control system covers.” What is your take in this?

Daniel Beyer

At a Rhebo event we once asked for an export function for some rough parameters which we make available in the control center via IEC-104, because at that time, it could not be done otherwise. 

But the journey, as we also hear from the community, is going in the direction of so-called information dispatches, which will look at such things. And then of course it would make sense if it could be integrated in the PSI core, because that is the tool with which such a dispatching and such a reporting center works. And I think that would actually be a good way to go.

Sebastian Miethe

I also think it makes a lot of sense, because maybe the dispatchers can't deduce that much from mere traffic lights on a dashboard. They can definitely deduce that something is wrong and can then start the incident response or say “okay, I'll now contact the on-call service from IT and they can do something about it. So I think for the 24/7 operation, that makes a lot of sense.

Daniel Beyer

Yes, absolutely, that's how I see it too. 

Klaus Mochalski

I also think that if you configure the traffic lights and the parameter limits accordingly, you can ensure that the incident response is only triggered when a certain criticality of events is reached. Then you can use existing resources and processes to deal with security issues.

Daniel Beyer

This also provides a lot of time [for reaction] when the teams can inform each other immediately because Rhebo is there 24/7. That is very helpful.

Klaus Mochalski

Another thing I am curious about is, what are you actually observing? We are all familiar with the reports in the newspapers when a medium-sized company is affected by a ransomware attack. These are the few moments where such information leaks out. I know we can't talk about details now, but in general.

 

I also tell in my webinars and also in the podcast again and again that our experience is that most of our customers are very well positioned especially in the area of Critical Infrastructure here in Germany, even before we get in there. And that we generally don't expect an attack to take place there every day, every week, every month. That we enable rather the small things:

  • reducing the attack surface,
  • uncovering a vulnerability in the operating system,
  • detecting maintenance technicians who make nonsense configurations.

To get to the bottom of things in the OT.

What are the things that you can think of that have occurred in the last three years that might have raised your pulse a bit? 

Daniel Beyer

Let me get back to the very beginning again. As you said, we often saw technicians hanging around in some process networks. We were able to identify them pretty well with Rhebo, because our computers are identified by special names and suffixes. So we knew what was happening. We could raise cyber awareness with the people so this is not happening much any longer.

Otherwise, we did not have serious cases in this respect. The only thing we have is that we are also the data hub for the entire Group. And we see when colleagues from other divisions in the Group's office network - which is why we now have a sensor there - don't retrieve data from us as they should, for example. Of course, our pulse goes up when an immense amount of data is moved back and forth. Even though it remains with the company, we still see that something is not right.

Klaus Mochalski

It's exciting that you say that. I remember an early installation at a large automobile manufacturer. And we found a whole series of exciting things there, too. And the biggest excitement was an access event to a local production plant from the corporate headquarters. That made for a huge storm. That was much more important than all the unpatched Siemens PLCs that we found as well.

Sebastian Miethe

Exactly. We can finally see these data flows or data retrievals, which we wouldn't see without Rhebo. Or, for example, what we see again and again is an insecure log-in, when someone logs in from somewhere via an old protocol and does so in plain text. We get notified in Rhebo Industrial Protector and can act on it so that it's no longer used or that it at least gets its own network segment. Yes, these are some of the events.

Klaus Mochalski

And this feeds into the continuous improvement process. 

Daniel Beyer

Exactly.

Klaus Mochalski

That you always get the reminder, even if it's not super critical to switch it off. But you get a reminder that at some point it should be replaced and no longer used, for example, old SMB versions etc.

Daniel Beyer

We often had this with camera systems that were relatively old. When we checked the operating system with Rhebo we found out that it is so old that it no longer worked. We were then able to take them directly off the network.

Klaus Mochalski

That is also a problem that is not going away. We have evaluated the RISSA results [Note: short for Rhebo Industrial Security Assessment], i.e. our security audit from last year, and there have been significant shifts compared to previous years. And in fact, in the past we have only talked about vulnerable software in general. Because that occurs so often, we now distinguish between vulnerable operating systems, vulnerable software, and vulnerable firmware. And each category typically occurs in over 50% of these audits. So that means it's still a very prevalent issue. Even in 2023.

My final topic for today are the requirements by the Federal Office of Security on Information Technology (BSI) that apply for critical infrastructure and important companies in Germany. The general requirements came with the amended IT Security Law with a deadline on May 1, 2023 for the implementation of an intrusion detection system. Last year, on rather short notice, the BSI released a guide that specifies the requirements. We made it readable by creating a tabular overview, because some of the requirements are a bit unclear, perhaps even vaguely formulated.

How did you deal with this? Did you actually use the BSI guide and what is the current status?

Sebastian Miethe

We have this document of course. And yes, it is quite a bit of prose. We had to go through it, categorize it and put it in table form, where you can then check off what needs to be done and what doesn’t. To decide where we have to work on our systems. 

In the end, we decided to operate on a double-track: With the Rhebo system as network-based anomaly detection and an overall SIEM system to meet the requirements.

Klaus Mochalski

That's exciting about the SIEM system. There was actually quite a bit of back and forth, even if you followed the revision of the document. In the penultimate revision, a SIEM system was actually explicitly required, and it could even have been read to mean that the SIEM system alone was a safe bet.

Daniel Beyer

Many companies actually went down that alley.

Klaus Mochalski

Though, a SIEM alone is far from being the effective solution. Because its performance depends on what data I process with the SIEM system. In fact, a SIEM system alone is of no use. In the end, this paragraph was removed from the document.

But you go double-tracks. That's good. Of course, we as a solution provider see it the way that an intrusion detection system or SIEM system can only analyze protocol parameters properly when you monitor the network communication. Because otherwise the data sources for SIEM systems are restricted to log messages.

Sebastian Miethe

From end devices.

Klaus Mochalski

End devices that produce log messages. And this means, that In the end, the programmer of the software on the end device, decides which event is logged and which is not. In other words, you rely on certain things.

The core of an anomaly detection system is that every change in the network communication. Even events that we never foresaw, generate a notification. And that is not covered by log messages. That's why we think behavioral network monitoring is a very important element.

Let me come back to the guide. At the end of the day, there are these implementation levels from 0 to 5, which used to be called maturity levels. Level 3 is the standard that everyone is trying or has tried to achieve by May 1, 2023. Where do you currently stand and are there any plans beyond that?

Daniel Beyer

Yes, we reached level 3 in April. And the next audit is in two years. Then we'll have to move up a level. In any case, it is our goal to achieve this.

Sebastian Miethe

In any case, we were able to convince our auditor with the combined systems. And he also thought it was very good that we use the network-based detection of Rhebo, so including all the organizational processes and guidelines we have reached level 3.

Klaus Mochalski

Congratulations for that. Do you have the feeling that you are in good company there? Have all critical infrastructures succeeded, or are there many companies that still need to improve?

Daniel Beyer

We only know of one company that reached level 2 only but were able to improve. Otherwise, we have no insight at all. To be honest, it is also not communicated so transparently from the BSI.

Klaus Mochalski

Of course, on the one hand it is understandable that such information is not shared publicly. But especially in the community of critical infrastructure operators, an exchange is actually quite good. To talk about experiences, processes.

And of course tools like SIEM systems we talked about. In the end, the efficiency and effectiveness of a SIEM depends on the use cases that are considered. How well are the company and infrastructure mapped to the SIEM? That's where an exchange of experience would be worth a lot. In any case, you should report back [to the BSI] that this is made possible in some form.

Sebastian Miethe

I'm also a member of a project group called IKT. There was already a good exchange with other network operators from Hamburg to Thuringia, before the audit [according to the IT Security Law] about what everyone was doing and what they thought was right. Now, there needs to be a debriefing. Then we'll probably have more knowledge about how it went with the others. 

Klaus Mochalski

It's definitely exciting. I think we also have to give some props to the BSI, because I think they get a lot of, often diametrically opposed, requirements and wishes from the various companies.

Daniel Beyer

I think so too.

Klaus Mochalski

Some would like to share information, while others would prefer to offer nothing else at all. And that is of course difficult to balance.

I think it would be very good if we meet again in, let's say, two years and then see where we, where you stand then. What are your plans for the next two years with the system? What are the next big steps you plan to take? You already mentioned implementation level 4.

Sebastian Miethe

That's definitely the case. Maybe even level 5, let’s see where we head.

With regard to Rhebo, the cooperation with the PSI systems is still interesting for us in any case. I think we will perhaps also establish a fifth instance so that we have even more insights into the core system of the control system.

Klaus Mochalski

Is this about monitoring communication to the control system or is it about integration into the control station itself?

Daniel Beyer

Integration. We have now done everything for sources outside the core system, monitoring everything. Now we want to take a look inside the system. We don't want to make any mistakes and find out too late.

Sebastian Miethe

Additionally, of course, the integration of all electrical substations. This. is an ongoing process.

Sebastian Miethe

And what is also an issue is the better evaluation of the Syslog data from Rhebo in our SIEM system. So the integration of Rhebo and correlation of Rhebo Industrial Protector network data with the end device logs.

Klaus Mochalski

This as a good idea to combine network traffic data with log information, i.e., with log data from a wide variety of systems, firewalls, and server systems. Because then you can analyze things much more quickly, where today you might still have to do a bit of network forensics. And in this respect, this is also a very important development in terms of efficient and fast work.

Daniel Beyer

Definitely.

Klaus Mochalski

This was great fun. I think that were very good insights for our listeners. Thank you very much, Daniel. Thank you very much, Sebastian. I'm definitely looking forward to the next episode. Until then, I wish you all the best, and no incidents or at least no serious ones.

Daniel Beyer

We would like to see the same. Thanks.

Sebastian Miethe

Thank you very much!