Essential Strategies to master Incident Response in Cloud

Apr 2, 2024

View Show Notes and Transcript

How do you build a Robust Detection Framework? Ashish spoke to Andrew Tabona, SVP of Cyber Threat Management and Incident Response at a Fortune 500 company about challenging the conventional wisdom of applying on-premise incident response plans to cloud environments. They speak about the critical metrics of mean time to detect, respond, and recover, and why mastering the fundamentals is key to effective cloud security. The conversation also covers practical strategies for building a detection framework, the importance of a balanced approach to log ingestion, and the nuanced differences in incident response between cloud and traditional on-premise environments.

Questions asked:
‍00:00 Introduction
‍03:20 A bit about Andrew Tabona
‍04:26 What is Threat Detection and Response?
‍06:14 Why incident response is different in Cloud?
‍09:18 Benefits of doing Incident Response in Cloud?
10:29 Is CSPM your incident response tool?
‍12:33 Where to start with Detection in Cloud?
‍16:35 Getting buy in from other teams for threat detection
‍20:15 Should you build or buy a cybersecurity solution?
‍22:34 Responding to incidents in a Cloud Context2
6:01 Containing incidents in a Cloud Context
‍28:34 What kind of access do IR teams need?
‍30:36 Balancing the signal to noise ratio
‍32:10 Where to start with Threat Detection and Response
‍34:37 Challenges an organisation might face
‍35:58 Threat Detection and Response in MultiCloud
‍37:52 Showing ROI of Cybersecurity to the business
‍38:57 Where to learn about IR and Threat Detection?
‍41:09 Fun Section
‍44:14 Where you can connect with Andrew

Andrew Tabona: [00:00:00] People will say I have an incident response plan. Can't I just use that in the cloud to respond to cloud incidents? But yeah, so it's this sweet balance between cost and making sure you have what you need in the event of a, investigating an incident, assess what you're needing to protect.

Assess what cloud native or third party controls you have in place today. And then look at what are the associated risks or threats that you think you're going to come up against. Depending on your industry, environment, the data, the assets you have, etc. If you think about the way that the cloud threat landscape is evolving, The bottom line was that we said we're not going to be able to keep up, right?

And we're not going to be able to have the parity we need across all the clouds that we use. It's really just about knowing what there is and where to go to find what you need, right? So many people get caught up [00:01:00] with, trying to learn the advanced topics when actually a lot of it is down to the fundamentals.

Three metrics to keep in mind are mean time to detect, mean time to respond, and mean time to recover.

Ashish Rajan: Have you been using a CSPM or a Cloud Security Posture Manager to build your threat detection response capability? If you are, then you might want to listen to this. Or maybe if you're looking at building your threat detection response capability in the cloud, then this is the episode for you.

I had the opportunity to talk to Andrew Tabona, who is the SVP for threat detection and incident response in a fortune 500 company, where we spoke about how he built his entire threat detection response across multiple cloud environments. What was some of the nuance around what can we carry over from on premise onto the cloud world?

What are some of the incident response changes you need to consider? Not just from a strategic perspective, technical perspective, team skills perspective, and where just having a CSPM is not just the answer. What are some of the things you would lack if you just rely on your CSPM? And if you are thinking of building a threat detection [00:02:00] response, specifically across multiple cloud, where should you start and what's the challenges you might face?

I can't wait for you guys to hear and give me some feedback on social media for how much you enjoy this episode. Specifically on threat detection and response, because incident response, threat detection, threat hunting in the cloud space is not spoken about enough. And I think Shilpi and I am making it a mission.

At least the cloud security podcast team is making it as a mission to talk more about it in 2024. Like we do a whole training on cloud security bootcamp for cloud purple teaming. It just needs to be spoken more. So I hope you enjoy this episode and get to learn some of the offensive side and maybe building some defense for the offensive side as well within your organization as you do some threat hunting.

As always, if you're here for the second or third time watching or listening to this on iTunes or Spotify, feel free to drop us a review or rating. It definitely helps us quite a bit. If you're watching LinkedIn, definitely give us a As always, we appreciate all the love that you share with us online. We did 1. 5 million downloads plus views last year and can't even imagine what we'll do hit this year, but it would not have been possible without your support. So thank you [00:03:00] so much for doing all that and continue to support us for all these years. I look forward to seeing you on the next episode. Enjoy. Peace.

Welcome to another episode of Cloud Security Podcast. This is a new setup, so we'll get, we'll find out what happens in the end. But I have this great honor to welcome Andrew Tabona, who's fortunately a UK talent as well. But welcome Andrew to the show.

How are you, man? Thank you. Thanks so much for having me. Can you share a bit about yourself and your cybersecurity experience as well for the audience, man?

Andrew Tabona: Yeah, of course. Of course. So I have a background in digital forensics. So I did my master's degree in digital forensics and e discovery which was really the stepping stone for me to move into InfoSec.

As a personality, I love to investigate and ask a bunch of questions. So it fits nicely with the way I like to operate. Anyway, after that, I worked at an investment bank for a while before landing where I am now at a fortune 500 financial services company. And I worked my way up in the team. I started as a cybersecurity analyst at 2016. I believe that was and I now lead [00:04:00] the cyber threat management function which includes IR red team and threat intelligence. So I'm really, these days, as much as I love to get hands on. I'm more about the strategy and helping to build a mature cyber programs, finding suitable technologies and processes and stuff like this to plug a gap.

I'm talking to a lot of executives and just supporting my amazing team. So with whatever they need, really.

Ashish Rajan: That's awesome, man. And actually it's funny cause you've been in this space for a while and. I want to level the playing field for people who probably would not have heard of threat detection and response.

How would you describe threat detection and response for people who probably haven't heard of the term before?

Andrew Tabona: So I think in its simplest form, I would say threat detection is about being alerted or being made aware of anomalous behavior in your environment, I guess that warrants further investigation, right?

So you're looking for different patterns of behavior or indicators of compromise [00:05:00] that kind of signal when something or some activity looks like it could be a threat. And this could be anything from a domain that's known to be malicious or a pattern of behavior, say a connection from a country that's never connected to your environment before, let's say using an access key that's been like dormant for six months.

And right after it authenticates, it starts doing discovery or enumerating resources in your cloud environment, right? That's anomalous, right? That's something that warrants further investigation. And then on, on the flip side, there's also the incident response element to that where it's really just about preparing, managing and learning from a cyber incident in it's the crux of it.

So what you do is you follow an industry standard framework, if you like, and that helps to shape your incident response program. It helps you to know who's who, what's what. Ensure that you have the detective and preventative controls in [00:06:00] place, the right skills to investigate, coordination with internal and external stakeholders, regulatory reporting if you're in a regulated environment, legal issues, and so on and so forth, there's a big element to this.

Ashish Rajan: I feel like it was worthwhile noting as well cause you've been in this field for a while and starting the forensic, the whole threat detection and response, we've been doing this on premise for a long time, right? And a lot of people, when I talk to in the cloud context, initially, at least initially I used to get the response that, isn't that just another environment that I'm doing threat detection in?

Why is it that I have to do this, give this special attention to cloud in a different way? What are your thoughts on that?

Andrew Tabona: People will say I have an incident response plan. Can't I just use that in the cloud to respond to cloud incidents? But look, in my experience, the principles of IR, the life cycle or the framework that you follow is pretty much the same.

So whether you're using or you're following the SANS.PICERL framework or the NIST IR framework, that's the same but the way you do [00:07:00] things and how you prepare and the logging and telemetry that's available to you in the cloud is different, right? So if I have to break it down, I would say there's a couple of elements of this.

So you think about in the cloud, the scale and complexity of the cloud there's hundreds of services out there. And honestly, it feels every week or so, they're releasing more and more of these services. It's very hard to keep up. And then you have your developers, they want the freedom and the autonomy to use these new services, like almost immediately, having some sort of architectural patterns and standards helps, but obviously developers, they tend to sway from that.

Sometimes they like to do their own thing. So I guess it's making sure that you make them aware. I always think of this example, right? A few years ago, you remember AWS. Had this wizard where you could create, when you were creating a EC2 instance and opening up a security group, right? Configuring the security group.

You just hit next next. And it was basically any, right? [00:08:00] It was just like, we had a phenomenal amount of exposures that right in the industry, and then obviously AWS went out and fixed that, but. It's just being aware of that, right? And then the other piece I think of is log sources. So you think about the way the logs are formatted and analyzed is different across CSPs, right?

And then you've got the microservices, the functions as a service, the auto scaling, all having these logs up and down within minutes, right? So like all these events happening within minutes. And then just one last thing I find really is the forensics side. You touched on it earlier. So forensic acquisition with on prem, like traditional forensics, you have all the artifacts you need generally in one location.

With the cloud, it's spread out, right? It's there. Spread out across regions, availability zones, each within different services themselves. And I feel like most tools out there, at least in my experience, aren't really built [00:09:00] for the cloud when it comes to forensics. It still really relies on taking an image of something or a snapshot and taking that back to a lab, so to speak, and then analyzing the artifacts there, things are getting better, of course, but I don't feel we're quite there yet. So that's a bit of a challenge.

Ashish Rajan: Yeah. Would you say the benefits as well to being in cloud, doing incident response in cloud?

Andrew Tabona: Yes. So first I'll say, yeah, so you think about the cloud model.

And, the shared responsibility model and so on. So it's like the higher up you go in that cloud model, the less control you have on what and how you can forensically analyze things. But the good thing about cloud in my experience is that there's a lot of logs out there. And if you choose the right telemetry and you know where to look, you think about the control plane, the data plane there's a good stuff that can paint a picture or help you to paint a picture of what happened.

I would say, just generally with the cloud, the benefits I find is, there's the faster deployments as [00:10:00] well, I think is something that I find helps me or my team or my organization, right? So you think about infrastructure as code say Terraform, right? If you have a cloud incident and you need to rebuild that environment, that's part of the recovery phase.

You can do that within a matter of, minutes or hours, depending on the scale versus, sometimes days to rebuild that in on prem, right? So I would say definitely keep that in mind and utilize that.

Ashish Rajan: I think another question that comes up as you were talking about this incidence and how to detect these things as well.

A lot of people also have bought into the idea of a CSPM or a Cloud Security Posture Manager. And, you I would love for you to demystify this. If I have a CSPM, isn't that kind of my incident response tool as well?

Andrew Tabona: Not by itself. I had this as well when I was thinking of bringing in a CDR, right? Cloud Detection Response Platform into the organization.

Some folks turned around and said we have a CSPM. We're good, no? Why do we [00:11:00] need another tool? That's right. It just seems to be this thing this sort of misconception. But look, CSPM, you break it down, right? CSPM for me is about hygiene. It's about cleaning up misconfigurations and policy violations.

But it doesn't really excel at alerting or detecting real time attacks like somebody's actively doing something in your environment that you need to know about. For there, I think you need to leverage you either build something yourself using the cloud native tools and services that are available, right?

Or you go out and look at buying a solution that, that does that for you. So CSPMs are great. Don't get me wrong, but they're not the be all and end all when it comes to IR in the cloud.

Ashish Rajan: Yeah. And also sometimes the challenge at least I seem to think the real time versus Hey, 24 hour later kind of thing is have you seen that as well, where you get a change, which may be a threat right now versus a thing, how there's like this [00:12:00] CSPM request 24 hours, or at least some of them request 24 hours before they detect the fact that, hey, Ashish's account is compromised or something.

Andrew Tabona: Yeah, exactly. Exactly. So it's not, in some cases, it's not real time.

So you could have someone who makes a configuration change. And then this CSPM is set up to detect, to to run every four hours, every six hours or whatever to get that alert like after the fact or way after the fact. And then by the time it turns into an alert that somebody looks at, like you say, 24 hours could have passed and 24 hours is a very long time in the cloud.

Ashish Rajan: I'm so with you, man. I think maybe it's also a good time to lay the foundation for when you we've done this in your own organization now, so I'm curious, how can we level this for people who are trying to build detection in cloud? How do you start preparing to even start doing detection in cloud in an organization?

What's your recommendation for that?

Andrew Tabona: I think there's a few elements to this, but one of the big questions we started asking ourselves internally was firstly log ingestion. We need to make sure that [00:13:00] we have the right stuff to give us that visibility if there was an attack.

So when it comes to log ingestion, you think of things like actually who needs to be involved in the planning of a detection because it's not just the detection engineers, right? There's multiple people. Up to and including, the business application. Cloud account owners, right?

You need to get that context. So thinking about who owns this space, who can contribute to the building out of detections. In my experience, it was a whole host of people. It was the cloud sec guys, it was the threat intel and red team, it was the sec security engineering, the business. All these kind of teams need to be involved in my experience.

And then you start looking at what telemetry should we collect and what artifacts do we have to highlight anomalies? How many logs should we collect? So it would be lovely to collect everything.

Ashish Rajan: I think that's not really, I'm like, usually the answer is give me everything. Isn't that like a security term that is very popular as well.

Give me [00:14:00] everything.

Andrew Tabona: It's yeah, I've had this thing as I think it's if you are in an organization that can afford to collect everything. That's great. But I actually don't even recommend that because I think it's just, you're just burning CO2 for nothing, right? Because there's no need to store all that.

Just think about it, right? So I need network, I need data. And then obviously customize it to whatever that environment or the services that are running in that environment are, and then you think about where should we keep it? Like, where should you keep the logs? Just. Leave it put it in a scene, for example, should you look in some kind of data lake, should you keep it within the CSP itself?

The other thing we thought about was how long should we keep it for? Cause this whole thing around, how long do attackers remain in the environment and how long they live in there, et cetera. So it's this sweet balance between cost and making sure you have what you need in that in the event of investigating an incident, but I would argue as well.

If I come from a regulated [00:15:00] environment, think carefully about how long you keep stuff and take guidance from your legal team, because sometimes the longer you keep something, you're actually exposing yourself, or making yourself more liable when it comes to discovery, right? If you're in a legal case and somebody says, give me logs from two years ago, and you have them, then you're putting yourself in a sticky situation.

Ashish Rajan: Yeah. Yeah. What's a good balance to find between that?

Andrew Tabona: Totally. So I would say in a nutshell, assess what you're needing to protect. Yeah. Assess what cloud native or third party controls you have in place today and what logging specifically those controls can give you. And then look at what are the associated risks or threats that you think you're going to come up against.

It's depending on your industry environment, the data, the. Yeah. Yeah. The assets. You have

Ashish Rajan: etc

Bringing all the different, thinking brains in the organization together, whether it's your SecOps team or DevOps team for that matter, if it's a cloud team. In terms of building detection [00:16:00] in cloud are there components that you think of where, oh, okay, do I need detection first or do I need to I almost feel a large organization would have a lot of applications to look at. You can't just start looking at everything in one go. How do you even get the buy in from these people? Cause I would find that sometimes when a threat detection team reaches out and say, Hey, we want to audit what you're doing.

It comes across a bit more adversarial sometimes and could be taken a wrong way. What would your suggestion be? Because I find a lot of people struggle with the idea of conveying the message for how do you even start getting the right information because not to what you were saying earlier, new services coming out every day.

As we are doing this recording, I'm pretty sure that these services being released as well. And unless we were working together, we're definitely not going to go forward. We may have that mentality, but it's just the way it's worded could be different. What was your approach to find some of the applications you could work with to start developing say the kind of log ingestion, the kind of input you wanted. Was that something that worked for you that you could share?

Andrew Tabona: Yeah, of course. Of course, for me, it was down to [00:17:00] prioritizing the critical assets within those environments. So what are we trying to protect? Why is it important to the business? So you start prioritizing things.

If someone tells you, Hey, If this service goes down or we're attacked in this part of the environment, we're going to lose $10,000 a minute. You probably want to think about prioritizing that, right? Or if somebody says, hey, we're holding PII in this database, right? So you focus on that, right?

And stack rank based on the critical assets or the crown jewels that you're trying to protect. And there are some quick wins. That you can implement. I would say, once you have that information, basically map it against the threats use threat intelligence and threat modeling.

That's one thing we did. We looked at everything and we said bring up MITRE, right? Talk to your red team guys, maybe get some external expertise if you need it. And start looking at what threats are we going to be up against, right? In this environment or in the industry right [00:18:00] now, and then start stack ranking your detections based on that and the importance to the business, I think that worked for us.

You also, Ashish, I asked about how do you get support? And I think that was a really important thing to me because, generally the security team, like if somebody goes up to the business or one of the dev ops guys and says, Hey, we need to do this. It's Oh, here come the security guys again.

Like going down. So I think there's a couple of things, right? So first is you need to get, at least in my experience, you need to get support from the top, right? So if you have support from the top, then that sort of trickles or cascades down to the tower leads and the managers and engineers and everything, right?

Secondly, build relationships with the people on the ground in the trenches way before you want to go in there and start talking about, probing and putting accounts and monitoring and whatever on their environments. And by that, just involve them, give them a seat at the table, right?

Ask [00:19:00] their opinion, right? Maybe go and share some information about the threat landscape, make them realize that, hey, this is important, right? So I remember once we, we were actually at an offsite and we had all these senior leaders and then we purposely brought in an external partner of ours to talk about the cloud threat landscape and people were terrified of doing it, right?

And honestly, from that talk alone, we got a lot of support from the top. Yeah, it's a multifaceted approach, but I think it, it comes down to relationships a lot of it.

Ashish Rajan: Yep. So we've spoke about the kind of telemetry to identify, start bringing up risk, the kind of people that you should be looking at bringing together, especially for the crown jewels you may have.

We also spoke about, the skillset in the team as well. And I think, how in the beginning you were talking about there could be a level of automation or API keys or whatever you can buy a CDR or a CSPM, whatever, is there like a build versus buy thing in this as well? While people are building detection, are [00:20:00] there sets that are already available that you guys found that was a good starting point to go to the team or like using a platform or open source or something, was there something that you found that was helpful as a starting point?

Cause I'm assuming not everyone is aware of all the threats in the world that say AWS or Azure or Google Cloud may have.

Andrew Tabona: So that's the thing for us, right? So we had this long debate internally about are we going to build or are we going to buy? And even though we have incredible detection engineers, I call them ninja level.

They're really good folks. If you think about the way that the cloud threat landscape is evolving, the bottom line was that we said, we're not going to be able to keep up, and we're not going to be able to have with the resources we have anyway, the parity we have, the parity we need, sorry, across all the clouds that we use, right?

We made the decision to go out and buy a CDR, and that was because, specifically, [00:21:00] that's their bread and butter, right? They do this day in and day out they take what they learn from IR engagements, and they have experts in that field, in the threat side of the house. That's pretty cool. And they factor that into their detection engine, and then they help you customize and give you those high fidelity alerts.

You could go out there and use event bridge and things like this and AWS and Lambda functions to go and build some sort of level of response and maybe use things like Splunk to build out your detections, but you're going to be fighting a losing battle. What we decided to do was like buy and then supplement and customize, right?

So you have the base of it with your CDR. And then of course, we're free to build your own custom detection and supplement what they give you based on your own environment, which would be very useful.

Ashish Rajan: Which to your point is also worthwhile calling out that not every tool would have all detection for specific applications or specific scenarios in your environment as well.

Andrew Tabona: Definitely. So you're going to have to do some level of customization [00:22:00] when it comes to that. And sometimes it's not even possible because not possible you don't have that level of visibility or telemetry that can give you a really solid detection. You have to, I guess that's where the behavior comes in, right?

And you, the TTPs around bridging everything together. And using one piece of one signal with another signal to give you like, hey, something's weird going on here, worth taking a look I can't tell you explicitly, but something needs to be investigated here,

Ashish Rajan: taking that thread forward as well, when things are detected and people have to respond to it as well.

I imagine there's a lot of things that as a general IR principle, people will be doing from a containment perspective, response, respecting all of that. Are there any levels to these in what people can expect when they're building response in a cloud context? Like how far and you could share the experience from a difference between an on [00:23:00] premise versus a cloud as well.

Are there different levels of responses that people may expect or would have to build in a cloud context?

Andrew Tabona: I think in a nutshell, so from an on prem perspective when it comes to things like containment, you have most of what you need in a box. You start looking at those artifacts.

Maybe you supplement, of course, with network logs on firewalls and whatever. But you it's more within your control. So then when you start doing things like traditional forensics and piecing a timeline together of events. There's typically one or two pivot points you can go across, right? Like they become clear as you're starting to do that forensics.

I see the full package review like in one central location. When it comes to the cloud, that's a bit different. Like it's just spread out across lots of different logs within that environment. So it's a bit different, but it's really just about knowing what there is and where [00:24:00] to go to find what you need, right?

And I always keep preaching about this kind of people laugh at me, but it's really all about asking the right question or asking the right questions, right? So many times we've been on a, an incident and we don't have all the answers, right? I don't think anybody, I speak to people at, the crowd specs and Mandiants of the world and they do this day in, day out, and even they don't know this stuff , they face stuff every day. But I feel like if you ask the right questions, you're gonna get what you need, right? You're gonna get some sort of picture at the end of it. So that's what I would go across. And I would also say in my experience so many people get caught up with, trying to learn the advanced topics, when actually a lot of it, It's down to the fundamentals.

If you master the fundamentals, like in say use RTC or even networking or say VPC flow logs how traffic ingresses and egresses out of your environment, all this stuff, eh, you're going to be better equipped to ask the right questions. [00:25:00] And B, you're going to find the answers quicker, right?

At least in my experience. Yeah.

Ashish Rajan: Yeah. And I think this is goes back to what you were saying about understanding the landscape as well, which in a on premise context where things were not changing that dramatically, you could know what the starting point and potentially how to follow the breadcrumbs in a way as people like to say it, whereas to exactly what you said earlier with the example of multi region kind of thing in a cloud context, you may start in the UK region, but land in US East Coast or Iceland, the breadcrumb could go anywhere in order to figure out where in this whole of AWS, and that's just, by the way, one account, that's not even like multiple accounts.

And add the whole, another layer of multiple clouds and stuff. I know what you mean. Are there challenges with containment as well? Cause a lot of, conversations around, at least in the initial stages, people started talking about auto remediation, auto mitigation. Are there any specific nuances to say containment or responses as well?

That people should look out for when thinking from a, Hey, this is different to on premise.

Andrew Tabona: [00:26:00] Yeah. So I think with the cloud, the main difference is the leveraging of automation. I think it's super important to have that in your strategy. And this is where we talk about playbooks, right? So whether it's cloud or on prem, I think obviously having a playbook is important, but more importantly in the cloud when you're looking at those human readable playbooks, the theory playbooks or whatever as I like to call them sometimes, that's showing you, walking you through the IR life cycle for different cloud threat scenarios, right?

This is how we prepare, analyze, contain, etc. And then, as you mentioned, the automated or semi automated playbooks that run either by themselves when certain criteria is met, or when a SOC or ION analyst does it presses a button, right? Like it's there. It's ready to be executed and run the background, but you want a human to make that decision.

And what I mean by this is, things like say creating a lambda function or having a CLI [00:27:00] command at the ready for things like deleting IAM roles or, revoking sessions, creating an EBS snapshot and things like this, right? What I found actually, I just, I want to touch on this point. Like you alluded to people are nervous about automating or fully automating these kinds of response or remediation actions.

And that's true. That's fine. That's a good concern. I would say the way we're approaching it is like a multistage approach first implement these semi automated playbooks in the background so that your analyst do have to make that decision over time. Take that data, look at those trends and start to ask questions like how many times was it successful when a SOC analyst hit the revoke sessions playbook, right?

Executed that. With that, you are building data and confidence that you're not going to have, or there's less chance of you [00:28:00] having false positives, right? With this playbook. You can then take that to the business and your executives. And say, look, over six months, we had one false positive out of a hundred executions.

Are you guys comfortable with us semi automating? Oh, sorry, fully automating this. And with that, by the way, our response time is going to go down by X percent. You start talking about these numbers and this data, and ears are gonna be listening much more than they would if you just went in and said, hey, we're, Putting this in place tomorrow kind of thing, so it can be super, super helpful in that scenario.

Ashish Rajan: To you point about being able to do responses? What kind of access do you think works best for IR teams into a cloud environment? And I know we're talking about in general across the board. So the technical specifics may be different, but in general, like earlier, it would just be an appliance being dropped down and suddenly we're doing like a memory snapshot and all of that.

There's technical limitation to how far people can go with that in a cloud context to share [00:29:00] responsibility, what you called out earlier, what do you find as the level of access for IR team that works in your experience?

Andrew Tabona: The way we approached it is to have a read only access to all accounts within all the CSPs because we need to jump in and look at what's going on and start reading and getting some sort of level of understanding, right? And see what's enabled, what's not, et cetera. What you don't want obviously is God level access for everyone from the get go.

That's a bad idea. But this sort of concept of a break glass accounts or just in time access, right? Depending on how you look at it is something that worked for us. So if you have a break glass account, a power access user, privileges that you have to check out of a PAM, right? A Privilege Access Management tool or some sort of a credential vault on an as needed basis.

That is going to give you the best balance of not over provisioning, [00:30:00] right? And you also have an audit trail, right? Of why the hell did somebody check out this account and log into my AWS or Azure account. And actually even better, we alluded to it earlier, but if you can programmatically build out using whatever functions or Azure automation playbooks or run books, whatever you choose, that just in time access to perform containment and remediation actions. That's also another model you can look at because if you're nervous about the break glass accounts or you get some hesitancy around that, then I would say go down the programmatic model or have both.

Ashish Rajan: Worthwhile calling out as well. Responding to incidents also means that, and I'm sure it's the case in most organizations that there's a, the signal to noise ratio is always hard to manage and hopefully for most companies out there, most of the incidents that are being raised are false positives.

And you basically find that, Oh, it's not really incident, but still have to be triaged. What have you found as a balance or way to balance that? Cause there's an opportunity for people to not [00:31:00] repeat that single to noise ratio blowout for lack of a better word in on premise in the cloud context.

Was there something that you found was helpful for the team to do IR better in a cloud or a multi cloud context?

Andrew Tabona: Yeah, I say this whole signal to noise ratio and this comes to the point of the fidelity of the alerts and you want to be aiming for high fidelity because it's a sort of a known thing, right?

If you start bombarding the SOC with a whole bunch of noise that they're wasting time on and taking their attention away from the real stuff, the serious stuff that's happening. Then they're eventually going to get burnt out. I took the approach of, I would rather invest the time in creating two high fidelity alerts that I know are going to give me some value vs just churning out. 12 alerts, which give us little to no value, right? Just because all we have 12 alerts. Now we're covering like this part of the mighty cloud framework. Trust me. It's much better to say I have five as an [00:32:00] example, but these five, I know that if they trigger, there's a, 80 to 90 percent chance it's something weird that's going on, thinking about that and investing the time wisely as well.

Ashish Rajan: Are there any challenges that people would face as they go down this? Like we're obviously talking about how they can start building detection capability. We spoke about skillset across multiple clouds and automation playbook as well on how to approach the playbook.

Are there any low hanging fruit that they can go for in the beginning when they're trying to start building this?

Andrew Tabona: Yes, there are a few I can think of, right? The first one is build out an organizational RACI. So basically, who's responsible for what, right? Because speed is essential for containment, especially in the cloud.

So having that predefined RACI and understanding with the business, like how much the IR team can do on their own to put the fire out, so to speak, it's going to be a really important and then just mapping out what does the dev ops team do? What does the application owner do or the server [00:33:00] on or the account owner do?

Where are we going to get the business context from of this account, who's going to gather the logs, who's going to rotate the compromise keys, these kinds of things. Having it. a table or a RACI ahead of time is really going to help you like the IR team, we need to jump in and stop the bleeding right as soon as possible.

The last thing you want to do is call somebody at 2 a. m. in the morning and ask him, Hey, are you the account owner? Can you come and take some action in this account? And He's no, it's not me. Or, or he says, okay, I can be there in 30 minutes. You need to be able to do this stuff in seconds or minutes.

But again, on the flip side, I would say we also have to be conscious of business operations. And I mentioned this earlier, so you don't really want to be like willy nilly pressing the buttons and like killing services and shutting down EC2 instances just like that. When you don't know what's going on, having said that, I will always, if I, [00:34:00] as an incident responder feel like if I do not take this action, it is going to have a huge impact to the reputation of the revenue of the business.

Then I will take that action and also forgiveness later, right? 99. 9 percent of the time justify why I'm doing something. So I think keeping that in mind is.

Ashish Rajan: What could be some of the challenges that people may face as they're trying to develop this detection capability in their organization?

Something that you can share that they can watch out for or they could probably prepare for beforehand?

Andrew Tabona: Yeah, I think, so it comes down to a few things. I think having the right skills. And getting the right support is a longer journey than people think. And what I would say is, use as many avenues and as many relationships as you can to garner that support and that sponsorship, if you like, from the top.

And then just we spoke about a lot of this earlier, but just really thinking about your what you want to detect on, [00:35:00] make sure you're not killing the SOC, so to speak, with the amount of detections and mapping out where you want to be. I think that's important. A lot of people forget about this, right?

If you think about the maturity level of your cloud IR journey, if you like. I've seen people and spoken to people that just jump in, they get excited, get knee deep and they just go for it, but actually the sort of level zero in my mind, if you like, is mapping out the strategy saying, by this time we want to be here, by this time we want to be here, we're going to need this.

Here's the dollar amount. Here's the skills, blah, blah. Like just thinking about all that and going through that journey over months. And, sometimes years. Yeah.

Ashish Rajan: That brings me to another point because you guys are doing this across multiple clouds for people. Should they do all the clouds in one go in that strategy map of this?

Or should they one cloud at a time? Cause there's a lot what would your recommendation be?

Andrew Tabona: So in short, I would say what's [00:36:00] worked for us is start with one, master one, understand and basically test what your model is and what your framework wants to be, right?

What your process is going to be, and then replicate that across the other clouds. Of course, there's going to be nuances along the way. Once you have that foundation, it's easier to replicate. And also you think about the skills piece of it, right? It's very hard to find someone who is a master of all clouds.

To the point that they understand in and outs of the ins and

Ashish Rajan: outs of,

So you're saying people who are on LinkedIn say multi cloud, they're not truly multi cloud. So many people with multi cloud, I'm like, you're like a jack of all trades or king of none or whatever that word is?

Andrew Tabona: We can all have that foundation level of understanding, I'm sure, but really and truly like knowing GCP, Azure, AWS, Oracle, whatever to a master ninja level, as we were talking about before, is very tough and very expensive to bring in house. So what we found [00:37:00] was start with say AWS at the same time, start looking for that skill or those skills or training for those skills so that when you're ready, you have someone who can help you replicate that

so I would say at least for me, that's worked.

Ashish Rajan: And would you say, and I may be bringing the leadership hat on for this for a second, how do you show ROI to the time invested in doing this? Cause say, for example, you run a purple team exercise and just like any other lunch and learn that people have tried Hey, why am I even coming into this place?

It's a waste of time. Why do you want my developers to come for this ? There's a whole conversation on that level as well. That a lot of leadership would have to face. Maybe the technical folks don't have to, but leadership has to justify what's the ROI for it. Was there anything that you found?

From a ROI perspective that you find is a good example that people can use for their justification as well.

Andrew Tabona: Yeah. I think three, three metrics to keep in mind are mean time to detect, mean time to respond, and mean time to [00:38:00] recover. So if you can show that the time and money you spent has resulted in faster detection and faster response and faster recovery times.

All of which as a by product of that sort of lower the business impact and effectively the bottom line of the company. Then you're speaking the exact language. So I think that's something we began to measure, many years ago in on prem, but taking that to the cloud, especially adapting that to the cloud.

It's actually, I would say easier in a way because the metrics are more readily available. And when you start talking about automating things, you can show, Hey, it used to take us like 17 minutes to execute or contain this type of attack, and it now takes us, 2 minutes, because we're using, semi automation, for example.

They start to go, wow that's a real ROI, right? Yeah. Yeah. Like I said, talk their language.

Ashish Rajan: Yeah. I appreciate that. And maybe this is like [00:39:00] the last technique question, what do you find has been the best way to learn the whole building the IR team? I guess it's not just one cloud, but multiple cloud.

I don't know if when you were trying to go through this process of starting this journey. Was there something helpful for you that you found either for yourself or your team that you could probably share with everyone? That would be great as well.

Andrew Tabona: Yeah, definitely. So there's a few things, one was immediately when we start to get into this, we built an internal sandbox within each of the clouds.

And that allowed our engineers and analysts to go in and play around in a safe environment. We didn't care if they spun up and spun down resources, we could tear things down and build them up pretty easily. So they learned the different services on what each CSP has to offer. A lot of folks found online training platforms like acloud guru and a cloud academy and things like hack the box and so on.

They found those really good for providing labs and challenges where they can get hands on. We also do [00:40:00] a few times a year we run cyber range exercises.We partner with an external vendor that comes in and gives us a scenario. Nobody knows what the scenario is. They just literally go on that day and they're like, Hey, this is happening right now, and people have a chance to get hands on with the different tools and technologies within the different CSPs.

And then you mentioned purple team and that's been huge for us running purple team exercise, where you have this concept of red team versus blue team bringing it. Basically, this is where you've partnered with the business as well, right? Cause you need an account that has real life or real world resources in it, right?

So you go to the business and say, Hey. Can you give us a non prod account where we can come in and run this exercise? We're actually going to learn what security gaps there are, what security control gaps there are. So you get the benefit of running a sort of pen test if you like. We as an IR and SOC team are going to learn about the [00:41:00] environment and about how we respond to an attack within, within, or certain TTPs within that environment.

So that's been really big for us as a, yeah, that's what I would say, really.

Ashish Rajan: Awesome. No, thank you for sharing that. That's like the technical questions I had. I got three personal questions for you as well, man, not too personal, but just to get me a bit more. What do you do outside of work? I R world of yours, man. Or what keeps you busy outside of this world of IR and cloud response and all of that.

Andrew Tabona: These days it's spending time with my family and my young daughter. And it's fun to take her to places and just see her enjoy herself and learn and grow in that respect.

I also love to to swim. Like swimming is my passion. is my thing. And previously I used to do technical writing, technical writing gigs as just as a means for me to really to learn and share and disseminate that knowledge. So yeah, but that's really it. Just these days, my family continues to hang outside of the world.

Ashish Rajan: That's pretty awesome, man. And this probably [00:42:00] leads to another question, which is what is something that you're proud of, but that is not on your social media?

Andrew Tabona: It's okay. So one thing is. I guess when I did my master's degree, there were two elements to that, right? One was, to do that, I was offered a scholarship.

I was offered an EU wide scholarship, and they were giving away ten scholarships, right? And I was fourth out of the ten to be to be given that, so I'm Quite proud of that. And b is, I walked into that. So I left my job, my full-time job. I moved . Oh, you had a full-time job. Yeah. I left my full-time job to be a full-time student again.

oh my god, my master's degree. Wow. In a new country and everything. I sold everything up and moved to Scotland. And I walked into that course, not knowing anything about forensics, e discovery, or very little about InfoSec. And I walked out with a distinction and a academic publication, all that stuff.

So I'm very humbled and appreciative of that opportunity.

Ashish Rajan: That is definitely something to be very proud of, that is pretty awesome, by the way, and that is something to be [00:43:00] proud of as well. I don't know how many people take that kind of bet, man, but I'm glad you did and I'm glad you aced it as well. Third and final question. What is your favorite restaurant or cuisine that you can share?

Andrew Tabona: Oh, now this. We need another podcast, but

Ashish Rajan: if you had to pick one, I know it's a hard one, food.

Andrew Tabona: Yeah. I'm really big into Asian cuisine. So , my wife is originally from Singapore. So I was introduced to South Asian cuisine that way, when I met her. I've since become a very big foodie, but yeah, I love going out there and all different countries like within that region and just trying different things.

One thing that I keep going back to, is chicken rice. It's a big thing, but I just love I've loved having that when I go back to yeah.

Ashish Rajan: Fair. And I think to be specific it's not just boiled chicken rice, it's that the, it's the Singaporean style of chicken rice, which is the yeah.

Just the roast, I think. I don't know if it is roasted, but it is. I know what you mean. It is totally, I'm just going to explain the first time you have it like this cannot be that [00:44:00] great and you'll be like, Oh my God, this is so great.

Andrew Tabona: I'm telling you, some people call it like a heart attack on a plate, because you're basically burning the chicken fat and putting that into the rice.

Oh yeah, I've been having it a few times.

Ashish Rajan: Fair enough. Yeah, that's awesome, man. Yeah, I'll definitely put Singaporean cuisine in there, man. Thank you so much for your time, man. Where can people find you on the internet? They want to connect and talk more about this with you, man.

Andrew Tabona: Yeah, no, thank you for inviting me, Ashish.

I really appreciate the opportunity. Always happy to share and talk about these things. I'm on LinkedIn. I've a couple of years ago killed all my social media, like Facebook and Instagram. It was just steaming my life more than I wanted, but I'm on LinkedIn. So happy to connect and, have a conversation offline if people want to.

Ashish Rajan: I would definitely include that in the show notes as well. But dude, thank you so much for coming in again. And for everyone else who's watching, if you guys have any questions, feel free to drop them in, but definitely feel free to connect with Andrew. And the link I put in the show note, I will see you next episode.

Thanks everyone for your time. See you next episode. Peace.

Thank you for listening or watching this episode of Cloud Security Podcast. We have been running for the past five years, so I'm sure we haven't [00:45:00] covered everything cloud security yet. And if there's a particular cloud security topic that we can cover for you in an interview format on Cloud Security Podcast or make a training video on tutorials on Cloud Security Bootcamp, definitely reach out to us on info at cloudsecuritypodcast. tv. By the way, if you're interested in AI and cybersecurity, as many cybersecurity leaders are, you might be interested in our sister podcast called AI Cybersecurity Podcast which I run with former CSO of Robinhood, Caleb Sima, where we talk about everything AI and cybersecurity. How can organizations deal with cybersecurity on AI systems, AI platforms, whatever AI has to bring next as an evolution of ChatGPT, and everything else continues.

If you have any other suggestions, definitely drop them on info@cloudsecuritypodcast.tv I'll drop that in the description and the show notes as well. So you can reach out to us easily. Otherwise, I will see you in the next episode. Peace.

‍

No items found.

Why AI Infrastructure is Harder to Secure Than Cloud

Vulnerability Management vs. Exposure Management

Is Developer Friendly AI Security Possible with MCP & Shadow AI

Why AI Can't Replace Detection Engineers: Build vs. Buy & The Future of SOC

AI Vulnerability Management: Why You Can't Patch a Neural Network

Why AI Infrastructure is Harder to Secure Than Cloud

Vulnerability Management vs. Exposure Management

Is Developer Friendly AI Security Possible with MCP & Shadow AI

Why AI Can't Replace Detection Engineers: Build vs. Buy & The Future of SOC

AI Vulnerability Management: Why You Can't Patch a Neural Network

Is Cloud Native Backup Enough? Air Gapping & Ransomware Resilience

Why Backups Aren't Enough & Identity Recovery is Key against Ransomware

How to secure your AI Agents: A CISOs Journey

AI-First Vulnerability Management: Should CISOs Build or Buy?

SIEM vs. Data Lake: Why We Ditched Traditional Logging?

How to Build Trust in an AI SOC for Regulated Environments

Threat Modeling the AI Agent: Architecture, Threats & Monitoring

The Terraform "Lift & Shift" Playbook: Migrating 200 Apps to Multi-Cloud with Terraform

AI is already breaking the Silos Between AppSec & CloudSec

Ransomware, AI & "Minutes to Meltdown": A New Strategy for Resiliency

AI Agents for SOC: Hype Curve vs. Measurable ROI

CloudFormation vs. Terraform: An Engineer's Experience Migrating AWS IaC

Can You Build an AI SOC with Claude Code? The Reality vs. Hype

Incident Response of Kubernetes and how to Automate Containment

The Truth About AI in the SOC: From Alert Fatigue to Detection Engineering