Starting SOC Automation

Tyler Wall
18 min readFeb 24, 2024

This article will discuss the maturity models of Security Operations Centers, how to know where your SOC is at, and how to embrace SOC automation and stay ahead of the curve.

Graphics Designed by Matthew Peterson https://www.mttptrsn.com/

Automation within the Security Operations Center (SOC) is generally referred to as Security Automation and Orchestration (SAO) or Security Automation, Orchestration, and Response (SOAR). As an analyst, it has become increasingly more common to encounter some type of security automation within organizations. To what extent may depend on the maturity of your organization and its SOC. We will dive into maturity models and how those relate to automation a bit later in this article. First, what is security automation?

What Is SOC Automation?

No, SOC automation does not refer to robots becoming self-aware. Threat intelligence feeds do not suggest that “judgment day” is close on the horizon. Simply stated, automation is the machine implementation of low- level security-related actions. These actions are small pieces of a larger task. Generally, a task will be made from a number of actions. Similarly, a process will encompass a number of tasks. Tasks can be partially or fully automated with the goal of reducing human intervention in security operations. Orchestration, while very closely tied to automation, takes advantage of multiple automation tasks across multiple systems or platforms. Orchestration is used to automate or semiautomate more complex workflows and processes.

We have heard criticism from SOC analysts and others in the security community regarding automation. The overwhelming theme seems to be that analysts are worried that automation will take their job. At first glance I can see where they are coming from. If a machine can do it faster and more efficiently, then what is the analyst to do? Believe me, I get it! As a SOC lead, I want to challenge my analysts to do a detailed analysis of events. This takes a good amount of time and is not possible with the volume of events seen on a daily basis. I want them to look for trends, examine data over a larger period of time, and then find the reason that these events are taking place. To ask themselves questions like: “Is the reason I have to respond to 50 events per day on an IPS signature due to the fact that the webserver is vulnerable?” Present that data back to your SOC leadership, and take initiative to get the business to patch the vulnerability.

What we are attempting to convey is that SOC automation should not be seen as a limitation to your career, rather a springboard which can help you become a better analyst. We will go over a number of reasons for automation in the next section that should paint a clearer picture of the benefit of automation not only to the SOC but to the individual analysts as well. Let’s dig into why automation is a positive addition to any SOC.

Why Automate?

There are a number of reasons for a SOC to automate, but be assured that replacing analysts is generally not the goal. The SOC analyst is a valuable resource which will always be needed to perform where machines cannot. Whether part of a maturity initiative or new business requirements, leadership is often left taking on additional services with the same or fewer resources Taking into account that SOC leadership is being pressured to deliver more, combined with the shortage of skilled cybersecurity professionals, it is easy to see why automation is a no-brainer.

I have spent time in the trenches working through an endless queue of events. When I was a junior analyst, there were times when I would have a number of events that were generated for antivirus detections where the files were quarantined. Over half of the events in that day were “potentially unwanted applications” (PUA) which were adware/toolbar related. The tool did its job, the files were quarantined, yet I still had a number of events that needed to be addressed. I had to manually add the appropriate notes and close each ticket. If I had automation in place, then it would have made my life a lot easier. I would have been able to focus on more in-depth analysis and look for a common source of the adware, but due to the sheer volume of events, it was not an option at that time.

For me, automation is a force multiplier when it comes to helping analysts with the flood of events they handle on a daily basis. By eliminating the need for analysts to do monotonous tasks, they are free to spend more time performing higher-level analysis of events. Senior analysts will have more time to dedicate to training junior analysts and more time can be spent on developing documentation. With the ever- changing pace of a SOC, we all know this is always needed.

One of the first reasons a SOC may choose to automate is to streamline existing processes. Many SOAR platforms have C-level dashboards that are designed to show the amount of time and money saved by automating actions. While I do agree to an extent that this can be important, focusing on this alone may not necessarily be the best fit for all organizations. There are a number of other reasons that I believe are equally important to the operation of a healthy SOC.

One of my favorite reasons for automating is to reduce analyst fatigue. I cannot be the only analyst that has ever spent what seems like hours a day pressing “Ctrl+C” and “Ctrl+V.” I have gone home at the end of the day brain-fried, wondering if a monkey could do the job just as well.

As I mentioned earlier, security analysts are the most important resource that a SOC has. These analysts are inundated day-in and day-out with an abundance of information that needs to be collected, categorized, classified, analyzed, and interpreted. Reducing the volume of events that need to be analyzed is one way to achieve this.

Reducing analyst fatigue benefits the SOC by reducing overall stress and making it a fun and challenging place to work. Isn’t the saying: “Happy SOC, Happy Life”? Good leadership should strive to do all that they can to promote morale and a healthy workplace environment. Doing the same repetitive actions day-in and day-out will desensitize you and cause you to skip steps or cut corners. This fatigue increases the possibility for mistakes to be made.

Reducing mistakes leads me to another popular reason for automating, which is standardizing processes. Analysts can get trapped in an endless screen-switching cycle during an investigation by checking documentation, following defined steps, and moving between multiple consoles. When automating security-related tasks, we drive consistency and reduce the likelihood for errors. Consistency is key in security operations. During incident response when we implement automation, we can ensure that processes are consistently followed.

As a SOC analyst, it is very easy to cast wide nets in order to collect as much information as possible. Sometimes the rules we write just need to be broad. The events generated by a rule may only be an indicator when correlated to another event or other condition. Sure, you could write a correlation rule, but maybe you are in the infancy of tuning a rule, and thus analysts receive a large number of false-positive detections. What if we could use automation to tune out these false positives? Reducing the overall volume of false positives is one such use case that I have spent a good amount of time automating. I will give an example of this later in the article.

Each analyst has their own preference for sources of information, and this can sometimes create false positives or lead an analyst down the wrong rabbit hole. As mentioned previously, consistency is important for a number of reasons, but in addition to those already mentioned, another reason to automate is for the reduction of information bias. There are some reputation and intelligence data sharing services that are higher fidelity than others. Open source feeds can be a double-edged sword. On one side they may have larger reference sets and are good quality, but on the other side, I have found that it is easier for one wrong attribution to skew a full dataset. When the sources for which data is ingested and consumed are defined by the team, reputation checking and intelligence enrichment can be easily automated within your playbooks.

Every few months, it seems like there is a new attack pattern and threats are becoming more complex each and every day. Organizations need to be prepared for this evolution of complex threats. Adversaries today are utilizing automation to conduct attacks against your organization. Security operations need to keep up with the speed at which attackers are evolving, and the only way to do this is through automation and orchestration. As you implement new automation playbooks, the end goal should be to reduce the mean time to detection (MTTD) and mean time to response (MTTR). Each step that is automated shaves fractions of seconds from these SOC metrics. While at first glance it may not seem that a machine could save much time per single action, the culmination of all of these small actions over time will add up to significant time savings. The decrease of these metrics will satisfy senior management while also providing the numerous benefits mentioned previously.

SOC Maturity

I would like to preface this section by stating that I do not think many organizations would expect that they could fully automate every process from beginning to end. I believe there are just so many situations that require an analyst to make a decision that a machine just cannot do. There have been many horror stories of automation putting blocks in place based upon the wrong classification of the data. These instances have had catastrophic effects on businesses and their reputations. Until an organization has a high confidence level with the data being provided, I would personally suggest adding in some checks and balances into automation processes. These checks and balances should require human interaction and approval before blocking controls are put in place. All of these steps can be built into your playbooks to ensure that you can not only take advantage of automation to the fullest extent possible but also keep automation from taking an incorrect action.

The goal of this article is not to go into a deep dive on the topic of maturity models. There are a few different ways to go about measuring the maturity of your SOC. You can write your own framework or use an industry standard framework to accomplish the same goal. The benefit to using a standardized framework is that it is recognized and probably being used by other organizations within your industry. Both solutions are designed to provide a situational summary of where the SOC is in their maturity taking into account all of its processes.

Figure 1–1 Sample Maturity Phases

When assessing the maturity of the SOC and its automation, it’s easy enough to start with a staged approach similar to the one shown in Figure . I put this graphic together to illustrate that once you have completed an inventory of the processes and actions that your SOC is doing today, you can then map your current state and measure your progress toward your goals. Set small goals to get you to the next phase. If you have not begun your automation journey, don’t be afraid of starting now. With each action you automate will get you closer to your goals.

As a junior analyst, you will begin to see areas for improvement in the processes that you and your team use every day. Document any process gaps and look for actions that can be automated. Take time to gather all of the appropriate data, and do the analysis. Can any of these actions be automated? What benefit do you see it providing the team? Be able to articulate how you believe automating an action will improve the function. By presenting a process improvement or resolution to a problem and not just the gap, you will set yourself as a leader among your peers, and SOC leadership will see you as a true problem solver.

How to Start Automating

There is no one-size-fits-all solution for every organization. In my experience, it has been the most beneficial for analysts within the SOC that are intimately familiar with their processes and procedures to spend a little bit of time analyzing the work they perform each day. Categorize your tasks by the time required to complete them, and then by the complexity of the task. Start with the tasks that are simple, and do not take a lot of time to complete and leave the complex tasks for after you are comfortable with the process flow. Chances are that there are a number of these simple tasks, and by automating them you will make a good amount of progress. Figure may help you categorize your tasks and allow you to focus on automation tasks that will provide the most value up front.

Figure 1–2 Security Task Categorization

When starting with a simple task that takes a short time to complete, look for repetitive actions without complex conditions. If you have different actions that you take based upon the output of an action, it will add complexity to the playbook. I have found that it is very easy to start working through a use case, only to find out halfway through it that one small attribute changes the whole thing. Spend time dissecting the actions and whiteboard the process flow. Make every effort to break it down to the smallest steps that you can. A very simple example of automating a task such as this may be getting the reputation of a file. This might make it a bit easier to help you envision the steps taken.

Figure 1–3 Simple use case of getting a file reputation

In this simple example, I have broken down the task into four small actions that an analyst would need to take:

1. Gather the file hash.

2. Open a web browser.

3. Paste the hash into the browser and submit it.

4. Make a decision based upon the file reputation.

The decision made upon the file reputation may then feed another action or a process flow further downstream. A playbook can be this small. Keep in mind that it is possible to have a playbook that calls other playbooks synchronously, waiting for the first one to complete before calling another.

At first glance, it may not look like that by automating this task, you would save much time. What if the hash was a false-positive detection? What if we could automatically close the event based on the file reputation? What if we could collect the false-positive file and submit it back to the vendor to be reevaluated? Not only would automation help by eliminating the noise of false-positive detections, but it would reduce the number of tickets you would need to respond to. Now, this short, simple action has saved a significant amount of time when scaled to the number of events that need to be investigated in a day.

Sample Use Cases

I have come across a number of use cases discussed in different articles around the Web. Maybe some of them will work for you, or maybe they will just spark some ideas on what can be done. Like I mentioned earlier in this article, there is no one size fits all. Vendors supply sample playbooks that are generally meant as teaching points to what their product can do. Unfortunately, not every solution will be able to be integrated with your automation platform. You will encounter situations that may not work in your environment, just as you will also encounter situations that the vendor has not specifically encountered before. This is to be expected and is all a part of the journey of SOC automation. I wanted to highlight a couple use cases that I have personally encountered that I have had good success with. They do not cover every use case or reason that a SOC may choose to automate; however, they may act as a starting point or inspiration for your automation endeavors.

A use case that I have encountered was reducing a number of false- positive detections from an email hygiene provider. The team utilized a service that sends alerts for a malicious email that was delivered. There were times that after the alert was sent, the email was reclassified as clean. I wrote an automation playbook that would call the email hygiene provider’s API to check for the “false-positive” flag. If the alert was a false positive, an analyst ticket would not be created.

Another use case which was a bit more advanced was providing paging to on-call analysts when critical events came in. We started by defining the type of events that would cause an analyst to be paged out. Once that was complete, we began to figure out how to collect the on-call person and their page address. This took a bit of custom python code using a plug-in called “beautifulsoup.” The playbook would scrape an intranet page and parse out the email address to page and send an alert to that analyst with the context of the critical event. Once that step was complete, the playbook would monitor a mailbox for a read-receipt for the page. If the page was not acknowledged within an hour, the playbook would send the same page to the on-call escalation point.

The most common automation use case that I have helped to put in place is the enrichment of events with threat intelligence. In this environment, events are sent from the SIEM to the automation platform for processing, and a ticket is created in a temporary ticket queue. The playbook will extract indicators such as file hash, file path, source and destination IP addresses, etc. Depending on the event type, these indicators are enriched from various sources that are predefined by the SOC. The data is used to populate notes in the event and add context to the event for the analyst that works it. Once all of this enrichment is complete, the playbook will move the ticket from the temporary queue to the SOC analyst queue. The reasons for moving it to the analyst queue after all the enrichment is done are to prevent a ticket state change and to ensure that any error checking added to the playbook is complete first. I want the analyst to have all the data they need to make a decision on the event, instead of having only partially complete data.

Summary

Security automation is a tool that assists your SOC analysts and allows them to be more effective with their work. In my opinion, it is not designed to be a replacement for an analyst. We invest in automation technology to make us more efficient at our jobs, and we are going to be required to make decisions where a machine cannot. I don’t want to focus directly on best practices for writing automation playbooks, but more of the overall process and how it relates to the SOC. With that in mind, I want to leave you with a few tips for success.

If you have not already begun your automation journey, talk with your team about the benefits of security automation. Get everyone on board with the idea and comfortable with how you envision the playbooks working for the team:

  1. Do a full inventory of the tasks your SOC performs. Break them down by the time required, and complexity to complete them.
  2. Define your use cases before automating any actions. Focus initially on tasks that are simple and can be completed quickly. This will provide you with some quick wins.
  3. Don’t write long complicated playbooks. Break them down to specific tasks as much as possible. You can use a parent playbook to call multiple child playbooks.
  4. Don’t be afraid to challenge the status quo. When you start automating processes, you may discover a new and better way to do something. Embrace these efficiencies, and automation will show its value to your organization.

While security automation may be in its infancy, there is much that can be done to improve the operations within your SOC. I hope I was able to provide some insight into why you need to begin automating sooner rather than later. I have highlighted a number of reasons for automating and provided some possible use cases for quick wins. Take the lead, and show the rest of your team that automation is not a limitation but a force multiplier that will help you all become better analysts.

ARTICLE QUIZ (ANSWERS FOLLOW)

_______ is the machine implementation of low-level security-related actions which are smaller pieces of a larger task.

Ⓐ SOC Automation

Ⓑ Process

Ⓒ Orchestration

Ⓓ Inventory

_______ takes advantage of multiple automation tasks across multiple systems or platforms.

Ⓐ Automation

Ⓑ Process

Ⓒ Orchestration

Ⓓ Inventory

A _______ is made up of a number of actions that are fully or partially automated while a _______ encompasses a number of the former.

Ⓐ process, task

Ⓑ task, process

Ⓒ process, response

Ⓓ response, task

All the following are true regarding automation except:

Ⓐ It will replace analysts in the next five years.

Ⓑ It streamlines existing processes.

Ⓒ It frees up analysts from monotonous tasks.

Ⓓ It manages the flood of events coming in daily.

All the following are reasons to implement SOC automation except:

Ⓐ Reduce analyst fatigue

Ⓑ Reduce mistakes

Ⓒ Reduce productivity

Ⓓ Reduce labor hours to increase skilled training

Which of the following is true regarding how to start automating the Security Operations Center (SOC)?

Ⓐ Start with complex changes

Ⓑ Someone who is intimately familiar with the Security Operations Center (SOC) processes and procedures should start by taking an inventory of the SOC tasks.

Ⓒ Figure out who to fire first.

Ⓓ Make tasks more complicated than they should be.

All of the following are true about playbooks except:

Ⓐ They can be small.

Ⓑ They can call other playbooks synchronously.

Ⓒ They’re only used in fantasy football.

Ⓓ They should not cause incorrect or damaging actions.

ARTICLE QUIZ SOLUTIONS

_______ is the machine implementation of low-level security-related actions which are smaller pieces of a larger task.

Ⓐ SOC Automation

SOC Automation is the machine implementation of low-level security-related actions which are smaller pieces of a larger task.

_______ takes advantage of multiple automation tasks across multiple systems or platforms.

Ⓒ Orchestration

Orchestration takes advantage of multiple automation tasks across multiple systems or platforms.

A _______ is made up of a number of actions that are fully or partially automated while a _______ encompasses a number of the former.

Ⓑ task, process

Atask is made up of a number of actions that are fully or partially automated and a process encompasses a number of tasks.

All the following are true regarding automation except:

Ⓐ It will replace analysts in the next five years.

Replacing analysts in the next five years is not entirely true. While SOC automation aims to reduce the amount of manual labor, SOC automation should be a springboard that frees up an analyst to work on more challenging tasks, preparing them to move out of the SOC into more advanced roles or to become a SOC Automation Engineer responsible for automating SOC Analyst tasks. Asmaller number of SOC analysts will always be needed to review the SOC automation’s work, assist in the SOC automation efforts, and handle exceptions.

All the following are reasons to implement SOC automation except:

Ⓒ Reduce productivity

Reducing productivity is not a reason to implement SOC automation.

Which of the following is true regarding how to start automating the Security Operations Center (SOC)?

Ⓑ Someone who is intimately familiar with the Security Operations Center (SOC) processes and procedures should start by taking an inventory of the SOC tasks.

Someone who is intimately familiar with the Security Operations Center (SOC) processes and procedures should start by taking an inventory of the SOC tasks.

All of the following are true about playbooks except:

Ⓒ They’re only used in fantasy football.

There are many constructive uses for playbooks other than in fantasy football, including in SOC Automation.

Other Articles in this Series

Tyler Wall is the founder of Cyber NOW Education. He holds bills for a Master of Science from Purdue University, and also CISSP, CCSK, CFSR, CEH, Sec+, Net+, A+ certifications. He mastered the SOC after having held every position from analyst to architect and is the author of three books, 100+ professional articles, four online courses, and regularly holds webinars for new cybersecurity talent.

You can connect with him on LinkedIn.

Get 20% off all courses in our On-Demand catalog with coupon code “Welcome20”

Download the Azure Security Labs eBook from the Secure Style Store. These labs walk you through several hands-on fun labs in Microsoft Azure, leaving you with the know-how to create a gig in Fiverr or Upwork to start your cybersecurity freelancing.

Also available in the Secure Style Store, download the Job Hunting Application Tracker for FREE to keep track of all your job applications.

Check out my latest book Jump-start Your SOC Analyst Career: A Roadmap to Cybersecurity Success published June 1st, 2024 and winner of the 2024 Cybersecurity Excellence Awards.

--

--

Tyler Wall
Tyler Wall

Written by Tyler Wall

Founder of Cyber NOW Education | Husband & Father | Published Author | Instructor | Master Mason | 3D Printing & Modeling | Astrophotography

No responses yet