Ransomware remains at the forefront of criminal activity, with criminal organizations seeing no company as off-limits, as evidenced by the recent attacks against the UK’s NHS. Total payments issued to ransomware gangs in 2023 surpassed $1 billion for the first time in history. Security teams need a consistent and effective way of preventing, containing, and recovering from ransomware. In this article I’ll outline what an incident response (IR) process to ransomware looks like with security automation underpinning it.
End-to-End Incident Response
The incident response process starts at the alert stage. With a daily deluge of alerts in the typical SOC, a true IR solution needs to assist with minimizing your investigation team’s queue of events and increasing the accuracy of each incident they’re looking at. That’s why our IR process starts with alert management; specifically deduplication and correlation. Our first set of workflows are designed to ensure incoming alerts are unique. If not, they are either dismissed or correlated to their related incident.
At the incident level, our workflows and case management are built to help with enrichment, containment, classification, containment, analysis, eradication, communication and coordination, recovery, and reporting. All of these are key in ensuring the incident has been dealt with and your systems hardened to make future breaches more difficult.
Estimated Time Savings
By automating these processes, a security team can save significant amounts of time:
- Deduplication and Correlation: From hours to minutes per incident.
- Enrichment: Several hours of manual data gathering reduced to a few minutes.
- Classification and Containment: Immediate containment actions save critical time, potentially reducing response time by several hours.
- Analysis and Eradication: Automated analysis and eradication steps can save days of manual effort.
- Recovery and Reporting: Automated recovery processes and instant report generation can save additional hours to days.
Overall, using the SOAR playbook can transform a multi-day response process into a streamlined effort that may take only a few hours or less. The tools used in this example can be swapped for other tools you may be using.
Stage 1: Deduplication
Alerts that are ingested into a SOAR tool are not always new alerts. Sometimes a data source updates an alert or ticket, so we need a way to identify when alerts are ingested that are actually revised versions of existing incidents in the platform. Typically the original data source will maintain a unique identifier within the alert such as an alert ID. In Smart SOAR, event playbooks are used to check for duplicates on ingestion. That way the alert never hits the analysts’ queue and incident reporting numbers aren’t skewed with false positives.
Stage 2: Correlation
Because a SOAR tool is an aggregator of security alerts from every tool in a security team’s environment, it is up to the SOAR to determine links between these alerts and consolidate them within the same incident. Otherwise, teams might miss key relationships between suspicious activities.
Similar to deduplication, the correlation stage of a Smart SOAR event playbook triggers on ingestion. In the case of ransomware, you may want to search for existing incidents with the same source or destination IP, file hash, domains, URLs, device ID, or user account. It is reasonable to choose a timeframe of 24 hours to 7 days for this correlation search. Alerts containing similar artifacts beyond seven days can be linked with existing incidents but escalated as separate incidents to allow your security team to close out incidents until further evidence of a breach is received.
Stage 3: Enrichment
Enrichment in SOAR is artifact-based. Since alerts are normalized on ingestion, all artifacts are stored in standard fields and can be automatically enriched. Common artifacts for ransomware alerts include file, network, process, system, and access indicators. The enrichment stage is meant to accomplish three goals:
- Enhance the quality and context of the alert data.
- Automate the enrichment process to provide sufficient information for initial classification.
- Prepare the alert for more in-depth incident analysis, if needed.
Enrichment Playbook Sections and Tasks
Threat Intelligence
- File Hash Lookup: Check file hashes against threat intelligence databases and malware repositories (e.g., VirusTotal, ReversingLabs).
- IP and Domain Reputation: Query IP and domain reputation services (e.g., AlienVault OTX, Cisco Talos) for known malicious activity.
- URL Analysis: Analyze URLs using web reputation services (e.g., Google Safe Browsing, Web of Trust).
- Process and Scheduled Task Verification: Verify process names and scheduled tasks against known malicious patterns or signatures.
Internal Data Correlation
- Historical Data Check: Search internal logs and historical incident data for matches to extracted artifacts.
- Automated Searches: Perform automated searches across SIEM logs, IAM logs, and other data sources.
Contextual Enrichment
- Asset Information: Retrieve asset details for affected systems (e.g., owner, criticality, location) from the CMDB.
- User Context: Pull user information for involved accounts (e.g., role, department, recent activity) from the identity management system.
- Network Context: Obtain network topology and segment information to understand the potential impact and spread.
Display
All enriched, relevant information is organized and displayed to the investigator in the Incident’s Investigation window. The investigation window holds all findings, data correlations, indicators of attack (IoA), and indicators of compromise (IoC), centralizing all key information for the investigation team.
Stage 4: Classification
Signs of ransomware may include files with unfamiliar extensions on important databases, such as ‘.encrypted’, ‘.locky’, or ‘.crypt’. Look out for a ransom note named ‘READ_ME.txt’ in various directories. Other indicators may be unusual network activity from internal IPs to outbound IPs on high-numbered ports, and system logs showing unauthorized access attempts and use of administrative privileges on affected machines.
Inside the incident overview, a Pending Task marked ‘Classification’ is available and tagged as required in order for the incident to move to the next phase.
Stage 5: Containment
Once an alert is classified as a true positive, the containment path triggers to stop the spread of the ransomware and limit its impact on the machines it has infected. The workflow is designed to accomplish five goals:
- Disconnect Affected Systems: Isolate the affected systems from the network to prevent further spread.
- Segregate Networks: Move unaffected systems to a segregated network to protect them from potential spread.
- Block Suspicious IPs and Domains: Update firewall rules to block communication with known malicious IPs and domains associated with the ransomware.
- Stop Processes: Identify and terminate malicious processes on affected systems.
- Calculate Time to Contain: Take the difference between time to detect and the completion of this workflow to track the time it took to contain this breach.
The IR team would likely perform several critical tasks post-containment to ensure that the transition to recovery is smooth, thorough, and minimizes the risk of reinfection or further damage. These tasks typically fall under the analysis and eradication phases, which are crucial for understanding the incident’s scope and completely removing the threat from the environment.
Stage 6: Incident Analysis
The goal of this phase is to understand the full extent of the ransomware attack and gather forensic evidence. Now that the malware has been contained, security automation tools can coordinate with malware analysis tools to automate the upload and retrieval of detailed static and dynamic malware analysis reports. The playbook also summarizes the information collected from the enrichment and containment stages to communicate the extent of the attack.
Similar to the enrichment stage, the information gathered in this stage is presented to the user easily in the investigation tab. Recommendations can be added on the right-hand side either manually or through an integrated automation with a large language model.
Stage 7: Eradication
The eradication phase is a critical part of our SOAR playbook for ransomware, ensuring that all traces of the malware are completely removed from affected systems. Here’s a detailed breakdown of the steps involved in the eradication process, as depicted in our workflow diagram.
The first step focuses on deleting malicious files. Using CrowdStrike Falcon, we automatically remove any ransomware and associated malicious files from the compromised devices.
Next, we address compromised user accounts. Through Microsoft Entra ID, we enforce a password reset for all affected accounts, mitigating the risk of attackers using stolen credentials to regain access. To further secure the network, we revoke all active sign-in sessions for these accounts, ensuring that any ongoing malicious activities are terminated and requiring re-authentication with the new credentials.
Following this, we initiate a full scan on all affected endpoints using CrowdStrike Falcon. This scan goes beyond the initial malware removal to ensure that no residual threats remain. By checking for hidden or dormant malware, we add an extra layer of security.
Finally, we retrieve and review the results of the full system scans. These comprehensive scan results are analyzed to confirm the successful removal of all malicious files. This crucial verification step validates the eradication process, ensuring that the systems are thoroughly clean and secure.
CrowdStrike scan displayed to users in the Investigation summary.
By following these detailed steps, organizations can be confident that ransomware has been removed from their devices. Each step leverages SOAR to speed up the process while ensuring accuracy and consistency in the response efforts. This approach significantly reduces the risk of reinfection and helps remove a threat quickly.
Stage 8: Recovery
The recovery stage is crucial for restoring normal operations and includes several steps including bringing systems back online, updating security measures, and validating the recovery process. Here’s a detailed overview of the steps in the recovery phase, as depicted in our workflow diagram.
We start by unblocking IP addresses that were blacklisted to prevent the spread of ransomware. Using Fortigate, we review and unblock IP addresses that are verified to be safe. This step restores normal network traffic and access, ensuring legitimate communication resumes without disruption.
Next, we reconnect endpoints that were isolated during the containment and eradication phases. Using CrowdStrike Falcon, we confirm that all affected endpoints are free of malware and securely reintegrate them into the network. This ensures business operations can resume without the risk of reinfection.
Lastly, we update network segmentation to strengthen security and prevent future incidents. Through Fortigate, we reconfigure segmentation policies to better isolate critical systems and sensitive data. This proactive measure reduces the attack surface and enhances containment of any future incidents.
Reporting
In Smart SOAR, reporting is captured in three ways:
- MITRE ATT&CK TTPs
- KPIs
- Incident reports
This incident has been tagged with an exfiltration label and is added to the SOC-wide monitoring dashboard on tactics, techniques, and procedures.
KPIs are automatically tracked throughout the incident and included in the incident overview. KPIs include Time to Respond, Time to Contain, and Time to Close. These metrics can be customized depending on your reporting requirements and are all managed within the playbook engine itself.
Finally, a comprehensive incident report can be exported automatically and includes all activities, findings, IoCs and more.
Communication & Coordination
Before we end, I’d like to make a note of the communication and coordination capabilities within Smart SOAR. Stages in the workflow can be assigned to users in a given role, which means when they access the incident they are able to review the results of, and respond to, specific tasks relevant to them. Be this the firewall team, legal, or end-user, access can be specified and controlled to ensure the incident is addressed appropriately, but no further actions can be taken than what are required from a specific team member.
Inside the incident workspace, you can also create ad hoc tasks. These are tasks that aren’t defined in the incident playbook but are still required to be completed. For example, the incident response team can assign the patching of an endpoint to their device manager to ensure actions are taken promptly and are tracked in the ticket.
Additionally, the internal messaging system can be used to coordinate with teammates outside of the ticket itself.
Takeaways
Automation ensures that the response is consistent, reducing the risk of human error and ensuring all steps are completed accurately every time. By automating repetitive tasks, skilled analysts can focus on more complex and strategic activities, improving overall efficiency and effectiveness. This workflow can handle large volumes of alert and incident data simultaneously, making it easier to scale security operations as the organization grows.
Metrics such as Time to Respond, Time to Contain, and Time to Close are automatically tracked and improved, demonstrating the effectiveness of the security operations to stakeholders.
Faster containment and eradication reduce the overall impact on business operations, minimizing downtime and associated costs.
Having a SOAR playbook for ransomware offers substantial time savings, enhanced efficiency, and numerous strategic benefits that are crucial for a security operations manager looking to optimize their incident response processes.