When a security incident occurs, it must be resolved as quickly as possible to protect the proper functioning of the organization. In large environments, multiple incidents occur per day, but having a solid layered defense, a sound SOC operation to generate and handle necessary alerts, and an Incident Response operation to quickly eliminate incidents with impact usually keeps the effective impact on daily operations rather limited. Defenses are typically layered, and while a partial bypass must be investigated to determine if improvements to the involved security controls are possible, the attack chain cannot be fully completed and the impact remains limited. However, once all layers of defense and detection methods have failed and critical systems for business operations have been impacted, or a multitude of less important systems, a trained, multi-disciplinary team is the key to recovery.
After the Detection/Identification phase, aka upon determining that illegitimate use is being made of the enterprise systems, a strategic battle with a partially invisible enemy ensues. As in classic combat our choices, resilience and striking power will decide the outcome. When a major incident is determined, one has a limited amount of information and a multitude of pressing questions. While making decisions based on limited information is, by definition, a flawed idea, it is the reality of most incidents. First and foremost is a good understanding of the situation. What impact has already been identified, on which systems, and what is the function of these systems within the business. Who is responsible for these systems and underlying processes, and who are the key users and consumers? Based on this information, an incident response team is assembled that contains the necessary knowledge to coordinate the further course of the incident based on the intelligence gathered and the insights these bring. Essential ingredient in such team is always (at least 1) incident responder, who has knowledge about how threat actors do their work, why and what connections can be made from that perspective between the current facts at hand. While this team is assembled, the incident responders search for additional information during the assessment and perform the first additional investigative actions to create context and build a narrative. Essential here are identification of patient 0 and the infection and spread pathway. If these are not known, an important blind spot exists, which will have to be taken into account in the further course of the incident process.
The first and foremost objective of the Incident Response team will be to develop a containment strategy. Based on the collected information, additional investigations will be ordered (to support certain assumptions with facts e.g.) and the needs will be identified to stop the threat and its impact in the shortest possible delay. It is important that this containment is coordinated and applied at all layers simultaneously to avoid tempting the threat actors to take desperate actions because they know their presence has been detected. We need to keep in mind that new information may continually come to light so we must always be ready to adjust our strategy. Some of the actions that we can find in the containment phase are e.g. system or network isolation, removing vulnerabilities used in the attack, configuring additional detection methods and indicators in the security infrastructure, resetting passwords of abused accounts, but also ensuring that backups of impacted systems are secured, communicating with affected parties, etc...
Once containment is accomplished and telemetry confirms that we were successful, the task of eradicating the impact (and thus the presence of the threat actors) begins. Where in the containment phase speed is essential, in this phase that becomes thoroughness. We don't want to leave a backdoor, scheduled task or malefide user account active so that the attacker can return later or the malware is relaunched. All specialists on the team are assigned research tasks in their area of expertise to create the highest probability of detecting abnormal elements. If any uncertainties or assumptions remain at the base of our strategy, these avenues should also be examined. This is also the phase where the validity of both impacted data and systems is evaluated and examined to see if it should (can?) be restored from a previous backup.
Once the impact has been removed, what remains is to return as much as possible to the original state before the incident began. We do this by restoring any impacted (and recoverable) data and continuing to monitor the environment using all the intel we have gathered in previous phases. Once it is decided that we are sufficiently certain of the end of this incident (this may be a long(er) time), we enter the final phase of the incident.
After the facts, we do an analysis of the whole process and dare to look into our own hearts and admit our flaws. While the incident is in progress there should be no time for finger-pointing or looking for potential scapegoats; this brings no added value at that time. However, it is important that all can learn from the mistakes that were made; both during the incident and during the run-up to it. Therefore, a root cause analysis and a lessons learned analysis are made as standard. These then form the input for other processes to ensure that this fox isn't caught twice in the same snare.
Above is a description of what happens when the Incident Response team works on a (major) incident. However, the most important phase is not yet mentioned so far. This is also the phase in which the Incident Responders work when they are not working on an actual case: 'Preparation'. Preparation is essential, and the only way one can expect a good response is by giving the IR team a chance to prepare and practice. A security Incident Response Plan contains the policies to ensure that all potential stakeholders are aware in advance of the IR team's modus operandi, powers and limitations. This should be propagated to the entire organization. Table top exercises, which ideally also involve non-IR services ensure that a well-oiled machine is ready when the need is dire and the organization's reputation and/or value are at stake.
© g3rt