No matter how well you manage your security posture, there is always a chance that you will become a victim of a cyber attack. That is why every organization, no matter the size, should be prepared to react to a cyber incident. The key element of such preparation is a cyber incident response plan (IRP).
Elements of a cybersecurity incident response plan
When building your IR plan, there are many elements to consider and each of these elements is equally important. If any of these elements are ignored, it would be impossible to react efficiently and it could cause chaos in an organization, which would, in turn, have a severe impact on business operations, information security, and more.
Incident response team
It is not optimal for any organization to have a separate team on standby, waiting for an incident. Therefore, when building a computer security incident response team (CSIRT), you must include existing human resources. Such a team is assembled only in the case of an incident but each resource must be aware of their role in the team and the impact that it will have on their everyday work.
- Decision-makers: The key resources in a CSIRT are the key stakeholders – people who are able to make decisions. This means that your team must include top management, possibly even involve company executives. It is quite common for an incident response team to be led by the chief information security officer (CISO), the chief security officer (CSO), the chief information officer (CIO), or even the chief technology officer (CTO). However, depending on the organizational structure, the team might also involve even the chief operations officer (COO) and the chief executive officer (CEO). When reacting to an incident, promptness is key, therefore decisions must be made quickly and cannot be challenged.
- Technical resources: A cyber incident response team must include people who are able to investigate the incident and identify the root cause, work with technical assets, as well as restore/repair affected systems and other assets and prevent further damage. This means that the team must involve your security operations center but also system administrators, IT operations, and in some cases even developers. Since this is the personnel that will be handling most of the work involved, they must be aware of priorities and task assignments. You must consider the impact of this on business continuity. For example, your team must still be able to maintain and secure unaffected systems so that your business does not come to a complete standstill until the incident is resolved.
- Legal and compliance resources: In the case of many organizations, a cyber incident might involve sensitive data and therefore have legal consequences as well as affect compliance with GDPR, PCI DSS, HIPAA, and more. Therefore, representatives of your legal and compliance departments must also be involved in the CSIRT for the purposes of risk assessment (not just limiting to security risks). Just as in the case of technical resources, they must be aware of priorities. However, to maintain business continuity, it might be impossible to dedicate their full attention to the incident.
- Communications: Almost every cyber incident will in some way affect external parties. For example, your customers, your partners, or the general public (depending on the nature of your organization). Therefore, your incident response team has to include resources from your customer service department, public relations, account management, and more. Note that clear communication involving public disclosure (including technical details) is good practice and helps your brand image.
- External resources: You might consider involving external resources such as forensic experts, risk management analysts, and more. If so, you must select and build a relationship with such parties before an incident occurs, so that they are ready to help when needed. This might involve additional contracts or agreements that need to be in place continuously.
Independent of whether the resources involved in your CSIRT are internal or external, you must consider the following factors:
- Responsibilities: Every responder involved in the incident response team must clearly know the scope of their roles, responsibilities, and priorities in relation to their everyday work. Responsibilities must not clash and if external resources are involved, they should have a go-to internal contact if internal business decisions are required.
- Contact information: Incidents may occur outside business hours and usually require real-time response. You cannot afford to wait with containment till the next business day because the criminal may take that time to wreak even more havoc. Therefore, for effective incident response, you must have out-of-office contact information for every resource involved and the resources must be aware of the fact that in the case of a cyber incident, they will be contacted outside business hours.
- Backup resources: For every key team member, you must have a backup. You cannot afford to wait until, for example, your team manager is back from vacation.
Since a cyber incident always involves some technical assets, their clear visibility is the key to an effective response. If assets are not well-defined, enumerated, and their relationship is not clear, it might be impossible to contain and fully resolve the incident.
- Asset identification: You should have a clear view of all your technical assets, both these within the company itself as well as the external ones. This is a good everyday practice but the importance is even greater if the assets are affected by an incident.
- Asset relationships: Many technical assets are interconnected and therefore, a criminal might breach one of the assets and escalate to others. Depending on the technical structure of your business, potentially every asset might be affected by an incident and should be part of an investigation and remediation. For example, if a criminal accesses a web application via an SQL injection, they will most certainly access the database server (which may be a separate system), potentially reaching the operating system, and potentially using the internal network to access other systems. Understanding how assets are interconnected is of utmost importance.
- Asset ownership: Some of the technical assets that are interconnected might be outside of your business ownership. For example, you might be working with cloud service providers or partners. Your organization might also be divided into separate entities with different management. This is where the technical assets interweave with human resources and where you might have to consider technical aspects in the composition of your CSIRT. In the case of an incident, you cannot afford to suddenly discover that you are unable to contain or repair because you have no control over the asset. Every asset should have a well-defined responsible representative who has full control over it.
A security incident response plan might involve tools that must be identified as well as potentially purchased and implemented before any incidents happen and before you start incident response activities:
- Identification tools: There are many different IT security tools with different functionalities that might be helpful to identify an incident. For example, an intrusion detection system (IDS) to detect a possible intrusion, a vulnerability scanner to identify a vulnerability (but you should use one regularly nevertheless as part of regular automation), manual tools for penetration testing to confirm a vulnerability, as well as other threat detection, web security, network security, and security information and event management (SIEM) tools.
- Planning and modeling tools: You can use additional tools to model your asset structure, organize the activities during incident response, provide threat intelligence, follow a selected methodology, and more. Such tools may be project planning software and different types of modeling software.
- Communication tools: During incident response procedures, some of the regular business communication tools might be considered unsafe. For example, if an incident involves a breach of the internal email server, you cannot use internal email to communicate during incident response because there is a risk that the attacker will be aware of your activities and will be able to counteract them. Therefore, you should have a backup communication plan.
- Other tools: Other tools may also be involved. For example, meeting rooms might be considered a tool for the incident response team to work together. If you include external personnel, they must also be equipped with suitable tools and authorizations to access your systems and potentially, your premises.
Clear incident definition
Every organization may have different definitions of types of incidents, depending on the business impact and other factors. For example, one organization might not consider a minor denial of service (DoS) attack to be a cyber incident because it does not affect business continuity but for another organization, even an hour of unavailability might mean serious business consequences. Also, some organizations might consider minor internal security breaches as insider threat incidents and others might not (for example, an employee of one department accessing resources from another department, to which they should have no access). Other factors to consider might be the source of the attack (for example, lone script kiddie vs. a criminal organization).
Therefore, one of the key elements of the IRP is to have a very clear definition of what type of cyber threats and security events may be considered incidents and when do they become actual incidents. For example, is a trojan virus found on an employee’s computer and delivered via phishing considered an incident? Is a customer reporting a low-impact cross-site scripting (XSS) vulnerability being exploited on your marketing site considered an incident? Is a minor data breach caused by an employee publicly exposing a spreadsheet file that contains only a couple of marketing email addresses considered an incident?
A good starting point for your own definition of an incident is the official NIST definition: “violation, or imminent threat of violation, of computer security policies, acceptable use policies, or standard security practices.” However, you should come up with your own, more detailed definition, that considers factors specific to your organization such as potential business impact, potential data loss, and more.
A clear definition is very important to decision-makers because they have to declare whether an incident occurred or not. An incident is not a grey zone, it either starts the process involving the entire team or it does not. Every incident that is declared should be treated equally, without severity assessment. Since the activities involved in the process are extensive and may have a business continuity impact, the decision-maker must clearly know, when to “press the red button”.
Note: An incident and a disaster are different terms. Therefore, disaster recovery and incident recovery should not be covered by the same processes and should be subject to separate planning. Disaster recovery is the process of recovering from natural or human-induced disasters, for example, natural disasters, fires, someone accidentally deleting the entire database, etc. Disaster recovery might involve different resources and, for example, does not have to involve the security team as much as incident recovery.
Incident response phases
The incident response process is divided into several phases that should be included in the plan. These phases should be followed strictly, no matter the temptation.
- Preparation: This is the most important phase of incident response and it involves defining all of the above elements: the CSIRT, assets, and the scope of what is considered an incident. It also involves training the resources and even performing trials, tabletop exercises, and mock attacks to see whether everything is working as intended. The key to the success of the preparation phase is to avoid any chaos in the organization in the case that an incident is declared.
- Identification: This phase involves two key activities. One is the preliminary investigation that leads to the declaration of an incident. This phase involves only part of the team: the decision-makers and the technical resources that provide intelligence. Note that the report of a potential incident might also come from external sources, for example, from your customers, partners, or even law enforcement, so communications personnel might also be involved. The incident is declared during this phase and if so, a detailed investigation is required to know, which assets are potentially affected by the incident and must be involved in the next phases. For example, if an attacker breaches your web application, you must identify whether this affects connected servers or even the entire network. Note that after identification is complete, your communications and legal/compliance resources should already start working on their tasks.
- Containment: Once the character and the scope of the incident are clearly identified by technical resources, you must decide which assets must be contained. Containment is absolutely necessary for short-term mitigation and this phase cannot be skipped, even if you are tempted to eradicate the threat as soon as possible. If not contained, the attacker might be still working in parallel with your team on escalation and keep spreading to other, currently unaffected systems. Containment means isolating the affected assets from unaffected assets. However, they are often not taken offline (even temporarily) because this may make eradication more difficult. The containment phase ends with a decision that affected assets are securely isolated and the attacker is cut off.
- Elimination: After the affected assets are contained, your technical resources start eliminating the consequences of the incident. This means, for example, removing malware, fixing vulnerabilities, restoring systems from safe backups, patching, etc. The elimination phase ends with a decision that all the technical consequences of the incident are eradicated and the systems are secured.
- Recovery: The secured systems must now be taken back online and reconnected to other assets, and all the technical and business processes should go back to normal operations. The recovery phase ends with a decision that the entire technical infrastructure is working as well as before the incident. Note that the recovery phase also involves the completion of work by your communications and legal/compliance resources. The end result of this phase is for your decision-makers to declare the incident as closed and for your team members to go back to their regular activities.
- Lessons learned: This activity does not have to be performed immediately after the incident is closed. Sometime after the incident, it is useful to reassemble the key resources from the incident response team, especially all the decision-makers, and analyze how well the incident was handled. As a result, the process might go back to the preparation phase to involve more resources in the team, shift responsibilities, or provide extra training if not all team members performed well enough.
An IRP for web security?
Even if your primary business is associated with the web and you’re most concerned with web-related threats, you cannot limit the cyber incident response plan to web security only. Because IT systems in every organization are interconnected, the incident response plan template must involve all of your organization and related parties as well as all the assets. Only then you can expect complete success in eliminating incident consequences.
Get the latest content on web security
in your inbox each week.