Chapter 10 - Slide Deck Updated Part 1 (1)
Contingency Planning
Agenda
- Contingency Planning
- Business Impact Analysis
- Incident Response Planning
- Disaster Recovery Planning
- Business Continuity Planning
Fundamentals of Contingency Planning
- What Is Contingency Planning (CP)?
- CP is the organizational process of preparing for unexpected adverse events.
- These events (sometimes called incident candidates) can:
- Threaten information assets
- Disrupt business operations
- CP covers incident response, disaster recovery, business continuity, and business impact analysis (BIA).
- Goal: Restore operations with minimal disruption and cost.
- Example: A server room flood threatens systems; CP ensures continuity via cloud failover.
Key Components of CP
- Business Impact Analysis (BIA):
- Identifies mission-critical functions and systems.
- Helps prioritize recovery efforts.
- Incident Response Plan (IRP):
- Defines steps to respond to immediate incidents (e.g., malware outbreak).
- Disaster Recovery Plan (DRP):
- Focuses on restoring systems at the primary site after a major event.
- Business Continuity Plan (BCP):
- Enables continued operations at an alternate site if primary recovery fails.
- Example Flow: Phishing attack → IRP triggered → Data loss → DRP initiated → Prolonged outage → BCP activated.
Unified vs. Modular Planning Approaches
- Unified Plan:
- Common in smaller organizations.
- Simple, integrated recovery strategies.
- Modular Plans:
- Preferred by large, complex organizations.
- Separate but interlinked IR, DR, BC, and BIA plans.
- Choice depends on resources, complexity, and philosophy.
CP Planning Prerequisites
- Planning methodology: Clear, repeatable process.
- Policy environment: Management support and documented authority.
- Budget and resources: Financial and technical support.
- Business Impact Analysis: Identifies what must be protected and recovered.
NIST's 7-Step CP Process (SP 800-34 Rev. 1)
- Develop CP Policy:
- Provides the authority and guidance necessary to develop an effective contingency plan
- Conduct BIA:
- Identify/prioritize critical systems and functions.
- Identify Preventive Controls:
- What are the measures taken to reduce the effects of system disruptions.
- This can increase system availability and reduce contingency life cycle costs.
- Create Contingency Strategies:
- Define detailed recovery methods to ensure quick and effective recovery following a disruption.
- Develop the CP:
- Detailed guidance and procedures for restoration unique to each business unit.
- Test, Train, Exercise:
- Ensures readiness and uncover gaps.
- Maintain the Plan:
- Update regularly to reflect organizational changes.
- Example: A tested plan reveals a communication gap—training resolves it.
Contingency Planning Life Cycle
The contingency planning life cycle includes the following steps:
- Form the CP team.
- Develop the CP policy statement.
- Develop subordinate planning policies (IR/DR/BC).
- Form subordinate planning teams (IR/DR/BC).
- Conduct the business impact analysis (BIA).
- Integrate the business impact analysis (BIA).
- Identify preventive controls.
- Determine mission/business processes &recovery criticality.
- Identify resource requirements.
- Identify recovery priorities for system resources.
- Create response strategies (IR/DR/BC).
- Develop subordinate plans (IR/DR/BC).
- Organize response teams (IR/DR/BC).
- Ensure plan testing, training, and exercises.
- Ensure plan maintenance.
- Review/revise as needed
Importance of CP Policy
- Articulates executive intent and strategic importance.
- Defines:
- Scope and purpose
- Team responsibilities
- Risk assessment/BIA frequency
- Testing and maintenance cycles
- Assigns Roles: COO (CPMT lead), CISO (IR lead), legal, IT, operations.
Specialized Planning Teams
- IRPT: Designs/manages incident response procedures.
- DRPT: Handles disaster recovery planning and processes.
- BCPT: Plans for alternate site operations continuity.
- CMPT: Develops crisis management strategy.
- Teams may overlap in small orgs, but ideally are distinct to avoid role conflict.
Staffing Considerations
- Planning teams ≠ response teams (but include some overlap for continuity).
- Avoid role conflicts during real incidents.
- Example: A DR team member can’t also manage BC tasks at a different site simultaneously.
Common Pitfalls
- Many orgs undervalue CP, leading to:
- Delayed recovery
- Permanent data loss
- Business failure
- Lack of testing = false sense of preparedness.
- CP must be a high priority, not an afterthought.
Business Impact Analysis
BIA Planning Considerations
- Scope –
- Determine:
- Which business units to cover
- Which systems to include
- The nature of the risk being evaluated.
- Plan –
- Make sure the proper data is collected to enable a comprehensive analysis
- Getting the correct information to address the needs of decision makers is important.
- Balance between Objective vs Subjective information–
- You may have collected huge amount of data, weigh the information available;
- Some information may be objective in nature, and some are as subjective or anecdotal references.
- Facts should be weighted properly against opinions;
- However, Sometimes the knowledge and experience of key personnel can be invaluable.
- Objective – Tailor analysis to decision-maker needs.
- Follow-Up –
- Communicate periodically to ensure process owners and decision makers will support the process and the end result of the BIA
Three BIA Phases (Per NIST SP 800-34 Rev. 1)
- Determine Mission/Business Processes and Recovery Criticality
- Identify Resource Requirements
- Prioritize System Resources for Recovery
Step 1 – Determine Business Process Criticality
- Assess each unit’s function and value to core operations.
- Evaluate how failure would impact mission success.
- Use Weighted Table Analysis (WTA) to:
- Define criteria (e.g., revenue impact, compliance, customer service).
- Assign weights to each criterion.
- Score functions and compute importance.
- Example: Sales platform vs. HR database—restore sales first due to revenue generation.
- Use of BIA Questionnaires
- Collects consistent data across departments.
- Includes:
- Process descriptions
- Dependencies
- Impact assessments
- Can be filled out by functional managers.
- Resources for Templates:
- NIST: SP 800-34 Rev. 1 BIA Template
- FEMA, Ready.gov, DRJ
Step 2 – Identify Recovery Objectives & Timeframes
- Use NIST’s recovery metrics:
- RTO (Recovery Time Objective):
- The maximum amount of time that a system resource can remain unavailable before there is an unacceptable impact of other system resources.
- RPO (Recovery Point Objective):
- The point in time before a disruption or system outage to which business process data can be recovered after an outage. Given the most recent back up copy of the data
- MTD (Maximum Tolerable Downtime):
- The total amount of time the system owner or authorizing official is willing to accept outage or disruption. The MTD includes all impact considerations.
- WRT (Work Recovery Time):
- The amount of time needed to make business functions work again after the technology element is recovered. This recovery time is identified by the RTO.
- Example:
- RTO = 2 hrs, RPO = 10 minutes, WRT = 4 hrs, MTD = 6 hrs total to be fully functional again
- Cost-Benefit Analysis of Recovery Times
- Shorter RTO = Higher cost
- Longer downtime = Higher operational loss
- Graph (Figure 10-5): Balance between:
- Cost to recover (mirror site, backups)
- Cost of disruption (lost revenue, reputation)
- Example: Web retail site may require real-time recovery = costly mirrored site.
- Classify data into Critical, Very Important, Routine, etc.
- Helps prioritize what must be restored first.
- Example: Payment gateway database gets higher priority than training schedules.
Step 4 – Identify Recovery Resource Requirements
- List supporting assets for each business process.
- Use a Resource/Component Table to document:
- Hardware/software needs
- Network dependencies
- Staff roles
- Example Table Entry:
- Process: Customer Billing
- Resource: Accounts Receivable Application
- Details: Linux server + SQL database, ~$8,000
Step 5 – Prioritize System Resource Recovery
- Use weighted scoring or simple labels (Primary/Secondary).
- Avoid overcomplicating the process:
- Large orgs → detailed weighted analysis.
- Smaller orgs → fast classification scheme.
- Output: Custom “to-do” list for disaster and continuity planning teams.
Incident Response Plan
- Incident Response (IR) refers to a planned, coordinated approach to:
- Detecting
- Reacting to
- Recovering from information security incidents.
- Many organizations already engage in IR (e.g., reacting to a system crash), even if informally.
- IR is essential to maintain system integrity, business continuity, and data protection.
- Example: Employee accidentally deletes a key database—IT acts to restore from backup and prevent recurrence.
What is an Incident?
- Adverse Event: Any unexpected event that could harm information assets.
- Incident: An adverse event that materializes into a real threat.
- Not every adverse event becomes an incident—but all incidents begin as adverse events.
- Example: Unusual login attempts → Adverse event. Confirmed unauthorized access → Incident.
Incident Response vs. Incident Reaction
- IR refers to the entire planning and coordination process.
- Reaction refers to what the organization actually does after detecting an incident.
- This deck uses:
- IR = broad process (planning + action).
- Reaction = specific post-detection response.
- Example: The IR team prepares a phishing response plan → Actual blocking/removal of phishing email = reaction.
Incident Response Planning (IRP)
- IRP is the formal preparation effort to guide organizational IR activities.
- Includes development of:
- IR policies
- IR plan (IRP document)
- Formation of IRP Team (IRPT)
- Requires senior management support and cross-functional coordination.
- IRPT = specialized team trained to handle incidents as per documented plan.
Who is Involved?
- IRP Team (IRPT):
- Often includes members from IT, InfoSec, legal, and communications.
- Coordinates detection, containment, eradication, and recovery steps.
- Management: Provides oversight and resources.
- End users: Often the first line of detection (e.g., reporting suspicious email or activity).
When is the IR Plan Activated?
- Trigger point: When an incident is detected, no matter how minor.
- Early detection and response can limit damage and cost.
- Example: If a USB drive containing sensitive data is reported missing, the IR plan is activated to assess and mitigate data loss.
Getting Started with Incident Response
- Initiated by: Contingency Planning Management Team (CPMT).
- The Incident Response Planning Team (IRPT) is responsible for:
- Creating and documenting the incident response policy.
- Defining the scope and structure of incident handling.
- Outlining the organization’s response approach to different types of incidents.
- Guiding users on how to help, not hinder, response efforts.
- Do: Report anomalies promptly.
- Don’t: Attempt personal troubleshooting that may destroy evidence
IRPT’s Core Responsibilities
- Establish formal policy for incident response activities.
- Define incident categories and related protocols.
- Advise users on how to recognize and report potential incidents.
- Design training programs and communication protocols.
- Ensure IR plan aligns with business objectives and legal requirements.
- Formed by IRPT to implement and execute the IR plan.
- Comprised of:
- Technical IT staff (admins, network engineers)
- Managerial IT personnel (e.g., CIO or IT directors)
- InfoSec specialists (e.g., security analysts, threat hunters)
- Some IRPT members may overlap with the CSIRT for continuity and clarity.
CSIRT in Action – A Practical Example
- Scenario: A phishing campaign targets internal users.
- Detection: Alert from email filter → triaged by CSIRT.
- Containment: Block malicious sender and quarantine affected inboxes.
- Recovery: Scan affected devices, restore clean backups if needed.
- User Guidance: IRPT trains staff to avoid clicking suspicious links in the future.
What’s Next – The NIST Incident Response Lifecycle
- The CSIRT operates within a structured lifecycle defined by NIST SP 800-61 Rev. 2:
- Preparation
- Detection & Analysis
- Containment, Eradication, and Recovery
- Post-Incident Activity
- This model ensures a systematic and repeatable response process.
Introduction to the NIST Cybersecurity Framework (CSF)
- Developed by NIST to enhance critical infrastructure cybersecurity.
- Known as the Framework for Improving Critical Infrastructure Cybersecurity.
- Designed to complement existing IR standards, including:
- NIST SP 800-61 Rev. 2 (Incident Handling Guide)
- NIST SP 800-184 (Cybersecurity Event Recovery)
- Built on foundational practices outlined in earlier NIST Special Publications.
Mapping CSF to IR and Recovery
- The CSF includes five core functions, which closely align with the IR lifecycle:
- Identify → Supports risk management and governance.
- Protect → Emphasizes controls: policy, training, technology.
- Detect → Involves recognizing signs of security incidents.
- Respond → Concerns action taken once an incident is detected.
- Recover → Focuses on restoring systems and operations.
CSF Function 1 – Identify
- Supports:
- Asset management
- Risk assessments
- Governance programs
- Key Objective: Understand what needs protection and why.
- Example: Inventorying all data centers and defining their importance to operations.
CSF Function 2 – Protect
- Implementation of preventive measures, such as:
- Access controls
- Security awareness training
- Data protection tools
- Builds a defense-in-depth strategy.
- Example: Requiring MFA for system logins to prevent unauthorized access.
CSF Function 3 – Detect
- Focuses on real-time monitoring and alerting.
- Enables organizations to identify incidents as they occur.
- Utilizes:
- IDS/IPS systems
- SIEM tools
- Threat intelligence feeds
- Example: SIEM detects unusual login patterns indicating a possible breach.
CSF Function 4 – Respond
- Encompasses:
- Incident handling
- Communication plans
- Legal coordination
- Forensics and containment
- Aligned with: NIST SP 800-61 Rev. 2
- Example: After detecting ransomware, CSIRT isolates infected machines and begins recovery.
CSF Function 5 – Recover
- Objective: Restore services and reduce long-term impact.
- Informed by:
- NIST SP 800-184: Guide for Cybersecurity Event Recovery
- Includes:
- System restoration
- Lessons learned
- Improvement planning
- Example: Restoring clean backups and implementing safeguards to prevent repeat incidents.
Introduction to the IR Policy
Key Components of the IR Policy (NIST SP 800-61 Rev. 2)
- Statement of Management Commitment
- Confirms executive support and assigns authority to the CSIRT.
- Ensures organizational alignment with IR goals.
- Purpose and Objectives
- Clarifies why the policy exists.
- Describes what the IR process aims to achieve (e.g., minimize downtime, preserve evidence).
- Scope
- Specifies:
- Who is covered (e.g., employees, contractors).
- What is covered (systems, data, networks).
- When it applies (during and after incidents).
Key Components of the IR Policy (Continued)
- Definitions of InfoSec Incidents and Related Terms
- Standardizes language for clarity.
- Example: Clearly distinguish between “incident,” “event,” and “breach.”
- Organizational Structure and Responsibilities
- Roles and authority levels (e.g., CSIRT’s ability to disconnect systems).
- Requirements for:
- Reporting incidents.
- Monitoring activity.
- Interacting with external stakeholders (e.g., law enforcement, partners).
- Example: Policy may authorize the CSIRT to pull a compromised server offline immediately without waiting for higher approval.
Additional Policy Elements
- Incident Severity and Prioritization
- Defines how incidents are ranked by impact (e.g., critical, high, medium, low).
- Helps triage response efforts and allocate resources appropriately.
- Performance Measures
- Metrics to evaluate IR effectiveness (e.g., time to detect, contain, recover).
- Aligns with broader InfoSec performance measurement frameworks (see Chapter 9).
- Reporting and Contact Protocols
- Specifies:
- How to report an incident.
- What forms or systems to use.
- Who should be contacted (internal and external).
What Is Incident Response Planning (IRP)?
- IRP is the structured development of plans, policies, and teams to manage InfoSec incidents.
- It is a reactive process: activated after an incident is detected, not before.
- Falls under the responsibility of:
- CIO, CISO, or designated IT manager
- With support from CPMT, system administrators, and key stakeholders
- Example: Water damage to an office activates IRP, not DRP, unless broader infrastructure is impacted.
What Qualifies as an InfoSec Incident?
- An event is classified as an incident if it meets all the following:
- Targets information assets
- Has a realistic chance of success
- Threatens confidentiality, integrity, or availability (CIA)
- IR focuses on incidents, not on prevention—that's InfoSec’s job.
IR Plan Elements (NIST SP 800-61 Rev. 2)
- A comprehensive IR plan should include:
- Mission: Purpose of the response effort.
- Strategies and goals: Desired outcomes.
- Senior management approval: Legitimacy and authority.
- Organizational approach: Structure and responsibilities.
- Communication plans: Internal and external.
- Performance metrics: KPIs to track effectiveness.
- Capability roadmap: For continuous improvement.
- Integration: How it fits within the broader organization.
Three Sets of Incident Response Procedures
- Before the Incident (Preparation):
- Backups, training, SLAs, test plans, DR/BC links
- Example: Weekly data backup plan with off-site storage
- During the Incident:
- Real-time actions by technical and managerial teams
- Assigned by role (admin vs. comms vs. legal)
- After the Incident:
- Cleanup, recovery, and lessons learned
- Restoring systems and documentation of events
Role of the CSIRT in IRP Execution
- The CSIRT executes the IR plan:
- Detect, respond to, and recover from incidents
- May be formal or informal based on org size
- Acts like a firefighting unit:
- Each member knows their specific role
- Coordinates as a unified team
- Example: One team isolates affected systems while another handles comms and compliance.
IR Phases – Detect, React, Recover
- Detection:
- Recognize that an incident is underway
- Reaction:
- Contain and mitigate damage
- Aligns with "Respond" in the NIST CSF
- Recovery:
- Return systems to pre-incident condition
Incident Handling Checklist (NIST SP 800- 61, Rev. 2)
- Detection and Analysis
- Determine incident occurrence
- Analyze precursors/indicators
- Correlate and research
- Document and gather evidence
- Prioritize based on impact
- Report internally/externally
- Containment, Eradication, and Recovery
- Preserve/document evidence
- Contain and eradicate
- Recover systems and monitor
- Post-Incident Activity
- Create report
- Conduct lessons learned session (mandatory for major events)
Data Protection Strategies for IR Preparation
- Traditional backups:
- On-site/off-site, disk-to-disk-to-tape, RAID
- Electronic vaulting:
- Batch data transfers via secure lines
- Remote journaling:
- Real-time transaction replication (vs. full backups)
- Database shadowing:
- Real-time mirroring to two locations
- 3-2-1 Rule:
- 3 copies of data, 2 media types, 1 off-site
- Example: Daily on-site backups, weekly cloud storage
Incident Detection
- Goal: Distinguish between routine system activity and real incidents.
- This is the first phase in the NIST incident response lifecycle.
- Begins with incident classification—deciding if an adverse event is an actual incident.
- Sources for detection:
- End-user reports
- Intrusion Detection/Prevention Systems (IDPS)
- Antivirus/antimalware alerts
- Admin observations
- Example: A help desk user notices strange pop-ups → report to CSIRT → IR plan initiated.
What Is Incident Classification?
- Definition: The process of evaluating an adverse event to determine whether it qualifies as an InfoSec incident.
- Requires training, clear definitions, and consistent procedures.
- Classification is critical for deciding the response path (IR vs. DR/BC).
Three Types of Incident Indicators
- Possible Indicators
- May suggest an incident but need further investigation.
- Examples:
- Unfamiliar files found by users or admins
- Unknown processes running in the background
- Unusual system crashes or sudden reboots
- Resource spikes or drops (e.g., CPU, RAM, disk space)
- Tools: Windows Task Manager, UNIX/Linux resource monitors
- Probable Indicators
- Stronger evidence of malicious or abnormal activity.
- Examples:
- System activity at odd hours (e.g., midnight traffic spikes)
- New user accounts with no documentation
- User-reported attacks
- Alerts from IDPS (though these may include false positives)
- Definite Indicators
- Confirmed signs that an incident is happening or has occurred.
- Examples:
- Dormant accounts being used unexpectedly
- Log file modifications with no authorized changes
- Presence of hacker or penetration tools in unauthorized locations
- Notification by trusted partner or external organization
- Web defacement or extortion message from a hacker
Incidence Detection Results That May Indicate an Incident
- Treat all unusual results as potential incidents—better to overreact than ignore.
- Possible outcomes of actual or attempted incidents:
- Loss of availability (system crash or downtime)
- Loss of integrity (corrupted or altered data)
- Loss of confidentiality (data leak or unauthorized access)
- Violation of policy (e.g., unapproved file sharing)
- Violation of law/regulation (e.g., unauthorized access to protected health info)
Why Early Detection Matters
- Prevents small issues from becoming large-scale incidents or disasters.
- Helps the IR team:
- Activate predefined IR procedures
- Contain and analyze the situation quickly
- Reduces:
- Downtime
- Reputational harm
- Financial loss
From Detection to Reaction
- Once an incident is confirmed and classified, the IR plan moves from Detection to Reaction.
- NIST SP 800-61, Rev. 2 combines this with recovery into “Containment, Eradication, and Recovery”, while the NIST CSF separates them as “Respond” and “Recover.”
- The Response Phase aims to:
- Stop the incident
- Minimize its impact
- Prepare for recovery
Notification of Key Personnel
- The CSIRT activates the alert roster to notify the appropriate individuals.
- Two types of rosters:
- Sequential: One person contacts everyone (accurate but slow).
- Hierarchical: Each person calls others in a tree structure (faster but risk of miscommunication).
- Tools: Automated systems (e.g., Preparis Portal) can streamline communication.
- Example: A ransomware attack triggers automated SMS, email, and voice alerts to CSIRT and IT leadership.
Alert Messages and Communication
- Alert messages include just enough information so responders can act without delay.
- Alert message example: “Ransomware detected on finance server. Disconnect server and contact network admin. Follow IR SOP.”
- Alert rosters must be:
- Regularly updated
- Tested
- Rehearsed
- General management, legal, HR, comms, and external partners may also need to be notified depending on the incident.
Documenting the Incident
- Document:
- Who did what
- What happened
- When actions were taken
- Where the incident occurred
- Why/How the event unfolded
- Purpose:
- Enables case studies
- Aids in legal defense
- Supports training and simulation
- Example: Documentation helps prove compliance with due care in a breach affecting customer data.
Containment Strategies – Stopping the Attack
- Identify affected systems quickly, but without full forensics analysis.
- Common containment methods:
- Disable compromised user accounts
- Reconfigure firewalls
- Shut down affected apps/services (e.g., mail server)
- Disconnect infected network segments
- In extreme cases, power down all systems
- Example: Email phishing attack contained by disabling external email gateway temporarily.
Balancing Containment vs. Operations
- Not all containment steps are ideal:
- Disconnecting circuits may stop the attack but also halt business.
- Use adaptive methods:
- Apply IP filtering
- Block specific ports or traffic types
- Monitor activity while developing longer-term solutions
Preparedness Across the Organization
- Preparedness must go beyond the CISO and CSIRT.
- Why?
- Team members may be sick, traveling, or otherwise unavailable.
- Everyone should:
- Know basic IR steps
- Understand their role in an emergency
- Example: Receptionist reporting suspicious USB drives helps avoid data exfiltration.
Incident Escalation – When IR Isn’t Enough
- Some incidents escalate beyond the IR plan’s scope:
- Infrastructure-wide damage
- Physical destruction
- Extended outages
- Criteria for escalation should be:
- Defined during the BIA
- Documented in the IR plan
- Escalation triggers:
- Major financial or operational impact
- Need for law enforcement or emergency services
- Example: A DDoS attack affecting critical customer services may escalate to disaster recovery activation and regulatory reporting.
Transitioning from Response to Recovery
- Recovery phase begins after containment and regaining system control.
- The focus shifts from mitigation to restoration.
- Recovery includes:
- Restoring systems and data
- Rebuilding trust
- Preventing recurrence
Initial Recovery Tasks
- Notify appropriate personnel:
- IT operations, system owners, data custodians, and department heads.
- Begin damage assessment immediately:
- Evaluate extent of loss to confidentiality, integrity, and availability (CIA).
- Use:
- Incident documentation
- Logs (IDS, system, config)
- Backup records
- Example: If an HR server was breached, assess if PII was exfiltrated or altered.
Incident Damage Assessment
- A critical first step in recovery.
- Can take days or weeks depending on incident scale.
- Outputs:
- Scope of infected/affected systems
- Type of data loss or corruption
- Entry vectors and spread pattern
- Damage documentation must be handled with care—may be used in legal or civil proceedings.
Steps in the Recovery Process (Donald Pipkin Framework)
- Identify vulnerabilities that enabled the incident and remediate them.
- Example: Apply patches, close open ports, disable unused services.
- Repair or install safeguards that failed or were missing.
- Firewalls, endpoint detection, email filtering, MFA, etc.
- Evaluate and upgrade monitoring tools.
- Deploy or enhance SIEM, intrusion detection, and alerting capabilities.
Data and Service Restoration
- Restore data from backups using recovery processes:
- Full backup + incremental backups
- Database journals or remote journaling
- Example: Use last clean backup from Monday, apply logs from Tues-Wed.
- Reinstate services and processes:
- Validate and restart suspended or compromised systems
- Review service dependencies and integrity checks
- Monitor systems continuously:
- Post-recovery surveillance to catch repeat attacks or missed threats
- Use enhanced logging, alerts, and analytics
Restoring Organizational Confidence
- Communicate with stakeholders:
- Internal: Staff, management, users
- External (if necessary): Clients, partners, regulators
- Tailor the message:
- Minor event: Quick update, emphasize prevention
- Major breach: Detailed reassurance, timeline for full restoration
- Goal: Prevent confusion, panic, or loss of trust
Post-Incident Review – After-Action Review (AAR)
- After-Action Review (AAR):
- A structured and non-blaming discussion after an incident.
- Allows all participants to share perspectives and reflect on:
- What happened
- What worked
- What needs improvement
- Moderated by a facilitator; results are documented and shared.
- Purpose:
- Refine the IR plan
- Train new team members
- Preserve institutional knowledge
What Happens in an AAR?
- Each team member:
- Reviews their roles and actions
- Identifies gaps and strengths
- The team verifies:
- The accuracy of documentation
- That incident records are clear and complete
- Outcome:
- Updates to IR plans, SOPs, and training content
- May become a case study for future simulation
Common IR Mistakes (McAfee’s Top 10)
- No clear chain of command
- No central operations center
- Not understanding attacker tactics ("know your enemy")
- Missing or weak containment strategies
- Not recording IR activities across all stages
- Missing real-time documentation and timelines
- Confusing containment with remediation
- Inadequate network security and monitoring
- Weak or nonexistent system logging
- Poor or missing antivirus/antimalware coverage
NIST SP 800-61 Rev. 2 – Best Practices Summary
- Acquire tools/resources ahead of time:
- Contact lists, forensics tools, network diagrams
- Prevent incidents with sound security controls and awareness
- Use layered detection tools:
- IDS/IPS, antivirus, file integrity tools
- Let outsiders report incidents (publish contact methods)
- Establish baseline logging and auditing:
- More detailed on critical systems
NIST Recommendations (Cont.)
- Profile network/system behavior to detect anomalies
- Understand normal vs. abnormal activity
- Develop