GameDays - Learning Module

Loading content...

0/273

Learning from GameDays

From Observations to Improvements

The GameDay is over. Systems are restored. Participants are gathered. Now comes the phase that determines whether your exercise was worth the investment: extracting learning and converting it into action.

Many organizations run GameDays, collect observations, and then... nothing changes. The debrief is skipped or rushed, action items are captured but never tracked, and the same weaknesses persist until a real incident exposes them painfully.

Effective learning from GameDays requires structured debriefs that surface insights, rigorous action item tracking that ensures follow-through, and documentation that builds organizational memory. This is where GameDays deliver their highest return on investment.

What You Will Learn

By the end of this page, you will understand how to conduct effective debriefs that surface actionable insights, categorize and prioritize findings, create action items that actually get completed, document learnings for organizational memory, and build the feedback loop that makes each GameDay more valuable than the last.

The Debrief Structure

A structured debrief ensures all participants contribute, diverse perspectives emerge, and the conversation stays productive rather than devolving into blame or celebration without substance.

The ideal debrief occurs immediately after the exercise, lasts 30-60 minutes, and includes all exercise participants. If time permits, extending debriefs to 90 minutes for particularly rich GameDays can be valuable.

Debrief Agenda Template

•Opening (2 min) — Thank everyone, reiterate that blameless learning is the goal, confirm that what's discussed here is about improving systems and processes, not judging people.
•Timeline Review (5-10 min) — Walk through the scribe's timeline. Verify events and timing are accurate. This establishes shared factual foundation.
•What Went Well (10 min) — Ask each participant to share one thing that worked well. Capture these on a shared board or document. Celebrate genuine strengths.
•What Could Be Improved (15-20 min) — Ask each participant to share observations about gaps, struggles, or surprises. Use prompts if needed. This is where the richest material emerges.
•Surprises (5 min) — Specifically ask: 'What surprised you about the exercise?' Surprises often reveal hidden assumptions that are worth examining.
•Root Cause Discussion (10-15 min) — For significant findings, dig deeper. Why did the runbook have outdated information? Why didn't the alert fire sooner? Seek systemic causes.
•Action Item Generation (10 min) — Convert key findings into specific action items with owners and deadlines. This is non-optional—findings without actions are wasted.
•Closing (2 min) — Summarize key takeaways, confirm action item owners, thank participants again, and provide information about where documentation will be shared.

Facilitating effective discussion:

The debrief facilitator (often the Game Master, but can be a separate role) must keep discussion productive:

Ensure all voices are heard — Directly invite quieter participants to share. 'Sarah, you were in the observer role—what did you notice?'
Redirect blame — If someone starts blaming an individual, gently redirect: 'Let's focus on the process that allowed that to happen rather than the person.'
Keep it concrete — Vague observations like 'communication was poor' need specifics. 'What specifically about communication? Can you give an example?'
Time-box tangents — If discussion goes deep on one topic, note it for separate follow-up: 'This is important—let's capture this for a dedicated session and continue the debrief.'
Capture everything — Have a scribe capturing observations in real-time. Don't rely on memory.

Blamelessness is Essential

The debrief must be blameless. If participants fear that their honest observations will lead to negative consequences—for themselves or colleagues—they'll sanitize their input. The goal is systemic improvement, not individual accountability. Mistakes made during a learning exercise are not performance failures—they're the discoveries that make the exercise valuable.

Asking the Right Questions

The quality of debrief output depends on the quality of questions asked. Generic prompts produce generic answers. Targeted questions uncover specific, actionable insights.

Question categories for comprehensive coverage:

•How long after the failure was injected did we detect it?
•Did detection come from automated monitoring or human observation?
•Were the right people alerted? Too many people? Not enough?
•Was the alert informative enough to guide initial investigation?
•Were there any false positives or misleading signals?
•What would have happened if this occurred at 3 AM instead of during work hours?
•Did our SLI/SLO dashboards reflect the impact accurately?

The 'Five Whys' for significant findings:

For particularly important observations, use the 'Five Whys' technique to find root causes:

Finding: The runbook had an outdated server hostname.

Why was the hostname wrong? → The server was replaced during a migration three months ago.
Why wasn't the runbook updated? → The engineer who did the migration didn't know the runbook existed.
Why didn't they know? → There's no discovery process linking infrastructure to documentation.
Why is there no process? → Documentation ownership is unclear; no one is responsible for keeping it current.
Why is ownership unclear? → We've never defined documentation maintenance responsibilities.

Root cause: Lack of documentation ownership and maintenance processes.

This root cause suggests systemic improvements (documentation ownership, change management procedures) rather than just fixing the one hostname.

Capture the 'Luckys'

Pay special attention to moments where things worked out only due to luck: 'Luckily, Alice happened to be online and knew the workaround.' These 'luckys' are hidden fragility—in a different circumstance, they would have been failures. Convert luckys into action items that remove the luck dependency.

Categorizing Findings

GameDays typically produce a mix of findings across several categories. Organizing findings by category helps ensure follow-up work addresses systemic issues rather than just surface symptoms.

Finding Categories and Examples
Category	Description	Example Findings	Typical Resolution
Technical Gaps	Missing or broken technical capabilities	Failover didn't complete, monitoring gap, missing automation	Engineering work: code changes, infrastructure updates, tool implementation
Documentation Issues	Missing, outdated, or unclear documentation	Runbook steps outdated, architecture diagram missing, procedure unclear	Documentation updates, ownership assignment, review processes
Process Gaps	Missing or ineffective procedures	Unclear escalation path, no customer notification process, undefined roles	Process definition, workflow creation, responsibility assignment
Knowledge Gaps	Missing skills or system understanding	On-call unfamiliar with system, diagnostic tooling unknown, tribal knowledge	Training, cross-training, documentation, knowledge sharing sessions
Tooling Deficiencies	Lacking tools or tool problems	Dashboard missing key metrics, log searches slow, missing permissions	Tool improvements, access provisioning, dashboard creation
Organizational Issues	Cultural or structural challenges	Hesitation to escalate, unclear ownership, siloed teams	Leadership discussions, team structure changes, culture initiatives

Prioritizing findings:

Not all findings are equally important. Prioritize based on:

Severity: How bad would the impact be if this issue manifested during a real incident?

Critical: Would cause extended outage or data loss
High: Would significantly delay recovery or cause customer impact
Medium: Would slow response or require workarounds
Low: Minor inconvenience or efficiency loss

Likelihood: How often would this issue actually manifest?

High: Would affect most incidents of this type
Medium: Would affect some incidents depending on circumstances
Low: Would only matter in specific edge cases

Effort: How much work is required to address?

Small: Can be fixed in a day or less
Medium: Requires a week or sprint of work
Large: Requires significant project investment

Prioritize high-severity, high-likelihood items first. Quick wins (low-effort fixes for real problems) should be addressed immediately.

Converting Mermaid diagram...

Don't Boil the Ocean

A rich GameDay might produce 20+ findings. Don't try to address all of them immediately. Select the top 3-5 highest-priority items for immediate action. Track the rest as backlog to be addressed in coming weeks or future sprints. Attempting to fix everything at once leads to nothing getting fixed well.

Creating Effective Action Items

Findings become improvements only when they are converted into specific, actionable, owned tasks. Vague action items become orphaned items that never get done.

Characteristics of effective action items:

Weak Action Items

•'Improve monitoring'
•'Fix runbooks'
•'Better communication'
•'Train team on system'
•'Look into escalation process'
•'Make failover faster'

Strong Action Items

•'Add database connection pool exhaust alert (Team: SRE, Due: March 15)'
•'Update payment-failover.md Section 5.3 with correct config path (Owner: Bob, Due: March 8)'
•'Create incident commander checklist for customer-facing outages (Owner: Alice, Due: March 22)'
•'Schedule 2-hour payment system deep-dive for on-call rotation (Owner: Carol, Due: March 30)'
•'Document escalation criteria from P2 to P1 and get sign-off from leadership (Owner: Dan, Due: March 18)'
•'Reduce failover time by parallelizing database promotion and DNS update (Team: Platform, Due: Q2 roadmap)'

Action item template:

action-item-template.md
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
## Action Item: [Concise Title]
 
**Finding:** [The observation from the GameDay]
 
**Impact:** [Why this matters—what happens if we don't fix it]
 
**Proposed Solution:** [Specific action to take]
 
**Owner:** [Single person responsible for completion—not a team]
 
**Due Date:** [Specific date or sprint/quarter]
 
**Success Criteria:** [How we know it's done]
 
**Priority:** [Critical/High/Medium/Low]
 
**Notes:** [Any context, dependencies, or alternatives considered]

Tracking action items to completion:

Action items tracked in a shared document that no one looks at are action items that never get done. Effective tracking requires:

Visible ownership — Each item has one owner (not a team, a person) who is responsible for completion or delegation
Regular follow-up — Review action item status weekly or at sprint planning. 'Status update on GameDay items' as a recurring agenda item
Clear completion criteria — 'Fixed' is ambiguous. 'Merged PR #1234 and verified in staging' is clear
Escalation for blockers — If items aren't progressing, identify blockers and escalate. Don't let items stall silently
Close-out verification — Before marking items complete, verify the fix actually works—ideally by testing in the next GameDay

The Action Item Graveyard

Many organizations have 'action item graveyards'—tracking systems full of items from past incidents and exercises that never got addressed. This undermines the entire practice. If you consistently generate more action items than you complete, either generate fewer items (focusing on the most critical) or allocate more capacity for follow-through. Tracked-but-ignored items are worse than untracked items because they create false comfort.

Building Organizational Memory

GameDay learnings should persist beyond the immediate follow-up. Documentation creates organizational memory that benefits team members who didn't participate, new hires, and future planning efforts.

Essential GameDay documentation:

Documentation Artifacts

•GameDay Report — A summary document covering objectives, scenarios, key findings, action items, and overall assessment. Shared with all stakeholders.
•Updated Runbooks — Incorporating corrections and improvements discovered during the exercise. Immediate, not delayed.
•Revised Architecture Documentation — If GameDay revealed misunderstandings about system behavior, update architectural documentation.
•Training Materials — Convert knowledge gaps into training content. 'Engineers struggled with X' becomes 'Tutorial: Understanding X.'
•GameDay Catalog — A running index of past GameDays with links to reports, enabling pattern analysis and planning future exercises.
•Metrics Updates — If new metrics or alerts are needed, document the rationale alongside implementation.

The GameDay Report template:

gameday-report-template.md
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
# GameDay Report: [Exercise Name]
 
## Executive Summary
[2-3 sentences: What did we do, what did we learn, what are we doing about it]
 
## Exercise Details
- **Date:** [Date and time]
- **Duration:** [Actual duration]
- **Environment:** [Production/Staging/etc.]
- **Participants:** [List of roles and names]
 
## Objectives and Outcomes
| Objective | Outcome | Assessment |
|-----------|---------|------------|
| [Objective 1] | [What happened] | ✅ Met / ⚠️ Partial / ❌ Not Met |
 
## Scenario Summary
[Brief description of failure scenarios and system/team response]
 
## Timeline of Key Events
[Condensed timeline from the scribe notes]
 
## Key Findings
 
### What Went Well
1. [Finding 1]
2. [Finding 2]
 
### Areas for Improvement
1. [Finding 1 - with context and impact]
2. [Finding 2 - with context and impact]
 
### Surprises
1. [Unexpected observation]
 
## Action Items
| ID | Action | Owner | Due | Priority | Status |
|----|--------|-------|-----|----------|--------|
| 1 | [Action] | [Name] | [Date] | [P1-P4] | Open |
 
## Recommendations for Future GameDays
- [Suggestions for next exercise]
 
## Appendix
- Link to full timeline
- Link to observer notes
- Link to recording (if applicable)

Sharing learnings broadly:

GameDay insights often apply beyond the immediate participating teams:

Engineering-wide sharing — Present key findings at engineering all-hands or in a newsletter. Other teams learn from your discoveries.
Cross-team pattern identification — If multiple teams' GameDays reveal similar issues (e.g., outdated runbooks), this signals organizational-level improvement opportunities.
Leadership reporting — Summarize GameDay program health for leadership: number of exercises, key improvements driven, reduction in incident metrics attributable to findings.

The GameDay Library

Maintain a searchable archive of GameDay reports. When planning new exercises, review past reports to avoid repeating scenarios that have already been validated, and to explicitly re-test areas where previous GameDays revealed issues. This library becomes invaluable institutional knowledge.

The Feedback Loop: Improving GameDays Themselves

GameDays themselves should improve over time. Part of learning from each exercise is learning how to run better exercises.

Questions to ask about the exercise itself:

•Was the scope appropriate? Too ambitious? Not ambitious enough?
•Did the scenarios reveal useful information, or were they too easy/too hard?
•Was the briefing adequate? Did participants have what they needed?
•Was the pacing good? Did we have enough time? Too much?
•Did observation capture what we needed? Any important events missed?
•Was psychological safety maintained? Did anyone feel uncomfortable?
•How was the debrief? Did we get to all topics? Did everyone participate?
•What would we do differently next time?

Closing the action item loop:

The ultimate validation of GameDay learning is whether action items actually improve resilience. This can be verified in two ways:

Subsequent GameDays — Run similar scenarios after fixes are implemented. Do the same issues recur?
Real Incidents — When actual incidents occur, are they handled better due to GameDay-driven improvements?

Track a 'Validation' field on action items:

Not Yet Validated — Fix is in place but not tested
Validated in GameDay — Subsequent exercise confirmed improvement
Validated in Production — Real incident demonstrated improvement

This closes the loop from discovery to improvement to verification.

Converting Mermaid diagram...

Metrics for GameDay Program Health

Track program-level metrics: number of GameDays per quarter, findings generated, action items completed, action items validated, time from finding to resolution. These metrics demonstrate program value and identify improvement areas for the practice itself.

Common Debrief Pitfalls and How to Avoid Them

Even with good structure, debriefs can go wrong. Recognizing common pitfalls helps facilitators steer toward productive outcomes.

Debrief Anti-Patterns and Solutions
Pitfall	Signs	How to Address
The Blame Game	Discussion focuses on who made mistakes rather than systemic causes	Redirect: 'What about our systems or processes allowed this to happen?' Reinforce blamelessness.
Victory Lap Syndrome	Team celebrates success without examining what could be better	Prompt explicitly: 'If we ran this at 3 AM with junior on-call, would it go as well?' Seek hidden fragility.
Dominant Voices	One or two people do all the talking; others stay silent	Use structured rounds: 'Let's hear one observation from each person.' Directly invite quiet participants.
Scope Creep	Discussion expands beyond the exercise to general system complaints	Acknowledge, capture for later: 'That's valid—let's table it for a separate discussion and stay focused on today's exercise.'
Solution Jumping	Rushing to 'how do we fix it' before fully understanding the finding	Pause: 'Before solutions, let's make sure we understand the problem. Why did this happen? What's the root cause?'
Vague Conclusions	Findings sound profound but aren't actionable: 'We need better communication'	Demand specificity: 'What specifically about communication? Between whom? During which phase?'
No Action Items	Discussion is interesting but ends without concrete next steps	Block time specifically for action item generation. Don't leave without at least 3 specific, owned items.
Meeting Fatigue	Participants are exhausted after the exercise and rush through debrief	Take a 10-minute break first. Consider scheduling debrief for next morning if exercise ends late.

The facilitator's meta-observation:

Good facilitators observe not just the exercise, but the debrief itself. After a few debriefs, patterns emerge:

Do the same people always dominate?
Are certain types of findings consistently missed?
Do action items from past exercises get forgotten?
Is psychological safety actually present, or just claimed?

Address these patterns proactively. A dysfunctional debrief process undermines the entire GameDay practice.

Summary: Learning That Lasts

The value of GameDays is realized in the learning phase—through effective debriefs, rigorous action tracking, and organizational memory building. Let's consolidate the essential practices:

Key Learning Practices

•Structured debriefs surface actionable insights — Follow a consistent agenda that covers what went well, what could improve, root causes, and action items.
•Ask targeted questions across domains — Detection, diagnosis, remediation, and communication each require specific inquiry.
•Categorize and prioritize findings — Not all discoveries are equal. Focus energy on high-severity, high-likelihood issues.
•Create effective action items — Specific, owned, time-bound items with clear success criteria—not vague aspirations.
•Build organizational memory — Document findings, update runbooks, create training materials, and maintain a searchable GameDay archive.
•Close the feedback loop — Verify improvements in subsequent GameDays or real incidents. Track validation status on action items.
•Improve the practice itself — Reflect on debrief quality and GameDay program health, continuously improving how you learn.

What's next:

With planning, execution, and learning covered, one question remains: How often should you run GameDays, and how do you sustain the practice over time? The final page addresses GameDay frequency and long-term program sustainability.

Page Complete

You now understand how to extract maximum value from GameDays through effective debriefs, rigorous action tracking, and persistent documentation. Learning is where chaos engineering investment pays dividends. Next, we'll explore GameDay frequency and sustaining the practice long-term.

Learning from GameDays

From Observations to Improvements

What You Will Learn

The Debrief Structure

A structured debrief ensures all participants contribute, diverse perspectives emerge, and the conversation stays productive rather than devolving into blame or celebration without substance.

Debrief Agenda Template

•Opening (2 min) — Thank everyone, reiterate that blameless learning is the goal, confirm that what's discussed here is about improving systems and processes, not judging people.
•Timeline Review (5-10 min) — Walk through the scribe's timeline. Verify events and timing are accurate. This establishes shared factual foundation.
•What Went Well (10 min) — Ask each participant to share one thing that worked well. Capture these on a shared board or document. Celebrate genuine strengths.
•What Could Be Improved (15-20 min) — Ask each participant to share observations about gaps, struggles, or surprises. Use prompts if needed. This is where the richest material emerges.
•Surprises (5 min) — Specifically ask: 'What surprised you about the exercise?' Surprises often reveal hidden assumptions that are worth examining.
•Root Cause Discussion (10-15 min) — For significant findings, dig deeper. Why did the runbook have outdated information? Why didn't the alert fire sooner? Seek systemic causes.
•Action Item Generation (10 min) — Convert key findings into specific action items with owners and deadlines. This is non-optional—findings without actions are wasted.
•Closing (2 min) — Summarize key takeaways, confirm action item owners, thank participants again, and provide information about where documentation will be shared.

Facilitating effective discussion:

The debrief facilitator (often the Game Master, but can be a separate role) must keep discussion productive:

Ensure all voices are heard — Directly invite quieter participants to share. 'Sarah, you were in the observer role—what did you notice?'
Redirect blame — If someone starts blaming an individual, gently redirect: 'Let's focus on the process that allowed that to happen rather than the person.'
Keep it concrete — Vague observations like 'communication was poor' need specifics. 'What specifically about communication? Can you give an example?'
Time-box tangents — If discussion goes deep on one topic, note it for separate follow-up: 'This is important—let's capture this for a dedicated session and continue the debrief.'
Capture everything — Have a scribe capturing observations in real-time. Don't rely on memory.

Blamelessness is Essential

Asking the Right Questions

The quality of debrief output depends on the quality of questions asked. Generic prompts produce generic answers. Targeted questions uncover specific, actionable insights.

Question categories for comprehensive coverage:

•How long after the failure was injected did we detect it?
•Did detection come from automated monitoring or human observation?
•Were the right people alerted? Too many people? Not enough?
•Was the alert informative enough to guide initial investigation?
•Were there any false positives or misleading signals?
•What would have happened if this occurred at 3 AM instead of during work hours?
•Did our SLI/SLO dashboards reflect the impact accurately?

The 'Five Whys' for significant findings:

For particularly important observations, use the 'Five Whys' technique to find root causes:

Finding: The runbook had an outdated server hostname.

Why was the hostname wrong? → The server was replaced during a migration three months ago.
Why wasn't the runbook updated? → The engineer who did the migration didn't know the runbook existed.
Why didn't they know? → There's no discovery process linking infrastructure to documentation.
Why is there no process? → Documentation ownership is unclear; no one is responsible for keeping it current.
Why is ownership unclear? → We've never defined documentation maintenance responsibilities.

Root cause: Lack of documentation ownership and maintenance processes.

This root cause suggests systemic improvements (documentation ownership, change management procedures) rather than just fixing the one hostname.

Capture the 'Luckys'

Categorizing Findings

GameDays typically produce a mix of findings across several categories. Organizing findings by category helps ensure follow-up work addresses systemic issues rather than just surface symptoms.

Finding Categories and Examples
Category	Description	Example Findings	Typical Resolution
Technical Gaps	Missing or broken technical capabilities	Failover didn't complete, monitoring gap, missing automation	Engineering work: code changes, infrastructure updates, tool implementation
Documentation Issues	Missing, outdated, or unclear documentation	Runbook steps outdated, architecture diagram missing, procedure unclear	Documentation updates, ownership assignment, review processes
Process Gaps	Missing or ineffective procedures	Unclear escalation path, no customer notification process, undefined roles	Process definition, workflow creation, responsibility assignment
Knowledge Gaps	Missing skills or system understanding	On-call unfamiliar with system, diagnostic tooling unknown, tribal knowledge	Training, cross-training, documentation, knowledge sharing sessions
Tooling Deficiencies	Lacking tools or tool problems	Dashboard missing key metrics, log searches slow, missing permissions	Tool improvements, access provisioning, dashboard creation
Organizational Issues	Cultural or structural challenges	Hesitation to escalate, unclear ownership, siloed teams	Leadership discussions, team structure changes, culture initiatives

Prioritizing findings:

Not all findings are equally important. Prioritize based on:

Severity: How bad would the impact be if this issue manifested during a real incident?

Critical: Would cause extended outage or data loss
High: Would significantly delay recovery or cause customer impact
Medium: Would slow response or require workarounds
Low: Minor inconvenience or efficiency loss

Likelihood: How often would this issue actually manifest?

High: Would affect most incidents of this type
Medium: Would affect some incidents depending on circumstances
Low: Would only matter in specific edge cases

Effort: How much work is required to address?

Small: Can be fixed in a day or less
Medium: Requires a week or sprint of work
Large: Requires significant project investment

Prioritize high-severity, high-likelihood items first. Quick wins (low-effort fixes for real problems) should be addressed immediately.

Converting Mermaid diagram...

Don't Boil the Ocean

Creating Effective Action Items

Findings become improvements only when they are converted into specific, actionable, owned tasks. Vague action items become orphaned items that never get done.

Characteristics of effective action items:

Weak Action Items

•'Improve monitoring'
•'Fix runbooks'
•'Better communication'
•'Train team on system'
•'Look into escalation process'
•'Make failover faster'

Strong Action Items

•'Add database connection pool exhaust alert (Team: SRE, Due: March 15)'
•'Update payment-failover.md Section 5.3 with correct config path (Owner: Bob, Due: March 8)'
•'Create incident commander checklist for customer-facing outages (Owner: Alice, Due: March 22)'
•'Schedule 2-hour payment system deep-dive for on-call rotation (Owner: Carol, Due: March 30)'
•'Document escalation criteria from P2 to P1 and get sign-off from leadership (Owner: Dan, Due: March 18)'
•'Reduce failover time by parallelizing database promotion and DNS update (Team: Platform, Due: Q2 roadmap)'

Action item template:

action-item-template.md
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
## Action Item: [Concise Title]
 
**Finding:** [The observation from the GameDay]
 
**Impact:** [Why this matters—what happens if we don't fix it]
 
**Proposed Solution:** [Specific action to take]
 
**Owner:** [Single person responsible for completion—not a team]
 
**Due Date:** [Specific date or sprint/quarter]
 
**Success Criteria:** [How we know it's done]
 
**Priority:** [Critical/High/Medium/Low]
 
**Notes:** [Any context, dependencies, or alternatives considered]

Tracking action items to completion:

Action items tracked in a shared document that no one looks at are action items that never get done. Effective tracking requires:

Visible ownership — Each item has one owner (not a team, a person) who is responsible for completion or delegation
Regular follow-up — Review action item status weekly or at sprint planning. 'Status update on GameDay items' as a recurring agenda item
Clear completion criteria — 'Fixed' is ambiguous. 'Merged PR #1234 and verified in staging' is clear
Escalation for blockers — If items aren't progressing, identify blockers and escalate. Don't let items stall silently
Close-out verification — Before marking items complete, verify the fix actually works—ideally by testing in the next GameDay

The Action Item Graveyard

Building Organizational Memory

GameDay learnings should persist beyond the immediate follow-up. Documentation creates organizational memory that benefits team members who didn't participate, new hires, and future planning efforts.

Essential GameDay documentation:

Documentation Artifacts

•GameDay Report — A summary document covering objectives, scenarios, key findings, action items, and overall assessment. Shared with all stakeholders.
•Updated Runbooks — Incorporating corrections and improvements discovered during the exercise. Immediate, not delayed.
•Revised Architecture Documentation — If GameDay revealed misunderstandings about system behavior, update architectural documentation.
•Training Materials — Convert knowledge gaps into training content. 'Engineers struggled with X' becomes 'Tutorial: Understanding X.'
•GameDay Catalog — A running index of past GameDays with links to reports, enabling pattern analysis and planning future exercises.
•Metrics Updates — If new metrics or alerts are needed, document the rationale alongside implementation.

The GameDay Report template:

gameday-report-template.md
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
# GameDay Report: [Exercise Name]
 
## Executive Summary
[2-3 sentences: What did we do, what did we learn, what are we doing about it]
 
## Exercise Details
- **Date:** [Date and time]
- **Duration:** [Actual duration]
- **Environment:** [Production/Staging/etc.]
- **Participants:** [List of roles and names]
 
## Objectives and Outcomes
| Objective | Outcome | Assessment |
|-----------|---------|------------|
| [Objective 1] | [What happened] | ✅ Met / ⚠️ Partial / ❌ Not Met |
 
## Scenario Summary
[Brief description of failure scenarios and system/team response]
 
## Timeline of Key Events
[Condensed timeline from the scribe notes]
 
## Key Findings
 
### What Went Well
1. [Finding 1]
2. [Finding 2]
 
### Areas for Improvement
1. [Finding 1 - with context and impact]
2. [Finding 2 - with context and impact]
 
### Surprises
1. [Unexpected observation]
 
## Action Items
| ID | Action | Owner | Due | Priority | Status |
|----|--------|-------|-----|----------|--------|
| 1 | [Action] | [Name] | [Date] | [P1-P4] | Open |
 
## Recommendations for Future GameDays
- [Suggestions for next exercise]
 
## Appendix
- Link to full timeline
- Link to observer notes
- Link to recording (if applicable)

Sharing learnings broadly:

GameDay insights often apply beyond the immediate participating teams:

Engineering-wide sharing — Present key findings at engineering all-hands or in a newsletter. Other teams learn from your discoveries.
Cross-team pattern identification — If multiple teams' GameDays reveal similar issues (e.g., outdated runbooks), this signals organizational-level improvement opportunities.
Leadership reporting — Summarize GameDay program health for leadership: number of exercises, key improvements driven, reduction in incident metrics attributable to findings.

The GameDay Library

The Feedback Loop: Improving GameDays Themselves

GameDays themselves should improve over time. Part of learning from each exercise is learning how to run better exercises.

Questions to ask about the exercise itself:

•Was the scope appropriate? Too ambitious? Not ambitious enough?
•Did the scenarios reveal useful information, or were they too easy/too hard?
•Was the briefing adequate? Did participants have what they needed?
•Was the pacing good? Did we have enough time? Too much?
•Did observation capture what we needed? Any important events missed?
•Was psychological safety maintained? Did anyone feel uncomfortable?
•How was the debrief? Did we get to all topics? Did everyone participate?
•What would we do differently next time?

Closing the action item loop:

The ultimate validation of GameDay learning is whether action items actually improve resilience. This can be verified in two ways:

Subsequent GameDays — Run similar scenarios after fixes are implemented. Do the same issues recur?
Real Incidents — When actual incidents occur, are they handled better due to GameDay-driven improvements?

Track a 'Validation' field on action items:

Not Yet Validated — Fix is in place but not tested
Validated in GameDay — Subsequent exercise confirmed improvement
Validated in Production — Real incident demonstrated improvement

This closes the loop from discovery to improvement to verification.

Converting Mermaid diagram...

Metrics for GameDay Program Health

Common Debrief Pitfalls and How to Avoid Them

Even with good structure, debriefs can go wrong. Recognizing common pitfalls helps facilitators steer toward productive outcomes.

Debrief Anti-Patterns and Solutions
Pitfall	Signs	How to Address
The Blame Game	Discussion focuses on who made mistakes rather than systemic causes	Redirect: 'What about our systems or processes allowed this to happen?' Reinforce blamelessness.
Victory Lap Syndrome	Team celebrates success without examining what could be better	Prompt explicitly: 'If we ran this at 3 AM with junior on-call, would it go as well?' Seek hidden fragility.
Dominant Voices	One or two people do all the talking; others stay silent	Use structured rounds: 'Let's hear one observation from each person.' Directly invite quiet participants.
Scope Creep	Discussion expands beyond the exercise to general system complaints	Acknowledge, capture for later: 'That's valid—let's table it for a separate discussion and stay focused on today's exercise.'
Solution Jumping	Rushing to 'how do we fix it' before fully understanding the finding	Pause: 'Before solutions, let's make sure we understand the problem. Why did this happen? What's the root cause?'
Vague Conclusions	Findings sound profound but aren't actionable: 'We need better communication'	Demand specificity: 'What specifically about communication? Between whom? During which phase?'
No Action Items	Discussion is interesting but ends without concrete next steps	Block time specifically for action item generation. Don't leave without at least 3 specific, owned items.
Meeting Fatigue	Participants are exhausted after the exercise and rush through debrief	Take a 10-minute break first. Consider scheduling debrief for next morning if exercise ends late.

The facilitator's meta-observation:

Good facilitators observe not just the exercise, but the debrief itself. After a few debriefs, patterns emerge:

Do the same people always dominate?
Are certain types of findings consistently missed?
Do action items from past exercises get forgotten?
Is psychological safety actually present, or just claimed?

Address these patterns proactively. A dysfunctional debrief process undermines the entire GameDay practice.

Summary: Learning That Lasts

The value of GameDays is realized in the learning phase—through effective debriefs, rigorous action tracking, and organizational memory building. Let's consolidate the essential practices:

Key Learning Practices

•Structured debriefs surface actionable insights — Follow a consistent agenda that covers what went well, what could improve, root causes, and action items.
•Ask targeted questions across domains — Detection, diagnosis, remediation, and communication each require specific inquiry.
•Categorize and prioritize findings — Not all discoveries are equal. Focus energy on high-severity, high-likelihood issues.
•Create effective action items — Specific, owned, time-bound items with clear success criteria—not vague aspirations.
•Build organizational memory — Document findings, update runbooks, create training materials, and maintain a searchable GameDay archive.
•Close the feedback loop — Verify improvements in subsequent GameDays or real incidents. Track validation status on action items.
•Improve the practice itself — Reflect on debrief quality and GameDay program health, continuously improving how you learn.

What's next:

Page Complete