/feature-results - Post-Launch Analysis
Document feature results and capture learnings.
Quick Start
/feature-results
Then provide:
- Feature name and ship date
- Original hypothesis (or I'll pull it from your PRD)
- Actual results (metrics, or I'll query your analytics MCP)
I'll generate a results doc with: executive summary, results vs targets, root cause analysis (if missed), segment breakdown, estimate calibration, stakeholder communication guidance, and concrete next steps.
Output: Saved to thoughts/shared/pm/analyses/feature-results-[name]-[date].md
Time: ~10 min with data ready, ~20 min if we need to dig
Pre-Launch Detection
Before generating a results analysis, verify the feature has actually shipped.
Check for launch signals:
- Does the PRD in
thoughts/shared/pm/prds/show status "Shipped" or "Launched"? - Does the PM mention actual results data (not projections)?
If the feature hasn't launched yet:
It looks like [feature] hasn't shipped yet. Post-launch analysis works best with real data.
For pre-launch work, you might want:
- `/feature-metrics` - Define what you'll measure before launch
- `/impact-sizing` - Estimate expected impact
- `/experiment-decision` - Plan your A/B test or rollout
- `/launch-checklist` - Make sure you're ready to ship
Want to proceed anyway (e.g., analyzing a soft launch or pilot), or switch to one of these?
When to Use
- After A/B test completes
- After feature reaches full rollout
- During quarterly reviews
- For portfolio analysis
Quick Start Prompt
When PM types /feature-results, respond:
Let's document the results of your feature launch.
Tell me:
1. What feature shipped?
2. What were the original goals/metrics?
3. What actually happened?
I'll help you create a results doc that captures outcomes and learnings.
Results Doc Structure
1. Executive Summary
- One paragraph: What shipped, did it work, what's next
2. Original Hypothesis
- What we expected to happen
- Why we believed it
3. Results
- Primary metric: Target vs. Actual
- Guardrail metrics: Any issues?
- Unexpected findings
4. Analysis
- Why did results occur?
- What surprised us?
- What didn't we anticipate?
5. Learnings
- What would we do differently?
- What should we apply elsewhere?
6. Next Steps
- Keep / Iterate / Kill?
- Follow-up experiments
- Related opportunities
Output Template
# Feature Results: [Feature Name]
**Ship Date:** [Date]
**Analysis Date:** [Date]
**Owner:** [PM Name]
---
## Executive Summary
[1-2 sentences: What shipped, key result, recommendation]
**Verdict:** [Ship to 100% / Iterate / Kill]
---
## Original Hypothesis
**If we** [built X],
**then** [Y would happen],
**because** [Z assumption].
**Target:** [Primary metric] would improve by [X%]
---
## Results
### Primary Metric
| Metric | Baseline | Target | Actual | Verdict |
| ------ | -------- | ------ | ------ | ---------- |
| [Name] | [X] | [Y] | [Z] | [Hit/Miss] |
### Statistical Significance
- Sample size: [N]
- Confidence level: [X%]
- P-value: [X]
### Guardrail Metrics
| Metric | Acceptable Range | Actual | Status |
| -------- | ---------------- | -------- | ------------ |
| [Metric] | [range] | [actual] | [OK/Warning] |
### Unexpected Findings
- [Finding 1]
- [Finding 2]
---
## Analysis
### Why These Results?
[Explain the underlying reasons for the results]
### What Surprised Us?
[Things we didn't expect]
### Segment Breakdown
| Segment | Result | Notes |
| ----------- | ------- | --------- |
| [Segment 1] | [+/-X%] | [insight] |
| [Segment 2] | [+/-X%] | [insight] |
---
## Learnings
### What Worked
- [Learning 1]
- [Learning 2]
### What Didn't Work
- [Learning 1]
- [Learning 2]
### Apply Elsewhere
- [Insight that applies to other features]
---
## Next Steps
**Recommendation:** [Ship / Iterate / Kill]
**If shipping:**
- [ ] Roll out to 100%
- [ ] Monitor for [X] weeks
- [ ] Update documentation
**If iterating:**
- [ ] [Specific change 1]
- [ ] [Specific change 2]
- [ ] Re-run experiment
**If killing:**
- [ ] Document why
- [ ] Rollback plan
- [ ] Communicate to stakeholders
---
## Appendix
- [Link to experiment dashboard]
- [Link to original PRD]
- [Link to detailed data]
Root Cause Analysis
When results miss targets, diagnose using this framework:
Discovery Problem (Users don't know it exists)
- Check: Feature awareness rate, in-app visibility, announcement reach
- Common causes: Buried in UI, no onboarding tooltip, weak announcement
- Fix: Improve discovery through tooltips, email campaign, in-app prompts
Quality Problem (Users try it but it doesn't work well)
- Check: Error rates, completion rates, time-on-task, feedback sentiment
- Common causes: Bugs, poor UX, confusing workflow, slow performance
- Fix: Fix bugs, simplify UX, improve performance
Relevance Problem (Users try it but it doesn't solve their problem)
- Check: Use case analysis, segment breakdown, feature-problem fit
- Common causes: Wrong target audience, edge cases not handled, misaligned with actual needs
- Fix: Narrow targeting, add missing use cases, revisit problem definition
Frequency Problem (Users use it once but don't come back)
- Check: Repeat usage rates, habit formation metrics, triggers for return
- Common causes: Not integrated into daily workflow, one-time value, no triggers
- Fix: Add recurring value hooks, notifications, integration with other workflows
Estimate vs Actual Calibration
| Metric | Original Estimate | Actual Result | Accuracy |
|---|---|---|---|
| [Primary metric] | [From impact-sizing] | [Actual] | [Over/Under by X%] |
| [Secondary metric] | [From impact-sizing] | [Actual] | [Over/Under by X%] |
| [Guardrail metric] | [Expected range] | [Actual] | [Within/Outside range] |
Calibration insight: [Are we consistently over/under-estimating? By how much? What should we adjust for next time?]
Historical calibration pattern:
- Look at past 3-5 feature results. If we consistently over-estimate by 30-50%, apply a discount factor to future impact-sizing.
- If we consistently under-estimate, we may be too conservative or missing amplification effects.
- Track this over time to improve estimation accuracy.
Segment Breakdown Guidance
Always analyze results across these default segments. Segment analysis often reveals that an overall miss is actually a win for one segment and a miss for another.
| Segment Dimension | Why It Matters |
|---|---|
| New vs existing users | New users may not find the feature; existing users may resist change |
| Plan tier (Free/Team/Business/Enterprise) | Feature value often varies dramatically by tier |
| Company size | Small teams adopt differently than large orgs |
| Power users vs casual users | Power users may love it; casual users may be confused |
| Geographic region (if applicable) | Cultural, language, and connectivity differences |
Segment analysis template:
| Segment | N (sample) | Primary Metric | vs Target | Insight |
|---|---|---|---|---|
| New users | [N] | [result] | [+/-X%] | [why] |
| Existing users | [N] | [result] | [+/-X%] | [why] |
| Free tier | [N] | [result] | [+/-X%] | [why] |
| Paid tier | [N] | [result] | [+/-X%] | [why] |
| Power users | [N] | [result] | [+/-X%] | [why] |
| Casual users | [N] | [result] | [+/-X%] | [why] |
How to Communicate Results
If results beat targets: Lead with the win, credit the team, connect to strategy. Share in #product-team, include in weekly status update, highlight in next board deck.
If results miss targets: Lead with what you learned, not what went wrong. Frame as: "Here's what we hypothesized, here's what actually happened, here's what we're doing about it." Never hide bad results -- own them fast.
If results are ambiguous: Be honest about uncertainty. "We saw X but can't conclusively attribute it to Y because Z. Here's our plan to get clearer signal."
Audience-Specific Framing
| Audience | Lead With | Include | Avoid |
|---|---|---|---|
| CEO/Board | Impact on key business metrics + strategic implications | Revenue impact, user growth, competitive position | Technical details, minor metrics |
| CPO/Manager | Learnings + next steps + resource ask | What worked, what didn't, proposed plan | Blame, excuses |
| Engineering team | What worked technically + what to improve | Performance data, technical wins, tech debt notes | Business jargon |
| Sales/CS | Customer-facing impact + talking points | User quotes, feature highlights, objection handlers | Internal debates, negative framing |
| Design | User behavior changes + UX insights | Usage patterns, qualitative feedback, heatmaps | Pure metric tables without context |
Example Patterns
Pattern 1: Clear Win
Scenario: Feature shipped, primary metric exceeded target by 20%. Guardrails held.
Root cause: N/A (success)
Action: Scale to 100%, plan v2 based on segment insights, draft announcement via /catalyst-pm-ops:slack-message, update /catalyst-pm-ops:status-update with win.
Communication: Lead with the number, credit the team, connect to quarterly goals.
Pattern 2: Mixed Results
Scenario: Primary metric hit target but guardrail metric degraded (e.g., page load time increased 15%).
Root cause: Quality problem -- feature added weight to core flow.
Action: Investigate quality regression, hold at current rollout %, fix guardrail before scaling. Create follow-up ticket via /catalyst-pm-ops:create-tickets.
Communication: "Feature is working as intended for users, but we identified a performance regression we need to fix before full rollout."
Pattern 3: Clear Miss
Scenario: Primary metric at 60% of target after 4 weeks. Root cause: Discovery problem -- only 12% of eligible users found the feature. Among those who found it, conversion was above target. Action: Don't kill the feature -- improve discovery (add tooltip, email campaign, onboarding mention), re-measure in 4 weeks. Communication: "The feature works for users who find it, but we have a distribution problem. Plan: improve discovery and re-measure."
Pattern 4: Kill Decision
Scenario: Primary metric at 30% of target after 6 weeks. Root cause is relevance -- users tried it but it doesn't solve their actual problem.
Root cause: Relevance problem -- hypothesis was wrong.
Action: Document the learning, roll back, redirect engineering effort. Draft decision doc via /decision-doc.
Communication: "We tested our hypothesis that X would solve Y. Data shows it doesn't. Here's what we learned and where we're redirecting effort."
Tips
- Be honest about failures - Negative results are still valuable
- Capture learnings immediately - Don't wait; context fades
- Share widely - Others can learn from your experiments
- Connect to strategy - How does this inform our roadmap?
- Run root cause analysis - Don't just report the miss; diagnose why
- Check segments - An overall miss may hide a segment win
- Calibrate estimates - Track accuracy over time to improve future sizing
Context Routing Strategy
When the PM uses /feature-results, I automatically:
1. Pull Original PRD & Hypothesis
Source: thoughts/shared/pm/prds/, conversation history
- What I look for: Original hypothesis, target metrics, success criteria
- How I use it: Compare actual results against original targets
- Example: Auto-populate "Original Hypothesis" section from the PRD you shared
2. Query Analytics MCPs
Source: PostHog, PostHog, Posthog, PostHog (if connected)
- What I look for: Real-time metrics for the feature being analyzed
- How I use it: Populate results tables with actual data
- Fallback: If no MCP connected, I'll ask you to provide the metrics
3. Check Related Experiments
Source: thoughts/shared/pm/metrics/, past feature results
- What I look for: Similar features' results, patterns in what works
- How I use it: Add context like "This compares to our pricing redesign which saw 12% improvement"
4. Extract Guardrail Metrics
Source: thoughts/shared/pm/metrics/, PRD success metrics section
- What I look for: Metrics that must not decline (retention, NPS, etc.)
- How I use it: Pre-populate guardrail metrics table with relevant metrics
5. Identify Learnings for Portfolio
Source: Cross-reference with other feature results
- What I look for: Patterns across multiple experiments
- How I use it: Suggest "Apply Elsewhere" section based on similar features
6. Route Results for Sharing
Routing logic:
- Ship to 100%: Draft announcement for
/catalyst-pm-ops:slack-message - Iterate: Create follow-up experiment plan with
/experiment-decision - Kill: Draft decision doc for
/decision-docexplaining sunset - Executive update: Format for
/catalyst-pm-ops:status-updatewith business impact
Output Quality Self-Check
Before delivering the feature results doc, verify:
- Executive summary is 1-2 sentences max and includes a clear verdict (Ship/Iterate/Kill)
- Original hypothesis is stated in If/Then/Because format
- Primary metric has baseline, target, and actual with statistical significance noted
- Guardrail metrics are checked and status is clear (OK/Warning/Breach)
- Root cause analysis is included if results missed targets (Discovery/Quality/Relevance/Frequency)
- Segment breakdown covers at least 3 dimensions (new vs existing, tier, user type)
- Estimate vs actual calibration table is populated with accuracy percentages
- Stakeholder communication section has audience-appropriate framing
- Next steps are specific, actionable, and have owners
- Learnings include at least one "apply elsewhere" insight
- No corporate jargon -- results are written in plain, direct language
- Linked to original PRD and any related experiments