π― Learning Objectives
- Diagnose why a prompt is failing
- Apply systematic refactoring techniques
- Reduce ambiguity and instruction drift
- Design and run iterative prompt tests
- Build versioned prompt libraries
- Benchmark prompt performance across variations
1. Diagnosing Prompt Failures
| Failure Mode | Symptom | Root Cause |
|---|---|---|
| Vague Task | Generic, unfocused output | Task instruction lacks specificity |
| Missing Audience | Wrong depth or tone | No audience context provided |
| Conflicting Instructions | Diluted, compromised output | Two instructions contradict each other |
| No Format | Wall of text | Output format not specified |
| Context Overload | Model ignores key instructions | Too much competing information |
| Hallucination | Plausible but wrong facts | No uncertainty guardrails specified |
Diagnostic prompt:
Analyse the following prompt and identify every reason it might produce a poor output.
For each issue: (1) Name the failure mode (2) Explain why (3) Suggest the specific fix.
[Paste your prompt here]
2. Prompt Refactoring β The RACE Method
- R β Remove conflicting or redundant instructions
- A β Add missing components (role, audience, format)
- C β Clarify vague terms with specific definitions
- E β Enforce format and output structure explicitly
β Before RACE
Write a detailed but concise report about our validation processes.
Make it technical but easy to understand.
Cover everything important but keep it short.
β
After RACE
You are a validation manager writing for a quality director.
Summarise current validation process status.
Audience: Technical but not in day-to-day operations.
Format: 3 sections (Equipment, Process, Cleaning). 3 bullets each.
Length: 250 words max. Flag items needing action: [ACTION REQUIRED].
3. Iterative Prompt Testing
// Test: Does adding a role improve output quality?
Version A (control):
Explain the risks of poor data integrity in pharma.
Version B (with role):
You are a regulatory auditor.
Explain the risks of poor data integrity in pharma.
// Score each on: Specificity, Relevance, Usability (1β5)
What to Test
- Role vs no role
- With examples vs without
- Format specified vs unspecified
- Word limit vs no limit
- Chain-of-thought vs direct answer
4. Versioning & Prompt Libraries
PROMPT CARD
ββββββββββββββββββββββββββββββββββββββ
ID: QA-004
Category: Quality Assurance
Use Case: SOP Compliance Review
Version: v2.1 (2025-03)
Quality Score: 4.2/5.0
Human Review: Required β QA specialist
ββββββββββββββββββββββββββββββββββββββ
PROMPT:
[Full prompt text here]
KNOWN LIMITATIONS:
- Does not catch formatting-only issues
- Regulatory ref numbers must be verified manually
ββββββββββββββββββββββββββββββββββββββ
βοΈ Module 09 Exercise
Take a prompt producing disappointing results. Run it through the diagnostic prompt. Categorise the failure mode. Apply the RACE method. Test before and after versions side by side. Document the single change that made the biggest difference.
π Key Takeaways
- Prompt failure has diagnosable root causes β always diagnose before fixing
- The RACE method (Remove, Add, Clarify, Enforce) is a systematic refactoring framework
- Isolate one variable at a time when A/B testing prompts
- Prompt libraries transform individual expertise into team capability
- Version control for prompts is as important as version control for code
- The highest-performing prompts result from multiple refactoring cycles, not first drafts