When to Use AI vs. Standard Automation
Not every automation needs AI. Adding AI where it’s not needed adds cost, latency, and unpredictability.
Use AI when:
- Classifying or categorizing unstructured input (emails, tickets, form responses)
- Extracting structured data from unstructured text (invoices, contracts, emails)
- Generating personalized text (follow-up emails, summaries, reports)
- Making decisions that require understanding context, not just matching rules
- Processing natural language where simple keyword matching would fail
Do not use AI when:
- A simple if/else rule handles it reliably
- Input is already structured data — use data transformation instead
- Output must be 100% deterministic — AI introduces variance
- Cost per execution would make the automation uneconomical at scale
Default Models (March 2026 — reviewed quarterly)
| Use Case | Model | Notes |
|---|---|---|
| General generation, summarization | Claude 3.5 Sonnet | Strong reasoning, cost-effective |
| Complex multi-step reasoning | Claude 3 Opus | Higher cost — use only when needed |
| Fast classification / simple tasks | GPT-4o Mini | Low cost, high speed |
| Structured JSON extraction | GPT-4o or Claude 3.5 Sonnet | Both handle JSON output well |
| Image / document processing | GPT-4o Vision | When visual input is required |
Note
Mahmoud reviews and updates this table every quarter.
Prompt Engineering Standards
Prompt quality determines AI output quality. Every production prompt must be written, reviewed, and version-controlled.
Prompt Structure
SYSTEM PROMPT:
You are [specific role relevant to the task].
Your job is to [one clear objective].
Rules:
- [Explicit constraint 1]
- [Explicit constraint 2]
- Always respond in [JSON / plain text / structured list]
- If [edge case condition], [specific instruction]
- Never [explicit prohibition]
USER PROMPT:
[Use XML tags to delimit variable content from instructions]
<email>
{{email_body}}
</email>
Classify this email and respond with only valid JSON:
{"category": "...", "priority": "...", "summary": "..."}
Prompt Writing Rules
- One objective per prompt. Two tasks = two prompts.
- Specify output format explicitly. Need JSON? Say “respond with only valid JSON” and provide the exact schema.
- Delimit variable content. Use XML tags (
<email>,<document>,<input>) to separate dynamic data from instructions. - Define edge cases. Tell the AI what to do if input is blank, in a different language, or in an unexpected format.
- Test adversarially. What happens with blank input? Gibberish? A different language?
- Set temperature correctly. Classification/extraction: 0. Creative generation: 0.3–0.7.
Version Control
- Prompts stored in GitHub
/workflowsas.txtor.mdfiles - Every change is a commit:
docs: update classification prompt v[X] — [reason] - Never change a production prompt without testing on 10+ real examples first
Handling AI Output
AI output must never be passed directly into downstream systems without validation.
Validation Rules
- JSON output: Parse and validate schema before using any field. If parsing fails → error path.
- Classification: Validate returned class is in the expected set. Unexpected value → fallback or human review queue.
- Generated text: Check minimum length, key fields populated, absence of obvious failures before sending to client or end-user.
- Extracted data: Validate field formats (email, date, numeric range) before writing to destination systems.
Required Fallback Strategy for Every AI Step
AI succeeds + output valid → proceed to next step
AI succeeds + output invalid → [retry / default value / human queue / stop + alert]
AI fails (timeout / API error) → retry once after 30 seconds
→ still failing: alert Discord #monitoring + stop automation
Cost Monitoring
- Log token usage for every AI call in execution logs
- Mahmoud reviews API cost per client monthly
- If a client’s AI usage significantly exceeds estimate → flag to PMO before next invoice cycle