When to Use AI vs. Standard Automation

Not every automation needs AI. Adding AI where it’s not needed adds cost, latency, and unpredictability.

Use AI when:

  • Classifying or categorizing unstructured input (emails, tickets, form responses)
  • Extracting structured data from unstructured text (invoices, contracts, emails)
  • Generating personalized text (follow-up emails, summaries, reports)
  • Making decisions that require understanding context, not just matching rules
  • Processing natural language where simple keyword matching would fail

Do not use AI when:

  • A simple if/else rule handles it reliably
  • Input is already structured data — use data transformation instead
  • Output must be 100% deterministic — AI introduces variance
  • Cost per execution would make the automation uneconomical at scale

Default Models (March 2026 — reviewed quarterly)

Use CaseModelNotes
General generation, summarizationClaude 3.5 SonnetStrong reasoning, cost-effective
Complex multi-step reasoningClaude 3 OpusHigher cost — use only when needed
Fast classification / simple tasksGPT-4o MiniLow cost, high speed
Structured JSON extractionGPT-4o or Claude 3.5 SonnetBoth handle JSON output well
Image / document processingGPT-4o VisionWhen visual input is required

Note

Mahmoud reviews and updates this table every quarter.


Prompt Engineering Standards

Prompt quality determines AI output quality. Every production prompt must be written, reviewed, and version-controlled.

Prompt Structure

SYSTEM PROMPT:
  You are [specific role relevant to the task].
  Your job is to [one clear objective].

  Rules:
  - [Explicit constraint 1]
  - [Explicit constraint 2]
  - Always respond in [JSON / plain text / structured list]
  - If [edge case condition], [specific instruction]
  - Never [explicit prohibition]

USER PROMPT:
  [Use XML tags to delimit variable content from instructions]

  <email>
  {{email_body}}
  </email>

  Classify this email and respond with only valid JSON:
  {"category": "...", "priority": "...", "summary": "..."}

Prompt Writing Rules

  1. One objective per prompt. Two tasks = two prompts.
  2. Specify output format explicitly. Need JSON? Say “respond with only valid JSON” and provide the exact schema.
  3. Delimit variable content. Use XML tags (<email>, <document>, <input>) to separate dynamic data from instructions.
  4. Define edge cases. Tell the AI what to do if input is blank, in a different language, or in an unexpected format.
  5. Test adversarially. What happens with blank input? Gibberish? A different language?
  6. Set temperature correctly. Classification/extraction: 0. Creative generation: 0.3–0.7.

Version Control

  • Prompts stored in GitHub /workflows as .txt or .md files
  • Every change is a commit: docs: update classification prompt v[X] — [reason]
  • Never change a production prompt without testing on 10+ real examples first

Handling AI Output

AI output must never be passed directly into downstream systems without validation.

Validation Rules

  • JSON output: Parse and validate schema before using any field. If parsing fails → error path.
  • Classification: Validate returned class is in the expected set. Unexpected value → fallback or human review queue.
  • Generated text: Check minimum length, key fields populated, absence of obvious failures before sending to client or end-user.
  • Extracted data: Validate field formats (email, date, numeric range) before writing to destination systems.

Required Fallback Strategy for Every AI Step

AI succeeds + output valid      → proceed to next step
AI succeeds + output invalid    → [retry / default value / human queue / stop + alert]
AI fails (timeout / API error)  → retry once after 30 seconds
                                   → still failing: alert Discord #monitoring + stop automation

Cost Monitoring

  • Log token usage for every AI call in execution logs
  • Mahmoud reviews API cost per client monthly
  • If a client’s AI usage significantly exceeds estimate → flag to PMO before next invoice cycle