Prompt Cache

A lightweight caching layer that prevents regenerating identical content. Saved approximately 60% of API quota in production by catching duplicate prompts before they hit the API.

How It Works

Normalize the prompt (lowercase, collapse whitespace)
Combine with context keys (user name, language, model)
SHA-256 hash the combined key
Check cache table for existing result
On miss: call API, store result. On hit: return cached result instantly.

Usage

import prompt_cache

# Check before calling expensive API
cached = await prompt_cache.get_cached(
    prompt="Tell me a story about clouds",
    child_name="Sophie",
    language="fr"
)

if cached:
    return cached  # Free! No API call needed.

# Cache miss — call the API
result = await generate_story(prompt, child_name, language)

# Store for next time
await prompt_cache.set_cached(prompt, child_name, language, result)

Schema

CREATE TABLE IF NOT EXISTS prompt_cache (
    prompt_hash TEXT NOT NULL,
    child_name TEXT NOT NULL,
    language TEXT NOT NULL,
    story_json TEXT,
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (prompt_hash, child_name, language)
);

Adapt the Keys

The default implementation uses (prompt, child_name, language) as the cache key. Adapt to your domain:

Chat completions: (system_prompt, user_message, model)
TTS: (text, voice_id, model_id)
Image gen: (prompt, seed, model, size)

Files

scripts/prompt_cache.py — Cache implementation (35 lines)