Chat API Orchestration

Guide for implementing a single-endpoint chat API with deterministic request flow.

What This Skill Does

Design /api/{user_id}/chat endpoint contract
Implement deterministic 5-step request flow
Validate incoming chat requests
Resolve or create conversation_id
Orchestrate agent runner calls
Return structured responses

What This Skill Does NOT Do

Implement the agent/LLM logic itself
Handle conversation persistence (see conversation-state-management skill)
Manage authentication tokens (assumes auth exists)
Deploy or scale the API

Before Implementation

Gather context to ensure successful implementation:

| Source | Gather | |--------|--------| | Codebase | Existing routers, auth dependencies, error handling patterns | | Conversation | User's agent runner interface, specific response fields needed | | Skill References | API contract, request flow, error handling patterns | | User Guidelines | Project conventions, existing schema patterns |

Core Architecture

Single Endpoint Design

POST /api/{user_id}/chat
     │
     ▼
┌─────────────────────────────────────────┐
│         DETERMINISTIC FLOW              │
│                                         │
│  1. Validate Request                    │
│  2. Authenticate User                   │
│  3. Resolve Conversation                │
│  4. Execute Agent                       │
│  5. Return Response                     │
│                                         │
└─────────────────────────────────────────┘

Why Single Endpoint?

| Benefit | Description | |---------|-------------| | Simplicity | One URL to remember, one contract to maintain | | Stateless | Each request is self-contained | | Idempotent design | Can safely retry failed requests | | Clear ownership | user_id in path ensures proper scoping |

The 5-Step Deterministic Flow

Every request follows this exact sequence. No shortcuts, no variations.

┌─────────────────────────────────────────────────────────────┐
│ Step 1: VALIDATE REQUEST                                    │
│   • Check required fields (message)                         │
│   • Validate field types and constraints                    │
│   • Return 400 if invalid                                   │
└─────────────────────────────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────┐
│ Step 2: AUTHENTICATE USER                                   │
│   • Verify JWT from Authorization header                    │
│   • Extract user_id from token                              │
│   • Verify path user_id matches token user_id               │
│   • Return 401/403 if unauthorized                          │
└─────────────────────────────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────┐
│ Step 3: RESOLVE CONVERSATION                                │
│   • If conversation_id provided: load existing              │
│   • If not provided: create new conversation                │
│   • Verify conversation ownership                           │
│   • Return 404 if conversation not found                    │
└─────────────────────────────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────┐
│ Step 4: EXECUTE AGENT                                       │
│   • Build context from conversation history                 │
│   • Add user message to context                             │
│   • Call agent runner with context                          │
│   • Persist user message + agent response                   │
│   • Return 503 if agent fails                               │
└─────────────────────────────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────┐
│ Step 5: RETURN RESPONSE                                     │
│   • Format structured response                              │
│   • Include conversation_id for continuation                │
│   • Return 200 with ChatResponse                            │
└─────────────────────────────────────────────────────────────┘

Implementation Workflow

Step 1: Define Request/Response Schemas

See references/api-contract.md for complete schemas.

# schemas/chat.py
from pydantic import BaseModel, Field
from typing import Optional
from datetime import datetime


class ChatRequest(BaseModel):
    """Incoming chat request."""
    message: str = Field(..., min_length=1, max_length=10000)
    conversation_id: Optional[str] = Field(
        None,
        description="Existing conversation ID, or omit to create new"
    )
    model: Optional[str] = Field("gpt-4", description="Model to use")


class ChatResponse(BaseModel):
    """Chat API response."""
    conversation_id: str
    message: str
    role: str = "assistant"
    created_at: datetime
    usage: Optional[dict] = None

Step 2: Create the Router

See references/request-flow.md for detailed implementation.

# routers/chat.py
from fastapi import APIRouter, Depends, HTTPException, status

router = APIRouter(prefix="/api", tags=["chat"])


@router.post("/{user_id}/chat", response_model=ChatResponse)
async def chat(
    user_id: str,
    request: ChatRequest,
    current_user: AuthenticatedUser = Depends(get_current_user),
    session: Session = Depends(get_session),
    agent: AgentRunner = Depends(get_agent_runner)
):
    """
    Single chat endpoint with deterministic flow.

    1. Validate (handled by Pydantic)
    2. Authenticate (handled by dependency)
    3. Resolve conversation
    4. Execute agent
    5. Return response
    """
    # Step 2: Verify user_id matches authenticated user
    if user_id != current_user.user_id:
        raise HTTPException(
            status_code=status.HTTP_403_FORBIDDEN,
            detail="Cannot access another user's chat"
        )

    # Step 3: Resolve conversation
    conversation_id = await resolve_conversation(
        session=session,
        user_id=user_id,
        conversation_id=request.conversation_id
    )

    # Step 4: Execute agent
    response = await execute_agent(
        session=session,
        agent=agent,
        conversation_id=conversation_id,
        user_id=user_id,
        message=request.message,
        model=request.model
    )

    # Step 5: Return response
    return response

Step 3: Implement Conversation Resolution

# services/conversation.py
async def resolve_conversation(
    session: Session,
    user_id: str,
    conversation_id: Optional[str]
) -> str:
    """
    Resolve or create conversation.

    - If conversation_id provided: verify exists and owned by user
    - If not provided: create new conversation

    Returns: conversation_id (existing or new)
    """
    if conversation_id:
        # Load existing
        conv = get_conversation(session, conversation_id, user_id)
        if not conv:
            raise HTTPException(
                status_code=status.HTTP_404_NOT_FOUND,
                detail="Conversation not found"
            )
        return conv.id
    else:
        # Create new
        conv = create_conversation(session, user_id=user_id)
        return conv.id

Step 4: Implement Agent Execution

# services/agent.py
async def execute_agent(
    session: Session,
    agent: AgentRunner,
    conversation_id: str,
    user_id: str,
    message: str,
    model: str
) -> ChatResponse:
    """
    Execute agent with conversation context.

    1. Load conversation history
    2. Add user message
    3. Call agent
    4. Persist response
    5. Return structured response
    """
    # Load context
    history = get_messages(session, conversation_id, user_id)
    context = [{"role": m.role, "content": m.content} for m in history]

    # Add user message
    add_message(session, conversation_id, user_id, "user", message)
    context.append({"role": "user", "content": message})

    try:
        # Call agent
        result = await agent.run(messages=context, model=model)

        # Persist response
        add_message(session, conversation_id, user_id, "assistant", result.content)

        return ChatResponse(
            conversation_id=conversation_id,
            message=result.content,
            role="assistant",
            created_at=datetime.now(timezone.utc),
            usage=result.usage
        )
    except AgentError as e:
        raise HTTPException(
            status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
            detail=f"Agent error: {str(e)}"
        )

Error Handling

See references/error-handling.md for complete patterns.

| Step | Error | Status | Response | |------|-------|--------|----------| | 1. Validate | Invalid request | 400 | {"detail": "Field X is required"} | | 2. Auth | Missing token | 401 | {"detail": "Authentication required"} | | 2. Auth | Wrong user | 403 | {"detail": "Cannot access another user's chat"} | | 3. Resolve | Conv not found | 404 | {"detail": "Conversation not found"} | | 4. Execute | Agent fails | 503 | {"detail": "Agent error: ..."} | | * | Unexpected | 500 | {"detail": "Internal server error"} |

Idempotency Considerations

For safe retries, consider adding request deduplication:

class ChatRequest(BaseModel):
    message: str
    conversation_id: Optional[str] = None
    request_id: Optional[str] = Field(
        None,
        description="Client-generated UUID for idempotency"
    )

If request_id was seen recently, return cached response instead of re-executing.

Output Checklist

Before delivering implementation:

[ ] ChatRequest schema with message, conversation_id, model
[ ] ChatResponse schema with conversation_id, message, role, created_at
[ ] POST /{user_id}/chat endpoint
[ ] Path user_id validated against token user_id
[ ] Conversation resolution (create or load)
[ ] Agent execution with context
[ ] Message persistence (user + assistant)
[ ] Error handling for all 5 steps
[ ] Consistent error response format

Reference Files

| File | When to Read | |------|--------------| | references/api-contract.md | Complete request/response schemas and examples | | references/request-flow.md | Detailed implementation of each step | | references/error-handling.md | Error codes, formats, and recovery patterns |