← Back

Architecture

Glass Pipeline is a stateless extraction service. Each request flows through a deterministic sequence of stages, fully scoped to the factory.

Pipeline Flow

1
PDF Parse — PyMuPDF text extraction + pdfplumber table detection + page rendering
2
Classify — Haiku VLM identifies document type (order, email, drawing, quote)
3
Route — Email → divide-and-conquer | Order → direct extraction | Drawing → vision
4
Extract — Sonnet VLM reads page images + OCR table hints → structured rows
5
Normalize — Deterministic rules engine (factory-scoped, no LLM, <1ms)
6
Match — Snake API batch matching (SAT + Fuzzy, factory-scoped, ~3ms/row)
7
Return — Unified payload: measurements + matching + client_info + confidence

Email Divide & Conquer

Complex email documents (threads with attachments) are the hardest case in glass extraction. Our approach:

Email PDF received
    │
    ├─ Split: detect email markers (De:/From:/Envoyé:)
    │   ├─ Email body segments (context)
    │   └─ Attachment pages (tables, drawings)
    │
    ├─ Extract context from email body (Haiku)
    │   → compositions, instructions, modifications
    │
    └─ Extract measurements from attachments
        with email context injected into prompt

Components

ComponentTechnologyRole
PDF ParserPyMuPDF + pdfplumberText, tables, page images
VLMClaude Sonnet 4.6 (Bedrock)Vision extraction
ClassificationClaude Haiku 4.5 (Bedrock)Doc type, client info
NormalizerPython (deterministic)Factory-scoped rules
MatcherSnake API (SAT + Fuzzy)Article matching
ServerFastAPI + uvicornAsync HTTP
InfraEC2 t3.medium + nginx + certbotHTTPS, reverse proxy

Multi-Region Fallback

Sonnet: eu-west-3 → eu-central-1 → us-east-1 → us-west-2
Haiku:  eu-west-3 → us-west-2 → us-east-1

On 500/503/timeout, automatically cycles to next region. Non-retryable errors (400/403) skip immediately.

Factory Scoping

IDFactoryNormalizationSnake Scope
1VITFE→rTherm, bare spacer→alu grisVIT articles
3MonceFE→LowE, standard rulesMonce articles
4VIPFE→rTherm, bare spacer→alu grisRiou articles
9EurovitrageFE→LowEEurovitrage articles
10TGVIFE→LowETGVI articles