REST API
Extract structured data and convert documents with simple HTTP calls. No SDK required — use curl, Postman, or any language.
Authentication
All API requests require a ClicheFactory API key passed via the X-API-KEY header:
Sign in to create and manage API keys.
POST /v1/extract
Single-call extraction. Send a document and a JSON schema, get structured data back.
Request
Content-Type: multipart/form-data
| Field | Type | Required | Description |
|---|---|---|---|
file | binary | Yes | Document file (PDF, image, DOCX, XLSX, CSV, EML, etc.) |
schema | string (JSON) | Yes* | JSON Schema describing the fields to extract. *Optional when artifact_id is provided. |
mode | string | No | Extraction mode: fast, robust. Default: standard. See Extraction Modes. |
artifact_id | string | No | Trained pipeline artifact ID. Schema, mode, and pipeline config auto-resolve from the artifact. |
allow_partial | boolean | No | Return partial results on validation failure instead of erroring. |
BYOK extraction LLM (optional). See BYOK. Sending any non-empty llm_api_key switches the call to BYOK billing. | |||
llm_provider_model | string | No | Extraction model in LiteLLM form, e.g. openai/gpt-4o-mini, gemini/gemini-2.0-flash, anthropic/claude-3-5-sonnet-latest. |
llm_api_key | string | No | Your provider API key. Required to actually run BYOK; without it the field is ignored. |
llm_api_base | string | No | Optional API base URL (self-hosted or proxy). |
llm_max_tokens | integer | No | Optional max-tokens override. |
llm_temperature | number | No | Optional temperature override (0.0–2.0). |
llm_num_retries | integer | No | Optional retry count for the BYOK extraction LLM. |
BYOK OCR LLM (optional, used by robust mode). Independent from the extraction LLM above. | |||
ocr_provider_model | string | No | OCR model in LiteLLM form. Falls back to the extraction model when omitted. |
ocr_api_key | string | No | OCR provider API key. |
ocr_api_base | string | No | Optional OCR API base URL. |
ocr_max_tokens | integer | No | Optional max-tokens override. |
ocr_temperature | number | No | Optional temperature override (0.0–2.0). |
ocr_num_retries | integer | No | Optional retry count for the BYOK OCR LLM. |
Example
-H "X-API-KEY: cliche-your-key" \
-F "file=@invoice.pdf" \
-F 'schema={"type":"object","properties":{"invoice_number":{"type":"string"},"total":{"type":"number"}}}'
Response
"status": "success",
"result": {
"invoice_number": "INV-2024-001",
"total": 1250.00
},
"cost": {
"credits_used": 3,
"pages": 1
},
"validation_errors": null
}
With allow_partial=true, a response may use "status": "partial" and include validation_errors when the model output does not satisfy the schema.
Credits and free-tier pages are still consumed — the extraction ran to completion regardless of schema validation outcome.
Artifact-Based Extraction
When using a trained pipeline, you only need the file and the artifact ID — the schema is optional as the pipeline knows its data model:
-H "X-API-KEY: cliche-your-key" \
-F "file=@invoice.pdf" \
-F "artifact_id=art_abc123"
POST /v1/to-markdown
Convert any supported document to markdown text.
Request
Content-Type: multipart/form-data
| Field | Type | Required | Description |
|---|---|---|---|
file | binary | Yes | Document file. |
conversion_mode | string | No | default (full OCR pipeline) or fast (VLM-only, no OCR). See Processing & OCR. |
BYOK OCR LLM (optional). See BYOK. Sending any non-empty ocr_api_key switches the call to BYOK billing. | |||
ocr_provider_model | string | No | OCR model in LiteLLM form, e.g. openai/gpt-4o-mini, gemini/gemini-2.0-flash. |
ocr_api_key | string | No | Your OCR provider API key. |
ocr_api_base | string | No | Optional OCR API base URL (self-hosted or proxy). |
ocr_max_tokens | integer | No | Optional max-tokens override. |
ocr_temperature | number | No | Optional temperature override (0.0–2.0). |
ocr_num_retries | integer | No | Optional retry count. |
Example
-H "X-API-KEY: cliche-your-key" \
-F "file=@document.pdf"
Response
"markdown": "# Invoice\n\nInvoice Number: INV-2024-001\n...",
"plain_text": "Invoice\nInvoice Number: INV-2024-001\n...",
"meta": {}
}
BYOK / Bring Your Own Key
By default POST /v1/extract and POST /v1/to-markdown run on ClicheFactory's full-service infrastructure and bill against your credit balance. To run the same calls against your own provider account (OpenAI, Gemini, Anthropic, or any LiteLLM-compatible endpoint), pass the BYOK form fields below. BYOK calls are billed at a reduced per-page rate — see Pricing.
How it works
- There are two independent BYOK channels:
llm_*(the extraction model on/v1/extract) andocr_*(the OCR model on/v1/extractwhenmode=robustand on/v1/to-markdown). You can set either, both, or neither. llm_provider_modelandocr_provider_modeluse LiteLLM model names:openai/gpt-4o-mini,gemini/gemini-2.0-flash,anthropic/claude-3-5-sonnet-latest, etc.- Sending any non-empty
llm_api_keyorocr_api_keyswitches the request to BYOK billing rates. Without a key, the other BYOK fields are ignored. - If
ocr_*is omitted andllm_*is set, the OCR step inherits the extraction model.
llm_api_key= with no value (e.g. from an unset templating variable), the field is silently treated as "not provided" and the request runs on full-service rates. This is intentional — it prevents a stray empty form field from accidentally flipping billing — but it does mean a typo in your variable substitution will quietly fall back to full-service. Omit BYOK fields entirely when you don't want them; only send them when the values are real.
Example: BYOK extraction
-H "X-API-KEY: cliche-your-key" \
-F "file=@invoice.pdf" \
-F 'schema={"type":"object","properties":{"total":{"type":"number"}}}' \
-F "llm_provider_model=openai/gpt-4o-mini" \
-F "llm_api_key=sk-your-openai-key"
Example: BYOK to-markdown
-H "X-API-KEY: cliche-your-key" \
-F "file=@document.pdf" \
-F "ocr_provider_model=gemini/gemini-2.0-flash" \
-F "ocr_api_key=your-gemini-key"
Example: separate extraction and OCR keys (robust mode)
-H "X-API-KEY: cliche-your-key" \
-F "file=@invoice.pdf" \
-F 'schema={"type":"object","properties":{"total":{"type":"number"}}}' \
-F "mode=robust" \
-F "llm_provider_model=anthropic/claude-3-5-sonnet-latest" \
-F "llm_api_key=sk-ant-your-key" \
-F "ocr_provider_model=openai/gpt-4o-mini" \
-F "ocr_api_key=sk-your-openai-key"
Training
Training jobs are created and managed through the ClicheFactory web app. Once training completes, use the artifact_id with the extract endpoint:
Extract with a Trained Pipeline
-H "X-API-KEY: cliche-your-key" \
-F "file=@invoice.pdf" \
-F "artifact_id=art_abc123"
The trained pipeline resolves its own schema and mode — no additional parameters needed. See Training for the full workflow.
Billing
For cost-aware integrations, you can query your credit balance and usage history.
| Endpoint | Method | Description |
|---|---|---|
/v1/billing/balance/{tenant_id} | GET | Current credit balance for the tenant. |
/v1/billing/usage/{tenant_id} | GET | Usage history and cost breakdown. |
Credits are deducted for every completed extraction, including partial results (status: partial with validation_errors) — the LLM ran and pages were processed regardless of schema validation outcome. Only hard failures (HTTP 4xx/5xx) do not consume credits. See Pricing for per-page costs by extraction mode.
Advanced: Presign + Canonical Flow
The convenience endpoints above (/v1/extract and /v1/to-markdown) handle file upload internally. If you're building your own client library or need to upload large files efficiently, you can use the three-step flow that the SDK uses internally:
- Presign —
POST /v1/uploads/presignreturns a presigned upload URL and afile_uri. - Upload —
PUT {presigned_url}uploads file bytes directly to storage (bypasses the API server). - Canonical —
POST /v1/canonicalsends thefile_uriwith your extraction parameters and returns the result.
This flow is more efficient for large files since bytes go directly to storage, not through the API server. Most users should use the convenience endpoints or the SDK — this section is for library authors and advanced integrations. The file_uri returned by the presign step is an internal reference; users should not construct these URIs manually.
JSON-bodied to-markdown: POST /v1/ocr/to-markdown
A JSON-bodied alternative to the multipart POST /v1/to-markdown. It expects an already-uploaded file_uri (from /v1/uploads/presign) instead of a multipart file, and accepts a nested ocr BYOK config with the same fields exposed flat as ocr_* on the multipart endpoint. Body shape:
"tenant_id": "your-tenant-id",
"file_uri": "s3://…",
"file_name": "document.pdf",
"mode": "default",
"ocr": {
"provider_model": "openai/gpt-4o-mini",
"api_key": "sk-…"
}
}
Same response shape as /v1/to-markdown. tenant_id must match your API key's tenant unless you're using an admin key. Counted against the Heavy rate-limit bucket.
Rate Limits
Rate limits are applied per API key (i.e. per tenant). Two named buckets share the same limiter:
| Bucket | Endpoints | Default |
|---|---|---|
| Heavy | POST /v1/extract, POST /v1/to-markdown, POST /v1/ocr/to-markdown, and POST /v1/canonical with operation: inference.extract. |
5 req/s, burst of 20 |
| General | Everything else (balance lookups, listings, billing reads, etc.). | 50 req/s, burst of 200 |
Excess traffic
When a tenant exceeds its bucket the API returns 429 Too Many Requests with a Retry-After: 1 header:
Retry-After: 1
Content-Type: application/json
{
"detail": "Rate limit exceeded"
}
Heavy paths are also subject to a small concurrency cap on each API server. Brief bursts of in-flight heavy requests beyond that cap return 503 Service Unavailable with Retry-After: 1; retry after a moment and the request will go through.
Need higher limits?
If your workload regularly bumps against the defaults, contact support with a short description of what you're building and the throughput you need. Per-tenant limits can be raised on request.
Error Codes
| Status | Meaning |
|---|---|
200 | Success. |
400 | Bad request — invalid schema JSON, unsupported file type, or missing required fields. |
401 | Unauthorized — invalid or missing API key. |
403 | Forbidden — insufficient credits for the requested operation. |
404 | Not found — invalid artifact ID or job ID. |
413 | Payload too large — file exceeds the upload size limit. |
422 | May be used by some routes for validation errors. On POST /v1/extract, prefer allow_partial=true to receive HTTP 200 with status: partial and cost.credits_used: 0. |
500 | Internal server error — extraction failure. Try again or contact support. |
502 | Bad gateway — upstream storage or LLM provider error. |