REST API

Extract structured data and convert documents with simple HTTP calls. No SDK required — use curl, Postman, or any language.

Authentication

All API requests require a ClicheFactory API key passed via the X-API-KEY header:

curl -H "X-API-KEY: cliche-your-key-here" https://api.clichefactory.com/v1/...

Sign in to create and manage API keys.

POST /v1/extract

Single-call extraction. Send a document and a JSON schema, get structured data back.

Request

Content-Type: multipart/form-data

FieldTypeRequiredDescription
filebinaryYesDocument file (PDF, image, DOCX, XLSX, CSV, EML, etc.)
schemastring (JSON)Yes*JSON Schema describing the fields to extract. *Optional when artifact_id is provided.
modestringNoExtraction mode: fast, robust. Default: standard. See Extraction Modes.
artifact_idstringNoTrained pipeline artifact ID. Schema, mode, and pipeline config auto-resolve from the artifact.
allow_partialbooleanNoReturn partial results on validation failure instead of erroring.
BYOK extraction LLM (optional). See BYOK. Sending any non-empty llm_api_key switches the call to BYOK billing.
llm_provider_modelstringNoExtraction model in LiteLLM form, e.g. openai/gpt-4o-mini, gemini/gemini-2.0-flash, anthropic/claude-3-5-sonnet-latest.
llm_api_keystringNoYour provider API key. Required to actually run BYOK; without it the field is ignored.
llm_api_basestringNoOptional API base URL (self-hosted or proxy).
llm_max_tokensintegerNoOptional max-tokens override.
llm_temperaturenumberNoOptional temperature override (0.0–2.0).
llm_num_retriesintegerNoOptional retry count for the BYOK extraction LLM.
BYOK OCR LLM (optional, used by robust mode). Independent from the extraction LLM above.
ocr_provider_modelstringNoOCR model in LiteLLM form. Falls back to the extraction model when omitted.
ocr_api_keystringNoOCR provider API key.
ocr_api_basestringNoOptional OCR API base URL.
ocr_max_tokensintegerNoOptional max-tokens override.
ocr_temperaturenumberNoOptional temperature override (0.0–2.0).
ocr_num_retriesintegerNoOptional retry count for the BYOK OCR LLM.
Example
curl -X POST "https://api.clichefactory.com/v1/extract" \
  -H "X-API-KEY: cliche-your-key" \
  -F "file=@invoice.pdf" \
  -F 'schema={"type":"object","properties":{"invoice_number":{"type":"string"},"total":{"type":"number"}}}'
Response
{
  "status": "success",
  "result": {
    "invoice_number": "INV-2024-001",
    "total": 1250.00
  },
  "cost": {
    "credits_used": 3,
    "pages": 1
  },
  "validation_errors": null
}

With allow_partial=true, a response may use "status": "partial" and include validation_errors when the model output does not satisfy the schema. Credits and free-tier pages are still consumed — the extraction ran to completion regardless of schema validation outcome.

Artifact-Based Extraction

When using a trained pipeline, you only need the file and the artifact ID — the schema is optional as the pipeline knows its data model:

curl -X POST "https://api.clichefactory.com/v1/extract" \
  -H "X-API-KEY: cliche-your-key" \
  -F "file=@invoice.pdf" \
  -F "artifact_id=art_abc123"

POST /v1/to-markdown

Convert any supported document to markdown text.

Request

Content-Type: multipart/form-data

FieldTypeRequiredDescription
filebinaryYesDocument file.
conversion_modestringNodefault (full OCR pipeline) or fast (VLM-only, no OCR). See Processing & OCR.
BYOK OCR LLM (optional). See BYOK. Sending any non-empty ocr_api_key switches the call to BYOK billing.
ocr_provider_modelstringNoOCR model in LiteLLM form, e.g. openai/gpt-4o-mini, gemini/gemini-2.0-flash.
ocr_api_keystringNoYour OCR provider API key.
ocr_api_basestringNoOptional OCR API base URL (self-hosted or proxy).
ocr_max_tokensintegerNoOptional max-tokens override.
ocr_temperaturenumberNoOptional temperature override (0.0–2.0).
ocr_num_retriesintegerNoOptional retry count.
Example
curl -X POST "https://api.clichefactory.com/v1/to-markdown" \
  -H "X-API-KEY: cliche-your-key" \
  -F "file=@document.pdf"
Response
{
  "markdown": "# Invoice\n\nInvoice Number: INV-2024-001\n...",
  "plain_text": "Invoice\nInvoice Number: INV-2024-001\n...",
  "meta": {}
}

BYOK / Bring Your Own Key

By default POST /v1/extract and POST /v1/to-markdown run on ClicheFactory's full-service infrastructure and bill against your credit balance. To run the same calls against your own provider account (OpenAI, Gemini, Anthropic, or any LiteLLM-compatible endpoint), pass the BYOK form fields below. BYOK calls are billed at a reduced per-page rate — see Pricing.

How it works
  • There are two independent BYOK channels: llm_* (the extraction model on /v1/extract) and ocr_* (the OCR model on /v1/extract when mode=robust and on /v1/to-markdown). You can set either, both, or neither.
  • llm_provider_model and ocr_provider_model use LiteLLM model names: openai/gpt-4o-mini, gemini/gemini-2.0-flash, anthropic/claude-3-5-sonnet-latest, etc.
  • Sending any non-empty llm_api_key or ocr_api_key switches the request to BYOK billing rates. Without a key, the other BYOK fields are ignored.
  • If ocr_* is omitted and llm_* is set, the OCR step inherits the extraction model.
Empty strings are dropped, not errors. If you send llm_api_key= with no value (e.g. from an unset templating variable), the field is silently treated as "not provided" and the request runs on full-service rates. This is intentional — it prevents a stray empty form field from accidentally flipping billing — but it does mean a typo in your variable substitution will quietly fall back to full-service. Omit BYOK fields entirely when you don't want them; only send them when the values are real.
Example: BYOK extraction
curl -X POST "https://api.clichefactory.com/v1/extract" \
  -H "X-API-KEY: cliche-your-key" \
  -F "file=@invoice.pdf" \
  -F 'schema={"type":"object","properties":{"total":{"type":"number"}}}' \
  -F "llm_provider_model=openai/gpt-4o-mini" \
  -F "llm_api_key=sk-your-openai-key"
Example: BYOK to-markdown
curl -X POST "https://api.clichefactory.com/v1/to-markdown" \
  -H "X-API-KEY: cliche-your-key" \
  -F "file=@document.pdf" \
  -F "ocr_provider_model=gemini/gemini-2.0-flash" \
  -F "ocr_api_key=your-gemini-key"
Example: separate extraction and OCR keys (robust mode)
curl -X POST "https://api.clichefactory.com/v1/extract" \
  -H "X-API-KEY: cliche-your-key" \
  -F "file=@invoice.pdf" \
  -F 'schema={"type":"object","properties":{"total":{"type":"number"}}}' \
  -F "mode=robust" \
  -F "llm_provider_model=anthropic/claude-3-5-sonnet-latest" \
  -F "llm_api_key=sk-ant-your-key" \
  -F "ocr_provider_model=openai/gpt-4o-mini" \
  -F "ocr_api_key=sk-your-openai-key"

Training

Training jobs currently run in BYOK mode only — every training job needs a model provider key (OpenAI, Gemini, or Anthropic). Full-service training is on the roadmap. (Extraction and to-markdown also support BYOK via the BYOK fields above, but default to full-service.)

Training jobs are created and managed through the ClicheFactory web app. Once training completes, use the artifact_id with the extract endpoint:

Extract with a Trained Pipeline
curl -X POST "https://api.clichefactory.com/v1/extract" \
  -H "X-API-KEY: cliche-your-key" \
  -F "file=@invoice.pdf" \
  -F "artifact_id=art_abc123"

The trained pipeline resolves its own schema and mode — no additional parameters needed. See Training for the full workflow.

Billing

For cost-aware integrations, you can query your credit balance and usage history.

EndpointMethodDescription
/v1/billing/balance/{tenant_id}GETCurrent credit balance for the tenant.
/v1/billing/usage/{tenant_id}GETUsage history and cost breakdown.

Credits are deducted for every completed extraction, including partial results (status: partial with validation_errors) — the LLM ran and pages were processed regardless of schema validation outcome. Only hard failures (HTTP 4xx/5xx) do not consume credits. See Pricing for per-page costs by extraction mode.

Advanced: Presign + Canonical Flow

The convenience endpoints above (/v1/extract and /v1/to-markdown) handle file upload internally. If you're building your own client library or need to upload large files efficiently, you can use the three-step flow that the SDK uses internally:

  1. PresignPOST /v1/uploads/presign returns a presigned upload URL and a file_uri.
  2. UploadPUT {presigned_url} uploads file bytes directly to storage (bypasses the API server).
  3. CanonicalPOST /v1/canonical sends the file_uri with your extraction parameters and returns the result.

This flow is more efficient for large files since bytes go directly to storage, not through the API server. Most users should use the convenience endpoints or the SDK — this section is for library authors and advanced integrations. The file_uri returned by the presign step is an internal reference; users should not construct these URIs manually.

JSON-bodied to-markdown: POST /v1/ocr/to-markdown

A JSON-bodied alternative to the multipart POST /v1/to-markdown. It expects an already-uploaded file_uri (from /v1/uploads/presign) instead of a multipart file, and accepts a nested ocr BYOK config with the same fields exposed flat as ocr_* on the multipart endpoint. Body shape:

{
  "tenant_id": "your-tenant-id",
  "file_uri": "s3://…",
  "file_name": "document.pdf",
  "mode": "default",
  "ocr": {
    "provider_model": "openai/gpt-4o-mini",
    "api_key": "sk-…"
  }
}

Same response shape as /v1/to-markdown. tenant_id must match your API key's tenant unless you're using an admin key. Counted against the Heavy rate-limit bucket.

Rate Limits

Rate limits are applied per API key (i.e. per tenant). Two named buckets share the same limiter:

BucketEndpointsDefault
Heavy POST /v1/extract, POST /v1/to-markdown, POST /v1/ocr/to-markdown, and POST /v1/canonical with operation: inference.extract. 5 req/s, burst of 20
General Everything else (balance lookups, listings, billing reads, etc.). 50 req/s, burst of 200
Excess traffic

When a tenant exceeds its bucket the API returns 429 Too Many Requests with a Retry-After: 1 header:

HTTP/1.1 429 Too Many Requests
Retry-After: 1
Content-Type: application/json

{
  "detail": "Rate limit exceeded"
}

Heavy paths are also subject to a small concurrency cap on each API server. Brief bursts of in-flight heavy requests beyond that cap return 503 Service Unavailable with Retry-After: 1; retry after a moment and the request will go through.

Need higher limits?

If your workload regularly bumps against the defaults, contact support with a short description of what you're building and the throughput you need. Per-tenant limits can be raised on request.

Error Codes

StatusMeaning
200Success.
400Bad request — invalid schema JSON, unsupported file type, or missing required fields.
401Unauthorized — invalid or missing API key.
403Forbidden — insufficient credits for the requested operation.
404Not found — invalid artifact ID or job ID.
413Payload too large — file exceeds the upload size limit.
422May be used by some routes for validation errors. On POST /v1/extract, prefer allow_partial=true to receive HTTP 200 with status: partial and cost.credits_used: 0.
500Internal server error — extraction failure. Try again or contact support.
502Bad gateway — upstream storage or LLM provider error.