REST API

Authentication

All API requests require a ClicheFactory API key passed via the X-API-KEY header:

curl -H "X-API-KEY: cliche-your-key-here" https://api.clichefactory.com/v1/...

Sign in to create and manage API keys.

POST /v1/extract

Single-call extraction. Send a document and a JSON schema, get structured data back.

Request

Content-Type: multipart/form-data

Field	Type	Required	Description
`file`	binary	Yes	Document file (PDF, image, DOCX, XLSX, CSV, EML, etc.)
`schema`	string (JSON)	Yes*	JSON Schema describing the fields to extract. *Optional when `artifact_id` is provided.
`mode`	string	No	Extraction mode: `fast`, `robust`. Default: standard. See Extraction Modes.
`artifact_id`	string	No	Trained pipeline artifact ID. Schema, mode, and pipeline config auto-resolve from the artifact.
`allow_partial`	boolean	No	Return partial results on validation failure instead of erroring.
BYOK extraction LLM (optional). See BYOK. Sending any non-empty `llm_api_key` switches the call to BYOK billing.
`llm_provider_model`	string	No	Extraction model in LiteLLM form, e.g. `openai/gpt-4o-mini`, `gemini/gemini-2.0-flash`, `anthropic/claude-3-5-sonnet-latest`.
`llm_api_key`	string	No	Your provider API key. Required to actually run BYOK; without it the field is ignored.
`llm_api_base`	string	No	Optional API base URL (self-hosted or proxy).
`llm_max_tokens`	integer	No	Optional max-tokens override.
`llm_temperature`	number	No	Optional temperature override (0.0–2.0).
`llm_num_retries`	integer	No	Optional retry count for the BYOK extraction LLM.
BYOK OCR LLM (optional, used by `robust` mode). Independent from the extraction LLM above.
`ocr_provider_model`	string	No	OCR model in LiteLLM form. Falls back to the extraction model when omitted.
`ocr_api_key`	string	No	OCR provider API key.
`ocr_api_base`	string	No	Optional OCR API base URL.
`ocr_max_tokens`	integer	No	Optional max-tokens override.
`ocr_temperature`	number	No	Optional temperature override (0.0–2.0).
`ocr_num_retries`	integer	No	Optional retry count for the BYOK OCR LLM.

Example

        curl -X POST "https://api.clichefactory.com/v1/extract" \

          -H "X-API-KEY: cliche-your-key" \

          -F "file=@invoice.pdf" \

          -F 'schema={"type":"object","properties":{"invoice_number":{"type":"string"},"total":{"type":"number"}}}'

Response

        {

          "status": "success",

          "result": {

            "invoice_number": "INV-2024-001",

            "total": 1250.00

          },

          "cost": {

            "credits_used": 3,

            "pages": 1

          },

          "validation_errors": null

        }

With allow_partial=true, a response may use "status": "partial" and include validation_errors when the model output does not satisfy the schema. Credits and free-tier pages are still consumed — the extraction ran to completion regardless of schema validation outcome.

Artifact-Based Extraction

When using a trained pipeline, you only need the file and the artifact ID — the schema is optional as the pipeline knows its data model:

        curl -X POST "https://api.clichefactory.com/v1/extract" \

          -H "X-API-KEY: cliche-your-key" \

          -F "file=@invoice.pdf" \

          -F "artifact_id=art_abc123"

POST /v1/to-markdown

Convert any supported document to markdown text.

Request

Content-Type: multipart/form-data

Field	Type	Required	Description
`file`	binary	Yes	Document file.
`conversion_mode`	string	No	`default` (full OCR pipeline) or `fast` (VLM-only, no OCR). See Processing & OCR.
BYOK OCR LLM (optional). See BYOK. Sending any non-empty `ocr_api_key` switches the call to BYOK billing.
`ocr_provider_model`	string	No	OCR model in LiteLLM form, e.g. `openai/gpt-4o-mini`, `gemini/gemini-2.0-flash`.
`ocr_api_key`	string	No	Your OCR provider API key.
`ocr_api_base`	string	No	Optional OCR API base URL (self-hosted or proxy).
`ocr_max_tokens`	integer	No	Optional max-tokens override.
`ocr_temperature`	number	No	Optional temperature override (0.0–2.0).
`ocr_num_retries`	integer	No	Optional retry count.

Example

        curl -X POST "https://api.clichefactory.com/v1/to-markdown" \

          -H "X-API-KEY: cliche-your-key" \

          -F "file=@document.pdf"

Response

        {

          "markdown": "# Invoice\n\nInvoice Number: INV-2024-001\n...",

          "plain_text": "Invoice\nInvoice Number: INV-2024-001\n...",

          "meta": {}

        }

BYOK / Bring Your Own Key

By default POST /v1/extract and POST /v1/to-markdown run on ClicheFactory's full-service infrastructure and bill against your credit balance. To run the same calls against your own provider account (OpenAI, Gemini, Anthropic, or any LiteLLM-compatible endpoint), pass the BYOK form fields below. BYOK calls are billed at a reduced per-page rate — see Pricing.

How it works

There are two independent BYOK channels: llm_* (the extraction model on /v1/extract) and ocr_* (the OCR model on /v1/extract when mode=robust and on /v1/to-markdown). You can set either, both, or neither.
llm_provider_model and ocr_provider_model use LiteLLM model names: openai/gpt-4o-mini, gemini/gemini-2.0-flash, anthropic/claude-3-5-sonnet-latest, etc.
Sending any non-empty llm_api_key or ocr_api_key switches the request to BYOK billing rates. Without a key, the other BYOK fields are ignored.
If ocr_* is omitted and llm_* is set, the OCR step inherits the extraction model.

Empty strings are dropped, not errors. If you send llm_api_key= with no value (e.g. from an unset templating variable), the field is silently treated as "not provided" and the request runs on full-service rates. This is intentional — it prevents a stray empty form field from accidentally flipping billing — but it does mean a typo in your variable substitution will quietly fall back to full-service. Omit BYOK fields entirely when you don't want them; only send them when the values are real.

Example: BYOK extraction

        curl -X POST "https://api.clichefactory.com/v1/extract" \

          -H "X-API-KEY: cliche-your-key" \

          -F "file=@invoice.pdf" \

          -F 'schema={"type":"object","properties":{"total":{"type":"number"}}}' \

          -F "llm_provider_model=openai/gpt-4o-mini" \

          -F "llm_api_key=sk-your-openai-key"

Example: BYOK to-markdown

        curl -X POST "https://api.clichefactory.com/v1/to-markdown" \

          -H "X-API-KEY: cliche-your-key" \

          -F "file=@document.pdf" \

          -F "ocr_provider_model=gemini/gemini-2.0-flash" \

          -F "ocr_api_key=your-gemini-key"

Example: separate extraction and OCR keys (robust mode)

        curl -X POST "https://api.clichefactory.com/v1/extract" \

          -H "X-API-KEY: cliche-your-key" \

          -F "file=@invoice.pdf" \

          -F 'schema={"type":"object","properties":{"total":{"type":"number"}}}' \

          -F "mode=robust" \

          -F "llm_provider_model=anthropic/claude-3-5-sonnet-latest" \

          -F "llm_api_key=sk-ant-your-key" \

          -F "ocr_provider_model=openai/gpt-4o-mini" \

          -F "ocr_api_key=sk-your-openai-key"

Training

Training jobs currently run in BYOK mode only — every training job needs a model provider key (OpenAI, Gemini, or Anthropic). Full-service training is on the roadmap. (Extraction and to-markdown also support BYOK via the BYOK fields above, but default to full-service.)

Training jobs are created and managed through the ClicheFactory web app. Once training completes, use the artifact_id with the extract endpoint:

Extract with a Trained Pipeline

        curl -X POST "https://api.clichefactory.com/v1/extract" \

          -H "X-API-KEY: cliche-your-key" \

          -F "file=@invoice.pdf" \

          -F "artifact_id=art_abc123"

The trained pipeline resolves its own schema and mode — no additional parameters needed. See Training for the full workflow.

Billing

For cost-aware integrations, you can query your credit balance and usage history.

Endpoint	Method	Description
`/v1/billing/balance/{tenant_id}`	GET	Current credit balance for the tenant.
`/v1/billing/usage/{tenant_id}`	GET	Usage history and cost breakdown.

Credits are deducted for every completed extraction, including partial results (status: partial with validation_errors) — the LLM ran and pages were processed regardless of schema validation outcome. Only hard failures (HTTP 4xx/5xx) do not consume credits. See Pricing for per-page costs by extraction mode.

Advanced: Presign + Canonical Flow

The convenience endpoints above (/v1/extract and /v1/to-markdown) handle file upload internally. If you're building your own client library or need to upload large files efficiently, you can use the three-step flow that the SDK uses internally:

Presign — POST /v1/uploads/presign returns a presigned upload URL and a file_uri.
Upload — PUT {presigned_url} uploads file bytes directly to storage (bypasses the API server).
Canonical — POST /v1/canonical sends the file_uri with your extraction parameters and returns the result.

This flow is more efficient for large files since bytes go directly to storage, not through the API server. Most users should use the convenience endpoints or the SDK — this section is for library authors and advanced integrations. The file_uri returned by the presign step is an internal reference; users should not construct these URIs manually.

JSON-bodied to-markdown: `POST /v1/ocr/to-markdown`

A JSON-bodied alternative to the multipart POST /v1/to-markdown. It expects an already-uploaded file_uri (from /v1/uploads/presign) instead of a multipart file, and accepts a nested ocr BYOK config with the same fields exposed flat as ocr_* on the multipart endpoint. Body shape:

        {

          "tenant_id": "your-tenant-id",

          "file_uri": "s3://…",

          "file_name": "document.pdf",

          "mode": "default",

          "ocr": {

            "provider_model": "openai/gpt-4o-mini",

            "api_key": "sk-…"

          }

        }

Same response shape as /v1/to-markdown. tenant_id must match your API key's tenant unless you're using an admin key. Counted against the Heavy rate-limit bucket.

Rate Limits

Rate limits are applied per API key (i.e. per tenant). Two named buckets share the same limiter:

Bucket	Endpoints	Default
Heavy	`POST /v1/extract`, `POST /v1/to-markdown`, `POST /v1/ocr/to-markdown`, and `POST /v1/canonical` with `operation: inference.extract`.	5 req/s, burst of 20
General	Everything else (balance lookups, listings, billing reads, etc.).	50 req/s, burst of 200

Excess traffic

When a tenant exceeds its bucket the API returns 429 Too Many Requests with a Retry-After: 1 header:

        HTTP/1.1 429 Too Many Requests

        Retry-After: 1

        Content-Type: application/json

        {

          "detail": "Rate limit exceeded"

        }

Heavy paths are also subject to a small concurrency cap on each API server. Brief bursts of in-flight heavy requests beyond that cap return 503 Service Unavailable with Retry-After: 1; retry after a moment and the request will go through.

Need higher limits?

If your workload regularly bumps against the defaults, contact support with a short description of what you're building and the throughput you need. Per-tenant limits can be raised on request.

Error Codes

Status	Meaning
`200`	Success.
`400`	Bad request — invalid schema JSON, unsupported file type, or missing required fields.
`401`	Unauthorized — invalid or missing API key.
`403`	Forbidden — insufficient credits for the requested operation.
`404`	Not found — invalid artifact ID or job ID.
`413`	Payload too large — file exceeds the upload size limit.
`422`	May be used by some routes for validation errors. On `POST /v1/extract`, prefer `allow_partial=true` to receive HTTP 200 with `status: partial` and `cost.credits_used: 0`.
`500`	Internal server error — extraction failure. Try again or contact support.
`502`	Bad gateway — upstream storage or LLM provider error.

Authentication

POST /v1/extract

Request

Example

Response

Artifact-Based Extraction

POST /v1/to-markdown

Request

Example

Response

BYOK / Bring Your Own Key

How it works

Example: BYOK extraction

Example: BYOK to-markdown

Example: separate extraction and OCR keys (robust mode)

Training

Extract with a Trained Pipeline

Billing

Advanced: Presign + Canonical Flow

JSON-bodied to-markdown: POST /v1/ocr/to-markdown

Rate Limits

Excess traffic

Need higher limits?

Error Codes

JSON-bodied to-markdown: `POST /v1/ocr/to-markdown`