Troubleshooting
Common issues, diagnostic tools, and performance tips.
clichefactory doctor
The doctor command checks your configuration, dependencies, and system binaries. Run it first when something isn't working.
client = factory(api_key="your-key")
client.doctor()
# The MCP server exposes it as a tool with no parameters.
Example Output
mode: service
api_key: cliche-...redacted
Dependencies:
clichefactory: 0.4.2 ✓
pydantic: 2.9.2 ✓
docling: 2.31.0 ✓
System binaries:
tesseract: not found (optional)
pandoc: 3.1.12 ✓
soffice: not found (optional)
Common Errors
401 Unauthorized — Invalid API key
Cause: The API key is missing, incorrect, or has been revoked.
Fix: Check that your key starts with cliche- and matches the one in your account settings. Ensure it's set in the right config file or environment variable.
403 Forbidden — Insufficient credits
Cause: Your credit balance is too low for the requested operation.
Fix: Check your balance and top up credits. Each extraction mode has a different per-page cost.
400 Bad Request — Invalid schema
Cause: The JSON schema is malformed or contains unsupported types.
Fix: Validate your schema at jsonschema.dev or check the Schemas reference. Ensure the JSON string is properly escaped if passing via CLI or curl.
File not found
Cause: The file path passed to the SDK, CLI, or MCP tool doesn't exist.
Fix: Use an absolute path. For MCP, the AI assistant must pass the full file path — relative paths may resolve against the wrong working directory.
Empty or garbled extraction results
Cause: Poor OCR quality from low-resolution scans, complex table layouts, or handwriting.
Fix: Try the enhanced parser for complex documents (local mode). Use robust extraction mode for a verification pass. For scanned documents, ensure the original is at least 200 DPI.
BYOK: model API errors
Cause: Your LLM API key is invalid, rate-limited, or the model is unavailable.
Fix: Verify your model API key and check the provider's status page. Run doctor to validate the configuration. See BYOK for supported providers.
MCP: tools not appearing
Cause: The MCP server config is incorrect, or clichefactory-mcp isn't installed.
Fix: Verify the JSON config is valid and the command/args are correct. Restart the IDE after changes. See MCP Setup.
System Dependencies
Some features require system binaries. These are optional — the default OCR engine (rapidocr) and default parser work without any system deps.
Tesseract (optional — for tesseract OCR engine)
brew install tesseract
# Ubuntu / Debian
sudo apt install tesseract-ocr
# Windows (via scoop)
scoop install tesseract
OCR Language Packs (optional — for Tesseract non-English OCR)
Tesseract needs traineddata files for each language. English is included by default.
brew install tesseract-lang
# Ubuntu — install a specific language pack
sudo apt install tesseract-ocr-<lang>
OCR engines support multiple languages. See your OCR engine's documentation for available language packs.
Pandoc (optional — for .doc and .odt files)
brew install pandoc
# Ubuntu / Debian
sudo apt install pandoc
LibreOffice (optional — fallback for .doc and .odt)
brew install --cask libreoffice
# Ubuntu / Debian
sudo apt install libreoffice
Used as a fallback if Pandoc is not available. Provides the soffice binary for document conversion.
Performance Tips
| Goal | Recommendation |
|---|---|
| Fastest extraction | Use fast mode for one-shot extraction, skipping the full parsing step. |
| Best accuracy | Use robust mode for a verification pass on high-stakes documents. |
| Complex scans | Use the enhanced parser (local mode) for documents with complex tables, mixed layouts, or handwriting. |
| High volume | Use extract_batch / extract-batch with max_concurrency tuned to your rate limits. |
| Non-English docs | Use Tesseract with the appropriate language pack. See your OCR engine's docs for available packs. |
| Repeated formats | Train a model (BYOK) — trained pipelines are faster and more accurate than generic extraction on recurring document types. |
Getting Help
If you're stuck:
- Run the
doctorcommand and include its output when reporting issues. - Check the Core Concepts page to verify your configuration.
- Contact us at info@clichefactory.com.