Infratex API

Programmatic access to document parsing, indexing, search, and AI-powered responses.

Authentication

All requests require an API key in the Authorization header. Create and manage keys from your dashboard.

http

Authorization: Bearer infratex_sk_your_key_here

Keys are scoped to your account. All usage is billed to your credit balance. Revoke a key at any time — it takes effect immediately.

Installation

Install the official SDK for your language. Both wrap the REST API with typed methods, automatic retries, and streaming support.

pip install infratex

Then initialize the client with your API key:

from infratex import Infratex
client = Infratex(api_key="infratex_sk_...")

Source code: infratex-python and infratex-node.

Base URL

http

https://api.infratex.io

All paths in this documentation are relative to this base URL. The SDKs handle this automatically.

Errors

All errors return a consistent JSON envelope:

json

{
  "error": {
    "code": "insufficient_credit",
    "message": "Insufficient credit for this operation",
    "request_id": "a1b2c3d4..."
  }
}

Status	Code	Meaning
401	`unauthorized`	Missing or invalid API key
402	`insufficient_credit`	Not enough balance
403	`account_frozen`	Account suspended
404	`not_found`	Resource does not exist
409	`document_not_ready`	Document not in required status
409	`index_not_ready`	Requested scope is missing a ready index for the selected method
409	`no_indexed_documents`	No documents are indexed with the selected retrieval method for that scope
413	`payload_too_large`	File exceeds size limit
413	`context_too_large`	Input context exceeds 250K token limit
422	`validation_error`	Invalid request body
500	`internal_error`	Server error

Documents

Upload PDFs or ordered page-image batches, extract structured markdown, and manage your document library.

POST Upload & Parse PDF

Maximum file size: 50 MB. This route accepts PDF files only.

POST /api/v1/documents
Content-Type: multipart/form-data

file: <binary PDF>              (required, max 50 MB)
method: "standard" | "max" | "experimental" | "legacy" | "cost-efficient"  (default: "standard")
collection_id: "col-uuid"  (optional: assign to a collection on upload)

Raw HTTP response 202:

json

{
  "id": "8f2a9a0c-...",
  "status": "pending",
  "method": "standard",
  "filename": "report.pdf",
  "page_count": 42,
  "collection_id": "col-uuid"
}

The REST API is async-first for uploads: POST /api/v1/documents creates the document resource and queues parsing on the backend. Poll GET /api/v1/documents/{id} until the status becomes parsed or indexed, then fetch the extracted content with GET /api/v1/documents/{id}/markdown.

The official Python and Node.js SDKs keep documents.upload(...) ergonomic by waiting for parsing by default, while also exposingwait=False / { wait: false } for queue-first control and documents.get(..., wait=True) when you want to resume later.

POST Upload & Parse Images

Use /api/v1/documents/images when the input is an ordered batch of page images instead of a PDF. Supported formats are PNG, JPEG, and WebP. Image uploads support only standard and max.

POST /api/v1/documents/images
Content-Type: multipart/form-data

files: <binary image>           (required, repeat once per page image)
method: "standard" | "max"     (default: "standard")
collection_id: "col-uuid"       (optional: assign to a collection on upload)

The image route is also async-first and returns the same queued Document resource shape as PDF uploads. The dashboard uses it for image-based uploads, and the backend preserves image order as page order.

The official SDKs expose the same ergonomic default here too: documents.upload_images(...) / documents.uploadImages(...) wait by default, and support queue-first control through wait=False / { wait: false }.

Parsing Methods & Pricing

Method	Price	Description
`standard`	$0.005/page	Best quality. Recommended for most documents.
`max`	$0.008/page	Gemini-based high-fidelity parsing plus brief visual interpretation notes for charts, figures, screenshots, and photos.
`cost-efficient`	$0.002/page	Faster, lower cost. Good for simple layouts.
`experimental`	$0.005/page	OCR-based pipeline for scanned documents.
`legacy`	$0.005/page	Previous generation parser.

GET List Documents

GET /api/v1/documents?limit=50&offset=0&status=parsed&collection_id=col-uuid

Returns {"documents": [...], "total": 142}. All query parameters are optional. Filter by status: pending, processing, parsed, indexed, or error.

GET Get Document

GET /api/v1/documents/{id}

Returns document metadata, processing status, and per-method index resources in the indexes array. Markdown is retrieved separately from /api/v1/documents/{id}/markdown.

GET Download Markdown

GET /api/v1/documents/{id}/markdown

Returns the raw extracted markdown as text/markdown.

PATCH Update Document

PATCH /api/v1/documents/{id}
Content-Type: application/json

{
  "filename": "new-name.pdf",       // optional
  "collection_id": "col-uuid",      // optional: move to collection
  "remove_collection": true          // optional: remove from collection
}

DELETE Delete Document

DELETE /api/v1/documents/{id}

Returns 204. Removes the document and all associated index data.

Indexes

Build retrieval artifacts on a parsed document. Required before /api/v1/searches or /api/v1/responsescan use that document with the selected retrieval method. Billed per 1M tokens processed.

POST Create Index

POST /api/v1/documents/{document_id}/indexes
Content-Type: application/json

{
  "method": "vector" | "hybrid"   // default: "vector"
}

Raw HTTP response 202:

json

{
  "id": "idx-uuid",
  "document_id": "8f2a9a0c-...",
  "filename": "report.pdf",
  "method": "vector",
  "status": "pending",
  "chunk_count": null,
  "node_count": null,
  "has_ast": false,
  "has_description": false,
  "processing_time_ms": null,
  "error_message": null,
  "created_at": "2026-04-02T12:00:00Z",
  "updated_at": "2026-04-02T12:00:00Z"
}

The REST API is async-first for indexing as well: POST /api/v1/documents/{document_id}/indexes creates or resets the method-specific index resource and queues background processing. Poll GET /api/v1/documents/{document_id}/indexes or GET /api/v1/documents/{document_id}/indexes/{method} until status becomes indexed or error.

The official SDKs keep documents.index(...) ergonomic by waiting for the queued index by default, while still exposingwait=False / { wait: false } if you want queue-first behavior.

When using the hybrid method, node_count reflects the number of AST nodes created, and has_ast / has_description indicate whether structural analysis and document summaries were generated.

GET List Indexes

GET /api/v1/documents/{document_id}/indexes

GET Get Index

GET /api/v1/documents/{document_id}/indexes/{method}

Index Methods & Pricing

Method	Price	Description
`vector`	$0.10/1M tokens	Standard vector embeddings. Fast and cost-effective.
`hybrid`	$0.50/1M tokens	Vector + structural analysis with AST and LLM summaries. Deeper understanding of complex documents.

Document Lifecycle

Upload (PDF) → Parse (markdown) → Index (vector/hybrid) → Search / Respond
                status: parsed      status: indexed

A document must be in parsed or indexed status before indexing. You can re-index a document with a different method at any time.

Search

Semantic search across your indexed documents. Returns the most relevant sections. Billed per query.

POST Request

POST /api/v1/searches
Content-Type: application/json

{
  "method": "vector" | "hybrid",  // default: "vector"
  "query": "What was the Q3 revenue?",  // required, max 4000 chars
  "limit": 5,                     // 1-50, default 5
  "collection_id": "col-uuid"     // optional: choose collection_id or document_ids
}

Response `200`

json

{
  "method": "vector",
  "query": "What was the Q3 revenue?",
  "results": [
    {
      "document_id": "8f2a9a0c-...",
      "document_name": "report.pdf",
      "score": 0.92,
      "content": "Total revenue for Q3 was $45.2M...",
      "title": "",
      "summary": "",
      "node_id": null,
      "chunk_index": 12,
      "metadata": null,
      "source": "vector"
    }
  ]
}

With hybrid search, results include title and summary from the document's AST, and node_id identifies the exact section. With vector search, these fields are empty/null.

Use exactly one of document_ids or collection_id to scope the search. If neither is provided, all documents with a ready index for the selected method are searched. If you scope the request explicitly, every requested document must already have a ready index for that same method or the API returns index_not_ready. Billing is captured only after the search succeeds.

Search Pricing

Method	Price
`vector`	$0.01/query
`hybrid`	$0.05/query

AI Responses

Ask a question and get an AI-generated answer grounded in your documents. The system retrieves relevant context, then generates a cited response streamed as Server-Sent Events. Billed per token based on the selected model. Retrieval method affects retrieval behavior, not response pricing.

POST Request

POST /api/v1/responses
Content-Type: application/json

{
  "method": "vector" | "hybrid",       // default: "vector"
  "model": "fast" | "pro",             // default: "fast"
  "message": "Summarize the key risk factors",  // required, max 8000 chars
  "limit": 5,                           // results to retrieve, 1-20
  "collection_id": "col-uuid",          // optional: use exactly one scope selector
  "reasoning": true | false             // default: false — enable extended reasoning
}

For single-turn requests, pass exactly one scope selector: document_ids or collection_id. If neither is provided, the response is generated from all documents with a ready index for the selected retrieval method.

For multi-turn threads, first create a conversation with its document scope, then call /api/v1/responses withconversation_id. When conversation_id is present, do not send document_ids orcollection_id in the same request.

Responses enforce the same method-specific readiness invariant as /api/v1/searches: the chosen scope must already have readyvector or hybrid indexes that match the request method.

Multi-turn Example

POST /api/v1/conversations
Content-Type: application/json

{
  "title": "Q3 Analysis",
  "collection_id": "col-uuid"
}

POST /api/v1/responses
Content-Type: application/json

{
  "conversation_id": "conv-uuid",
  "method": "hybrid",
  "model": "pro",
  "message": "How does that compare with the previous quarter?",
  "limit": 8,
  "reasoning": true
}

Response — Server-Sent Events

The response streams as SSE. Both models support reasoning — thinking events stream before the answer:

http

data: {"type": "sources", "content": [{"id": 1, "document_id": "...", "snippet": "..."}]}
data: {"type": "thinking", "content": "Let me analyze the risk factors..."}
data: {"type": "thinking", "content": " across the documents..."}
data: {"type": "text", "content": "Based on the documents, "}
data: {"type": "text", "content": "the key risk factors are..."}
data: {"type": "done"}

Both fast and pro models produce reasoning when reasoning is set to true. The fast model reasons briefly before answering. The pro model reasons extensively for complex queries. Reasoning is false by default in the API. Pass a conversation_id for multi-turn conversations after creating the thread separately.

Models

Model	Description
`fast`	Quick and efficient. Best for most queries.
`pro`	More intelligent model for complex tasks. Higher accuracy on nuanced questions.

Response Pricing

Model	Input Tokens	Output Tokens
`fast`	$1.00/1M tokens	$5.00/1M tokens
`pro`	$3.00/1M tokens	$16.00/1M tokens

Reasoning tokens are included in output token billing for both models. The maximum input context is 250,000 tokens. If exceeded, the API returns acontext_too_large error — start a new conversation or reduce scope.

Field Extraction

Pull typed fields out of any parsed document. You define the fields you want (name, type, description), the API returns a strict JSON object that matches your schema, plus per-field source citations. Works on documents in parsed / indexed / done status.

The pipeline routes each document to one of three execution tiers automatically based on size and index availability:

Tier	When	How it works
`S`	markdown ≤ 30k tokens	Single-pass extraction over the full document.
`M`	30k – 150k tokens, hybrid index built	AST-guided: a router picks relevant sections, the extractor sees only those.
`L`	> 150k tokens, or any size with vector-only index	Per-field hybrid retrieval; array fields run as parallel map-reduce with a deterministic Python-side numeric filter.

Tier L requires at least one indexed vector or hybrid index — otherwise the request returns 409 index_required_for_tier_l.

Field types

Each field declares a name (snake_case), a type, a description, an optional instructions hint, and a required flag. Optional fields can come back as null.

type	JSON shape returned	Notes
`string`	string or null	Free text.
`number`	number or null	No currency symbols or commas — pure numeric.
`integer`	integer or null
`boolean`	true / false / null
`date`	ISO-8601 string (`YYYY-MM-DD`) or null
`enum`	one of `enum_values`	Provide `enum_values: [...]`.
`object`	nested object	Provide a typed `properties: [...]` list.
`array`	array	Provide a typed `items: {...}` shape. Use `array<object>` for table-like fields.

POST Create Extraction Run

Submits an asynchronous extraction. Returns 202 Accepted immediately with a run_id; the actual extraction runs on a background worker. Poll the run to get the result.

POST /api/v1/documents/{document_id}/extractions
Content-Type: application/json

{
  "inline_fields": [
    {"name": "company_name",    "type": "string",  "description": "Issuing company",          "required": true},
    {"name": "agreement_date",  "type": "date",    "description": "Signature date",            "required": false},
    {"name": "total_amount",    "type": "number",  "description": "Total contract amount USD", "required": false},
    {"name": "parties", "type": "array", "description": "All parties to the agreement",
     "required": true,
     "items": {
       "name": "row", "type": "object", "description": "row",
       "properties": [
         {"name": "name", "type": "string", "description": "party name", "required": true},
         {"name": "role", "type": "string", "description": "party role", "required": false}
       ]
     }}
  ],
  "inline_system_prompt": "This is a commercial subscription agreement.",
  "model": "fast",                  // "fast" or "pro" (default: "fast")
  "include_evidence": true          // include per-field source citations in GETs (default: false)
}

Two ways to define fields: send inline_fields directly (one-off), or save a named template via the dashboard and pass template_id instead. Templates support edit/delete and version-pin the snapshot used by each run. Provide exactly one of inline_fields or template_id.

Response 202: {"id": "run-uuid", "status": "pending", "tier": null, ...}.tier is populated once the worker picks up the job.

GET Get Extraction Run

Polls a single run. Repeat until status is done or error.

GET /api/v1/extractions/{run_id}?include_evidence=true

Response shape:

json

{
  "id": "run-uuid",
  "document_id": "doc-uuid",
  "template_id": null,
  "status": "done",
  "tier": "S" | "M" | "L",
  "model": "fast" | "pro",
  "result": {
    "company_name": "Verse AB",
    "agreement_date": "2021-05-03",
    "total_amount": 1000000
  },
  "evidence": {
    "company_name":   { "source_quote": "## VERSE AB" },
    "agreement_date": { "source_quote": "Latest update: 3 May 2021" },
    "total_amount":   { "source_quote": "Total subscription monies (SEK) 1,000,000" }
  },
  "processing_time_ms": 5215,
  "input_tokens": 7284,
  "output_tokens": 916,
  "error_message": null,
  "created_at": "...",
  "updated_at": "..."
}

Evidence is always captured server-side. Pass ?include_evidence=true to receive it. For Tier M / L runs, evidence also carries an internal _tier_m / _tier_l audit object showing which sections the router or retriever picked.

GET List Runs for a Document

GET /api/v1/documents/{document_id}/extractions?limit=50&offset=0&include_evidence=false

GET Export a Run (CSV / XLSX)

Downloads the tabular fields (any array<object>) of a completed run. CSV mode: one array per run becomes a single CSV; multiple arrays become a ZIP. XLSX mode: one sheet per array field.

GET /api/v1/extractions/{run_id}/export?format=csv
GET /api/v1/extractions/{run_id}/export?format=xlsx

Requires the run to be done and the snapshot to contain at least one array<object> field with rows; otherwise the endpoint returns 409 extraction_not_done or 422 no_tabular_fields.

Array fields with filters (Tier L map-reduce)

For documents large enough to route to Tier L, an array<object> field can carry an instructionshint that the engine parses as a deterministic post-filter:

json

{
  "name": "transactions",
  "type": "array",
  "description": "All transactions on the statement",
  "instructions": "only transactions where amount > 100,000",
  "required": true,
  "items": {
    "name": "row", "type": "object", "description": "row",
    "properties": [
      {"name": "date",        "type": "date",   "description": "tx date",   "required": true},
      {"name": "amount",      "type": "number", "description": "signed",    "required": true},
      {"name": "description", "type": "string", "description": "memo",      "required": true}
    ]
  }
}

The LLM extracts every row it sees. The filter is then applied in Python — never by the model — so it is reproducible and auditable. Supported operators: >, >=, <, <=, between … and …, with K / M / B suffixes and $, €, £, ¥ currency prefixes. Both the filtered set (in result) and the unfiltered set (in evidence.<field>.unfiltered_rows) are returned.

Extraction Pricing

Model	Input Tokens	Output Tokens
`fast`	$1.00/1M tokens	$5.00/1M tokens
`pro`	$3.00/1M tokens	$16.00/1M tokens

Identical re-runs of the same (document, content_hash, model) tuple are served from a server-side cache at no cost.

Collections

Organize documents into collections. Use collections to scope search and retrieval.

POST Create Collection

POST /api/v1/collections
Content-Type: application/json

{"name": "Q3 Reports"}   // required, max 255 chars

Response 201: {"id": "col-uuid", "name": "Q3 Reports", "created_at": "..."}

GET List Collections

GET /api/v1/collections

GET Get Collection

GET /api/v1/collections/{id}

PATCH Rename Collection

PATCH /api/v1/collections/{id}
{"name": "New Name"}

DELETE Delete Collection

DELETE /api/v1/collections/{id}

Returns 204. Documents inside are not deleted — they become unassigned.

Conversations

Conversations are an optional managed-session layer on top of /api/v1/responses. Each conversation stores message history and the document scope used for the thread, so follow-up turns stay grounded in the same corpus.

POST Create Conversation

POST /api/v1/conversations
Content-Type: application/json

{
  "title": "Q3 Analysis",
  "collection_id": "col-uuid"
}

A conversation can be scoped to exactly one collection or an explicit list of documents. That scope is locked for the life of the thread and becomes the retrieval scope for all follow-up responses.

Response 201: {"id": "conv-uuid", "title": "Q3 Analysis", "collection_id": "col-uuid", "document_ids": [], ...}

GET List Conversations

GET /api/v1/conversations?limit=50&offset=0

GET Get Conversation with Messages

GET /api/v1/conversations/{id}

Returns the conversation with its full message history, persisted thread scope, and per-message metadata such as retrieval method, model, reasoning, sources, and token usage when available.

DELETE Delete Conversation

DELETE /api/v1/conversations/{id}

Account & Billing

Check your balance and monitor usage programmatically.

GET Get Account

GET /api/v1/account

Response 200:

json

{
  "tenant": {
    "id": "a1b2c3d4-...",
    "name": "Acme Corp",
    "email": "dev@acme.com",
    "credit_balance_micros": 5000000,
    "is_admin": false,
    "is_frozen": false,
    "frozen_reason": null,
    "deleted_at": null,
    "created_at": "2025-01-15T10:30:00Z"
  }
}

Balances are in micros: 1,000,000 = $1.00. The example above shows a $5.00 balance.

GET Get Billing Summary

GET /api/v1/billing

Response 200:

json

{
  "balance_micros": 5000000,
  "recent_transactions": [
    {
      "id": "tx-uuid",
      "transaction_type": "usage_charge",
      "status": "posted",
      "amount_micros": -25000,
      "balance_after_micros": 4975000,
      "currency": "USD",
      "source_type": "documents.parse.standard",
      "description": "Parsing Standard • 5 pages from report.pdf",
      "created_at": "2025-01-20T14:30:00Z"
    }
  ],
  "recent_credit_transactions": [],
  "recent_usage": [
    {
      "id": "evt-uuid",
      "service_key": "documents.parse.standard",
      "status": "posted",
      "unit_count": 5,
      "unit_label": "pages",
      "unit_price_micros": 5000,
      "total_cost_micros": 25000,
      "document_id": "doc-uuid",
      "created_at": "2025-01-20T14:30:00Z"
    }
  ],
  "spend_by_service": [
    { "service_key": "documents.parse.standard", "total_cost_micros": 250000, "total_units": 50 }
  ],
  "daily_spend": [
    { "day": "2025-01-20", "total_cost_micros": 25000, "total_units": 5, "by_service": {} }
  ],
  "totals": {
    "lifetime_spent_micros": 1500000,
    "lifetime_parse_pages": 300,
    "lifetime_searches": 42,
    "thirty_day_spend_micros": 500000,
    "thirty_day_transactions": 15
  }
}

All monetary values are in micros (1,000,000 = $1.00). Covers the last 30 days of activity, with lifetime totals in the totals object.

Quick Start

Upload a document, let the parser finish, index it, search, and get an AI response.

# 1. Create a document and queue parsing
curl -X POST https://api.infratex.io/api/v1/documents \
  -H "Authorization: Bearer $INFRATEX_KEY" \
  -F "file=@report.pdf"

# 2. Poll until status becomes parsed or indexed
curl -X GET https://api.infratex.io/api/v1/documents/{id} \
  -H "Authorization: Bearer $INFRATEX_KEY"

# 3. Download the extracted markdown
curl -X GET https://api.infratex.io/api/v1/documents/{id}/markdown \
  -H "Authorization: Bearer $INFRATEX_KEY"

# 4. Queue the vector index
curl -X POST https://api.infratex.io/api/v1/documents/{id}/indexes \
  -H "Authorization: Bearer $INFRATEX_KEY" \
  -H "Content-Type: application/json" \
  -d '{"method": "vector"}'

# 5. Poll the method-specific index until status becomes indexed
curl -X GET https://api.infratex.io/api/v1/documents/{id}/indexes/vector \
  -H "Authorization: Bearer $INFRATEX_KEY"

# 6. Search that document
curl -X POST https://api.infratex.io/api/v1/searches \
  -H "Authorization: Bearer $INFRATEX_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "What is the revenue?", "method": "vector", "limit": 5, "document_ids": ["{id}"]}'

# 7. Get an AI-generated response from that same document
curl -X POST https://api.infratex.io/api/v1/responses \
  -H "Authorization: Bearer $INFRATEX_KEY" \
  -H "Content-Type: application/json" \
  -d '{"message": "Summarize the findings", "method": "vector", "limit": 5, "document_ids": ["{id}"]}'

Rate Limits

The core API endpoints (documents, indexes, searches, responses) are not rate-limited. Usage is metered and billed per operation. Ensure your account has sufficient credit balance before making requests.

SDKs

Official client libraries are available for Python and Node.js. Both provide typed interfaces, automatic retries, and first-class streaming support.

Python

pip install infratex

Built on httpx. Source and docs: github.com/beltromatti/infratex-python

Node.js

npm install infratex

TypeScript, uses native fetch. Source and docs: github.com/beltromatti/infratex-node

In both SDKs, documents.upload(...), documents.upload_images(...) / documents.uploadImages(...), and documents.index(...) share the same contract: they wait by default, but each also exposes queue-first control when you passwait=False / { wait: false }.

The matching getters also line up: documents.get(..., wait=True) resumes a queued PDF upload or image-batch upload until parsed, anddocuments.get_index(..., wait=True) / documents.getIndex(..., { wait: true }) resumes a queued index until ready.

The REST API follows standard conventions and works with any HTTP client. All curl examples in this documentation translate directly to any language.

MCP

Infratex exposes a remote Model Context Protocol server for AI agents that need direct access to the same parsing, indexing, retrieval, and grounded answering pipeline as the REST API. It is retrieval-first and uses the same tenant, billing, and readiness invariants as the core backend.

The MCP server is stateless and remote. Use your existing API keys and connect to https://api.infratex.io/mcp. The dashboard also includes an operational overview at /dashboard/mcp.

Authentication & Transport

json

{
  "mcpServers": {
    "infratex": {
      "url": "https://api.infratex.io/mcp",
      "headers": {
        "Authorization": "Bearer infratex_sk_..."
      }
    }
  }
}

Use the same infratex_sk_... keys you use for the REST API. Every MCP tool call is tenant-scoped and billed exactly like the equivalent backend operation.

Core Tools

Tool	Purpose	Default Behavior
`list_collections`	List available collections	Read-only
`list_documents`	List tenant documents with optional filters	Read-only
`create_document`	Queue PDF parsing from a base64 payload	`wait=false`
`create_document_images`	Queue parsing for an ordered image batch	`wait=false`
`get_document`	Read document metadata and status	Read-only
`get_document_markdown`	Fetch extracted markdown	Read-only
`list_indexes`	List vector and hybrid indexes for a document	Read-only
`create_index`	Queue vector or hybrid indexing	`wait=false`
`get_index`	Read method-specific index status	Read-only
`search_documents`	Run vector or hybrid retrieval	Sync
`answer_documents`	Generate a cited answer over indexed documents	Sync

Recommended Agent Workflow

python

# 1. Queue parsing
create_document(wait=False)  # or create_document_images(wait=False)

# 2. Poll until parsed
get_document(document_id=...)

# 3. Queue indexing
create_index(document_id=..., method="vector", wait=False)

# 4. Poll until indexed
get_index(document_id=..., method="vector")

# 5. Retrieve or answer
search_documents(...)
answer_documents(...)

create_document, create_document_images, and create_index are async-first by default because parsing and indexing can be long-running. Set wait=true only when an agent should block until the resource is ready. This keeps MCP aligned with the production backend, where documents and indexes are queued resources.

Scope & Behavior

Retrieval tools accept the same scope rules as REST: pass exactly one of document_ids or collection_id, or omit both to search across all ready documents for the selected retrieval method.

search_documents and answer_documents enforce the same method-specific readiness check as /api/v1/searches and /api/v1/responses. If you ask for hybrid, the scope must already have readyhybrid indexes; vector indexes are not reused implicitly.

MCP is meant for agents and orchestration. For end-user applications or direct integrations, the REST API and official SDKs remain the primary surface.

Infratex API

Authentication

Installation

Base URL

Errors

Documents

POST Upload & Parse PDF

POST Upload & Parse Images

Parsing Methods & Pricing

GET List Documents

GET Get Document

GET Download Markdown

PATCH Update Document

DELETE Delete Document

Indexes

POST Create Index

GET List Indexes

GET Get Index

Index Methods & Pricing

Document Lifecycle

Search

POST Request

Response 200

Search Pricing

AI Responses

POST Request

Multi-turn Example

Response — Server-Sent Events

Models

Response Pricing

Field Extraction

Field types

POST Create Extraction Run

GET Get Extraction Run

GET List Runs for a Document

GET Export a Run (CSV / XLSX)

Array fields with filters (Tier L map-reduce)

Extraction Pricing

Collections

POST Create Collection

GET List Collections

GET Get Collection

PATCH Rename Collection

DELETE Delete Collection

Conversations

POST Create Conversation

GET List Conversations

GET Get Conversation with Messages

DELETE Delete Conversation

Account & Billing

GET Get Account

GET Get Billing Summary

Quick Start

Rate Limits

SDKs

MCP

Authentication & Transport

Core Tools

Recommended Agent Workflow

Scope & Behavior

Response `200`