Skip to main content
Every document submitted to POST /v1/documents is processed asynchronously. The API returns 202 Accepted immediately with a document object; the actual extraction happens in the background. This guide explains the status lifecycle and the three consumption modes.

Status lifecycle

A document moves through the following states:
StatusMeaning
receivedThe file has been accepted and is waiting to enter the queue.
queuedThe document is in the worker queue, waiting for a free slot.
processingOCR and extraction are running.
completedExtraction finished; all fields passed confidence checks.
partially_completedExtraction finished but one or more fields triggered the human-review threshold.
failedProcessing encountered an unrecoverable error.
deletedThe document was deleted via DELETE /v1/documents/{id}.
Terminal states are completed, partially_completed, failed, and deleted. Once a document reaches a terminal state it does not change again.

Consumption modes

Register a URL once; Folio calls it whenever a document reaches a terminal state.
curl -X POST https://api.glialhealth.com/v1/webhook-endpoints \
  -H "Authorization: Bearer sk_test_..." \
  -H "Content-Type: application/json" \
  -d '{"url": "https://yourapp.example.com/folio-webhook"}'
Your server receives a POST with the event payload. No polling required. See Webhooks for signature verification and event types. Best for: production integrations, high volume, latency-sensitive pipelines.

2. Long-poll (?wait=<seconds>)

Pass ?wait=<seconds> on GET /v1/documents/{id}. The server holds the HTTP connection open until the document leaves queued/processing, or the timeout expires (whichever comes first). The response is the updated document object.
curl "https://api.glialhealth.com/v1/documents/doc_01j9xkqz3b0000000000000000?wait=30" \
  -H "Authorization: Bearer sk_test_..."
If the document finishes within the window you get the terminal status in one round-trip. If the timeout fires before processing completes, the response returns the current (non-terminal) status — loop and call again.
wait=0 (the default) is an immediate poll with no hold: useful for checking status without blocking.
Best for: scripts, CLIs, short-lived jobs where a persistent webhook listener is impractical.

3. Fetch result directly

Once you know (via webhook or poll) that the document is in a terminal state, fetch the full extraction result:
curl https://api.glialhealth.com/v1/documents/doc_01j9xkqz3b0000000000000000/result \
  -H "Authorization: Bearer sk_test_..."
A 409 means the document is not yet in a terminal state. A 200 returns the full result object including extract, tables, review_status, flags, and provenance fields. Best for: fetching the structured payload after any notification method has confirmed completion.

Choosing a mode

ScenarioRecommended mode
Production server-to-server integrationWebhook push
CLI or notebook — fire and waitLong-poll (?wait=30)
Check status from a cron jobPoll (?wait=0) then fetch result
Re-process or debug a past documentFetch result directly by ID

Listing documents

You can list all documents for your organisation using GET /v1/documents. Supports cursor-based pagination via limit and starting_after, and a status filter:
curl "https://api.glialhealth.com/v1/documents?status=partially_completed&limit=50" \
  -H "Authorization: Bearer sk_test_..."