Every document submitted to POST /v1/documents is processed asynchronously. The API returns 202 Accepted immediately with a document object; the actual extraction happens in the background. This guide explains the status lifecycle and the three consumption modes.
Status lifecycle
A document moves through the following states:
| Status | Meaning |
|---|
received | The file has been accepted and is waiting to enter the queue. |
queued | The document is in the worker queue, waiting for a free slot. |
processing | OCR and extraction are running. |
completed | Extraction finished; all fields passed confidence checks. |
partially_completed | Extraction finished but one or more fields triggered the human-review threshold. |
failed | Processing encountered an unrecoverable error. |
deleted | The document was deleted via DELETE /v1/documents/{id}. |
Terminal states are completed, partially_completed, failed, and deleted. Once a document reaches a terminal state it does not change again.
Consumption modes
1. Webhook push (recommended for production)
Register a URL once; Folio calls it whenever a document reaches a terminal state.
curl -X POST https://api.glialhealth.com/v1/webhook-endpoints \
-H "Authorization: Bearer sk_test_..." \
-H "Content-Type: application/json" \
-d '{"url": "https://yourapp.example.com/folio-webhook"}'
Your server receives a POST with the event payload. No polling required. See Webhooks for signature verification and event types.
Best for: production integrations, high volume, latency-sensitive pipelines.
2. Long-poll (?wait=<seconds>)
Pass ?wait=<seconds> on GET /v1/documents/{id}. The server holds the HTTP connection open until the document leaves queued/processing, or the timeout expires (whichever comes first). The response is the updated document object.
curl "https://api.glialhealth.com/v1/documents/doc_01j9xkqz3b0000000000000000?wait=30" \
-H "Authorization: Bearer sk_test_..."
If the document finishes within the window you get the terminal status in one round-trip. If the timeout fires before processing completes, the response returns the current (non-terminal) status — loop and call again.
wait=0 (the default) is an immediate poll with no hold: useful for checking status without blocking.
Best for: scripts, CLIs, short-lived jobs where a persistent webhook listener is impractical.
3. Fetch result directly
Once you know (via webhook or poll) that the document is in a terminal state, fetch the full extraction result:
curl https://api.glialhealth.com/v1/documents/doc_01j9xkqz3b0000000000000000/result \
-H "Authorization: Bearer sk_test_..."
A 409 means the document is not yet in a terminal state. A 200 returns the full result object including extract, tables, review_status, flags, and provenance fields.
Best for: fetching the structured payload after any notification method has confirmed completion.
Choosing a mode
| Scenario | Recommended mode |
|---|
| Production server-to-server integration | Webhook push |
| CLI or notebook — fire and wait | Long-poll (?wait=30) |
| Check status from a cron job | Poll (?wait=0) then fetch result |
| Re-process or debug a past document | Fetch result directly by ID |
Listing documents
You can list all documents for your organisation using GET /v1/documents. Supports cursor-based pagination via limit and starting_after, and a status filter:
curl "https://api.glialhealth.com/v1/documents?status=partially_completed&limit=50" \
-H "Authorization: Bearer sk_test_..."