Skip to content

Get Extraction

GET /v1/extractions/{id}

Retrieve a single extraction record by ID. This endpoint is primarily used to poll for the result of an async extraction.

Try it

Test this endpoint interactively in the Swagger UI.

Authorization required

Include your API key in the Authorization header.

Request

Headers

HeaderValueRequired
AuthorizationBearer <token>Yes

Path Parameters

ParamTypeRequiredDescription
idstringYesThe extraction ID returned from the Run Extraction endpoint.

Code Examples

bash
curl https://api.docmap.io/v1/extractions/extract_9k2m4n6p8q0r1s3t \
  -H "Authorization: Bearer dm_live_abc123def456ghi789jkl012mno345"
typescript
const apiKey = process.env.DOCMAP_API_KEY

const response = await fetch(
  'https://api.docmap.io/v1/extractions/extract_9k2m4n6p8q0r1s3t',
  { headers: { 'Authorization': `Bearer ${apiKey}` } },
)

const { data } = await response.json()
console.log(data.status, data.extractedData)
python
import requests

api_key = "dm_live_abc123def456ghi789jkl012mno345"

response = requests.get(
    "https://api.docmap.io/v1/extractions/extract_9k2m4n6p8q0r1s3t",
    headers={"Authorization": f"Bearer {api_key}"},
)

data = response.json()["data"]
print(data["status"], data["extractedData"])

Response

Status: 200 OK

The response body is wrapped in a data object containing a single extraction record.

Fields

Each field is the same as the Run Extraction response.

FieldTypeDescription
idstringUnique extraction ID.
userIdstringID of the user who owns this extraction.
templateIdstringID of the template used for extraction.
templateNamestringDisplay name of the template used.
fileNamestringOriginal file name of the uploaded document.
status"processing" | "completed" | "failed"Current extraction status.
extractedDataobject | nullExtracted data matching the template fields. null while processing or if failed.
errorstring | nullError message if failed. null otherwise.
variablesVariable[]Array of template variable definitions used during extraction.
source"dashboard" | "api"How the extraction was triggered.
runIdstring | nullBatch run ID, if one was provided.
processingTimeMsnumber | nullTotal processing duration in milliseconds. null while still processing.
createdAtstringISO 8601 timestamp of when the extraction was created.

Example (completed)

json
{
  "data": {
    "id": "extract_9k2m4n6p8q0r1s3t",
    "userId": "uid_a1b2c3d4e5f6",
    "templateId": "tmpl_8f3a2b1c4d5e6f7g",
    "templateName": "Invoice Template",
    "fileName": "invoice-2024-001.pdf",
    "status": "completed",
    "extractedData": {
      "vendor_name": "Acme Corp",
      "invoice_number": "INV-2024-001",
      "total_amount": 1250.00
    },
    "error": null,
    "variables": [
      {
        "name": "vendor_name",
        "type": "string",
        "description": "Name of the vendor or supplier"
      }
    ],
    "source": "api",
    "runId": null,
    "processingTimeMs": 3842,
    "createdAt": "2024-11-20T14:30:00.000Z"
  }
}

Example (still processing)

json
{
  "data": {
    "id": "extract_9k2m4n6p8q0r1s3t",
    "userId": "uid_a1b2c3d4e5f6",
    "templateId": "tmpl_8f3a2b1c4d5e6f7g",
    "templateName": "Invoice Template",
    "fileName": "invoice-2024-001.pdf",
    "status": "processing",
    "extractedData": null,
    "error": null,
    "variables": [
      {
        "name": "vendor_name",
        "type": "string",
        "description": "Name of the vendor or supplier"
      }
    ],
    "source": "api",
    "runId": null,
    "processingTimeMs": null,
    "createdAt": "2024-11-20T14:30:00.000Z"
  }
}

Polling Pattern

When using async extractions, poll this endpoint until the status is no longer "processing":

typescript
async function pollExtraction(extractionId: string, apiKey: string) {
  const maxAttempts = 30
  const intervalMs = 2000

  for (let i = 0; i < maxAttempts; i++) {
    const response = await fetch(
      `https://api.docmap.io/v1/extractions/${extractionId}`,
      { headers: { 'Authorization': `Bearer ${apiKey}` } },
    )

    const { data } = await response.json()

    if (data.status !== 'processing') {
      return data
    }

    await new Promise((resolve) => setTimeout(resolve, intervalMs))
  }

  throw new Error('Extraction timed out')
}
python
import time
import requests

def poll_extraction(extraction_id: str, api_key: str):
    max_attempts = 30
    interval_s = 2

    for _ in range(max_attempts):
        response = requests.get(
            f"https://api.docmap.io/v1/extractions/{extraction_id}",
            headers={"Authorization": f"Bearer {api_key}"},
        )

        data = response.json()["data"]

        if data["status"] != "processing":
            return data

        time.sleep(interval_s)

    raise TimeoutError("Extraction timed out")

TIP

A polling interval of 2 seconds is recommended. Most extractions complete within 5--30 seconds.

Errors

StatusCodeDescription
401UNAUTHORIZEDMissing, invalid, or expired API key / token.
403FORBIDDENThe extraction belongs to a different user.
404NOT_FOUNDNo extraction exists with the specified ID.

DocMap API Documentation