列出提取

GET /v1/extractions

列出已认证用户的提取记录。结果按时间倒序排列（最新的在前）。支持基于游标的分页，以及按模板、批次运行 ID 或日期范围进行筛选。

试一试

在 Swagger UI 中交互式测试此端点。

需要认证

在 Authorization 头中包含您的 API 密钥。

请求

请求头

头	值	必需
`Authorization`	`Bearer <token>`	是

查询参数

参数	类型	必需	描述
`templateId`	`string`	否	按特定模板筛选结果。
`runId`	`string`	否	按特定批次运行筛选结果。返回所有匹配结果（忽略 `limit`/`cursor`）。
`limit`	`number`	否	返回的最大结果数。默认：`50`，最大：`100`。
`cursor`	`string`	否	分页游标。传入上一页最后一条记录的 `createdAt` 值来获取下一页。
`dateFrom`	`string`	否	筛选在此日期当天或之后创建的提取（`YYYY-MM-DD`）。
`dateTo`	`string`	否	筛选在此日期当天或之前创建的提取（`YYYY-MM-DD`）。

分页

响应中包含 hasMore 布尔值。当为 true 时，将最后一条记录的 createdAt 值作为 cursor 参数传入即可获取下一页。

使用 runId 筛选时，所有匹配的提取会在单次响应中返回（批次运行通常较小，无需分页）。

代码示例

curlTypeScriptPython

bash

# List all extractions (default limit of 50)
curl https://api.docmap.io/v1/extractions \
  -H "Authorization: Bearer dm_live_a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0"

# Filter by template and limit results
curl "https://api.docmap.io/v1/extractions?templateId=48291&limit=10" \
  -H "Authorization: Bearer dm_live_a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0"

# Filter by batch run ID
curl "https://api.docmap.io/v1/extractions?runId=b4704c6e-8917-4671-8c92-178aec3eba92" \
  -H "Authorization: Bearer dm_live_a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0"

# Filter by date range
curl "https://api.docmap.io/v1/extractions?dateFrom=2025-01-01&dateTo=2025-01-31" \
  -H "Authorization: Bearer dm_live_a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0"

# Paginate through results
curl "https://api.docmap.io/v1/extractions?limit=10&cursor=2025-01-15T09:30:00.000Z" \
  -H "Authorization: Bearer dm_live_a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0"

typescript

const apiKey = process.env.DOCMAP_API_KEY

// List all extractions
const response = await fetch('https://api.docmap.io/v1/extractions', {
  headers: { 'Authorization': `Bearer ${apiKey}` },
})

const { data, hasMore } = await response.json()
console.log(`Found ${data.length} extractions, hasMore: ${hasMore}`)

// Paginate through all results
let cursor: string | undefined
const allExtractions = []

do {
  const url = new URL('https://api.docmap.io/v1/extractions')
  url.searchParams.set('limit', '50')
  if (cursor) url.searchParams.set('cursor', cursor)

  const res = await fetch(url, {
    headers: { 'Authorization': `Bearer ${apiKey}` },
  })
  const page = await res.json()

  allExtractions.push(...page.data)
  cursor = page.hasMore ? page.data.at(-1)?.createdAt : undefined
} while (cursor)

console.log(`Total: ${allExtractions.length} extractions`)

// Filter by template
const filtered = await fetch(
  'https://api.docmap.io/v1/extractions?templateId=48291&limit=10',
  { headers: { 'Authorization': `Bearer ${apiKey}` } },
)

const { data: filteredData } = await filtered.json()

python

import requests

api_key = "dm_live_a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0"
headers = {"Authorization": f"Bearer {api_key}"}

# List all extractions
response = requests.get(
    "https://api.docmap.io/v1/extractions",
    headers=headers,
)
result = response.json()
print(f"Found {len(result['data'])} extractions, hasMore: {result['hasMore']}")

# Paginate through all results
all_extractions = []
cursor = None

while True:
    params = {"limit": 50}
    if cursor:
        params["cursor"] = cursor

    response = requests.get(
        "https://api.docmap.io/v1/extractions",
        headers=headers,
        params=params,
    )
    page = response.json()

    all_extractions.extend(page["data"])

    if not page["hasMore"]:
        break
    cursor = page["data"][-1]["createdAt"]

print(f"Total: {len(all_extractions)} extractions")

# Filter by date range
response = requests.get(
    "https://api.docmap.io/v1/extractions",
    headers=headers,
    params={"dateFrom": "2025-01-01", "dateTo": "2025-01-31"},
)

响应

状态码：200 OK

响应体包含一个 data 提取记录数组和一个 hasMore 布尔值，指示是否有更多页。

顶层字段

字段	类型	描述
`data`	`ExtractionRecord[]`	提取记录数组。
`hasMore`	`boolean`	如果当前页之后还有更多结果则为 `true`。使用 `runId` 筛选时始终为 `false`（所有结果已返回）。

提取记录字段

数组中每个提取记录包含与运行提取响应相同的字段：

字段	类型	描述
`id`	`string`	唯一提取 ID（以 `extract-` 为前缀）。
`userId`	`string`	拥有此提取的用户 ID。
`templateId`	`string`	用于提取的模板 ID。
`templateName`	`string`	所使用模板的显示名称。
`fileName`	`string`	上传文档的原始文件名。
`status`	`"processing"` \| `"completed"` \| `"failed"`	当前提取状态。处理中为 `"processing"`，成功为 `"completed"`，错误为 `"failed"`。
`extractedData`	`object` \| `null`	与模板字段匹配的提取数据。如果提取失败则为 `null`。
`error`	`string` \| `null`	描述失败原因的错误信息。如果提取成功则为 `null`。
`variables`	`Variable[]`	提取过程中使用的模板变量定义数组。
`source`	`"dashboard"` \| `"api"`	提取的触发方式。
`runId`	`string` \| `null`	批次运行 ID（如果已提供）。
`processingTimeMs`	`number` \| `null`	总处理时长（毫秒）。
`createdAt`	`string`	提取创建时间的 ISO 8601 时间戳。

示例

json

{
  "data": [
    {
      "id": "extract-KmL9nOpQrStUvWxYz",
      "userId": "L5kM9nRpQ7vX3yZ1wD4eF6gHjA2",
      "templateId": "48291",
      "templateName": "Invoice Template",
      "fileName": "invoice-2024-001.pdf",
      "status": "completed",
      "extractedData": {
        "vendor_name": "Acme Corp",
        "invoice_number": "INV-2024-001",
        "total_amount": 1250.00
      },
      "error": null,
      "variables": [
        {
          "name": "vendor_name",
          "type": "string",
          "description": "Name of the vendor or supplier"
        }
      ],
      "source": "api",
      "runId": null,
      "processingTimeMs": 3842,
      "createdAt": "2025-01-20T14:30:00.000Z"
    }
  ],
  "hasMore": true
}

错误

状态码	错误码	描述
`401`	`UNAUTHORIZED`	缺少、无效或已过期的 API 密钥/令牌。

列出提取 ​

请求 ​

请求头 ​

查询参数 ​

代码示例 ​

响应 ​

顶层字段 ​

提取记录字段 ​

示例 ​

错误 ​

列出提取

请求

请求头

查询参数

代码示例

响应

顶层字段

提取记录字段

示例

错误