📚 API Documentation
Submit PDF for PageIndex Computation
- Endpoint (POST):
https://api.vectify.ai/pageindex
- Description: Initiates the conversion of a PDF document into a structured hierarchical tree format. Immediately returns a task identifier (
task_id
).
Request Body:
- file (binary, required): PDF file to be processed
Optional Parameters:
- model: OpenAI model to use (default: "gpt-4o-2024-11-20")
- toc_check_page_num: Number of initial pages to check for table of contents (default: 20)
- max_page_num_each_node: Max pages allowed for each node (default: 10)
- max_token_num_each_node: Max tokens allowed for each node (default: 20000)
- if_add_node_id: Include a node ID for each node ("yes" / "no") (default: "yes")
- if_add_node_text: Include node text for each node ("yes" / "no") (default: "no")
- if_add_node_summary: Include a summary for each node ("yes" / "no") (default: "no")
- if_add_doc_description: Include a description for the document ("yes" / "no") (default: "yes")
Example Request:
with open('./2023-annual-report.pdf', 'rb') as file:
response = requests.post(
"https://api.vectify.ai/pageindex",
headers={'api_key': 'YOUR_API_KEY_HERE'},
files={'file': file}
)
Example Request with Optional Parameters:
with open('./2023-annual-report.pdf', 'rb') as file:
response = requests.post(
"https://api.vectify.ai/pageindex",
headers={'api_key': 'YOUR_API_KEY_HERE'},
files={'file': file},
data={
"toc_check_pages": 15,
"max_page_num_each_node": 8,
}
)
See here for the example PDF document.
Example Response:
{
"task_id": "abc123def456"
}
Check Status and Retrieve Results
- Endpoint (POST):
https://api.vectify.ai/pageindex/status
- Description: Checks computation status and retrieves results once processing is complete.
Request Body:
- task_id (string, required): Task ID from submit response
Computation Status:
The status returned from the endpoint indicates the progress of PDF processing tasks:
- queued: Task is queued and waiting to begin processing
- processing: Task is currently being processed
- completed: Task processing is complete; results are ready
- failed: Task processing encountered an error
Example Request:
response = requests.post(
"https://api.vectify.ai/pageindex/status",
headers={'api_key': 'YOUR_API_KEY_HERE'},
json={"task_id": "abc123def456"}
)
Example Response (Processing):
{
"task_id": "abc123def456",
"status": "processing"
}
Example Response (Completed):
{
"task_id": "abc123def456",
"status": "completed",
"result": [
...
{
"title": "Financial Stability",
"node_id": "0006",
"start_index": 21,
"end_index": 22,
"summary": "The Federal Reserve maintains financial stability by...",
"child_nodes": [
{
"title": "Monitoring Financial Vulnerabilities",
"node_id": "0007",
"start_index": 22,
"end_index": 28,
"summary": "The Federal Reserve's monitoring focuses on..."
},
{
"title": "Domestic and International Cooperation and Coordination",
"node_id": "0008",
"start_index": 28,
"end_index": 31,
"summary": "In 2023, the Federal Reserve collaborated internationally..."
}
]
}
...
]
}
See here for a complete example output structure generated by PageIndex from the above example PDF document.
⚠️ API Response Codes
- 200: Request successful
- 400: Bad request due to missing/invalid parameters (Resolution: Check request parameters)
- 401: Unauthorized; invalid or missing API key (Resolution: Ensure API key is correct)
- 404: Task or PDF file not found (Resolution: Verify task_id and PDF path)
- 413: File size too large (Resolution: Use smaller file or contact support)
- 500: Internal server error (Resolution: Retry later; if persistent, contact support)