Document Parser API
Integrate Airparser into your applications with our comprehensive API. Parse documents, create extraction schemas, and receive extracted data via webhooks.
📤 Parse Documents via API
Send PDFs and emails to your Airparser inbox for automatic parsing. Documents are processed immediately and extracted data is available via webhooks or integrations.
cURL - Parse Document
curl \
-X POST \
https://api.airparser.com/inboxes/<INBOX_ID>/upload \
-F 'file=@./receipt.pdf' \
-H "X-API-Key: <YOUR_API_KEY>"
Python - Parse Document
import requests
header = {"X-API-Key": "<API_KEY>"}
url = "https://api.airparser.com/inboxes/<INBOX_ID>/upload"
with open('invoice.pdf', 'rb') as f:
files = {'file': f}
with requests.request("POST", url, files=files, headers=header) as response:
print('response: ', response)
Node.js - Parse Document
const fetch = require("node-fetch");
const fs = require("fs");
const FormData = require("form-data");
const APIKEY = "<YOUR_API_KEY>";
const inboxId = "<INBOX_ID>";
const filePath = "/path/to/your/file.pdf";
const metadata = { foo: "bar" };
async function parseDocument(inboxId, filePath, metadata) {
const url = `https://api.airparser.com/inboxes/${inboxId}/upload`;
const fileStream = fs.createReadStream(filePath);
const form = new FormData();
form.append("file", fileStream);
form.append("meta", JSON.stringify(metadata));
try {
const response = await fetch(url, {
method: "POST",
body: form,
headers: {
"X-API-Key": APIKEY,
},
});
const docId = await response.json();
console.log(response.status);
console.log("Document id:", docId);
} catch (e) {
console.error("Error:", e.message);
}
}
parseDocument(inboxId, filePath, metadata);
PHP - Parse Document
<?php
$apikey = '<API_KEY>';
$url = 'https://api.airparser.com/inboxes/<INBOX_ID>/upload';
$filepath = './invoice.pdf';
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_HTTPHEADER, array(
'X-API-Key: ' . $apikey
));
curl_setopt($curl, CURLOPT_POST, true);
$meta = array(
'foo' => 'bar',
'my_id' => 42,
);
$metaJson = json_encode($meta);
curl_setopt($curl, CURLOPT_POSTFIELDS, array(
'file' => curl_file_create($filepath, 'application/pdf', 'invoice.pdf'),
'meta' => $metaJson
));
$response = curl_exec($curl);
curl_close($curl);
echo $response;
⚙️ Create Extraction Schema
Define your extraction fields programmatically. Create schemas with scalar fields, lists, and enums to structure your parsed data.
API Endpoint
POST /inboxes/<inbox_id>/schema
Content-Type: application/json
X-API-Key: <YOUR_API_KEY>
Schema Example
{
"fields": [
{
"type": "scalar",
"data": {
"name": "invoice_number",
"description": "Invoice reference number",
"type": "string",
"default_value": ""
}
},
{
"type": "list",
"data": {
"name": "items",
"description": "List of items in the invoice",
"attributes": [
{
"name": "description",
"description": "Item description",
"type": "string",
"default_value": ""
},
{
"name": "amount",
"description": "Item amount",
"type": "decimal",
"default_value": "0.00"
}
]
}
},
{
"type": "enum",
"data": {
"name": "payment_status",
"description": "Current payment status",
"values": ["paid", "pending", "overdue"]
}
}
]
}
🔧 Post-Processing
Modify parsed data before it's exported to integrations or webhooks. Add custom formatting, business logic, and data transformations using Python.
Merge Fields
# Merge first and last name into full name
data['fullname'] = data['first_name'] + " " + data['last_name']
# Using f-strings (recommended)
data["fullname"] = f"{data['first_name']} {data['last_name']}"
# Using format() function
data['fullname'] = '{} {}'.format(data['first_name'], data['last_name'])
Conditional Processing
# Only process invoices above certain amount
if float(data.get('total_amount', 0)) > 1000:
data['priority'] = 'high'
else:
data['priority'] = 'normal'
# Add timestamp
import datetime
data['processed_at'] = datetime.datetime.now().isoformat()
📚 Learn More
Explore advanced post-processing techniques and examples:
🔗 Webhooks
Receive parsed data in real-time via webhooks. Set up your server to handle incoming webhook events from Airparser.
🚀 Setup Steps
- Copy your webhook endpoint URL
- Go to Integrations → Webhooks in your Airparser account
- Click "Create a webhook" and paste your URL
- Test the webhook with a sample document
Python
import json
from flask import Flask, request
app = Flask(__name__)
@app.route('/webhook', methods=['POST'])
def webhook():
payload = request.data
data = json.loads(payload)
# Process the parsed data
# data contains all extracted fields
return {'success': True}
if __name__ == '__main__':
app.run(port=4242)
Node.js
const express = require('express');
const app = express();
app.post('/webhook', express.raw({type: 'application/json'}), (request, response) => {
const payload = request.body;
const data = JSON.parse(payload);
// Process the parsed data
// data contains all extracted fields
// Return 200 to acknowledge receipt
response.send();
});
app.listen(4242, () => console.log('Running on port 4242'));
PHP
<?php
$payload = @file_get_contents('php://input');
$data = json_decode($payload, true);
// Process the parsed data
// $data contains all extracted fields
// Always return 200 to acknowledge receipt
http_response_code(200);
echo "OK";
📚 Additional Resources
📖 Full Documentation
Complete API reference with all endpoints, parameters, and examples.
View Full Docs →💬 Need Help?
Get support from our developer team for integration questions.