Developer reference

ParseMyStatement API & MCP

Convert bank statement PDFs into normalized JSON, CSV, and Excel. This page documents the product conversion pipeline, authenticated web APIs, developer API v1, and the Model Context Protocol (MCP) endpoint for agents.

Conversion pipeline

Every PDF follows the same path whether you use the web UI, /api/convert, or the developer API. A Python worker consumes jobs from Redis and returns structured transactions.

  1. Text extractionpdfplumber reads native PDF text with layout mode for column alignment (HSBC, ICICI, etc.).
  2. OCR (when needed) — Scanned pages are sent to Mistral mistral-ocr-latest with table structure preserved as HTML.
  3. Normalization — DeepInfra (deepseek-ai/DeepSeek-V4-Flash by default) extracts transactions, meta fields, and merchant classifications via the OpenAI-compatible Python SDK into strict JSON.
  4. Post-processing— Balance reconciliation fixes swapped debit/credit; classification rules replace generic bank labels (e.g. "ONLINE TRANSACTION") with merchant names.

Required environment variables: DEEPINFRA_API_KEY, MISTRAL_API_KEY, REDIS_URL, DATABASE_URL. Run the worker with ./python_worker/run-worker.sh alongside pnpm dev.

Web & session APIs

POST/api/convert

Upload a PDF. Authenticated users get stored documents; guests receive an inline JSON response after the worker finishes.

Auth: Session cookie (authenticated) or guest (rate-limited)

Multipart form field: file (PDF only).

curl -X POST https://parsemystatement.com/api/convert \
  -H "Cookie: <session>" \
  -F "[email protected]"

Stored documents (authenticated)

  • GET /api/documents — list documents (24h retention)
  • GET /api/documents/status — latest processing state from Redis
  • GET /api/documents/:id/json — normalized statement JSON
  • GET /api/documents/:id/csv — CSV export
  • GET /api/documents/:id/xlsx — Excel export
  • GET /api/documents/:id/input — original PDF bytes

Billing

  • POST /api/billing/checkout — start Dodo Payments checkout
  • GET /api/billing/subscription — current plan
  • POST /api/billing/webhooks — payment webhooks (server-to-server)

Developer API v1

Machine-to-machine access with named API keys. Keys are shown once at creation (psm_...); only a hash is stored. Manage keys in the developer console or via the routes below.

OpenAPI stub: /developers/openapi.json

Authentication

Authorization: Bearer psm_<your-secret-key>

API keys

POST/api/v1/developer/api-keys

Create a named key. Response includes rawKey — save it immediately.

Auth: None (console); protect in production

GET/api/v1/developer/api-keys

List keys with aggregated request stats.

Auth: None

POST/api/v1/developer/api-keys/:id/revoke

Revoke a key.

Auth: Bearer (must match key)

POST/api/v1/developer/api-keys/:id/rotate

Rotate secret; returns new rawKey once.

Auth: Bearer (must match key)

GET/api/v1/developer/api-keys/:id/stats

Per-key latency and success/failure counts.

Auth: Bearer

Upload → poll → result

No webhooks. Clients poll until status is done or failed.

POST/api/v1/developer/uploads

Upload PDF (multipart field file). Returns 202 with upload id.

Auth: Bearer

GET/api/v1/developer/uploads/:id

Check status: queued | processing | done | failed.

Auth: Bearer

GET/api/v1/developer/uploads/:id/result

Fetch outputJson and outputCsv when done.

Auth: Bearer

Example flow

# 1. Upload
curl -X POST https://parsemystatement.com/api/v1/developer/uploads \
  -H "Authorization: Bearer $PSM_API_KEY" \
  -F "[email protected]"
# => { "upload": { "id": "upload_...", "status": "queued" } }

# 2. Poll status
curl https://parsemystatement.com/api/v1/developer/uploads/upload_abc123 \
  -H "Authorization: Bearer $PSM_API_KEY"

# 3. Fetch result when status is done
curl https://parsemystatement.com/api/v1/developer/uploads/upload_abc123/result \
  -H "Authorization: Bearer $PSM_API_KEY"

Statement JSON schema

Normalized output shape (web, developer, and MCP tools):

{
  "bank_name": null,
  "account_holder": "Jane Doe",
  "account_number": "0006",
  "currency": "INR",
  "statement_period": { "from": null, "to": null },
  "opening_balance": null,
  "closing_balance": null,
  "transactions": [
    {
      "date": "2026-05-16",
      "description": "ZOMATO",
      "debit": 462.98,
      "credit": null,
      "balance": 13948.41,
      "reference": "UPI20260516000771719",
      "classification": "ZOMATO"
    }
  ]
}

CSV columns: Date, Description, Credit, Debit, Balance, Reference, Classification. Classifications avoid generic bank labels when a merchant can be inferred from UPI or POS text.

Model Context Protocol (MCP)

Agents can connect over streamable HTTP JSON-RPC. Discovery files are published for crawlers and MCP clients.

  • MCP endpoint: https://parsemystatement.com/api/mcp
  • Server card: https://parsemystatement.com/.well-known/mcp/server-card.json
  • AI plugin manifest: https://parsemystatement.com/.well-known/ai-plugin.json
  • Agent catalog: https://parsemystatement.com/.well-known/agents.json

Available tools

  • upload_bank_statementapi_key, file_name, file_base64
  • get_upload_statusapi_key, upload_id
  • get_upload_resultapi_key, upload_id

Initialize (JSON-RPC)

curl -X POST https://parsemystatement.com/api/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "id": 1,
    "method": "initialize",
    "params": {
      "protocolVersion": "2025-06-18",
      "capabilities": {},
      "clientInfo": { "name": "my-agent", "version": "1.0.0" }
    }
  }'

Call a tool

curl -X POST https://parsemystatement.com/api/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "id": 2,
    "method": "tools/call",
    "params": {
      "name": "upload_bank_statement",
      "arguments": {
        "api_key": "psm_...",
        "file_name": "hsbc.pdf",
        "file_base64": "<base64>"
      }
    }
  }'

After upload_bank_statement, poll with get_upload_status until status is done, then call get_upload_result. The Python worker must be running.

Errors & limits

  • 401 — missing or invalid developer API key
  • 202 — upload accepted or result not ready yet
  • 400 — invalid PDF or missing form fields
  • 429 — guest or API rate limit exceeded
  • 503 — Redis queue unavailable or worker heartbeat missing

Plan limits (documents/pages per month) apply to authenticated users. See pricing on the homepage and src/lib/billing-store.ts for caps.