ginibooster ginibooster
· Profile

Czech Bank Statement Parser API

REST API for extracting structured transaction data from Czech bank statement PDFs. Upload a PDF — get back JSON with account info, statement metadata, a full list of transactions with semantic tags, and a cryptographic verification of the PDF's embedded digital signature (bank issuer identity, integrity check, tampering detection — see the Signature object).

Supported banks: KB (/0100), MONETA (/0600), Air Bank (/3030), ČSOB (/0300), mBank (/6210), Raiffeisenbank (/5500), Fio (/2010), UniCredit (/2700), Česká spořitelna (/0800), Partners Banka (/6363).

Base URL

API endpoint
https://api.ginibooster.cz

All endpoints are prefixed with /api/v1. The full URL for each endpoint is shown in the examples below.

Authentication

Every request to a business endpoint must include a Bearer token in the Authorization header:

header
Authorization: Bearer YOUR_API_KEY

API keys are issued manually by the operator — contact ginibooster@gmail.com to request one. The same header accepts a JWT access token obtained via /api/v1/auth/login for browser sessions.

New accounts created via /auth/register must be approved by an administrator before they can call the business endpoints; until approval the API returns 403. API-key callers bypass this check — issuing a key implies approval.

Error codes

StatusMeaningBody
200 Parsing successful {"result": {...}, "file_id": "..."}
400 Bad request — missing or invalid file {"error": "..."}
401 Missing or invalid Bearer token {"error": "..."}
403 Account awaits administrator approval {"error": "..."}
422 File is not a recognised bank statement {"error": "...", "file_id": "..."}
500 Unexpected server error {"error": "...", "file_id": "..."}
POST /api/v1/bs_parse

Upload a Czech bank statement PDF. Returns structured JSON with account details, statement period, all parsed transactions, and a verification of the PDF's embedded digital signature (see the Signature object).

Request

Two content types are accepted:

Option A — Content-Type: application/json

FieldTypeDescription
filerequired string PDF file encoded as Base64
filenameoptional string Original filename (must end with .pdf). Default: upload.pdf

Option B — Content-Type: multipart/form-data

FieldTypeDescription
filerequired file PDF file, max 20 MB

Example — cURL (Base64 JSON)

shell
B64=$(base64 -w 0 statement.pdf)
curl -X POST https://api.ginibooster.cz/api/v1/bs_parse \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{\"file\":\"$B64\",\"filename\":\"statement.pdf\"}"

Example — cURL (multipart)

shell
curl -X POST https://api.ginibooster.cz/api/v1/bs_parse \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@statement.pdf"

Example — Python (Base64 JSON)

python
import base64, requests

with open("statement.pdf", "rb") as f:
    b64 = base64.b64encode(f.read()).decode()

resp = requests.post(
    "https://api.ginibooster.cz/api/v1/bs_parse",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
    json={"file": b64, "filename": "statement.pdf"},
)
print(resp.json())

Response — 200 OK

json
{
  "file_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890.pdf",
  "result": {
    "bank": "KB",
    "account": {
      "number":   "123456789/0100",
      "iban":     "CZ65 0800 0000 0012 3456 7890",
      "bic":      "KOMBCZPP",
      "owner":    "Jan Novák",
      "currency": "CZK",
      "type":     "Osobní účet"
    },
    "statement": {
      "number":          "12",
      "period_from":    "2026-01-01",
      "period_to":      "2026-01-31",
      "date_issued":    "2026-02-01",
      "opening_balance": 12500.00,
      "closing_balance": 18340.50,
      "total_credited":  35000.00,
      "total_debited":   -29159.50
    },
    "transactions": [ /* see Transaction object below */ ],
    "signature":    { /* see Signature object below */ }
  }
}
POST /api/v1/report_bug

Submit a parsing error report linked to a previously uploaded file. Reports are stored server-side and reviewed manually.

Request

Content-Type: application/json

FieldTypeDescription
textrequired string Description of the parsing issue
file_idoptional string UUID from a previous /bs_parse response
original_nameoptional string Original filename of the uploaded PDF
bankoptional string Detected bank name from the parse result

Example — cURL

shell
curl -X POST https://api.ginibooster.cz/api/v1/report_bug \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "file_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890.pdf",
    "original_name": "statement.pdf",
    "bank": "KB",
    "text": "Transaction on 2026-01-15 is missing from the output."
  }'

Response — 200 OK

json
{ "ok": true }
GET /uploads/<file_id>

Download or display a previously uploaded PDF by its UUID. Useful for embedding an in-browser PDF preview alongside the parse result.

Path parameter

ParameterTypeDescription
file_idrequired string UUID returned by /api/v1/bs_parse, e.g. a1b2c3d4…pdf

Example — cURL

shell
curl https://api.ginibooster.cz/uploads/a1b2c3d4-e5f6-7890-abcd-ef1234567890.pdf \
  -H "Authorization: Bearer YOUR_API_KEY" \
  --output statement.pdf

Response

Returns the raw PDF file with Content-Type: application/pdf.

Response object

Top-level shape returned by /api/v1/bs_parse.

Top-level fields

FieldTypeNotes
bankstringOne of KB, MONETA, AIRBANK, CSOB, MBANK, RAIFFEISENBANK, FIO, UNICREDIT, SPORITELNA, PARTNERS_BANKA
accountobjectSee account below
statementobjectSee statement below
transactionsarraySee the Transaction object
signatureobjectSee the Signature object — always present, reports integrity and signer identity

account

FieldTypeNotes
numberstring|nullLocal format, e.g. 123456789/0100
ibanstring|null
bicstring|null
ownerstring|nullAccount holder name
currencystringAlways "CZK"
typestring|nullAccount type label

statement

FieldTypeNotes
numberstring|nullStatement serial number
period_fromstring|nullISO date YYYY-MM-DD
period_tostring|nullISO date YYYY-MM-DD
date_issuedstring|nullISO date
opening_balancefloat|null
closing_balancefloat|null
total_creditedfloat|nullSum of incoming transactions
total_debitedfloat|nullNegative value

Transaction object

Each item in transactions[]. Unused fields are null. Debit amounts are negative floats. Dates are ISO YYYY-MM-DD.

FieldTypeNotes
date_postedstring|nullAccounting date
date_valutastring|nullValue date
amountfloat|nullNegative = debit, positive = credit
currencystringAlways "CZK"
typestring|nullTransaction category from the bank
descriptionstring|nullPayment message / note
counterparty_accountstring|nullE.g. 987654321/0600
counterparty_namestring|null
vsstring|nullVariable symbol
ksstring|nullConstant symbol
ssstring|nullSpecific symbol
feefloat|nullTransaction fee (Air Bank only)
transaction_idstring|nullBank-assigned transaction ID
balance_afterfloat|nullRunning balance after this transaction
raw_linestring|nullRaw PDF text line (debug only)
tagsstring[]Semantic tags — see below

Example transaction

json
{
  "date_posted":          "2026-01-05",
  "date_valuta":          "2026-01-05",
  "amount":               35000.00,
  "currency":             "CZK",
  "type":                 "Příchozí úhrada",
  "description":          "Mzda leden 2026",
  "counterparty_account": "987654321/0600",
  "counterparty_name":    "Acme s.r.o.",
  "vs":                   "20260105",
  "ks":                   "0308",
  "ss":                   null,
  "fee":                  null,
  "transaction_id":       "20260105001",
  "balance_after":        47500.00,
  "raw_line":             null,
  "tags":                 ["salary"]
}

Transaction tags

Tags in tags[] are assigned automatically after parsing. A transaction can have multiple tags simultaneously.

salary social gambling

Tag definitions

TagApplied when
salary KS=138, or SEPA salary reference, or salary keywords in description (mzda, plat…). Excludes ČSSZ payments.
social Counterparty is ČSSZ, or social benefit keywords in description (nemocenská, mateřská…).
gambling Gambling brand names in description or counterparty name (Fortuna, Tipsport, Betano, Sazka, …).

Signature object

Returned as result.signature. Every parse result includes this object — when no digital signature is embedded in the PDF, verdict is "unsigned" and every other field is null. The verifier never raises — the worst case is verdict: "error" with a human‑readable entry in warnings[].

Verdicts

VerdictMeaning
valid Intact, signer matches detected bank, cert chain rooted in a trusted CA
valid_untrusted Intact, signer matches detected bank, but cert chain is not in our local trust store
valid_unknown_bank Intact, but the bank could not be detected from the PDF content to compare the signer against
signer_mismatch Intact, but the certificate's organisation / IČO points to a different bank than what the content says — strong fraud signal
tampered File bytes were modified after signing, or content was appended past the signed region (coverage=partial)
unsigned No embedded signature. Expected for Air Bank, ČSOB, mBank, UniCredit, Partners Banka.
error Verification threw an exception — see warnings[] for details

Fields

FieldTypeNotes
presentbooleantrue if a signature is embedded in the PDF
verdictstringSee table above
intactboolean|nullCryptographic integrity — true if bytes have not been altered since signing
coveragestring|nullentire_file, entire_revision (legitimate — bank added annotations after signing; Česká spořitelna always), or partial (content appended after signature — red flag)
bank_matchboolean|nulltrue when signer's NTRCZ‑IČO (or organisation) points to the same bank as detected from PDF content
trustedboolean|nulltrue when the cert chain ends at a root in our trust store
trust_errorstring|nullWhy trusted is false (missing root CA, revoked cert, SHA‑1 legacy hash …)
signerobject|nullSubject fields of the signing cert — subject_cn, organization, ntrcz_id, country, issuer_cn, issuer_organization
signed_atstring|nullDeclared signing time (ISO 8601 with timezone). Cryptographically attested only when timestamp_attested=true.
timestamp_attestedboolean|nulltrue if the signature carries a verified RFC 3161 timestamp token from a trusted TSA
subfilterstring|nullPDF signature SubFilter — adbe.pkcs7.detached (most banks), adbe.pkcs7.sha1 (FIO legacy), ETSI.CAdES.detached (PAdES)
hash_algorithmstring|nullDigest algorithm — sha256, sha512, or sha1 (legacy, FIO only)
warningsstring[]Non‑fatal advisories: legacy_sha1_signature, expected_signed_but_unsigned, signer_bank_mismatch, multi_sig_ignored:N, error:<detail>

Per‑bank signing behaviour

Based on a 500‑file scan of real statements. For banks that never sign, an unsigned verdict is not a red flag.

BehaviourBanks
Always signsKB, Česká spořitelna, Raiffeisenbank, Fio banka
Sometimes signs (~67 %)MONETA Money Bank
Never signsAir Bank, ČSOB, mBank, UniCredit, Partners Banka

Example — valid signature (KB)

json
{
  "present":            true,
  "verdict":            "valid",
  "intact":             true,
  "coverage":           "entire_file",
  "bank_match":         true,
  "trusted":            true,
  "trust_error":        null,
  "signer": {
    "subject_cn":          "Elektronická pečeť Komerční banky, a.s.",
    "organization":        "Komerční banka, a.s.",
    "ntrcz_id":            "NTRCZ-45317054",
    "country":             "CZ",
    "issuer_cn":           "Komerční banka Qualified CA/RSA",
    "issuer_organization": "Komerční banka, a.s."
  },
  "signed_at":          "2026-03-25T21:01:07+00:00",
  "timestamp_attested": false,
  "subfilter":          "adbe.pkcs7.detached",
  "hash_algorithm":     "sha256",
  "warnings":           []
}

Example — unsigned (Air Bank)

json
{
  "present":            false,
  "verdict":            "unsigned",
  "intact":             null,
  "coverage":           null,
  "bank_match":         null,
  "trusted":            null,
  "trust_error":        null,
  "signer":             null,
  "signed_at":          null,
  "timestamp_attested": null,
  "subfilter":          null,
  "hash_algorithm":     null,
  "warnings":           []
}

Example — tampered

json
{
  "present":            true,
  "verdict":            "tampered",
  "intact":             false,
  "coverage":           "partial",
  "bank_match":         true,
  "trusted":            false,
  "trust_error":        "signature broken or data appended after signing",
  "signer": {
    "organization":        "Raiffeisenbank a.s.",
    "ntrcz_id":            "NTRCZ-49240901",
    "country":             "CZ"
  },
  "signed_at":          "2026-03-01T16:48:59+01:00",
  "timestamp_attested": false,
  "subfilter":          "adbe.pkcs7.detached",
  "hash_algorithm":     "sha256",
  "warnings":           []
}
POST /api/v1/id_parse

Upload a photo of a Czech identity card (Občanský průkaz) and receive structured JSON with extracted fields. The OCR pipeline (Tesseract + RapidOCR) runs entirely on our own servers — your images are never forwarded to a third-party AI provider. Handles rotated images, both card generations (pre-2012 and current), front and back sides.

Request

Two content types are accepted:

Option A — Content-Type: application/json

FieldTypeDescription
filerequired string Image file encoded as Base64
filenameoptional string Original filename (must end with .jpg, .jpeg, .png, or .webp). Default: upload.jpg

Option B — Content-Type: multipart/form-data

FieldTypeDescription
filerequired file JPG / PNG / WEBP image, max 20 MB

Example — cURL (Base64 JSON)

shell
B64=$(base64 -w 0 id_card_front.jpg)
curl -X POST https://api.ginibooster.cz/api/v1/id_parse \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{\"file\":\"$B64\",\"filename\":\"id_card_front.jpg\"}"

Example — cURL (multipart)

shell
curl -X POST https://api.ginibooster.cz/api/v1/id_parse \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@id_card_front.jpg"

Example — Python (Base64 JSON)

python
import base64, requests

with open("id_card_front.jpg", "rb") as f:
    b64 = base64.b64encode(f.read()).decode()

resp = requests.post(
    "https://api.ginibooster.cz/api/v1/id_parse",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
    json={"file": b64, "filename": "id_card_front.jpg"},
)
print(resp.json())

Response — 200 OK (front side)

json
{
  "file_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890.jpg",
  "result": {
    "document_type": "cz_id",
    "side":          "front",
    "side_confidence":   0.97,
    "extraction_method": "local",
    "front": {
      "document_number": "AB1234567",
      "surname":         "NOVÁK",
      "given_names":     "JAN",
      "date_of_birth":   "12.05.1990",
      "sex":             "M",
      "place_of_birth":  "PRAHA",
      "nationality":     "ČESKÁ REPUBLIKA",
      "date_of_issue":   "01.06.2020",
      "date_of_expiry":  "01.06.2030",
      "personal_number": "900512"
    },
    "back": null,
    "confidence": {
      "document_number": 1.00,
      "surname":         0.98,
      "given_names":     0.97,
      "date_of_birth":   0.94
    }
  }
}

Response — 200 OK (back side)

json
{
  "file_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890.jpg",
  "result": {
    "document_type": "cz_id",
    "side":          "back",
    "side_confidence":   0.95,
    "extraction_method": "local",
    "front": null,
    "back": {
      "address":         "Praha, Nové Město, Vzorová 1234/5",
      "personal_number": "900512/1234",
      "marital_status":  "ženatý",
      "mrz_line1":       "IDCZEAB1234567<<<<<<<<<<<<<<<<<<<<<<<<0",
      "mrz_line2":       "9005121M3005121CZE<<JAN"
    },
    "mrz_extracted": {
      "document_number": "AB1234567",
      "date_of_birth":   "12.05.1990",
      "date_of_expiry":  "12.05.2030",
      "sex":             "M",
      "nationality":     "CZE",
      "valid":           true
    }
  }
}

Response — 200 (not a Czech ID)

If the image is not recognised as a Czech ID, the API still returns 200 OK with document_type: "unknown" and empty front/back fields.

json
{
  "file_id": "...",
  "result": {
    "document_type": "unknown",
    "side":          "unknown",
    "front":         null,
    "back":          null
  }
}

ID document object

Top-level shape returned by /api/v1/id_parse.

Top-level fields

FieldTypeValues
document_typestring"cz_id" · "unknown"
sidestring"front" · "back" · "unknown"
side_confidencenumber0.01.0 — confidence of the side classification
extraction_methodstring"local" — fields extracted on our servers (Tesseract + RapidOCR)
frontobject|nullPopulated when side = "front"
backobject|nullPopulated when side = "back"
confidenceobjectPer-field confidence scores (0.01.0) for fields the OCR returned
mrz_extractedobject|nullBack side only — fields recovered algorithmically from the MRZ zone with ICAO 9303 checksum verification (document_number, date_of_birth, date_of_expiry, sex, nationality, valid)

front object

FieldTypeNotes
document_numberstring|null9-digit number (ČÍSLO DOKLADU)
surnamestring|nullPŘÍJMENÍ
given_namesstring|nullJMÉNO
date_of_birthstring|nullDD.MM.YYYY
sexstring|null"M" or "F"
place_of_birthstring|nullMÍSTO NAROZENÍ
nationalitystring|nullSTÁTNÍ OBČANSTVÍ
date_of_issuestring|nullDD.MM.YYYY (DATUM VYDÁNÍ)
date_of_expirystring|nullDD.MM.YYYY (PLATNOST DO)
personal_numberstring|null6-digit partial rodné číslo shown on front

back object

FieldTypeNotes
addressstring|nullFull address (street, city, district)
personal_numberstring|nullFull rodné číslo from MRZ, e.g. 900512/1234
marital_statusstring|nullRodinný stav if visible
mrz_line1string|nullFirst MRZ line (starts with IDCZE)
mrz_line2string|nullSecond MRZ line