Document Verifier API

The Document Verifier API provides an asynchronous API to upload new documents, acknowledge processed jobs and obtain verified documents

Overview

Overview

Base URL for all requests: https://verifier.buildsimple.de/documentverifier/api/

Example process sequence

Example process sequence

The sequence diagram shows a scenario with two apps calling the API. One app, the producer or importer, is responsible for uploading the documents from various sources, such as scanners, e-mail inboxes, or business apps. The second app, the consumer or exporter, is responsible for downloading the verified documents (the image and the verified properties). This app then acknowledges whether everything has been successfully processed and whether the verified documents can be deleted or archived:

Document Verifier API

Job queue

Job queue

The Document Verifier API uses the concept of a pull queue. Using pull queues, you can have many workers to pull multiple verfied documents with a certain document type all at once. This process is called batching.

Pull queues do not dispatch tasks at all. They depend on other worker services to „lease“ tasks from the queue on their own initiative. Pull queues give you more power and flexibility over when and where tasks are processed, but they also require you to do more process management. When a task is leased the leasing worker declares a deadline. By the time the deadline arrives the worker must either complete the task and delete it or the Task Queue service will allow another worker to lease it.

Authentication

Authentication

Authentication

The Document Verifier uses Basic Authentication over HTTPS to authenticate a request. To get valid credentials you need to log in to the Document Verifier APP. Here you can generate username/password credentials for two users:

  • Import@DocumentVerifierAPI – the producer, that will upload documents
  • Export@DocumentVerifierAPI – the consumer, that wil download the verified results

On the „Document entry“ and „Data export“ admin page you will also find the Http Header Attribute value for X-Tenant-Id that you must set.

HTTP response status codes

HTTP response status codes

For each request there can be the following response status codes:

  • 401: Authorization failed. Operation not allowed
  • 403: Authorization failed due to invalid credentials

All other responses can be found in the description of the individual requests.

API Requests

POST documents

https://verifier.buildsimple.de/documentverifier/api/documents

Customers can upload documents from various sources to this endpoint. A workflow is started for each document, which sends the document to the Entity Extraction API and then provides it to an employee of the specified client in his or her inbox.

The return value contains a job documentId that you can use to check the status of the document or download the verified document.

HATEOAS – The response contains a link to check the status of the document:

Link→<https://verifier.buildsimple.de/documentverifier/api/documents/{documentId}/status>; rel=“document“; type=“GET“

HTTP response status codes

  • 200 OK – Document uploaded successfully
  • 400 Bad request
    • Missing or invalid input parameter
    • Unsupported file format

Headers

X-Tenant-Id         {{tenantId}}

Body

formdata

fileThe document file or stream to process.

mandatory – a document file stream.
entityDataThe entity data file or stream.

optional – If you supply entity data, no external entity extraction is called. The entity data must correspond to the schema of the given API (form param ‚entityDataApi‘).
entityDataVendorThe vendor name of the entity extraction API.

optional – default is BUILDSIMPLE, valid values: BUILDSIMPLE|NONE
entityDataApiThe external entity extraction API name.

optional – default is ENTITYEXTRACTIONAPI, valid values: ENTITYEXTRACTIONAPI|DOCUMENTVERIFIERAPI|NONE
priorityThe priority of the corresponding inbox task to be started for this document.

optional – default is NORMAL, valid values: HIGH|NORMAL

Example Request documents

curl --location --request POST "https://verifier.buildsimple.de/documentverifier/api/documents"
 \ --header "X-Tenant-Id: {tenantId}"
 \ --form "file=@"
 \ --form "entityData=@"
 \ --form "entityDataVendor=Buildsimple"
 \ --form "entityDataApi=EntityExtractionAPI"

 

Example Response 200 OK

{
    "documentId": "a46a5432-bd17-498a-9650-68fdefa808dc",
    "processId": "DV190000000821",
    "processingTime": 2370
}

GET documents/{documentId}/status

https://verifier.buildsimple.de/documentverifier/api/documents/{documentId}/status

Get the status of a document. The status changes while the document is processed.

HTTP response status codes

  • 200 OK

Headers

X-Tenant-Id         {{tenantId}}

Example Request documents/{documentId}/status

curl --location --request GET "https://verifier.buildsimple.de/documentverifier/api/documents/{{id}}/status" \
  --header "X-Tenant-Id: {tenantId}"

 

Example Response 200 OK

{
    "statusName": "VERIFIED"
}

POST jobs?action=LEASE

https://verifier.buildsimple.de/documentverifier/api/jobs?action=LEASE&numberOfDocuments=5&leaseSecs=60&docType=INVOICE

Verified documents are queued jobs and have the status VERIFIED. This endpoint allows clients to lease these verified documents for a specified time to process them on site. The result is a list of jobs. Each job represents a processed document and has a property documentId wich can be used to download the verified document properties or image and a property jobId to acknowledge a processed job.

HATEOAS – The response contains links to the leased documents:

Link →<https://verifier.buildsimple.de/documentverifier/api/documents/{documentId}>; rel=“document“; type=“GET“

HTTP response status codes

  • 202 Accepted
  • 400 Bad Request – Missing or invalid input parameter
  • 405 Method Not Allowed

Headers

X-Tenant-Id          {{tenantId}}

Params

actionLEASE

The action that will be applied to the jobs.
optional – default is LEASE, valid values: LEASE
numberOfDocuments5

The number of documents to lease (up to a maximum of 50).
optional – default 5 documents
leaseSecs60

The duration of the lease in seconds (up to a maximum of 300 seconds respectively 5 minutes). The lease duration needs to be long enough to ensure that the slowest task will have time to finish before the lease period expires.
optional – default 60 seconds
docTypeINVOICE

The document type to lease. If you do not specify a document type, the system searches for all jobs.

optional – valid values for the standard model: INVOICE|CONTRACT

Example Request jobs?action=lease

curl --location --request POST "https://verifier.buildsimple.de/documentverifier/api/jobs?action=LEASE&numberOfDocuments=5&leaseSecs=60&docType=INVOICE" \
 --header "X-Tenant-Id: {tenantId}"

Example Response 202 Accepted

[
    {
        "documentId": "9f630043-a008-4e03-a97f-570da5940c89",
        "jobId": "7de6a16d-f844-4ff7-871c-eaa2a7914aac",
        "jobType": "io-extract_INVOICE",
        "jobState": "WAIT",
        "createdBy": "ec2amaz-b0io0l7.ad02.isr|qtp1145954315-1018|1#ISRRoleSystem",
        "creationDate": 1551978726174,
        "executionDate": 1551978786158
    }
]

GET jobs

https://verifier.buildsimple.de/documentverifier/api/jobs?state=WAIT

Get a list of all verfied document jobs.

 

HTTP response status codes

  • 200 OK
  • 400 Bad Request

Headers

X-Tenant-Id          {{tenantId}}

Params

stateWAIT

The job state to filter.
optional - valid values are: LEASED|WAIT|ERROR|ALL

 

Example Request jobs

curl --location --request GET "https://verifier.buildsimple.de/documentverifier/api/jobs?state=WAIT" \
  --header "X-Tenant-Id: {tenantId}"

Example Response 200 OK

[
    {
        "documentId": "e860632b-2d68-45b0-8e39-a51ffcebb0a0",
        "jobId": "7b65ea76-964e-4d4a-b68b-9a1682eb78d5",
        "jobType": "io-extract_INVOICE",
        "jobState": "WAIT",
        "createdBy": "ec2amaz-b0io0l7.ad02.isr|Camunda [Camunda] AutotaskExecutor|1#ISRRoleSystem",
        "creationDate": 1556543685538,
        "executionDate": 1556543685538
    }
]

GET documents/{documentId}

https://verifier.buildsimple.de/documentverifier/api/documents/{documentId}?alt=media
Get a verified document. Call POST documentverifier/api/jobs?action=LEASE to get leased jobs that contain the corresponding document ids in the property documentId.

The result contains the following elements:

status – The status of the document:

  • STARTED – the verifier workflow process was started
  • ARCHIVED – the provided document was archived to the repository
  • UPLOADED – the document was uploaded to the Entity Extraction API
  • EXTRACTED – the determined data of the Entity Extraction API was transferred
  • ENRICHED_HOCR – the document was enriched with hOCR Data
  • CLASSIFIED – the result Classification API was saved
  • NORMALIZED – the field values were normalized
  • SCORED – the scoring of each field was calculated
  • COMPLETED – the document enrichment process has finished und assigned to a verifier user
  • VERIFIED – the document was verified by a verifier user
  • TRAINED – the system has sent the training data
  • MARKED_FOR_DELETION – the document was marked for deletion by a verfier user
  • ARCHIVED – the document was acknowledged as ARCHIVED by a consumer app
  • DELETED – the document was acknowledged as DELETED by a consumer app

response – The response of the verfier user:

  • OK – the user could verify the document without any problems
  • DELETE – the user has marked the document for deletion
  • BAD_IMAGE – the user could not verify the document due to poor image quality

sourceDocument – data of the source document

fields – field names and values of verified entities

groups – field names and values of the verified group entities

auditTrail – The audit trail of the enrichment process:

  • dataChanges – contain logs of changes at field level
  • events – with the following categories
    • system – system events
    • comment – user comments
    • milestone – system and user milestones

HTTP response status codes

  • 200 OK
  • 404 Not Found

Headers

X-Tenant-Id      {tenantId}

Accept              application/json

  • application/json (default)
  • application/octet-stream

Params

altmedia

Alternative param for the header param Accept - application/octet-stream. optional – valid values: media

Example Request documents/{documentId}

curl --location --request GET "https://verifier.buildsimple.de:443/documentverifier/api/documents/59a59a31-c63b-4f90-869c-ee839b2fa589?" \
  --header "Accept: application/json" \
  --header "X-Tenant-Id: {tenantId}"

Example Response 200 OK

{
   "id": "22f32468-6d02-4ffd-977a-554431c2638b",
    "processId": "DV190000001085",
    "classification": "INVOICE",
    "language": "DE",
    "status": "ARCHIVED",
    "response": "OK",
    "sourceDocument": {
        "externalId": "/30ccb55f-1fa6-4bea-baac-661c8215cd67",
        "externalClient": "Apache Jackrabbit Oak",
        "externalType": "ContentEntry",
        "contentName": "Dell.pdf",
        "mediaType": "application/pdf",
        "creationDate": 1556609038377,
        "lastModificationDate": 1556609038377,
        "size": 127293
    },
    "fields": {
        "invoice_deliveryDate": {
            "value": 1246485600000,
            "valueClass": "Date"
        },
        "invoice_deliveryNumber": {
            "value": "A0571564",
            "valueClass": "String"
        },
        "invoice_dueDate": {
            "value": "15 Tage",
            "valueClass": "String"
        },
        "invoice_invoiceCurrency": {
            "value": "EUR",
            "valueClass": "String"
        },
        "invoice_invoiceDate": {
            "value": 1279144800000,
            "valueClass": "Date"
        },
        "invoice_invoiceGrossAmount": {
            "value": 275.97,
            "valueClass": "Double"
        },
        "invoice_invoiceNumber": {
            "value": "5402056711",
            "valueClass": "String"
        },
        "invoice_orderNumber": {
            "value": "646073443",
            "valueClass": "String"
        },
        "recipient_accountNumber": {
            "value": "DE2116202",
            "valueClass": "String"
        },
        "recipient_city": {
            "value": "Braunschweig",
            "valueClass": "String"
        },
        "recipient_company": {
            "value": "ISR Information Products AG",
            "valueClass": "String"
        },
        "recipient_street": {
            "value": "Lange Str. 61",
            "valueClass": "String"
        },
        "recipient_zip": {
            "value": "38100",
            "valueClass": "String"
        },
        "vendor_bankName": {
            "value": "Citibank",
            "valueClass": "String"
        },
        "vendor_bic": {},
        "vendor_city": {
            "value": "Frankfurt",
            "valueClass": "String"
        },
        "vendor_iban": {
            "value": "DE11502109000209865084",
            "valueClass": "String"
        },
        "vendor_name": {
            "value": "DELL GmbH",
            "valueClass": "String"
        },
        "vendor_street": {
            "value": "Unterschweinstiege 10",
            "valueClass": "String"
        },
        "vendor_taxIdNumber": {},
        "vendor_vatNumber": {
            "value": "646073443",
            "valueClass": "String"
        },
        "vendor_zip": {
            "value": "60549",
            "valueClass": "String"
        }
    },
    "groups": {
        "items": [
            {
                "id": "3ecab64e-b5bd-4aa3-836c-bae9d6e4e780",
                "fields": {
                    "item_group_description": {
                        "value": "Logitech Cordless Desktop EX 110",
                        "valueClass": "String"
                    },
                    "item_group_materialNumber": {
                        "valueClass": "String"
                    },
                    "item_group_quantity": {
                        "value": 7,
                        "valueClass": "Double"
                    },
                    "item_group_singleNetAmount": {
                        "value": 33.13,
                        "valueClass": "Double"
                    },
                    "item_group_taxRate": {},
                    "item_group_totalNetAmount": {
                        "value": 231.91,
                        "valueClass": "Double"
                    }
                }
            },
            {
                "id": "905b37be-a816-44f6-a772-6b45be211517",
                "fields": {
                    "item_group_description": {
                        "valueClass": "String"
                    },
                    "item_group_materialNumber": {
                        "valueClass": "String"
                    },
                    "item_group_quantity": {
                        "value": 1,
                        "valueClass": "Double"
                    },
                    "item_group_singleNetAmount": {
                        "value": 0,
                        "valueClass": "Double"
                    },
                    "item_group_taxRate": {},
                    "item_group_totalNetAmount": {}
                }
            }
        ],
        "taxRates": [
            {
                "id": "1dd5378f-a26a-47e2-83dd-89556a7ff2e2",
                "fields": {
                    "invoice_taxRateGroup_netAmount": {
                        "value": 231.91,
                        "valueClass": "Double"
                    },
                    "invoice_taxRateGroup_taxAmount": {
                        "value": 44.06,
                        "valueClass": "Double"
                    },
                    "invoice_taxRateGroup_taxRate": {
                        "value": 19,
                        "valueClass": "Double"
                    }
                }
            }
        ]
    },
    "auditTrail": {
        "dataChanges": [
            {
                "eventTimestamp": 1556544118265,
                "user": {
                    "id": "buildsimple",
                    "email": "info@buildsimple.de",
                    "name": "Lastname, Firstname"
                },
                "fields": [
                    {
                        "fieldType": "FIELD",
                        "fieldName": "recipient_street",
                        "newValue": "Lange Str. 61",
                        "valueClass": "String",
                        "valueEvent": "UPDATE"
                    }
                ]
            }
        ],
        "events": [
            {
                "id": "a0987361-0454-462b-a03e-c6fabbd9230d",
                "eventTimestamp": 1555590568987,
                "user": {
                    "id": "Service",
                    "name": "Product, Service"
                },
                "category": "system",
                "message": "Enrichment - Documents and results of extraction merged"
            },
            {
                "id": "c4456d75-fefa-49eb-a580-12639db08781",
                "eventTimestamp": 1556544192338,
                "user": {
                    "id": "buildsimple",
                    "email": "info@buildsimple.de",
                    "name": "Lastname, Firstname"
                },
                "category": "comment",
                "message": "Ok!"
            },
            {
                "id": "b74cef4a-7f89-4376-9f1f-9573587784c2",
                "eventTimestamp": 1556544196249,
                "user": {
                    "id": "buildsimple",
                    "email": "info@buildsimple.de",
                    "name": "Lastname, Firstname"
                },
                "category": "milestone"
            }
        ]
    }
}

PATCH jobs/{jobId}?status=DELETED

https://verifier.buildsimple.de/documentverifier/api/jobs/{jobId}?status=DELETED

Acknowledge a leased job and set the status of the document to DELETED. To lease a job you must call POST jobs?action=LEASE. Each job has a property jobId that you must provide as a path parameter to acknowledge this job.´

Headers

X-Tenant-Id         {{tenantId}}

Params

statusDELETED

The status to set on the document to which the job refers:
mandantory - DELETED|ARCHIVED.

Example Request jobs/{jobId}?status=DELETED

curl --location --request PATCH "https://verifier.buildsimple.de/documentverifier/api/jobs/{jobId}?status=DELETED" \
  --header "X-Tenant-Id: {tenantId}"
Copy link
Powered by Social Snap