Document Verifier API

The Document Verifier API provides an asynchronous API to upload new documents, acknowledge processed jobs and obtain verified documents

START

Overview

Base URL for all requests: https://verifier.buildsimple.de/documentverifier/api/

Example Process Sequence

Example process sequence

The sequence diagram shows a scenario with two apps calling the API. One app, the producer or importer, is responsible for uploading the documents from various sources, such as scanners, e-mail inboxes, or business apps. The second app, the consumer or exporter, is responsible for downloading the verified documents (the image and the verified properties). This app then acknowledges whether everything has been successfully processed and whether the verified documents can be deleted or archived:

Document Verifier API

Job queue

Job queue

The Document Verifier API uses the concept of a pull queue. Using pull queues, you can have many workers to pull multiple verfied documents with a certain document type all at once. This process is called batching.

Pull queues do not dispatch tasks at all. They depend on other worker services to „lease“ tasks from the queue on their own initiative. Pull queues give you more power and flexibility over when and where tasks are processed, but they also require you to do more process management. When a task is leased the leasing worker declares a deadline. By the time the deadline arrives the worker must either complete the task and delete it or the Task Queue service will allow another worker to lease it.

Authentication

Authentication

The Document Verifier uses Basic Authentication over HTTPS to authenticate a request. To get valid credentials you need to log in to the Document Verifier APP. Here you can generate username/password credentials for two users:

  • Import@DocumentVerifierAPI – the producer, that will upload documents

  • Export@DocumentVerifierAPI – the consumer, that wil download the verified results

PUBLIC

PUBLIC

These requests are accessible to all users with valid credentials.

Post Documents

POST documents

{{scheme}}://{{server}}:{{port}}/documentverifier/api/documents

Customers can upload documents from various sources to this endpoint. A workflow is started for each document, which sends the document to the Entity Extraction API and then provides it to an employee of the specified client in his or her inbox.

The return value contains a job id that you can use to check the status of the document or download the verified document.

 

Body formdata

file          The document file to process.

Headers

X-Tenant-Id          {{tenantId}}

Example Request documents

curl --request POST \
  --url '{{scheme}}://{{server}}:{{port}}/documentverifier/api/documents' \
  --header 'X-Tenant-Id: {{tenantId}}' \
  --header 'content-type: multipart/form-data; boundary=----WebKitFormBoundary7MA4YWxkTrZu0gW' \
  --form file=undefined

 

Example Response 200 OK

{

"processId": "DV180000001909",

"dataId": "c0351e4f-a14d-4418-8a51-ffe1705d5a2d",

"processingTime": 734

}

GET documents/{id}/status

{{scheme}}://{{server}}:{{port}}/documentverifier/api/documents/{{id}}/status

Get the status of a document. The status changes while the document is processed.

 

Headers

X-Tenant-Id         {{tenantId}}

Example Request documents/{id}/status

curl --request GET \
  --url '{{scheme}}://{{server}}:{{port}}/documentverifier/api/documents/{{id}}/status' \
  --header 'X-Tenant-Id: {{tenantId}}'

 

Example Response 200 OK

{
    "statusName": "VERIFIED"
}

POST jobs?action=LEASE

{{scheme}}://{{server}}:{{port}}/documentverifier/api/jobs?action=LEASE&numberOfDocuments=5&leaseSecs=60&docType=INVOICE

Verified documents are queued jobs and have the status VERIFIED. This endpoint allows clients to lease these verified documents for a specified time to process them on site. The result is a list of jobs. Each job represents a processed document and has a property id wich can be used to download the verified document properties or image.

HATEOAS – The response contains links to the leased documents:

Link →<https://verifier.buildsimple.de/documentverifier/api/documentverifier/documents/{{id}}>; rel=“document“; type=“GET“

 

Headers

X-Tenant-Id          {{tenantId}}

 

Params

action

LEASE

The action that will be applied to the jobs. optional – default is LEASE, valid values: LEASE

numberOfDocuments

5

The number of documents to lease (up to a maximum of 50). optional – default 5 documents

leaseSecs

60

The duration of the lease in seconds (up to a maximum of 300 seconds respectively 5 minutes). The lease duration needs to be long enough to ensure that the slowest task will have time to finish before the lease period expires. optional – default 60 seconds

docType

INVOICE

The document type to lease. mandatory – valid values: INVOICE|CONTRACT

 

 

Example Request jobs?action=lease

curl --request POST \
  --url '{{scheme}}://{{server}}:{{port}}/documentverifier/api/jobs?action=lease&numberOfDocuments=5&leaseSecs=60&docType=INVOICE' \
  --header 'X-Tenant-Id: {{tenantId}}'

 

Example Response 202 Accepted

[
    {
        "id": "5d69014f-05d2-4854-a4ea-7b738b489686",
        "jobUUID": "bdd79c32-dcd9-4ae0-bfb2-4a88c1e3d84c",
        "jobType": "io-extract_INVOICE",
        "jobState": "WAIT",
        "createdBy": "10.3.0.131|Camunda [Camunda] AutotaskExecutor|1#ISRRoleSystem",
        "creationDate": 1542124348892
    }
]

 

GET jobs

{{scheme}}://{{server}}:{{port}}/documentverifier/api/jobs

Get a list of all verfied document jobs.

 

Headers

X-Tenant-Id          {{tenantId}}

 

Params

state

WAIT

The job state to filter. optional – valid values are: NEW|WAIT|LOCK|ACTIVE|SUSPEND|ERROR

 

Example Request jobs

curl --request GET \
  --url '{{scheme}}://{{server}}:{{port}}/documentverifier/api/jobs?state=WAIT' \
  --header 'X-Tenant-Id: {{tenantId}}'

 

GET documents/{id}

{{scheme}}://{{server}}:{{port}}/documentverifier/api/documents/{{id}}?alt=media

Get a verified document. Call POST documentverifier/api/jobs?action=LEASE to get leased jobs that contain the corresponding document ids.

 

Headers

Accept

application/json

 

  • application/json (default)

  • application/octet-stream

X-Tenant-Id

{{tenantId}}

 

Params

alt

media

 

Alternative param for the header param Accept – application/octet-stream. optional – valid values: media

 

Example Request documents/{id}

curl --request GET \
  --url '{{scheme}}://{{server}}:{{port}}/documentverifier/api/documents/151968b7-cdf4-4cf4-a19e-5c2fdaa427fd' \
  --header 'Accept: application/json' \
  --header 'X-Tenant-Id: {{tenantId}}'

 

Example Response 200 OK

{
    "id": "151968b7-cdf4-4cf4-a19e-5c2fdaa427fd",
    "processId": "DV180000000004",
    "classification": "INVOICE",
    "language": "DE",
    "sourceDocument": {
        "externalId": "/86a95b12-c22e-424b-a9af-261109bf4b2f",
        "externalClient": "Apache Jackrabbit Oak",
        "externalType": "ContentEntry",
        "contentName": "Computeruniverse.pdf",

 

PATCH jobs/{id}?status=deleted

{{scheme}}://{{server}}:{{port}}/documentverifier/api/jobs/{{id}}?status=DELETED

Acknowledge a leased job and set the status of the document to DELETED.

 

Headers

X-Tenant-Id         {{tenantId}}

 

Params

status

DELETED

The status to set on the document to which the job refers:
mandantory – DELETED|ARCHIVED.

 

Example Request jobs/{id}?status=deleted

curl --request PATCH \
  --url '{{scheme}}://{{server}}:{{port}}/documentverifier/api/jobs/{{id}}?status=DELETED' \
  --header 'X-Tenant-Id: {{tenantId}}'