Quaero Document Archive REST API
AUTHENTICATION
DOCUMENT TYPES
1. List of document types
2. Fetch one document type
INFORMATION FIELDS
1. List of information fields
2. Fetch one information field
FILE FORMATS
1. List of file formats
2. Fetch one document format
SEQUENCES
1. List of sequences
2. Allocate and fetch the next value in a sequence
SEARCHES
1. Search the archive
  1. Full-text search
  2. Search on a field
2. Constrain the search
DOCUMENTS
PROCESSING QUEUES
UPLOAD
JSON
HATEOAS
1. HATEOAS within other objects
2. HATEOAS objects

Quaero Document Archive REST API

Quaero Archive provides a REST API to allow you to create and access documents in the Archive. It also allows you to access information about document types, formats and other configuration elements of the Archive.

Request

Like all REST APIs, it is accessed via HTTP or HTTPS requests. HTTP is normally available on port 33133, unless you have a non-standard configuration. POST requests must encode their parameters with JSON or YAML. The Content-Type header must match the encoding; application/json for JSON and text/vnd.yaml for YAML.

This documentation assumes you know the basics of HTTP. It will not include all the HTTP headers necessary in each example; headers like User-Agent, Host, TE, and Content-Length might be omitted or inaccurate. If you are using the Quaero C# or Perl library then these headers are added as needed.

GET /dw/v1/documents/X1337 HTTP/1.1 X-Comment: Fetch a single document Authorization: Bearer cc0149a3addfc20cd9e505230f492265-0-5FDBB800 Host: HOST.quaero.ca

POST /dw/v1/documents HTTP/1.1 X-Comment: Request with parameters as JSON Authorization: Bearer 0ffa9c978fb31bb5d84515189c1c0ca6-0-5FDBB900 Host: HOST.quaero.ca Content-Type: application/json Content-Length: 30 {"criteria":{"words":"levis"}}

POST /dw/v1/documents HTTP/1.1 X-Comment: Request with parameters as YAML Authorization: Bearer 0ffa9c978fb31bb5d84515189c1c0ca6-0-5FDBB900 Host: HOST.quaero.ca Content-Type: text/vnd.yaml Content-Length: 30 criteria: words: "levis"

Response

Replies are standard HTTP replies. A HTTP status code of 200 OK means the request was successful. Other HTTP status codes are returned if their was a problem with your request, see Errors below.

REST information replies will be JSON objects. They will always include a status field. status:"OK" means the request was successful. status:"error" indicates an error occured. Error text will be available in the error field.

REST data replies will be the appropriate content type. For example, a rendered page of a document will be image/png. The document's intermediate PDF will be application/pdf and the document's file will a content-type appropriate for that document.

GET /dw/v1/documents/X1337 HTTP/1.1 X-Comment: Fetch a single document Authorization: Bearer cc0149a3addfc20cd9e505230f492265-0-5FDBB800 Host: HOST.quaero.ca HTTP/1.1 200 OK X-Comment: JSON response Connection: close Date: Thu, 17 Dec 2020 20:16:57 GMT Server: DW-Server-REST-0.03/DW-4.10.1.2565 Content-Length: 39394 Content-Type: application/json Client-Response-Num: 1 {"count":"70","status":"OK","documents":[{"N":"1","meta":{"date":"2020-06-02","time":"20:17:46","contrat":"LE-378973","client":"E LEV EMPLOYES LEVIS","NUM":"X1062","code_revenus":"26","format":"pdf","facture":"LE-298325","dw-orig-format":"xlsx","DID":"19720","type":"contrat","pages":"2"},"links":[{"rel":"self","href":"https://dev6.quaero.ca/dw/v1/documents/X1062"},{"rel":"file","href":"https://dev6.quaero.ca/dw/v1/documents/X1062.pdf"},{"rel":"original","href":"https://dev6.quaero.ca/dw/v1/documents/X1... (+ 38882 more bytes not shown)

Response as YAML

REST information replies can also be encoded as YAML, if you include the Accept: text/vnd.yaml header in your HTTP request.

PUT /queue/todo/test-20122314-12341-123.txt HTTP/1.1 X-Comment: This provokes an error and gets a YAML response Accept: text/vnd.yaml Authorization: Bearer cc0149a3addfc20cd9e505230f492265-0-5FDBB800 Host: HOST.quaero.ca HTTP/1.1 404 (Not Found) X-Comment: YAML response Connection: close Date: Thu, 17 Dec 2020 20:25:41 GMT Server: DW-Server-REST-0.03/DW-4.10.1.2565 Content-Length: 90 Content-Type: text/vnd.yaml Client-Date: Thu, 17 Dec 2020 20:25:41 GMT Client-Peer: 127.0.0.1:33133 Client-Response-Num: 1 X-PID: 18128 --- "status": 'error' "error": 'Bad endpoint PUT /queue/todo/test-20122314-12341-123.txt'

Errors

The REST API signals errors with HTTP status codes. For instance 413 Request To Large is returned if you attempted to upload a huge document.

HTTP error replies also include a JSON infromation body with extra information about the error.

HTTP/1.1 404 Not Found Content-Length: 99 Content-Type: application/json { "status": "error", "error": "Bad endpoint PUT /queue/todo/test-20122314-12341-123.txt" }

More information

To find out more about REST, read the Wikipedia REST entry.

To find out more about HTTP, read RFC 7230, RFC 7231 and RFC 7235.

AUTHENTICATION

The first step is to authenticate using credentials generated in the Users administrative tool. This will provide an access token that must be provided for all subsequent API calls.

The API authentication is done by asking for a bearer token using the oauth2/token endpoint. The Authorization header is built from your TokenID and Secret according to RFC 7617. You may find your TokenID and Secret in the Users administrative tool in Quaero Archive.

The endpoint will respond with an access_token and token_type. The token_type and access_token must be included in the Authentication header of all subsequent calls to the API. These are denoted by TOKEN_TYPE and ACCESS_TOKEN respectively in the examples below.

The response will also contain a list of HATEOAS links you may use to access other endpoints.

    POST https://HOST.quaero.ca/dw/v1/oauth2/token
    Authorization: Basic MmEyOTExNWIyNTkwNWM4N...WE1OWM5YzBlZjhmZTcxMTAxOWYzOWZj
    Content-Type: application/json

    {"grant-type":"token"}

{ "expires_in" : 43200, "status" : "OK", "access_token" : "1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500", "token_type" : "Bearer" "links" : [ { "rel" : "search", "href" : "https://HOST.quaero.ca/dw/v1/documents" }, { "rel" : "documents", "href" : "https://HOST.quaero.ca/dw/v1/documents" }, { "rel" : "types", "href" : "https://HOST.quaero.ca/dw/v1/types" }, { "rel" : "fields", "href" : "https://HOST.quaero.ca/dw/v1/fields" }, { "rel" : "formats", "href" : "https://HOST.quaero.ca/dw/v1/formats" }, { "rel" : "sequences", "href" : "https://HOST.quaero.ca/dw/v1/sequences" }, { "rel" : "queues", "href" : "https://HOST.quaero.ca/dw/v1/queues" }, { "rel" : "scans", "href" : "https://HOST.quaero.ca/dw/v1/scans" } ] }

wget --http-user=TOKENID --http-password=SECRET \ --auth-no-challenge \ --header="Content-Type: application/json" \ --post-data='{"grant-type":"token"}' \ https://HOST.quaero.ca/dw/v1/oauth2/token -O-

curl --user TOKENID:SECRET \ --header "Content-Type: application/json" \ --data '{"grant-type":"token"}' \ https://HOST.quaero.ca/dw/v1/oauth2/token -o-

string tokenID = "TOKENID"; string secret = "SECRET"; REST.Agent agent = new Quaero.REST.Agent (); agent.set_endpoint( "http://quaero.local.lcan:33133/dw/v1" ); agent.set_credentials( tokenID, secret ); if( !agent.connect() ) throw new Exception ( "Unable to connect" );

DOCUMENT TYPES

List of document types

 GET /dw/v1/types
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500

{ "types" : [ { "name" : "01-facture", "links" : [ { "rel" : "self", "href" : "https://HOST.quaero.ca/dw/v1/types/01-facture" } ] } /* ... more types */ ], "status" : "OK" }

wget --header="Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/types -O-

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/types -o-

// Fetch the list of known document types DW.Types list = agent.types(); foreach( DW.DocType dt in list.types ) { Console.WriteLine( dt.name ); }

Fetch one document type

 GET /dw/v1/types/01-facture
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500

{ "types" : [ { "rotation" : 0, "deduplicate" : false, "saisie" : [], "name" : "01-facture", "default" : false, "i18n" : { "en" : "Customer Invoices", "fr" : "Factures clients" }, "skip-banner" : false, "fields" : [ "location", "client-nom", "N", "time", "date", "facture-no", "client-no", "NUM", "reference", "type", "inside", "commande", "pages" ], "expire" : "10ans", "no-scale" : false, "links" : [ { "rel" : "self", "href" : "https://HOST.quaero.ca/dw/v1/types/01-facture" } ] } ], "status" : "OK" }

wget --header="Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/types/01-facture -O-

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/types/01-facture -o-

// Fetch one document type via the REST agent DW.DocType report = agent.doc_type( "report" ); // Fetch one document type via the list of document types DW.Types doc_types = agent.types(); DW.DocType invoice = doc_types.doc_type( "invoice" );

INFORMATION FIELDS

List of information fields

 GET /dw/v1/fields
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500

{ "fields" : [ { "name" : "run_date", "links" : [ { "rel" : "self", "href" : "https://HOST.quaero.ca/dw/v1/fields/run_date" } ] }, // More fields follow }, "status" : "OK" }

wget --header="Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/fields -O-

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/fields -o-

// Fetch the list of known information fields DW.Fields list = agent.fields(); foreach( DW.InfoField field in list.fields ) { Console.WriteLine( field.name ); }

Fetch one information field

 GET /dw/v1/fields/pick
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500

{ "fields" : [ { "searchable" : true, "justify" : "left", "required" : false, "hide" : false, "significant" : false, "name" : "pick", "i18n" : { "en" : "Pick", "fr" : "Pick" }, "links" : [ { "rel" : "self", "href" : "https://HOST.quaero.ca/dw/v1/fields/pick" } ] } ], "status" : "OK" }

wget --header="Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/fields/pick -O-

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/pick -o-

// Fetch one field via the REST agent DW.InfoField f1 = agent.info_field( "NUM" ); // Fetch one field via the list of fields DW.Fields fields = agent.fields(); DW.InfoField f2 = fields.info_field( "date" ); // Fetch one field via a document type DW.DocType dt = agent.doc_type( "report" ); DW.InfoField f3 = dt.info_field( "run-date" );

FILE FORMATS

List of file formats

To fetch a list of all accepted file formats.

 GET /dw/v1/formats
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500

{ "formats" : [ { "format" : "pdf", "links" : [ { "rel" : "self", "href" : "https://HOST.quaero.ca/dw/v1/formats/pdf" } ] } ], "status" : "OK" }

wget --header="Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/formats -O-

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/formats -o-

// Fetch the list of file formats DW.Formats list = agent.formats(); foreach( DW.Format format in list.formats ) { Console.WriteLine( format.format ); }

Fetch one document format

 GET /dw/v1/formats/pdf
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500

{ "formats" : [ { "extension" : "pdf", "creator" : "", "can-merge" : false, "compressable" : true, "image" : false, "sphinxable" : true, "mime_type" : "application/pdf", "can-page" : false, "format" : "pdf", "asis" : false, "links" : [ { "rel" : "self", "href" : "https://HOST.quaero.ca/dw/v1/formats/pdf" } ] } ], "status" : "OK" }

wget --header="Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/formats/pdf -O-

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/formats/pdf -o-

// Fetch details of one format via the REST agent DW.Format pdf = agent.format( "pdf" ); // Fetch one format via the list of formats DW.Formats formats = agent.formats(); DW.Format tif = formats.format( "tif" ); String ct = tif.mime_type;

SEQUENCES

Sequences are unified way of allocating unique numbers for documents. They starts with letters (A-Z, generally uppercase) and ends with a one or more digits. Sequences are created as needed; accessing an unknown sequence will start with the counter at 1.

Every time you access a sequence, it will increment the counter.

Sequence values are formated with 6 digits (TAG000000) by default but you can specify more or less by adding the zeros to the sequence name (ie TAG000 if you only want 3). You can also specify the starting counter (ie TAG01000 will start the count at 1000) but this only works for previously unstarted sequences.

Sequences will never wrap around - more digits are added as needed. It is impossible to delete a sequence and start again.

List of sequences

 GET /dw/v1/sequences
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500

{ "status" : "OK", "sequences" : [ { "name" : "TEST", "links" : [ { "rel" : "self", "href" : "https://HOST.quaero.ca/dw/v1/sequences/TEST" } ] } /* ... loads more sequences */ ] }

wget --header="Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/sequences -O-

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/sequences -o-

// Get the list of known sequences DW.Sequences sequences = agent.sequences(); foreach( DW.Sequence seq in sequences.sequences ) { Console.WriteLine( seq.name ); }

Allocate and fetch the next value in a sequence

 GET /dw/v1/sequences/TEST000
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500

{ "status" : "OK", "sequences" : [ { "value" : "TEST002", "name" : "TEST", "links" : [ { "rel" : "self", "href" : "https://HOST.quaero.ca/dw/v1/sequences/TEST" } ] } ] }

wget --header="Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/sequences/TEST000 -O-

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/sequences/TEST000 -o-

// Fetch one sequence via the REST agent DW.Sequence s1 = agent.sequence( "TEST" ); Console.WriteLine( "TEST=" + s1.value ); // eg TEST000017 // Fetch the same sequence via the list of sequences DW.Sequences sequences = agent.sequences(); // Note we are also want 3 trailing digits DW.Sequence s2 = sequences.sequence( "TEST000" ); Console.WriteLine( "TEST=" + s2.value ); // eg TEST018 // Get a subsequence number in the sequence s2.next(); Console.WriteLine( "TEST=" + s2.value ); // eg TEST019

SEARCHES

Search the archive

Full-text search

 POST /dw/v1/documents
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500
 Content-Type: application/json
 Content-Length: 36

 {"criteria":{"words":"something"}}

{ "count" : "161", "documents" : [ { "N" : "1", /* note this is 1 based */ "links" : [ { "rel" : "self", "href" : "https://HOST.quaero.ca/dw/v1/documents/B21370" }, { "rel" : "file", "href" : "https://HOST.quaero.ca/dw/v1/documents/B21370.pcl" }, { "rel" : "page-N", "href" : "https://HOST.quaero.ca/dw/v1/documents/B21370/p{page}.png", "templateRequired" : [ "page" ] } ], "meta" : { "DID" : "5792", "NUM" : "B21370", "client" : "E LEV EMPLOYES LEVIS", "code_revenus" : "26", "contrat" : "LE-378973", "date" : "2013-12-04", "facture" : "47340 47316 47309 MT-007288", "format" : "pcl", "page-0-x" : "2550", "page-0-y" : "3300", "pages" : "1" "time" : "07:30:02", "type" : "contrat", } } /* Many more documents follow */ ], "status" : "OK" }

wget --header="Authorization: TOKEN_TYPE ACCESS_TOKEN" -o log \ --header= "Content-Type: application/json" --post-data='{"criteria":{"words":"something"}}' \ https://HOST.quaero.ca/dw/v1/documents -O-

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ --header "Content-Type: application/json" \ --data '{"criteria":{"words":"something"}}' \ https://HOST.quaero.ca/dw/v1/documents -o-

// Build an HTTP request HttpRequestMessage req = agent.search_req( "something" ); // Use your threading to get req or ... HttpResponseMessage resp = agent.do_req( req ); // Parse the response DW.SearchResults R = agent.parse_resp<DW.SearchResults>( resp ); // All three above steps can be done as one DW.SearchResults R2 = agent.search( "something" ); Console.WriteLine( "Found {0} documents.", R.count ); foreach( DW.Document doc in R2.documents ) { Console.WriteLine( "{0}. {1} ({2})", doc.N, doc.field('invoice'), doc.field( 'client' ) ); }

Search on a field

 POST /dw/v1/documents
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500
 Content-Type: application/json
 Content-Length: 36

 {"criteria":{"fields":["contrat"],words=>'LE-378973'}}

wget --header="Authorization: TOKEN_TYPE ACCESS_TOKEN" -o log \ --header= "Content-Type: application/json" --post-data='{"fields":["contrat"],words=>'LE-378973'}}' \ https://HOST.quaero.ca/dw/v1/documents -O-

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ --header "Content-Type: application/json" \ --data '{"fields":["contrat"],words=>'LE-378973'}}' \ https://HOST.quaero.ca/dw/v1/documents -o-

// TO BE WRITTEN

Constrain the search

All dates may be YYYY, YYYY-MM or YYYY-MM-DD.

Page of results

Fetch results past the first page. Defaults to 1.

 GET /dw/v1/documents
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500
 Content-Length: 67
 Content-Type: application/json

 {"criteria":{"during":"2019","type":"contrat","words":"pill box"},"page":3}

Size of pages

Set the size of results pages. Defaults to 100.

 GET /dw/v1/documents
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500
 Content-Length: 67
 Content-Type: application/json

 {"criteria":{"during":"2019","type":"contrat","words":"pill box"},"limit":10}

During a date

Find all contracts for 2019:

 GET /dw/v1/documents
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500
 Content-Length: 67
 Content-Type: application/json

 {"criteria":{"during":"2019","type":"contrat","words":"pill box"}}

wget --header="Authorization: TOKEN_TYPE ACCESS_TOKEN" -o log \ --header "Content-Type: application/json" \ --post-data='{"criteria":{"during": 2019, "type":"contrat"}}' \ https://HOST.quaero.ca/dw/v1/documents -O-

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ --header "Content-Type: application/json" \ --data '{"criteria":{"during": 2019, "type":"contrat"}}' \ https://HOST.quaero.ca/dw/v1/documents -o-

DW.Criteria crit = new DW.Criteria { words = "bill box", types = new List<String>{ "contrat" }, during = "2019" }; DW.Search search = new DW.Search { criteria = crit, page = 1, limit = 50 }; HttpRequestMessage req = agent.search_req( search ); // Use your threading to get req or do it sync HttpResponseMessage resp = agent.do_req( req ); // Parse response DW.SearchResults R = agent.parse_resp<DW.SearchResults>( resp ); // All three above steps can be done as one DW.SearchResults R2 = agent.search( search ); Console.WriteLine( "Found {0} documents.", R.count ); // Note that smaller archive can search by just a document type and year DW.Criteria crit2 = new DW.Criteria { types = new List<String>{"contrat"}, during = "2019" }; DW.SearchResults R3 = agent.search( crit2 ); foreach( DW.Document doc in R3.documents ) { Console.WriteLine( "{0}. {1}", doc.N, doc.NUM ); } // Get the next page of results R3.next(); if( R3.is_empty ) // Past the end of results return; foreach( DW.Document doc in R3.documents ) { Console.WriteLine( "{0}. {1}", doc.N, doc.NUM ); }

Between two dates

Find all contracts for 3rd quarter 2022:

 GET /dw/v1/documents
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500
 Content-Length: 67
 Content-Type: application/json

 {"criteria":{"from":"2022-10-01","to":"2022-12-31","type":"contrat","words":"pill box"}}

wget --header="Authorization: TOKEN_TYPE ACCESS_TOKEN" -o log\ --header "Content-Type: application/json" \ --post-data='{"criteria":{"from": "2022-10-01", "to":"2022-12-31", "type":"contrat"}}' \ https://HOST.quaero.ca/dw/v1/documents -O-

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ --header "Content-Type: application/json" \ --data '{"criteria":{"from": "2022-10-01", "to":"2022-12-31", "type":"contrat"}}' \ https://HOST.quaero.ca/dw/v1/documents -o-

DOCUMENTS

Documents are fetched based on their NUM. You can find a documents NUM by searching for it. The NUM is in the meta field. You may also fetch a document using the links in the links array.

Fetching information about one document

 GET /dw/v1/documents/X2020
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500

{ "documents" : [ { "meta" : { "DID" : "6061", "MJD" : "58087", "NUM" : "X2020", "TID" : "761423986", "date" : "2017-11-30", "dw-orig-format" : "xls", "format" : "pdf", "pages" : "22", "time" : "19:52:51", "type" : "contrat", "client" : "E LEV EMPLOYES LEVIS", "code_revenus" : "26", "contrat" : "LE-378973", "facture" : "LE-298325" }, "links" : [ { "rel" : "self", "href" : "https://HOST.quaero.ca/dw/v1/documents/X2020" }, { "rel" : "file", "href" : "https://HOST.quaero.ca/dw/v1/documents/X2020.pdf" }, { "rel" : "original", "href" : "https://HOST.quaero.ca/dw/v1/documents/X2020.xls" }, { "rel" : "first-page", "href" : "https://HOST.quaero.ca/dw/v1/documents/X2020/p0.png" }, { "rel" : "page-N", "href" : "https://HOST.quaero.ca/dw/v1/documents/X2020/p{page}.png", "templateRequired" : [ "page" ] } ] } ], "status" : "OK" }

wget --header="Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/documents/X2020 -O-

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/documents/X2020 -o-

String NUM = "X1021"; DW.Document doc = agent.document( NUM ); Console.WriteLine( "{0} is a {1} format {2}", doc.NUM, doc.format, doc.type ); Console.WriteLine( "Invoice: {0}", doc.field( 'invoice' ) ); Console.WriteLine( "Client: {0}", doc.field( 'client' ) );

Fetching information about one document without knowing it's NUM

If you do not know the document's NUM and you wish to skip the search step, you can add search criteria as URL parameters. The search must return a single document.

 GET /dw/v1/documents/?invoice=LE-298325
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500

{ "documents" : [ { "meta" : { "DID" : "6061", "MJD" : "58087", "NUM" : "X2020", "TID" : "761423986", "date" : "2017-11-30", "dw-orig-format" : "xls", "format" : "pdf", "pages" : "22", "time" : "19:52:51", "type" : "contrat", "client" : "E LEV EMPLOYES LEVIS", "code_revenus" : "26", "contrat" : "LE-378973", "invoice" : "LE-298325" }, "links" : [ { "rel" : "self", "href" : "https://HOST.quaero.ca/dw/v1/documents/X2020" }, { "rel" : "file", "href" : "https://HOST.quaero.ca/dw/v1/documents/X2020.pdf" }, { "rel" : "original", "href" : "https://HOST.quaero.ca/dw/v1/documents/X2020.xls" }, { "rel" : "first-page", "href" : "https://HOST.quaero.ca/dw/v1/documents/X2020/p0.png" } ] } ], "status" : "OK" }

wget --header="Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/documents/?invoice=LE-298325 -O-

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/documents/?invoice=LE-298325 -o-

DW.UCriteria crit = new DW.UCriteria { { "invoice", "FF-123312" } }; DW.Document doc = agent.document( crit ); Console.WriteLine( "{0} is a {1} format {2}", doc.NUM, doc.format, doc.type );

Fetch one document file

To fetch the document file, use the original link. In the following example, document B7 is a PostScript file.

 GET /dw/v1/documents/B7.ps
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500

Obviously this endpoint returns PostScript (C<application/postscript>), not JSON.

wget --header="Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/documents/B7.ps -O-

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/documents/B7.ps -o-

HttpRequestMessage req = agent.document_req( "B7", "original" ); // OR DW.Document doc = agent.document( "B7" ); HttpRequestMesage req = agent.document_req( doc, "original" ); // OR HttpResponseMessage resp = agent.document( "B7", "original" ); DW.Document doc = agent.document( "B7" ); HttpResponseMessage resp = agent.document( doc, "original" ); // In all cases, once you get your response, you may copy the data to a file: Stream output = File.OpenWrite( "B7.ps"); Stream input = resp.GetResponseStream(); input.CopyTo(output);

Fetch one document's intermediate PDF form

Some document formats, for insance PowerPoint presentations and Word documents, are stored in an intermediate PDF alongside the original file. These PDFs are available on the file link (See HATEOAS below).

 GET /dw/v1/documents/X2020.pdf
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500

Obviously this endpoint returns a PDF, not JSON.

wget --header="Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/documents/X2020.pdf -O-

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/documents/X2020.pdf -o-

HttpRequestMessage req = agent.document_req( "X2020", "file" ); // OR DW.Document doc = agent.document( "X2020" ); HttpRequestMesage req = agent.document_req( doc, "file" ); // OR HttpResponseMessage resp = agent.document( "X2020", "file" ); DW.Document doc = agent.document( "X2020" ); HttpResponseMessage resp = agent.document( doc, "file" ); // In all cases, once you get your response, you may copy the data to a file: Stream output = File.OpenWrite( "X2020.pdf"); Stream input = resp.GetResponseStream(); input.CopyTo(output);

Fetch one rendered page of a document

You may fetch one page of a document rendered as a PNG using the documents/NUM/pPAGE.png endpoint. NUM is the document number, PAGE is the page number, starting at 0.

To get the first page of document X2020 :

 GET /dw/v1/documents/X2020/p0.png
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500

To get the second page of document X2020 :

 GET /dw/v1/documents/X2020/p1.png
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500

Obviously this endpoint returns a PNG, not JSON.

wget --header="Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/documents/X2020/p0.png -O- wget --header="Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/documents/X2020/p1.png -O-

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/documents/X2020/p0.png -o- curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/documents/X2020/p1.png -o-

HttpRequestMessage req = agent.document_req( "X2020", "first-page" ); // OR DW.Document doc = agent.document( "X2020" ); HttpRequestMesage req = agent.document_req( doc, "first-page" ); // OR HttpResponseMessage resp = agent.file( "X2020", "first-page" ); // OR DW.Document doc = agent.document( "X2020" ); HttpResponseMessage resp = agent.file( doc, "first-page" ); // OR access via page number // page 1 DW.Document doc = agent.document( "X2020" ); HttpRequestMesage req = agent.document_req( doc, 0 ); // OR page 2 HttpResponseMessage resp = agent.file( "X2020", 1 ); // OR page 3 DW.Document doc = agent.document( "X2020" ); HttpResponseMessage resp = agent.file( doc, 2 ); // In all cases, once you get your response, you may copy the data to a file: Stream output = File.OpenWrite( "X2020.page-1.png"); Stream input = resp.GetResponseStream(); input.CopyTo(output);

Fetch one page as a GIF

 GET /dw/v1/documents/X2020/p2.gif
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500

Obviously this endpoint returns a GIF, not JSON.

wget --header="Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/documents/X2020/p2.gif -O-

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/documents/X2020/p2.gif -o-

DW.Document doc = agent.document( "X2020" ); HttpRequestMessage req = agent.linked_req( doc, "first-page" ); String uri = req.Uri().ToString(); Regex re = new Regex( @"\.png$" ); req.Uri( re.Replace( uri, ".gif" ) ); HttpResponseMessage resp = agent.do_req( req ); Stream output = File.OpenWrite( "X2020.page-1.gif"); Stream input = resp.GetResponseStream(); input.CopyTo(output);

Fetch one page as a JPEG

 GET /dw/v1/documents/X2020/p3.jpeg
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500

Obviously this endpoint returns a JPEG, not JSON.

wget --header="Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/documents/X2020/p3.jpg -O-

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/documents/X2020/p3.jpg -o-

DW.Document doc = agent.document( "X2020" ); HttpRequestMessage req = agent.linked_req( doc, "first-page" ); String uri = req.Uri().ToString(); Regex re = new Regex( @"\.png$" ); req.Uri( re.Replace( uri, ".jpg" ) ); HttpResponseMessage resp = agent.do_req( req ); Stream output = File.OpenWrite( "X2020.page-1.jpg"); Stream input = resp.GetResponseStream(); input.CopyTo(output);

Delete a document

 DELETE /dw/v1/documents/X2020
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500

{ status: "OK", documents: [ { NUM: "X2020" # all the details of the document you just deleted } ] }

wget --header="Authorization: TOKEN_TYPE ACCESS_TOKEN" \ --method=DELETE \ https://HOST.quaero.ca/dw/v1/documents/X2020 -O-

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ --request DELETE \ https://HOST.quaero.ca/dw/v1/documents/X2020 -o-

HttpRequestMessage req = agent.delete_req( "X2020" ); // OR DW.Document doc = agent.document( "X2020" ); HttpRequestMessage req = agent.delete_req( doc ); // OR DW.Reply repl = agent.delete( "X2020" ); if( repl.is.success ) throw new Exception ( repl.error ) // OR DW.Document doc = agent.document( "X2020" ); DW.Reply repl = agent.delete( doc ); if( repl.is.success ) throw new Exception ( repl.error )

PROCESSING QUEUES

Queues hold documents that will need processing before being added to the archive. The queues available will depend on the configuration of your system and your processing needs. Every system will have an incoming queue.

Get a list of document queues

 GET /dw/v1/queues
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500

{ "status" : "OK", "queues" : [ { "accepts" : [ "document" ], "name" : "aiguillage", "links" : [ { "rel" : "self", "href" : "https://HOST.quaero.ca/dw/v1/queues/aiguillage" } ] }, { "accepts" : [ "meta", "document" ], "name" : "incoming", "links" : [ { "rel" : "self", "href" : "https://HOST.quaero.ca/dw/v1/queues/incoming" } ] }, { "accepts" : [ "document" ], "name" : "pjl-extract", "links" : [ { "rel" : "self", "href" : "https://HOST.quaero.ca/dw/v1/queues/pjl-extract" } ] } /* You might have more queues configured */ ] }

wget --header="Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/queues -O-

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/queues -o-

DW.Queues queues = agent.queues();

Get a list of scanner queues

 GET /dw/v1/scans
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500

Scanner queues hold TIFF files that have been scanned and need processing. Processing steps are image cleanup, image rotation, decoding the barcode and matching the scan to an original document.

{ "scans" : [ { "accepts" : [ "document" ], "name" : "todo", "links" : [ { "rel" : "self", "href" : "https://HOST.quaero.ca/dw/v1/scans/todo" } ] } /* Other queues depending on your configuration */ ], "status" : "OK" }

wget --header="Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/scans -O-

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/scans -o-

DW.Scanners scanners = agent.scanners();

Get details of a single queue

 GET /dw/v1/queues/incoming
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500

{ "status" : "OK", "queues" : [ { "accepts" : [ "meta", "document" ], "name" : "incoming", "links" : [ { "rel" : "self", "href" : "https://HOST.quaero.ca/dw/v1/queues/incoming" } ] } ] }

wget --header="Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/queues/incoming -O-

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ https://HOST.quaero.ca/dw/v1/queues/incoming -o-

DW.Queues queues = agent.queues(); DW.Queue auto10 = queues.queue( "auto10" ); DW.Queue incoming = agent.queue( "incoming" );

UPLOAD

Knowing about queues is all very well. You must will also want to upload documents and files to those queues so they may be processed and added to the archive.

All requests must include a Host header. When uploading a file the Content-Type and Content-Length headers are required.

Note that filenames may only contain letters (A-Z, a-z), numbers (0-9), dashes (-) and underscores (_). Additionally, filenames must start with a letter. No accents, punctuation nor emojis are allowed.

Upload a document's meta data

Uploading a document happens in 2 parts; first you upload its meta-data. This will give you an HATEOAS link that you can use to upload the document file itself. Content-Type and Content-Length headers are required.

 PUT /dw/v1/queues/auto10/SQW0001.json
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500
 Content-Length: 149
 Content-Type: application/json

 {"name-client":"ZZZ INC.",
  "client":"3437238",
  "invoice":"733179",
  "date":"2019/03/13",
  "company":"000007",
  "type":"invoice"}

Note that the auto10 queue used in these examples. You will need to specify a queue that exists in your system. See PROCESSING QUEUES.

{ "status" : "OK", "batchID" : "5C895A06-02EA2DAC", "queue" : "auto10", "links" : [ { "rel" : "self", "href" : "https://HOST.quaero.ca/dw/v1/queues/auto10/5C895A06-02EA2DAC" }, { "rel" : "delete", "href" : "https://HOST.quaero.ca/dw/v1/queues/auto10/5C895A06-02EA2DAC/batch" }, { "rel" : "close", "href" : "https://HOST.quaero.ca/dw/v1/queues/auto10/5C895A06-02EA2DAC/waiting" } ] }

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ --header "Content-Type: application/json" \ --request PUT --upload-file ~/tmp/XZ1000.json \ https://HOST.quaero.ca/dw/v1/queues/auto10/XZ1000.json -o-

String meta_file = @"c:\temp\QC00100.json"; // Get the queue we want to work with DW.Queue incoming = agent.queue( "incoming" ); // Upload the information file DW.Batch batch = incoming.upload( meta_file ); // You can also upload meta by populating a Quaero.DW.Document object DW.Document doc = new DW.Document (); doc.NUM = agent.sequence( "TG" ); doc.type = "invoice"; doc.set( "invoice_no", invoice.no ); doc.set( "branch", invoice.branch ); doc.set( "date", invoice.date ); batch = incoming.upload( doc ); batch.upload( "invoice.pdf", doc.NUM + ".pdf" ); batch.close();

Upload a document's meta data using YAML

You may also specify the meta-data as YAML, using the text/vnd.yaml content-type.

 PUT /dw/v1/queues/auto10/SQW0001.yml
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500
 Content-Length: 128
 Content-Type: text/vnd.yaml

 ---
 name-client: "ZZZ INC."
 client: "3437238"
 invoice: "733179"
 date: "2019/03/13"
 company: "000007"
 type: invoice
 end: of-file

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ --header "Content-Type: text/vnd.yaml" \ --request PUT --upload-file ~/tmp/XZ1000.yml \ https://HOST.quaero.ca/dw/v1/queues/auto10/XZ1000.yml -o-

String meta_file = @"c:\temp\QC00100.yml"; String pdf_file = @"c:\temp\QC00100.pdf"; // Get the queue we want to work with DW.Queue incoming = agent.queue( "incoming" ); // Upload the information file DW.Batch batch = incoming.upload( meta_file ); // Upload the document file batch.upload( pdf_file ); // Close the batch batch.close();

Upload a document's file

A document's file is uploaded to the HATEOAS link self link that you received when you uploaded the meta data.

 PUT /dw/v1/queues/auto10/5C895A06-02EA2DAC/SQW0001.pdf
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500
 Content-Type: application/pdf
 Content-Length: 12312

 %PDF-............................................................
 .................................................................

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ --header "Content-Type: application/pdf" \ --request PUT --upload-file ~/tmp/XZ1000.pdf \ https://HOST.quaero.ca/dw/v1/queues/auto10/5C895A06-02EA2DAC/XZ1000.pdf -o-

String document_file = @"c:\temp\QC00100.pdf"; String meta_file = @"c:\temp\QC00100.json"; // Get the queue we want to work with DW.Queue incoming = agent.queue( "incoming" ); // Upload the information file DW.Batch batch = incoming.upload( meta_file ); // Upload the document file batch.upload( document_file );

Finish an upload

To finish your upload, you must delete the close HATEOAS relationship. This will move the document and information files from a temporary directory into a queue directory and processing will start on the document.

 DELETE /dw/v1/queues/auto10/5C895A06-02EA2DAC/waiting
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500

WRITE ME

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ --request DELETE \ https://HOST.quaero.ca/dw/v1/queues/auto10/5C895A06-02EA2DAC/waiting -o-

Cancel an upload

If you wish to cancel the upload you just completed, you can delete the cancel HATEOAS link. This will delete both the file and meta data you have sent.

 DELETE /dw/v1/queues/auto10/5C895A06-02EA2DAC/batch
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500

{ "status" : "OK", "element" : "batch", "queue" : "auto10", "batchID" : "5C8AC705-31F3BE94" }

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ --request DELETE \ https://HOST.quaero.ca/dw/v1/queues/auto10/5C895A06-02EA2DAC/batch -o-

Boolean uploading( String[] files ) { DW.Queue incoming = agent.queue( "auto10" ); DW.Batch batch = null; foreach( String file in files ) { if( ! File.Exists( file ) ) { if( batch != null ) batch.cancel(); // Cancel the batch if a file doesn't exist return false; } if( batch == null ) batch = incoming.upload( file ); else batch.upload( file ); } batch.close(); foreach( String file in files ) { File.Delete( file ); } return true; }

Upload only the document's meta data

You may wish to upload only meta data. This is useful if you are scanning in documents without barcodes. The user will then view the documents in Quaero Archive but enter the information in your custom application. Your custom application will then upload the meta with this command, which will return a virtual barcode that the user will enter into Quaero Archive to match the scan with the meta data.

The trick is to use the immediate queue, as follows:

 PUT /dw/v1/queues/immediate/meta.yml
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500
 Content-Length: 119
 Content-Type: application/json

 {"name-client":"ZZZ INC.",
  "client":"3437238",
  "invoice":"733179",
  "date":"2019/03/13",
  "company":"000007",
  "type":"invoice"}

Note that depending on your configuration, you might need to set dw-match-field.

Note also that the queue must be immediate and the file name must be meta.yml.

You may also specify the meta-data as YAML, using the text/vnd.yaml content-type.

{ "status" : "OK", "NUM" : "IM001010", "barcode" : "1231234", }

// Get the queue we want to work with DW.Queue immediate = agent.queue( "immediate" ); // Upload the meta data DW.Document doc = new DW.Document (); doc.type = "invoice"; doc.set( "name-client", "ZZZ INC." ); doc.set( "client", "3437238" ); doc.set( "date", "2019/03/13" ); doc.set( "company", "000007" ); doc.set( "invoice", "733179" ); batch = incoming.upload( doc );

Upload only the document's meta data, updating the web UI

If you include the dw-for field in your META, the user's web UI will be updated with the virtual barcode. The user then only has to press "Enter barcode" button to match the scan with the meta data.

    PUT /dw/v1/queues/immediate/meta.yml
    Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500
    Content-Length: 119
    Content-Type: application/json

    {"dw-for":"user-id", /* HIGHLIGHT */
     "dw-match-field":"invoice", /* HIGHLIGHT */
     "name-client":"ZZZ INC.",
     "client":"3437238",
     "invoice":"733179",
     "date":"2019/03/13",
     "company":"000007",
     "type":"invoice"}

Note that user-id is the user's exact ID. It is case sensitive.

Note also that the queue must be immediate and the file name must be meta.yml.

You may also specify the meta-data as YAML, using the text/vnd.yaml content-type.

{ "status" : "OK", "NUM" : "IM001010", "barcode" : "1231234", }

Upload the first scan of a batch

Uploading a scan is done by creating a temporary batch queue and uploading mutliple files to the self link. When completed, you either close the batch queue with the close link or delete it with the cancel link.

 PUT /dw/v1/scans/todo/scan1-20190504-122343-0001.tif
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500
 Content-Type: image/tiff
 Content-Length: 1231

 II*.....................................
 ........................................

Note that scans filenames have a very specific format:

 scannerID-YYYYMMDD-HHMMSS-0000.tif

scannerID: An short string to identify this scanner. Useful for debuging. It must start with a letter and end with a number. It may only contain lower case letters (a-z) and numbers (0-9). It is possible that different scanners will have different processing done to them. Talk to your integrator if this is the case for your setup. When in doubt, use scan1.
YYYYMMDD: The date the scan was made. YYYY is the full year (eg 2019), MM is the month with leading 0 (eg 03 for March) and DD is the day with leading zero (09).
HHMMSS: The time the scan was made. HH is the hour, with a 24 hour clock (eg 13 for 1 PM), MM is the minutes with leading 0 (eg 09) and SS is the seconds with leading 0 (eg 04).
0000: Sequential number of the scan within the batch. Starts at 1. If a number is missing in the sequence (ie, you only uploaded -0001, -0002 and -0004 but not -0003) then the batch will not be processed.

{ "status" : "OK", "links" : [ { "rel" : "self", "href" : "http://dev6:33133/dw/v1/scans/todo/5C8AD76E-2FD1631C" }, { "rel" : "next", "href" : "http://dev6:33133/dw/v1/scans/todo/5C8AD76E-2FD1631C/scan1-20190504-122343-0002.tif" }, { "rel" : "delete", "href" : "http://dev6:33133/dw/v1/scans/todo/5C8AD76E-2FD1631C/batch" }, { "rel" : "close", "href" : "http://dev6:33133/dw/v1/scans/todo/5C8AD76E-2FD1631C/waiting" } ], "queue" : "todo", "batchID" : "5C8AD76E-2FD1631C" }

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ --header "Content-Type: application/tiff" \ --request PUT --upload-file ~/tmp/scan1-20190504-122343-0001.tif \ https://HOST.quaero.ca/dw/v1/scans/todo/scan1-20190504-122343-0001.tif -o-

String file1 = @"c:\temp\scan1-20191014-121212-0001.tif"; DW.Queue incoming = agent.scanner( "todo" ); DW.Batch batch = incoming.upload( file1 );

Upload subsequent scans

You may upload as many files as you need to the to the self link. When completed, you either close the batch queue with the close link or delete it with the cancel link.

 PUT /dw/v1/scans/todo/5C895A06-02EA2DAC/scan1-20190504-122343-0002.tif
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500
 Content-Type: image/tiff
 Content-Length: 43212

 II*.....................................
 ........................................

{ "status" : "OK", "links" : [ { "rel" : "self", "href" : "http://dev6:33133/dw/v1/scans/todo/5C8AD76E-2FD1631C" }, { "rel" : "next", "href" : "http://dev6:33133/dw/v1/scans/todo/5C8AD76E-2FD1631C/scan1-20190504-122343-0003.tif" }, { "rel" : "delete", "href" : "http://dev6:33133/dw/v1/scans/todo/5C8AD76E-2FD1631C/batch" }, { "rel" : "close", "href" : "http://dev6:33133/dw/v1/scans/todo/5C8AD76E-2FD1631C/waiting" } ], "queue" : "todo", "batchID" : "5C8AD76E-2FD1631C" }

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ --header "Content-Type: application/tiff" \ --request PUT --upload-file ~/tmp/scan1-20190504-122343-0002.tif \ https://HOST.quaero.ca/dw/v1/scans/todo/5C895A06-02EA2DAC/scan1-20190504-122343-0002.tif -o-

String file1 = @"c:\temp\scan1-20191014-121212-0001.tif"; String file2 = @"c:\temp\scan1-20191014-121212-0002.tif"; DW.Queue incoming = agent.scanner( "todo" ); DW.Batch batch = incoming.upload( file1 ); batch.upload( file2 );

Finish a batch

To finish the batch, you must delete the waiting HATEOAS link. This will move the files from a temporary directory into the normal scanner processing queue.

 DELETE /dw/v1/scans/todo/5C895A06-02EA2DAC/waiting
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ --request DELETE \ https://HOST.quaero.ca/dw/v1/scans/todo/5C895A06-02EA2DAC/waiting -o-

Boolean uploading( String[] files ) { DW.Queue incoming = agent.queue( "incoming" ); DW.Batch batch = null; foreach( String file in files ) { if( ! File.Exists( file ) ) { if( batch != null ) batch.cancel(); return false; } if( batch == null ) batch = incoming.upload( file ); else batch.upload( file ); } if( batch != null ) batch.close(); // Finish the batch foreach( String file in files ) { File.Delete( file ); } return true; }

Cancel a batch

If you wish to cancel a batch, you can delete the cancel HATEOAS link. This will delete all the files you have sent.

 DELETE /dw/v1/scans/todo/5C895A06-02EA2DAC/batch
 Authorization: Bearer 1b59145e4b23e42b6bb3641308d6c2b5-0-5C815500

curl --header "Authorization: TOKEN_TYPE ACCESS_TOKEN" \ --request DELETE \ https://HOST.quaero.ca/dw/v1/scans/todo/5C895A06-02EA2DAC/batch -o-

Boolean uploading( String[] files ) { DW.Queue incoming = agent.queue( "incoming" ); DW.Batch batch = null; foreach( String file in files ) { if( ! File.Exists( file ) ) { if( batch != null ) batch.cancel(); // Cancel the batch if a file doesn't exist return false; } if( batch == null ) batch = incoming.upload( file ); else batch.upload( file ); } if( batch != null ) batch.close(); foreach( String file in files ) { File.Delete( file ); } return true; }

JSON

Nearly all REST calls return a JSON object. The exception are calls that return images or documents.

Every JSON response object will always have a status field.

Global fields

status

This element will be OK on success, error on failure. In the latter case, error will be set.

error

Contains a text string that describes an error, helpful for diagnostic purposes. Only set if status is error.

faults

Sometimes an endpoint will detect many things wrong with a request. In these cases it will return an array of strings describing the errors in faults.

name

Simple text name or tag of an object.

Authentication fields

grant-type

Type of access you are requesting. Must be token.

access_token

An opaque token used to authorized subsequent REST calls.

token_type

Type of token used for authentification. Must be included in the Authorization header.

expires_in

Number of seconds an access_token is valid.

Document types

types

An array of simplified DOCUMENT TYPES objects. Each object has a name and links field.

rotation

Initial clockwise rotation of all pages of the document

deduplicate

Boolean value that decides if a duplicate documents are checked and rejected.

saisie

Array of details of manual data entry.

skip-banner

Boolean value that determines if the first page of a document is skipped when it is displayed.

expire

String that describes how long a document will remain in the archive.

Information fields

fields

An array of INFORMATION FIELDS objects.

searchable

justify

required

hide

significant

Document formats

formats

An array of FILE FORMATS objects.

format

Short name of the format.

asis

Is this format unparsable and unprocessable? If true, then no processing is done, and the file can not be viewed unless the user installs a viewer extension in his browser. An example unprocessable format is a .step g-code file.

can-page

Can Quaero Archive navigate to a specific page within the file?

creator

Short name of the application that would create this file format. If multiple applications can create the format, then a generic name is used. This field could be used to generate text prompts.

BLANK: This field is an empty string ("") if the format is a generic interchange format.
calc: Any spreadsheet application, ie MS Excel or LibreCalc.
draw: Any vector graphics application, ie Corel Draw or LibreDraw.
impress: Any presentation software, ie PowerPoint.
writer: Any word processing software, ie MS Word or LibreOffice.
libreoffice: Other office formats used by LibreOffice.
AutoCAD: Autodesk-specific CAD software.
CAD: Any other CAD software formats, including 3D formats.
CATIA: Dassault Systems CATIA files.
inconnu: Unknown application.

extension

File extension.

can-merge

Can Quaero Archive append mutliple files of this format together? For example, many TIFFs can be combined into a multipaged TIFF, but multiple PDFs can not be merged.

compressable

Is compressing this file worthwhile? Quaero Archive saves a lot of disk space by compressing some file formats. As an example, TIFF files can be compressed by 80% or more. However, some file formats are already compressed and running another layer of compression is waste of time. For example, PDF and OpenDoc files are already compressed.

image

Is this format an image file? Examples of image formats are PNG and TIFF.

mime_type

Mime type for this format.

sphinxable

Can text be easily extracted from this format for full-text searching?

Note that some documents types might be configured for OCR and will be full-text searchable even if the document format has sphinxable = false.

Search criteria

criteria

An object that contains your search criteria.

 { "criteria" : {
     "during": "2019-03",
     "words": "key words",
     "type": "01-invoice",
   },
   "first": 11,
   "limit": 50
 }

It must contain one or more of the following fields.

during

Limit search to this time period. For instance 2019 will limit the search to one year and 2019-03 will limit the search to one month. This field has precedence over from and to

from

Search starting at this date. This field is ignored if during is specified.

to

Search up to and including this date. This field is ignored if during is specified.

type

Limit the search to this document type.

types

Array of document types to search. Ignored if type is specified.

sort

Information field to sort on.

words

Search for these keywords.

fields

Search within these information fields.

first

First document from the search results to return. Defaults to 1. Use this field to page through search results. Ignored if page is set.

page

Which page of search results to return. Defaults to 1.

limit

Number of documents to return from the search restuls. Defaults to 100. Maxium is 1000.

Internationalisation

Quaero Archive currently supports English and French languages. Object such as document types have longer text names in both languages.

i18n

An object containing all the known names for an object.

en

The name in English.

fr

Le nom en Français.

Documents

documents

An array of document objects.

NUM

Unique identifier of the document.

DID

Unique non-repeatable number assigned to the document.

type

Short text name of the document type. See DOCUMENT TYPES.

TID

A unique numeric identifier for the document type.

date

Date the document was created, in the format of "YYYY-MM-DD". Note that this could be different from the date it was uploaded to Quaero Archive.

MJD

"Modified Julian Date" the document was created. This differs from date in that it is a number, not a string.

time

Time the document was created.

format

File format of the document. See FILE FORMATS.

dw-orig-format

The file format of the original document. For some complex formats, Quaero Archive converts the file to an intermediate PDF. This intermediate PDF is used for display and processing.

pages

Number of pages in the document. Note that some file formats can't be parsed by Quaero Archive and will have pages = 0.

Upload queues

queues

An array of upload queue objects. These objects contain the following fields.

queue

Name of the queue.

batchID

Identifier of the current batch.

accepts

List of types of files this queue will accept. Currently limited to document and meta.

meta: A JSON or YAML file containing the information fields of a document.
document: The document file. In the case of scanner queues, this must be a TIFF. For document queues it can be any of the document formats that Quaero Archive understands.

Scanner queues

scans

An array of scanner queue objects. These objects contain the same fields as queues objects.

HATEOAS

Hypermedia as the Engine of Application State, which is a long winded way of saying that an object returned by the REST API will have list of links that are related to itself and objects.

{ "links" : [ { "rel" : "self", "href" : "https://HOST.quaero.ca/dw/v1/documents/X2020" }, { "rel" : "delete", "method": "DELETE", "href" : "https://HOST.quaero.ca/dw/v1/documents/X2020" }, { "rel" : "file", "href" : "https://HOST.quaero.ca/dw/v1/documents/X2020.pdf" }, { "rel" : "original", "href" : "https://HOST.quaero.ca/dw/v1/documents/X2020.xls" }, { "rel" : "page-N", "href" : "https://HOST.quaero.ca/dw/v1/documents/X2020/p{page}.png", "templateRequired" : [ "page" ] } ] }

self: Link to itself.
close: Link to finish an upload or scan batch. You must DELETE this link to close the batch.
delete: Link to delete a document or an upload or scan batch. You must DELETE this link to cancel the batch.
documents: Link to the document endpoint.
fields: Link to the fields endpoint.
file: Link to download the document file.
first-page: Link to an image of the first page of the document.
formats: Link to the formats endpoint.
next: Link to the next expected scan in the batch.
original: Link to the original document file. This will be different from file for the file formats that use an intermediate PDF, for instance OpenDoc files.
page-N: Templated link to an image of any page of the document. This link has the {page} string expansion, which you must expand to a the page number you wish to fetch. Page numbering starts at 0.
pdf: Link to the document file converted to PDF. Not available for file formats that have an intermediate PDF, for instance OpenDoc files, as it is already available as a file relationship.
queues: Link to the queues endpoint.
scans: Link to the scans endpoint.
search: Link to the search endpoint. A POST to this endpoint will perform a search of Quaero Archive.
sequences: Link to the sequences endpoint.
types: Link to the types endpoint.