Record Matching - Match Post Tutorial
Download tutorial source code |
Click here to download |
Last update | August 2016 |
Environment | Windows, Linux |
Compilers | JDK 1.7 or greater |
Prerequisites | Components
|
Introduction
Record Matching API is a powerful tool that allows you to match your own entity data to Refinitiv's identifiers. Permid.Org record matching is the first step to integrating your existing data with Refinitiv's PermID. Use our API to resolve your Organization, Instrument, or Quote records to PermID.org URLs, reconciling your data with Refinitiv data. Then, use obtained identifiers to get more information on your records from Refinitiv via our search API, enriching your data with the full extent of Refinitiv’s knowledge.
You can send up to 1000 records to resolve in one request.
Description
The service accepts custom-formatted records to be matched in bulk, as either plain text input or file input. This tutorial goes over the second way, the file input, and it illustrates the implementation of it in Java.
Each line of the submitted file should contain a single record to be matched. For an example, refer to the example file included with this tutorial named exampleRM.csv. The first line of the file contains headers, indicative of the content that might be submitted as part of a record. A search will be performed for all lines within the submitted batch. Each line is treated independently and a corresponding result will be provided as part of a complete JSON response which will include a header containing the total match results. The response will contain the openpermids that were matched by the service from the submitted records.
Please note that any language that supports HTTP can be used to implement the request
Setup Steps
The steps include:
- Get your access-token from the Try it Out! tab (read the Quick Start guide learn how to do it)
- Review the example source code
- Build and run
- Setup Eclipse or other Java project
- Or build the java example from the command line
- Include HTTP client prerequisite libraries provided with the tutorial
Review the example code
The steps include:
- Create HTTP client
// create HTTP client CloseableHttpClient httpClient = HttpClientBuilder.create().build();
- Specify the end-point URL (in this tutorial we are using file-based service
// specify end-point URL HttpPost httppost = new HttpPost("https://api.thomsonreuters.com/permid/match/file");
- Read in a specially formatted file, this example comes with file exampleRM.csv
FileBody file = new FileBody(new File(args[1]));
- From file, create and set entity
//create HTTP entity to post the required file HttpEntity reqEntity = MultipartEntityBuilder.create() .addPart("file", file) .build(); httppost.setEntity(reqEntity);
- Set required and option headers. For the full list of options, please refer to Record Match Guide
// set required headers httppost.setHeader("x-openmatch-numberOfMatchesPerRecord", "1"); httppost.setHeader("X-AG-Access-Token", args[0]); httppost.setHeader("x-openmatch-dataType", "Organization");
- Issue request, collect JSON response
ResponseHandler<String> responseHandler=new BasicResponseHandler(); // execute String strResponse = httpClient.execute(httppost, responseHandler); JSONObject jsonResponse=new JSONObject(strResponse);
- Process response (output)
// pretty-print json response int spacesToIndentEachLevel = 2; System.out.println("JSON response:\n"+ jsonResponse.toString(spacesToIndentEachLevel));
Alternatively, the response can be received and simply printed as is without using JSONObject.
- Close http client
//close on client httpClient.close();
Build and run
The quickest way to build and run is with an IDE, Eclipse or NetBeans would work great
To build and run from the command line:
- Build:
javac -cp ".;prereqs\httpclient-4.4.jar;prereqs\httpcore-4.4.jar;prereqs\ httpmime-4.4.jar ;prereqs\java-json.jar;prereqs\httpmime-4.4.jar " tr\test\*.ja
- Run
java -cp ".;prereqs\httpclient-4.4.jar;prereqs\httpcore-4.4.jar;prereqs\commons-logging-1.2.jar ;prereqs\java-json.jar;prereqs\httpmime-4.4.jar" tr.test.HTTPClientRecordMatchPost YOURTOKENGOESHERE exampleRM.csv
Understanding the input
Input file from our example looks like this, the top line contains headers, explaining the information contained on the rest of the lines. Field names or values are separated by commas and note that some field values are empty:
LocalID,RIC,Ticker,Name,Country,Street,City,PostalCode,State,Website
1,AAPL.O,,Apple,US,"Apple Campus, 1 Infinite Loop",Cupertino,95014,California,
2,AAPL.O,,Apple,,,,,,
3,,TEVA,Teva Pharmaceutical Industries Ltd,IL,,Petah Tikva,,,
4,TATA,,Tata Sky,IN,,,,,
5,IBM.N,,,,,,,,
6,,msft,,,,,,,
7,,GPRO,GoPro Inc,,,,,,
The request constructed by our example will look like this:
The request includes the path and the query parameters. The file is being submitted as “multipart/form-data”. The mandatory headers are also included with the request (see the code). From api.thomsonreuters.com we request from the match service to run a match on the information contained within our file.
CURL request example
To help verify if the submission is correct, a popular approach is to use command line utility CURL.
So we also include an example of a curl request:
curl "https://api.thomsonreuters.com/permid/match/file"-H "X-AG-Access-Token:____TOKEN__HERE____"
-H "xopenmatch-numberOfMatchesPerRecord: 1"
-H "x-openmatch-datatype: Organization" -F "file=@____FILE___NAME___HERE______
Understanding the expected output
This is the response we receive when we submit for matching the example file included with this tutorial.
JSON response:
{
"matched": {
"total": 5,
"excellent": 5
},
"errorCodeMessage": "Success",
"resolvingTimeInMs": 115,
"headersIdentifiedSuccessfully": [
"localid",
"name",
"country",
"street",
"city",
"postalcode",
"state",
"website"
],
"numErrorRecords": 2,
"unMatched": 0,
"requestTimeInMs": 116,
"headersNotIdentified": [
"ric",
"ticker"
],
"outputContentResponse": [
{
"Input_LocalID": "1",
"Original Row Number": "2",
"Input_City": "Cupertino",
"Match OrgName": "Apple Inc",
"Input_PostalCode": "95014",
"Match Ordinal": "1",
"Input_Street": "Apple Campus, 1 Infinite Loop",
"Input_Country": "US",
"ProcessingStatus": "OK",
"Match Level": "Excellent",
"Match Score": "100%",
"Input_Name": "Apple",
"Match OpenPermID": "https://permid.org/1-4295905573",
"Input_State": "California"
},
…
{
"Original Row Number": "6",
"ProcessingStatus": "ERROR: Row Num 6, Cause - At least one of the fields should not be null: [standard identifier, name]. Original Row: 5,IBM.N,,,,,,,,",
"Match Level": "No Match"
},
{
"Original Row Number": "7",
"ProcessingStatus": "ERROR: Row Num 7, Cause - At least one of the fields should not be null: [standard identifier, name]. Original Row: 6,,msft,,,,,,,",
"Match Level": "No Match"
}
],
"headersSupportedWereNotSent": ["standard identifier"],
"errorCode": 0,
"numReceivedRecords": 7,
"uploadedFileName": "exampleRM.csv",
"numProcessedRecords": 7,
"ignore": " "
}
The output we are looking at is JSON. It will contain both match and non-match results (as indicated by numErrorRecords). It contains Refinitiv identifiers, Open PermIDs (for matches). For more information on JSON please refer to: http://json.org
This information on response, and more, can be found on developer.permid.org:
PARAMETER | DESCRIPTION |
---|---|
outputContentResponse | A JSON matching your data to Open PermID URLs. |
errorCode | Present when an error is encountered when processing the request. It is a short code for the error type. |
errorCodeMessage | Present when an error is encountered when processing the request. Containing a descriptive cause of the error encountered. |
headersIdentifiedSuccessfully | List of the column headers in the inputted CSV that are compliant to the input template and successfully processed. |
headersNotIdentified | List of the column headers in the inputted CSV that are not compliant with the input template and not used for matching. This is to allow you to debug your template to make sure you’ve named columns correctly. |
headersSupportedWereNotSent | headersSupportedWereNotSentList of the column headers in the inputted CSV that could have been sent, but were not sent. This is to allow you to debug your template to make sure you’ve named columns correctly. |
numReceivedRecords | numReceivedRecordsThe number of records for matching sent in your request. |
numProcessedRecords | The number of records for matching that was processed successfully. |
numErrorRecords | The number of records for matching that could not be processed due to an error. |
Matched | The number of records processed successfully, for which a match was found. |
Unmatched | The number of records processed successfully, for which no match was found. |
requestTimeInMs | How long it took to process the request – time in milliseconds. |
resolvingTimeInMs | How long it took to perform the matching part of the request – time in milliseconds. |
uploadedFileName | In case of permid/match/file: Name of uploaded file. |
Learn more
For more information, developer guides, FAQ and Release Notes check out the documentation