DSS Extract Endpoint Obsolescence

Introduction

This article has two purposes. It aims to explain:

  1. Why extraction notes are useful, and why you should save and analyze them. This is relevant for anyone using DSS (DataScope Select) to retrieve data.
  2. The planned obsolescence of the DSS Extract endpoint. This change will impact existing and future developments that use the DSS REST API.

Table of contents

Extract endpoint obsolescence

This is only relevant for On Demand extractions made without using the .Net SDK.

Current situation

The DSS REST API delivers 3 On Demand extraction endpoints:

  • Extractions/Extract
  • Extractions/ExtractWithNotes
  • Extractions/ExtractRaw

Extract returns only the data, in JSON format.

ExtractWithNotes delivers the data in JSON format, as well as extraction notes.

ExtractRaw delivers data in compressed CSV format, as well as extraction notes, but uses a different workflow.

For a detailed comparison of these calls, refer to this section of the DSS REST Tutorials Introduction.

Reason for the change

The idea is to encourage a more effective use of the API, by systematically retrieving, storing and analyzing the extraction notes.

Time table and references

The Extract endpoint is already deprecated, and it is planned to disable it in the future. Although it was originally announced that the Extract endpoint would be disabled in release 12.3 (in January 2019), that will not be the case, but it will happen in the second half of 2019 (the exact date has yet to be announced). The Developer Community advisory and PCN (Product Change Notification) 10132 have been updated in consequence.

If you use the Extract endpoint, you should plan to replace it with ExtractWithNotes before mid 2019.

Extraction notes details – why are they useful

Extraction notes are extremely useful, for several reasons:

  • They contain a lot of interesting information:
    • Extraction details
    • Instrument specifics
    • Warning messages
    • Error messages
  • You must send them to the DSS support team when you ask for help on extraction issues encountered with particular requests.

There are actually 2 sets of extraction related notes:

  1. The extraction notes
  2. The RIC maintenance notes

They are bundled together in the result of an On Demand request. Scheduled extractions generate them in two separate files.

The extraction notes deliver details of the extraction and its timing, list inactive instruments, and highlight instrument expansion, applied embargoes, warnings, errors, data usage, etc.

The RIC maintenance notes deliver information on name changes and corporate actions.

It is strongly recommended to save all these notes, and to analyze them to detect name changes, warnings or potential issues that might otherwise go unnoticed.

Here is a notes sample (slightly reformatted to enhance lisibility) from an on demand intraday pricing request:

"Notes": [
  "Extraction Services Version 12.2.39800 (46e4b31ee8b8), Built Oct 26 2018 17:16:27
   Processing started at 31102018 03:37:02 PM.
   User ID: 33314
   Extraction ID: 329709808
   Schedule: _OnD_0x0663e6ec9d001825 (ID = 0x0663e6ec9f301825)
   Input List (3 items): _OnD_0x0663e6ec9d001825 (ID = 0663e6ec9d601825) Created: 31102018 03:36:55 PM Last Modified: 31102018 03:36:55 PM
   Schedule Time: 31102018 03:36:56 PM
   Report Template (23 fields): _OnD_0x0663e6ec9d001825 (ID = 0x0663e6ec9d201825) Created: 31102018 03:36:55 PM Last Modified: 31102018 03:36:55 PM
   WARNING: No Pricing for instrument (CSP,438516AC0,CPN), segment 'G' due to instrument not being traded.
   No prices needed currency scaling.
   Real-time data was snapped at 31102018 03:37:02 PM, it was scheduled to snap at 31102018 03:36:56 PM.
   Embargo delay of 15 minutes required by [ GER (DEUTSCHE BOERSE XETRA ULTRA LEVEL 1), GE2 (DEUTSCHE BOERSE - XETRA ULTRA L1 L2) ] for quotes from GER
   The last report will be embargoed until 31102018 03:52:00 PM (15 minutes) due to quote: RIC,ALVG.DE,GER - Last Update Time: 31102018 03:37:00 PM.
   The file _OnD_0x0663e6ec9d001825.0min.csv will be available immediately.
   The file _OnD_0x0663e6ec9d001825.15min.csv will be embargoed until 31102018 03:52:00 PM.
   Processing completed successfully at 31102018 03:37:02 PM, taking 0.56 Secs.
   Extraction finished at 31102018 02:37:02 PM UTC, with servers: x10i07, QSHC18 (0.1 secs), QSSHA1 (0.0 secs)
   Usage Summary for User 33314, Client 11122, Template Type Intraday Pricing
   Base Usage
           Instrument                          Instrument                   Terms          Price      
     Count Type                                Subtype                      Source         Source    
   ------- ----------------------------------- ---------------------------- -------------- ----------------------------------------
         1 Corporate                                                        N/A            N/A
         2 Equities                                                         N/A            N/A
   -------
         3 Total instruments charged.
         0 Instruments with no reported data.
   =======
         3 Instruments in the input list.
   No TRPS complex usage to report -- 3 Instruments in the input list had no reported data.
   Writing RIC maintenance report.
  ",
  "Identifier,IdentType,Source,RIC,RecordDate,MaintType,OldValue,NewValue,Factor,FactorType   "
]

Let us now see in detail what type of information the notes can deliver, and why this is important.

General extraction information

Here is a snippet from extraction notes of an on demand End of Day pricing extraction request:

Extraction Services Version 12.2.39749 (667ea36e4050), Built Oct 18 2018 18:03:24
Holiday Rollover of Universal Close Price waived.
Processing started at 22102018 04:51:04 PM.
User ID: 33314
Extraction ID: 328169652
Schedule: _OnD_0x0660f77ebfd01442 (ID = 0x0660f77ec7101442)
Input List (2 items): _OnD_0x0660f77ebfd01442 (ID = 0660f77ec2501442) Created: 22102018 04:51:02 PM Last Modified: 22102018 04:51:02 PM
Schedule Time: 22102018 04:51:03 PM
Report Template (73 fields): _OnD_0x0660f77ebfd01442 (ID = 0x0660f77ebff01442) Created: 22102018 04:51:02 PM Last Modified: 22102018 04:51:02 PM
Processing completed successfully at 22102018 04:51:04 PM, taking 0.628 Secs.
Extraction finished at 22102018 02:51:04 PM UTC, with servers: x12n01, QSHC17 (0.4 secs), QSSHA1 (0.0 secs)

It contains the extraction start and end times, duration, IDs, etc. This can be useful if you want to do performance analysis or need to investigate suspected issues.

Actual extraction durations are often quite short; what usually takes a bit more time is the wait in queue (especially when the servers are under heavy load). Bear in mind that extraction times can be longer for Time Series requests, because Intraday requests will be treated with highest priority and End of Day requests are prioritized as well. Also take note that Intraday pricing requests could be subject to embargoes.

Intraday pricing snap time

When running an Intraday pricing request it can be useful to know the exact time when prices were snapped. Here is the relevant snippet from the extraction note:

Real-time data was snapped at 31102018 04:17:22 PM, it was scheduled to snap at 31102018 04:17:16 PM.

Note that data was snapped a few seconds later than the extraction start time. This is normal; the delay varies depending on server load and the number of instruments.

Embargoes

An Intraday pricing extraction can be embargoed (in other words, delayed) if you do not have real-time permission for the selected data venues.

If no embargo applies, the notes will contain this:

No embargo required for this report.

If one or more embargoes apply, you will get details on each of them, one venue per line:

Embargo delay of 15 minutes required by [ GER (DEUTSCHE BOERSE XETRA ULTRA LEVEL 1), GE2 (DEUTSCHE BOERSE - XETRA ULTRA L1 L2) ] for quotes from GER
The last report will be embargoed until 31102018 03:52:00 PM (15 minutes) due to quote: RIC,ALVG.DE,GER - Last Update Time: 31102018 03:37:00 PM.
The file _OnD_0x0663e6ec9d001825.0min.csv will be available immediately.
The file _OnD_0x0663e6ec9d001825.15min.csv will be embargoed until 31102018 03:52:00 PM.

This particular example illustrates a 15 minute embargo applied for Xetra. It also tells us that a partial data delivery will be done immediately for data that is not embargoed (instruments for which we have real-time permissioning), whereas the Xetra data will be delivered 15 minutes later. Partial deliveries can be enabled or disabled using the GUI, in the user preferences. Embargoes and their handling are explained in detail in this section of the .Net SDK tutorial 4.

Thanks to this information you can find out why an intraday request was unexpectedly delayed, which could lead you to a better understanding of your real-time data requirements. This could lead you to a discussion with your account manager.

Invalid instruments

Invalid instruments are those that are unsupported, not found or inactive.

An inactive instrument had no activity in the request period, and thus does not deliver pricing data, whereas other data (like the currency or exchange code) could be returned.

Note: in the user preferences you can set if inactive and unsupported instruments should be allowed (or not) in your instrument lists. You can override this default value for individual extraction requests, using the identifier list validation options:

"ValidationOptions": {
  "AllowInactiveInstruments": true,
  "AllowUnsupportedInstruments": false
},

Here are a few examples of messages related to invalid instruments:

WARNING: No Pricing for instrument (CSP,438516AC0,CPN), segment 'G' due to instrument not being traded.
WARNING: No Pricing for instrument (RIC,CARR.VX,), segment 'E' due to instrument not found.
All identifiers were invalid.  No extraction performed.

Please bear in mind that the extraction notes content differs depending on the type of requested data.

Data usage

You can check your total data usage in the DSS GUI usage dashboard.

For each individual request, a data usage summary is delivered in the extraction notes:

Usage Summary for User 33314, Client 11122, Template Type Intraday Pricing
Base Usage
        Instrument                          Instrument                   Terms          Price   
  Count Type                                Subtype                      Source         Source
------- ----------------------------------- ---------------------------- -------------- ----------------------------------------
      1 Corporate                                                        N/A            N/A
      2 Equities                                                         N/A            N/A
-------
      3 Total instruments charged.
      0 Instruments with no reported data.
=======
      3 Instruments in the input list.

This gives you a more detailed view on how much each request impacts your data usage, and can serve to monitor data consumption, or investigate unexpected increases.

TRPS complex usage

Pricing for TRPS (Thomson Reuters Pricing Service) derivatives and structured notes is different from other instrument types, and requires additional permissions. Complex instruments are grouped in 3 tiers, their usage (or absence of) is also reported:

No TRPS complex usage to report -- 3 Instruments in the input list had no reported data.

Instrument expansion

This is of importance if you use:

  • Chain RICs: a chain is expanded, at extraction time, into its constituent RICs.

A request for one chain will therefore result in a request for several RICs (anything from a few to several hundred depending on the chain).

Example: the 0#.FTSE RIC (FTSE 100 international index) expands to 103 RICs.

  • File codes: they also expand at extraction time, just like chains.

    If you set file code expansion to include delisted RICs (the default value is set in the DSS GUI general preferences) they will expand even more.
  • Non-RIC instrument identifiers: CUSIPs, ISINs and SEDOLs are usually resolved to the primary RIC. But in some cases the mapping between these or other instrument codes and RICs is not 1:1.

Here is an extraction notes snippet mentioning instrument expansion, from an On Demand Intraday Pricing extraction request for ChainRIC 0#.FTSE:

CHAIN RIC 0#.FTSE expanded to 103 RICS: .AD.FTSE to WTB.L.
Total instruments after instrument expansion = 103

There are a few rare cases where an instrument could expand to 0 instruments. This could happen with a sub-chain RIC for instance, as explained on page 55 of the DSS User Guide.

Instrument expansion consequences

Instrument expansion can potentially impact extraction limits, and your data usage.

If an instrument list used in an extraction request is very large and close to the extraction limit for that type of request, instrument expansion could result in exceeding the limit, in which case the instruments in excess would be ignored, and no data returned for them.

As instrument expansion dramatically increases the number of RICs in a request, your data usage also increases.

Suppressed items

Some data items might be suppressed for lack of permission, typically third party and premium data content fields. Here are 2 examples from the extraction notes:

Column 'CIN Code' suppressed for lack of 'CIN Code' permission.
(RIC,23323CCP8=FINR,FNR)  row suppressed for lack of 'DTCC CMORT' permission.

Also, prices that are older than yesterday will be suppressed from extractions for current day prices.

Currency scaling

Several API calls allow you to set currency scaling, using the following condition in the extraction request:

"Condition": { "ScalableCurrency": true }

This option allows you to extract certain prices in major currencies instead of minor currencies. It does not convert major currencies to minor currencies. For example, if an instrument is quoted in GBp, this option will convert the pricing to GBP. However, if the instrument is in GBP, no conversion will be performed.

Extraction notes will mention if no scaling was applied:

No prices needed currency scaling.

Here is what you can see when currency scaling was applied; this example is from an Intraday Pricing extraction request for ChainRIC 0#.FTSE:

606 prices were scaled from currency GBp (divided by 100).

You therefore know if any scaling was applied.

Warnings

These highlight various cases that are not considered as errors, but that you ought to know about, like: missing prices, delisted instruments, restricted content, value truncation (if you set the number of decimals to be smaller than delivered values), file code changes, etc.

Example from an Intraday pricing request for an instrument that is not traded:

WARNING: No Pricing for instrument (CSP,438516AC0,CPN), segment 'G' due to instrument not being traded.

Intraday pricing request for delisted instruments: the extraction notes contain warnings:

WARNING: No Pricing for instrument (RIC,CARR.VX,), segment 'E' due to instrument not found.
WARNING: No Pricing for instrument (RIC,RTR.L,), segment 'unknown' due to instrument not found.

Time series request for an instrument where no historical data was found:

WARNING: No Pricing is available for (RIC,UBSG.S,VTX) because no historical data was found.

Thanks to this you understand why no pricing data is returned for specific instruments, whereas other data (like the currency or exchange code) is returned.

Errors

A bad request will generate an error immediately, the extraction request will not be processed and no extraction notes will be generated. But an error could also occur during the extraction process.

Here is an example where a request was made for a template that was not permissioned:

ERROR: Report suppressed for lack of 'Premium_ASIA_10PM' template permission.

Maintenance notes

To receive them you must enable RIC Maintenance Reports in the DSS GUI user preferences:

This second type of notes contains information on RIC maintenance changes that have occurred over the past ten days, including: Deletions, Renames, Currency Conversions, Delists, Relists, File Code Updates and Stock Split Adjustment Factors.

Instrument identifiers can change, for instance if a company changes its name, is bought or merges, or simply for normalization reasons. This is not a frequent occurrence. The probability of an instrument identifier change is obviously higher if the request is for many instruments.

Corporate actions like splits, dividends and others can have an impact on prices. Seeing split adjustment factors in the maintenance notes is a clear indication that something of interest happened, and is useful to understand sudden price changes.

RIC maintenance changes could require further actions in your company, like changes to your instrument lists, reference data or analytics.

For full information on individual corporate actions, you can make a Corporate Actions Extraction request, as explained in the DSS REST Tutorial 5.

Here are some sample maintenance notes lines, from an Intraday Pricing request for ChainRIC 0#.FTSE. The first line details the field names. Three data lines follow; they contain stock splits with their adjustment factors and individual dates:

Identifier,IdentType,Source,RIC,RecordDate,MaintType,OldValue,NewValue,Factor,FactorType
0#.FTSE,CHR,LSE,RR.L,25102018,SPLT,,,.994524,
0#.FTSE,CHR,LSE,SLA.L,22102018,SPLT,,,1.142857,
0#.FTSE,CHR,LSE,SLA.L,22102018,SPLT,,,.868511,

If there are no maintenance notes, then only the field names will appear:

Identifier,IdentType,Source,RIC,RecordDate,MaintType,OldValue,NewValue,Factor,FactorType

Notes reference documentation

For more information on extraction and maintenance notes, please refer to the following:

See also chapter 7 of the DSS GUI user guide.

Extraction notes management

As a minimum the notes should be requested, and saved to file for further reference.

Ideally you should also treat them automatically in your applications, to extract all the information that is of interest to you, detect issues and log or flag them for further treatment or human intervention.

As the extraction notes content differs depending on the API call, the best way to proceed is simply to parse the notes line by line to identify, log and treat all lines that contain key words like “inactive” , “invalid”, “permission”, “WARNING”, “ERROR”, etc.

Caveat: the list of key words above is just an example to get you started; you must define your own list based on your use cases and interests.

You must also manage the maintenance notes, if there are data lines after the one containing the field names.

How to modify your code

With the .Net SDK

The .Net SDK automatically requests the extraction notes.

For a scheduled request, the data, extraction notes and RIC maintenance notes are in three separate files. All you need to do is retrieve the appropriate files, as shown in the DSS .Net SDK Tutorial 4.

For an On Demand request, the extraction notes must be retrieved from the extraction result, as shown in this very simple extract from the DSS .Net SDK Tutorial 5 code:

if (extractionResult.Notes.Any())
{
    foreach (String note in extractionResult.Notes) Console.WriteLine(note);
}
else Console.WriteLine("Error: no extraction notes returned");

In all cases, once you have the extraction notes you should save them to file, and analyze them. This is illustrated, with a very simple analysis, in the DSS .Net SDK Tutorial 6.

Without the .Net SDK

For a scheduled request, the data, extraction notes and RIC maintenance notes are in three separate files. All you need to do is retrieve the appropriate files, as shown in the DSS REST Tutorial 10, and then save them to file and analyze them.

For an On Demand request, you must select the appropriate endpoint to receive the extraction notes, before receiving and treating them.

Migrating from using the deprecated Extract endpoint to the ExtractWithNotes endpoint is fairly easy; here are the steps to follow:

  1. Change the endpoint to use ExtractWithNotes
  2. Change the data retrieval object name
  3. Handle the extraction notes

Change the endpoint to use ExtractWithNotes

Action point: wherever you are using it, replace this endpoint:

https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/Extract

with this one:

https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/ExtractWithNotes

Usage of this endpoint can be seen in the DSS REST Tutorial 2.

Change the data retrieval object name

The response differs slightly between the two endpoints. Here is a snippet of what the Extract endpoint delivers, for a small End of Day request:

{
    "@odata.context": "https://hosted.datascopeapi.reuters.com/RestApi/v1/$metadata#Collection(ThomsonReuters.Dss.Api.Extractions.ExtractionRequests.ExtractionRow)",
    "value": [
        {
            "IdentifierType": "Ric",
            "Identifier": "EUR=",
…
            "Trade Date": "2018-11-06"
        }
    ]
}

Here is a snippet of what the ExtractWithNotes endpoint delivers, for the same request:

{
    "@odata.context": "https://hosted.datascopeapi.reuters.com/RestApi/v1/$metadata#ThomsonReuters.Dss.Api.Extractions.ExtractionRequests.ExtractionResult",
    "Contents": [
        {
            "IdentifierType": "Ric",
            "Identifier": "EUR=",
…
            "Trade Date": "2018-11-06"
        }
    ],
    "Notes": [
        "Extraction Services Version 12.2.39800 … ",
        "Identifier,IdentType,Source,RIC,RecordDate,MaintType,OldValue,NewValue, … "
    ]
}

When migrating from Extract to ExtractWithNotes, the value object is replaced by 2 objects:

  1. Contents – Contains the same data in the same format as the value object
  2. Notes – Contains the extraction notes and the RIC maintenance report

Action point: modify your data handling code to retrieve data from the Contents object.

Handle the extraction notes

Action point: add some new code to save and analyze the contents of the Notes object.

This is out of scope for this article, but you can refer to some of our tutorials and sample code that illustrate a few parts of these tasks:

Java samples (available under the downloads tab, and described here):

  • DSS2ImmediateScheduleTermsAndCondition – save and display notes
  • DSS2OnDemandEndOfDay – display notes when retrieving data (most of the Java On Demand samples display the extraction notes in a similar way)
  • DSS2OnDemandEndOfDayRaw – display notes when retrieving compressed data

Python sample (available under the downloads tab):

  • Python Symbology Conversion – display notes when retrieving data

Conclusions

You should now have a better understanding of the content of extraction notes, and maybe also some ideas as to how you could leverage them to your advantage.

You are now also aware of the requirement to migrate any code using the Extract endpoint, and how to accomplish that task.

There is also a webinar on this topic, you can view it here.