Introduction

The goal of this article is to demonstrate the Eikon API with the focus on the news retrieval in a Jupyter Notebook environment. So, for that purpose we are going to look at new issue news from International Financial Review (IFR), a global capital markets intelligence provider, that is a part of Thomson Reuters.

We will capture the PRICED or DEAL notifications that contain structured text that we will extract.

Before we start, let's make sure that:

  • Thomson Reuters Eikon Scripting Proxy is up and running;
  • Thomson Reuters Eikon API library is installed;
  • You have created an application ID for this script.

If you have not yet done this, have a look at the quick start section for this API.

A general note on the Jupyter Notebook usage: in order to execute the code in the cell, press Shift+Enter. While notebook is busy running your code, the cell will look like this: In [*]. When its finished, you will see it change to the sequence number of the task and the output, if any. For example,

In [8]: df['Asset Type'].value_counts()

`Out[8]: Investment Grade 47

High Yield 24

Islamic 10

Covered 2

Name: Asset Type, dtype: int64`

For more info on the Jupyter Notebook, check out Project Jupyter site http://jupyter.org or 'How to set up a Python development environment for Thomson Reuters Eikon' tutorial on the Developer Community portal.

Getting started

Let's start with referencing Eikon API library and pandas:


import eikon as tr

import pandas as pd

Paste your application ID in to this line:


tr.set_app_id('your_app_id')

We are going to request emerging market new issue (ISU) Eurobond (EUB) news from International Financial Review Emerging EMEA service (IFREM), focusing on the notifications of the already priced issues. You can replicate this request in the News Monitor app with the following query:

  • Product:IFREM AND Topic:ISU AND Topic:EUB AND ("PRICED" OR "DEAL")

from datetime import date

start_date, end_date = date(2017, 1, 1), date.today()

q = "Product:IFREM AND Topic:ISU AND Topic:EUB"

headlines = tr.get_news_headlines(query=q, date_from=start_date, date_to=end_date, count=100)

headlines.head()
  versionCreated text storyId sourceCode
2018-01-05 11:12:25.000 2018-01-05 11:12:25.000 Slovenia sweeps to 10-year marker urn:newsml:reuters.com:20180105:nL8N1P01LX:2 NS:IFR
2018-01-04 15:15:33.050 2018-01-04 15:15:33.050 DEAL: Slovenia prices EUR1.5bn 1% Mar 2028 10y... urn:newsml:reuters.com:20180104:nIFR1WtLpT:1 NS:IFR
2018-01-04 14:34:38.000 2018-01-04 14:39:30.000 Macedonia hires banks for seven-year euro benc... urn:newsml:reuters.com:20180104:nL8N1OZ3DS:1 NS:IFR
2018-01-04 14:34:38.000 2018-01-04 14:34:38.000 MACEDONIA HIRES CITIGROUP, DEUTSCHE BANK AND E... urn:newsml:reuters.com:20180104:nL8N1OZ3DS:1 NS:IFR
2018-01-04 14:27:25.000 2018-01-04 14:27:25.000 Slovenia 10yr allocs out urn:newsml:reuters.com:20180104:nL8N1OZ3BY:1 NS:IFR

In the context of news, each story has its own unique idenifier, created according to the RFC 3085 standard. Here's what the story looks like, notice that I am using the standard HTML() function from Notebook to display it:


from IPython.core.display import HTML

html = tr.get_news_story('urn:newsml:reuters.com:20180104:nIFR1WtLpT:1')

HTML(html)
[Status]: PRICED [Asset Type]: Investment Grade

[Pricing Date]: 04-Jan-18 [Issuer/Borrower Type]: Sovereign

[Issuer]: Slovenia [Offering Type]: Eurobond

[Issuer Long Name]: SLOVENIA, REPUBLIC [Bookrunners]:

OF (GOVERNMENT) Citi/CMZ/GS/HSBC/Jeff/Nova Ljubjanska

[Size]: EUR 1.5bn [Coupon]: 1.000 Fxd

[Ratings]: Baa1/A+/A- [Price]: 99.6540

[Tenor/Mty]: 10yr 06-Mar-28 [Reoffer Price]: 99.6540

[Issue Type]: bmk, snr unsec [Yield]: 1.036

[CUSIP/ISIN]: SI0002103776 [Spread]: MS+13

[Law]: Slovenian [Price Guidance]: MS+20 area

[Country]: SLOVENIA [Listed]: Ljubljana

[Region]: EEMEA [Denoms]: 1k/1k

[Settledate]: 11-Jan-18 [Fees]: Undisclosed

[Format]: Reg S only
[NOTES]: EUR1.5bn 10yr bmk. Baa1/A+/A-. Citi/CMZ/GS/HSBC(B&D)/Jeff/NLB. IPTs

MS+20 area, guidance +17 area, set +13 on bks closed >3.6bn (395m JLM). Long

first cpn. Vs DBR 0.5% 8/27 +59.3 @100.535 / HR 102%. . FTT 3:30pm

Now we can parse the data using a regular expression but before this we will need to convert HTML into text. Let's create a function that is going to return a dictionary from the this type of article. I will be using lxml library to convert HTML and re to parse its output.


from lxml import html

import re

def termsheet_to_dict(storyId):

x = tr.get_news_story(storyId)

story = html.document_fromstring(x).text_content()

matches = dict(re.findall(pattern=r"\[(.*?)\]:\s?([A-Z,a-z,0-9,\-,\(,\),\+,/,\n,\r,\.,%,\&,>, ]+)", string=story))

clean_matches = {key.strip(): item.strip() for key, item in matches.items()}

return clean_matches

Let's test it and see if it works:


termsheet_to_dict('urn:newsml:reuters.com:20170323:nIFR9z7ZFL:1')['NOTES']

'EUR400m (from 300m+) 3yr LPN. RegS. Follows rshow. Exp nr/B+/BB.\r\nAlfa/ING/UBS(B&D). IPTs 2.75% area, guidance 2.625%/2.75% WPIR, set at 2.625% on\r\nbks closed >750m.'

Let's extract all data for all headlines:


result = []

index = pd.DataFrame(headlines, columns=['storyId']).values.tolist()

for i, storyId in enumerate(index):

try:

x = termsheet_to_dict(storyId[0])

if x:

result.append(x)

except:

pass

df = pd.DataFrame(result)

df.head()
  Asset Type Bookrunners CUSIP/ISIN Call Country Coupon Denoms Fees Format Guarantor ... Reoffer Price Sector Settledate Size Spread Stabilis Status Tenor/Mty Total Yield
0 Investment Grade OF (GOVERNMENT) Citi/CM... SI0002103776 NaN SLOVENIA 1.000 Fxd 1k/1k Undisclosed Reg S only NaN ... 99.6540 NaN 11-Jan-18 EUR 1.5bn MS+13 NaN PRICED 10yr 06-Mar-28 NaN 1.036
1 Investment Grade Citi/Halyk/JPM\r\nKAZAKHSTANA AO XS1734574137 NaN KAZAKHSTAN 9.500 Fxd 50m+250k Undisclosed Reg S only NaN ... 99.6810 NaN 14-Dec-17 KZT 100bn NaN NaN PRICED 3yr 14-Dec-20 NaN 9.625
2 Investment Grade StCh/DIB/ENBD/Warba\r\n(CEIC) LTD XS1720817540 NaN UNITED ARAB EMIRATES 5.125 Fxd 200k/1 Undisclosed Reg S only Emirates REIT (CEIC) Ltd ... 100.0000 NaN 12-Dec-17 USD 400m MS+291 NaN PRICED 5yr 12-Dec-22 NaN 5.125
3 Investment Grade DB XS1731920291 , 40 day NaN LUXEMBOURG 2.125 Fxd 100k/1k Undisclosed Reg S only NaN ... 100.3230 Financials-Real Estate 06-Dec-17 EUR 225m MS+165\r\nfungw w/ XS1693959931 FCA/ICMA PRICED 7yr 04-Oct-24 825M 2.072
4 High Yield DAIWA/Miz/SMBC Nikko\r\nOF (GOVERNMENT) NaN NaN TURKEY 1.810 Fxd NaN Undisclosed Reg S only NaN ... NaN NaN 07-Dec-17 JPY 60bn PS+170 NaN PRICED 3yr 20-Dec-20 60BN NaN

5 rows × 36 columns

Now, when we have the dataframe in place, we can perform simple stats on our data. For instance, how many of those issues reported were Investment Grade versus High Yield.


df['Asset Type'].value_counts()

High Yield 9

Investment Grade 6

Name: Asset Type, dtype: int64

What about a specific country?


df[df['Country']=='RUSSIA']['Asset Type'].value_counts()

High Yield 1

Name: Asset Type, dtype: int64

Conclusion

You can experiment further by changing the original headline search query, for example, by including the RIC into your request.