1. Home
  2. Article Catalog
  3. New issues from the news in Python

Article

New issues from the news in Python

Evgeny Kovalyov
Product Manager, Framework Services Product Manager, Framework Services

Introduction

The goal of this article is to demonstrate the Eikon API with the focus on the news retrieval in a Jupyter Notebook environment. So, for that purpose we are going to look at new issue news from International Financial Review (IFR), a global capital markets intelligence provider, that is a part of Thomson Reuters.

We will capture the PRICED or DEAL notifications that contain structured text that we will extract.

Before we start, let's make sure that:

  • Thomson Reuters Eikon Scripting Proxy is up and running;
  • Thomson Reuters Eikon API library is installed;
  • You have created an application ID for this script.

If you have not yet done this, have a look at the quick start section for this API.

A general note on the Jupyter Notebook usage: in order to execute the code in the cell, press Shift+Enter. While notebook is busy running your code, the cell will look like this: In [*]. When its finished, you will see it change to the sequence number of the task and the output, if any. For example,

In [8]: df['Asset Type'].value_counts()
`Out[8]: Investment Grade 47
High Yield 24
Islamic 10
Covered 2
Name: Asset Type, dtype: int64`

For more info on the Jupyter Notebook, check out Project Jupyter site http://jupyter.org or 'How to set up a Python development environment for Thomson Reuters Eikon' tutorial on the Developer Community portal.

 

Getting started

Let's start with referencing Eikon API library and pandas:

    	
            

import eikon as tr

import pandas as pd

Paste your application ID in to this line:

    	
            
tr.set_app_id('your_app_id')

We are going to request emerging market new issue (ISU) Eurobond (EUB) news from International Financial Review Emerging EMEA service (IFREM), focusing on the notifications of the already priced issues. You can replicate this request in the News Monitor app with the following query:

  • Product:IFREM AND Topic:ISU AND Topic:EUB AND ("PRICED" OR "DEAL")
    	
            

from datetime import date

start_date, end_date = date(2017, 1, 1), date.today()

q = "Product:IFREM AND Topic:ISU AND Topic:EUB"

headlines = tr.get_news_headlines(query=q, date_from=start_date,

date_to=end_date, count=100)

headlines.head()

  versionCreated text storyId sourceCode
2018-01-05 11:12:25.000 2018-01-05 11:12:25.000 Slovenia sweeps to 10-year marker urn:newsml:reuters.com:20180105:nL8N1P01LX:2 NS:IFR
2018-01-04 15:15:33.050 2018-01-04 15:15:33.050 DEAL: Slovenia prices EUR1.5bn 1% Mar 2028 10y... urn:newsml:reuters.com:20180104:nIFR1WtLpT:1 NS:IFR
2018-01-04 14:34:38.000 2018-01-04 14:39:30.000 Macedonia hires banks for seven-year euro benc... urn:newsml:reuters.com:20180104:nL8N1OZ3DS:1 NS:IFR
2018-01-04 14:34:38.000 2018-01-04 14:34:38.000 MACEDONIA HIRES CITIGROUP, DEUTSCHE BANK AND E... urn:newsml:reuters.com:20180104:nL8N1OZ3DS:1 NS:IFR
2018-01-04 14:27:25.000 2018-01-04 14:27:25.000 Slovenia 10yr allocs out urn:newsml:reuters.com:20180104:nL8N1OZ3BY:1 NS:IFR

In the context of news, each story has its own unique idenifier, created according to the RFC 3085 standard. Here's what the story looks like, notice that I am using the standard HTML() function from Notebook to display it

    	
            

from IPython.core.display import HTML

html = tr.get_news_story('urn:newsml:reuters.com:20180104:nIFR1WtLpT:1')

HTML(html)

[Status]: PRICED [Asset Type]: Investment Grade

[Pricing Date]: 04-Jan-18 [Issuer/Borrower Type]: Sovereign

[Issuer]: Slovenia [Offering Type]: Eurobond

[Issuer Long Name]: SLOVENIA, REPUBLIC [Bookrunners]:

OF (GOVERNMENT) Citi/CMZ/GS/HSBC/Jeff/Nova Ljubjanska

[Size]: EUR 1.5bn [Coupon]: 1.000 Fxd

[Ratings]: Baa1/A+/A- [Price]: 99.6540

[Tenor/Mty]: 10yr 06-Mar-28 [Reoffer Price]: 99.6540

[Issue Type]: bmk, snr unsec [Yield]: 1.036

[CUSIP/ISIN]: SI0002103776 [Spread]: MS+13

[Law]: Slovenian [Price Guidance]: MS+20 area

[Country]: SLOVENIA [Listed]: Ljubljana

[Region]: EEMEA [Denoms]: 1k/1k

[Settledate]: 11-Jan-18 [Fees]: Undisclosed

[Format]: Reg S only
[NOTES]: EUR1.5bn 10yr bmk. Baa1/A+/A-. Citi/CMZ/GS/HSBC(B&D)/Jeff/NLB. IPTs

MS+20 area, guidance +17 area, set +13 on bks closed >3.6bn (395m JLM). Long

first cpn. Vs DBR 0.5% 8/27 +59.3 @100.535 / HR 102%. . FTT 3:30pm

Now we can parse the data using a regular expression but before this we will need to convert HTML into text. Let's create a function that is going to return a dictionary from the this type of article. I will be using lxml library to convert HTML and re to parse its output.
 

    	
            

from lxml import html

import re

def termsheet_to_dict(storyId):

x = tr.get_news_story(storyId)

story = html.document_fromstring(x).text_content()

matches = dict(re.findall(pattern=r"\[(.*?)\]:\s?([A-Z,a-z,0-9,\-,\(,\),\+,/,\n,\r,\.,%,\&,>, ]+)",
string=story))

clean_matches = {key.strip(): item.strip() for key, item in matches.items()}

return clean_matches

Let's test it and see if it works:

    	
            
termsheet_to_dict('urn:newsml:reuters.com:20170323:nIFR9z7ZFL:1')['NOTES']

'EUR400m (from 300m+) 3yr LPN. RegS. Follows rshow. Exp nr/B+/BB.\r\nAlfa/ING/UBS(B&D). IPTs 2.75% area, guidance 2.625%/2.75% WPIR, set at 2.625% on\r\nbks closed >750m.'

Let's extract all data for all headlines:

    	
            

result = []

index = pd.DataFrame(headlines, columns=['storyId']).values.tolist()

for i, storyId in enumerate(index):

try:

x = termsheet_to_dict(storyId[0])

if x:

result.append(x)

except:

pass

df = pd.DataFrame(result)

df.head()

 

Asset Type

Bookrunners

CUSIP/ISIN

Call

Country

Coupon

Denoms

Fees

Format

Guarantor

...

Reoffer Price

Sector

Settledate

Size

Spread

Stabilis

Status

Tenor/Mty

Total

Yield

0

Investment Grade

OF (GOVERNMENT) Citi/CM...

SI0002103776

NaN

SLOVENIA

1.000 Fxd

1k/1k

Undisclosed

Reg S only

NaN

...

99.654

NaN

11-Jan-18

EUR 1.5bn

MS+13

NaN

PRICED

10yr 06-Mar-28

NaN

1.036

1

Investment Grade

Citi/Halyk/JPM\r\nKAZAKHSTANA AO

XS1734574137

NaN

KAZAKHSTAN

9.500 Fxd

50m+250k

Undisclosed

Reg S only

NaN

...

99.681

NaN

14-Dec-17

KZT 100bn

NaN

NaN

PRICED

3yr 14-Dec-20

NaN

9.625

2

Investment Grade

StCh/DIB/ENBD/Warba\r\n(CEIC) LTD

XS1720817540

NaN

UNITED ARAB EMIRATES

5.125 Fxd

200k/1

Undisclosed

Reg S only

Emirates REIT (CEIC) Ltd

...

100

NaN

12-Dec-17

USD 400m

MS+291

NaN

PRICED

5yr 12-Dec-22

NaN

5.125

3

Investment Grade

DB

XS1731920291 , 40 day

NaN

LUXEMBOURG

2.125 Fxd

100k/1k

Undisclosed

Reg S only

NaN

...

100.323

Financials-Real Estate

06-Dec-17

EUR 225m

MS+165\r\nfungw w/ XS1693959931

FCA/ICMA

PRICED

7yr 04-Oct-24

825M

2.072

4

High Yield

DAIWA/Miz/SMBC Nikko\r\nOF (GOVERNMENT)

NaN

NaN

TURKEY

1.810 Fxd

NaN

Undisclosed

Reg S only

NaN

...

NaN

NaN

07-Dec-17

JPY 60bn

PS+170

NaN

PRICED

3yr 20-Dec-20

60BN

NaN

5 rows √ó 36 columns

Now, when we have the dataframe in place, we can perform simple stats on our data. For instance, how many of those issues reported were Investment Grade versus High Yield.

    	
            
df['Asset Type'].value_counts()

High Yield 9
Investment Grade 6
Name: Asset Type, dtype: int64
What about a specific country?

    	
            
df[df['Country']=='RUSSIA']['Asset Type'].value_counts()

High Yield 1
Name: Asset Type, dtype: int64

Conclusion

You can experiment further by changing the original headline search query, for example, by including the RIC into your request.