Download source code from |
|
Last update | January 2019 |
Interpreter | Python 2.x/3.x |
The goal of this tutorial is to demonstrate the Eikon API with the focus on the news retrieval. So, for that purpose we are going to look at new issue news from International Financial Review (IFR), a global capital markets intelligence service that is a part of Refinitiv.
We will capture the PRICED or DEAL notifications that contain structured text that we will then extract.
Before we start, let's make sure that:
If you have not yet done this, have a look at the Quick Start guides for this API.
Let's start with referencing Eikon Data APIs library and pandas:
import eikon as tr
import pandas as pd
Paste your App Key in to this line:
tr.set_app_key('your_app_key')
We are going to request emerging market new issue (ISU) eurobond (EUB) news from International Financial Review Emerging EMEA service (IFREM), focusing on the notifications of the already priced issues. You can replicate this request in the News Monitor app with the following query:
from datetime import date
start_date, end_date = date(2016, 1, 1), date.today()
q = "Product:IFREM AND Topic:ISU AND Topic:EUB AND (\"PRICED\" OR \"DEAL\")"
headlines = tr.get_news_headlines(query=q, date_from=start_date, date_to=end_date, count=100)
headlines.head()
versionCreated | text | storyId | sourceCode | |
---|---|---|---|---|
2017-04-13 07:11:15 | 2017-04-13 07:11:49.650 | PRICED: 4finance USD325m 5NC2 at 10.75%; Leads | urn:newsml:reuters.com:20170413:nIFR184tb5:1 | NS:IFR |
2017-04-12 19:58:46 | 2017-04-12 20:50:14.731 | PRICED: Saudi Arabia US$9bn 2-tranche deal | urn:newsml:reuters.com:20170412:nIFR36WjXM:1 | NS:IFR |
2017-04-12 19:58:09 | 2017-04-12 20:49:38.608 | PRICED: Saudi Arabia US$9bn 2-tranche deal | urn:newsml:reuters.com:20170412:nIFR1BKY60:1 | NS:IFR |
2017-04-11 15:03:51 | 2017-04-11 15:04:33.786 | PRICED: X5 RUB20bn 3yr at 9.25%; Leads | urn:newsml:reuters.com:20170411:nIFR3Wb3bN:1 | NS:IFR |
2017-04-10 20:43:18 | 2017-04-10 20:48:04.320 | PRICED: Romania E1.75bn 2-tranche deal | urn:newsml:reuters.com:20170410:nIFRS9KcK:1 | NS:IFR |
In the context of news, each story has its own unique identifier, created according to the RFC 3085 standard. Here's what the story looks like, notice that I am using the standard HTML() function from Notebook to display it:
from IPython.core.display import HTML
html = tr.get_news_story('urn:newsml:reuters.com:20170405:nIFR5LpzRX:1')
HTML(html)
[Status]: PRICED [Asset Type]: High Yield
[Pricing Date]: 05-Apr-17 [Issuer/Borrower Type]: Corporate
[Issuer]: Petra Diamonds [Bookrunners]: Barc/RBC/BMO
[Issuer Long Name]: PETRA DIAMONDS US [Coupon]: 7.250 Fixed
TREASURY PLC [Price]: 99.9930
[Size]: USD 650M [Reoffer Price]: 99.9930
[Ratings]: B2/B+ [Yield]: 7.25
[Tenor/Mty]: 5yr 01-May-22 [Spread]: T+540.1bp
[Issue Type]: Sr Sec Notes [Price Guidance]: 7.5% area
[CUSIP/ISIN]: USG7028AAB91 / [Listed]: Irish
US71642QAB32 [Denoms]: 200k/1k
[Sector]: Materials-Mining [UOP]: Mixed
[Law]: NY [Fees]: Undisclosed
[Country]: UNITED STATES [Format]: 144A/RegS for life
[Region]: US
[Settledate]: 12-Apr-17
[NOTES]: USD600m 5NC2 snr sec second lien. RegS/144a. B2/B+. GloCos
Barc(B&D)/RBC. JBs BMO. IPTs 7.5% area, guidance 7.375% area (+/-0.125%).
Now we can parse the data using a regular expression but before this we will need to convert HTML into text. Let's create a function that is going to return a dictionary from the this type of article. I will be using lxml library to convert HTML and re to parse its output.
from lxml import html
import re
def termsheet_to_dict(storyId):
x = tr.get_news_story(storyId)
story = html.document_fromstring(x).text_content()
matches = dict(re.findall(pattern=r"\[(.*?)\]:\s?([A-Z,a-z,0-9,\-,\(,\),\+,/,\n,\r,\.,%,\&,>, ]+)", string=story))
clean_matches = {key.strip(): item.strip() for key, item in matches.items()}
return clean_matches
Let's test it and see if it works:
termsheet_to_dict('urn:newsml:reuters.com:20170323:nIFR9z7ZFL:1')['NOTES']
'EUR400m (from 300m+) 3yr LPN. RegS. Follows rshow. Exp nr/B+/BB.\r\nAlfa/ING/UBS(B&D). IPTs 2.75% area, guidance 2.625%/2.75% WPIR, set at 2.625% on\r\nbks closed >750m.'
Let's extract all data for all headlines:
from time import sleep
result = []
index = pd.DataFrame(headlines, columns=['storyId']).values.tolist()
for i, storyId in enumerate(index):
x = termsheet_to_dict(storyId[0])
if x:
result.append(x)
sleep(0.5)
df = pd.DataFrame(result)
df.head()
Note: the sleep is to avoid too many requests per second, which would generate an HTTP 429 error.
1st Pay | Asset Type | Bookrunners | Business | CUSIP/ISIN | Call | Country | Coupon | DBRS | Denoms | ... | Sector | Settledate | Size | Spread | Stabilis | Status | Tenor/Mty | Total | UOP | Yield | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | NaN | High Yield | Stifel | NaN | XS1597294781 / | 01-May-19 | LATVIA | 10.750 Fixed | NaN | 200k/1k | ... | Financials-Diversified | 28-Apr-17 | USD 325M | T+894 | NaN | PRICED | 5yr 01-May-22 | NaN | NaN | 10.75 |
1 | NaN | Islamic | NaN | NaN | NaN | SAUDI ARABIA | 3.628 Fixed | NaN | 200k/1k | ... | NaN | 20-Apr-17 | USD 4.5BN | MS+140 | NaN | PRICED | 10yr 20-Apr-27 | NaN | NaN | 3.628\r\nSukuk | |
2 | NaN | Islamic | NaN | NaN | NaN | SAUDI ARABIA | 2.894 Fixed | NaN | 200k/1k | ... | NaN | 20-Apr-17 | USD 4.5BN | MS+100 | NaN | PRICED | 5yr 20-Apr-22 | NaN | NaN | 2.894\r\nSukuk | |
3 | NaN | High Yield | GS/UBS/VTB | NaN | XS1598697412 Trade House PEREKRIOS... | NaN | RUSSIA | 9.250 Fixed | NaN | 10m/100k | ... | Cons Staples-Retailing Limited Liability... | 18-Apr-17 | RUB 20BN | NaN | NaN | PRICED | 3yr 18-Apr-20 | NaN | NaN | 9.25 |
4 | NaN | Investment Grade | Barc/Citi/Erste/ING/SG | NaN | XS1313004928 / | NaN | ROMANIA | 3.875 | NaN | 1k+1k | ... | NaN | 19-Apr-17 | EUR 750M | NaN | NaN | PRICED | 18.5yr 29-Oct-35 | 2BN | NaN | 3.55\r\nSr Unsec Notesa |
5 rows × 42 columns
Now, when we have the dataframe in place, we can perform simple stats on our data. For instance, how many of those issues reported were Investment Grade versus High Yield.
df['Asset Type'].value_counts()
Investment Grade 47
High Yield 24
Islamic 10
Covered 2
Name: Asset Type, dtype: int64
What about a specific country?
df[df['Country']=='RUSSIA']['Asset Type'].value_counts()
Investment Grade 6
High Yield 6
Name: Asset Type, dtype: int64
You can experiment further by changing the original headline search query, for example, by including the RIC into your request.