Introduction

This notebook builds on the previous article published on Refinitiv Developer Portal and titled "ESG Disclosures". In this notebook we further explore the distribution of ESG disclosures and its evolution over time.

Refinitiv started collecting Environmental, Social and Governance (ESG) metrics for companies and building ESG dataset back in the year 2002. The original universe of companies Refinitiv collected ESG metrics for included 661 company. At the time of writing ESG universe covered by Refinitiv includes over 8,000 companies and ESG coverage keeps expanding. In this notebook we look at how the disclosure of ESG metrics evolved over time for the list of companies Refinitiv has the longest history for, i.e. for the original ESG universe of 2002.

Retrieving ESG data for the original ESG universe from 2002

As in the article referenced above we retrieve the list of ESG fields from a CSV file where the metrics availbale from Refinitiv ESG set are divided into several categories. The two categories we use here are "performance" and "policy" metrics. "Policy" metrics are Boolean in nature and specify whether a company has a specific policy such as for instance a data privacy or a fair trade policy. "Performance" metrics mostly provide quantified or descriptive values such as for example the amount of CO2 emissions, although some provide Boolean values (e.g. TR.ProfitWarnings, which states whether the company issued a profit warning during the year). This categorization into "performance" and "policy" metrics is somewhat subjective. There are metrics that could be considered for both categories, e.g. TR.PoisonPill, which states whether the company has adopted a poison pill (a shareholders rights plan, macaroni defense or a similar provision protecting the company against hostile takeover). I chose to categorize TR.PoisonPill as "performance" metric. You may want to rearrange the metrics across categories according to your perception or investment model.

# The list of companies with ESG scores in 2002
screener_exp = 'SCREEN(U(IN(Equity(active,public,primary))), TR.TRESGScore(Period=FY2002)>=0)'
instr, err = ek.get_data(screener_exp,['TR.CommonName'])
# ensure that RICs are in a list object 
instr = instr['Instrument'].tolist() 
# Remove duplicates, None and empty strings from the list
instr = list(dict.fromkeys(instr))
instr = list(filter(None, instr))
instr = list(filter(str.strip, instr))
param = {'Period':'FY2002'}

esg_fields_df = pd.read_csv('ESGFieldCategories.csv')
perf_fields = esg_fields_df['Performance'].dropna().tolist()
pol_fields = esg_fields_df['Policy'].dropna().tolist()
df_perf_2002, err = ek.get_data(instr, perf_fields, param)
df_perf_2002.set_index('Instrument', inplace = True)
df_perf_latest, err = ek.get_data(instr, perf_fields)
df_perf_latest.set_index('Instrument', inplace = True)
df_pol_2002, err = ek.get_data(instr, pol_fields, param)
df_pol_2002.set_index('Instrument', inplace = True)
df_pol_latest, err = ek.get_data(instr, pol_fields)
df_pol_latest.set_index('Instrument', inplace = True)

Data logic for determining the distribution of disclosures across ESG metrics

The data logic for determining the distribution of disclosures across ESG metrics is described in the previous article on ESG disclosures I referenced above. Here I plot this distribution for the year 2002 and for the latest available ESG data set for the companies in the original ESG universe of 2002. As you can see from the chart below the disclosure rate improved significantly since 2002. Yet there's still a long way to go. Interestingly there's a handful of metrics where the disclosure rate decreased. Namely the adoption of the poison pill has become far less prevalent. Salaries and wages from CSR reporting are also being disclosed less now than in 2002.

# Return number of non-NA/null observations from the Refinitiv ESG performance metrics
df_plot_2002 = df_perf_2002.count() - (df_perf_2002 == '').sum()
df_plot_latest = df_perf_latest.count() - (df_perf_latest == '').sum()
# Return number of 'True' observations from Refinitiv ESG policy metrics
df_plot_2002 = df_plot_2002.append((df_pol_2002 == 'True').sum())
df_plot_latest = df_plot_latest.append((df_pol_latest == 'True').sum())
# Prepare non-NA/null & 'True' observations for chart plot
df_plot = pd.concat([df_plot_2002, df_plot_latest], axis=1, keys=['2002','latest'])
df_plot = 100*df_plot/len(instr)
df_plot.sort_values(by = 'latest', inplace = True, ascending = False)

fig = go.Figure([go.Bar(name = '2002', x=df_plot.index, y=df_plot['2002'].values),
                        go.Bar(name = 'latest', x=df_plot.index, y=df_plot['latest'].values)],
               go.Layout(barmode='group',
                         xaxis_tickangle=-45,
                         title='Percentage of companies in ESG universe of 2002 <br>reporting on each ESG Metric',
                         xaxis = {'tickfont_size':6},
                         yaxis = {'ticksuffix':'%'}))
fig.show()

Now let's group the mterics into 20% buckets by the disclosure rate. The number of metrics with less than 20% of companies in the original ESG universe of 2002 disclosing them decreased from 75.5% in 2002 to 34.5% now. And the number of metrics with over 80% of companies disclosing them increased from 8.4% in 2002 to 27.7% now.

# Breake the number of disclosures into 20% brackets
df_group_2002 = df_plot['2002'].groupby(pd.cut(df_plot['2002'], 
                                               np.arange(0, 101, 20),include_lowest=True)).count()*100/len(df_plot)
df_group_latest = df_plot['latest'].groupby(pd.cut(df_plot['latest'], 
                                                   np.arange(0, 101, 20),include_lowest=True)).count()*100/len(df_plot)
fig = go.Figure([go.Bar(name = '2002', x=['<20%','20%-40%','40%-60%','60%-80%','80%-100%'], y=df_group_2002.values),
                 go.Bar(name = 'latest', x=['<20%','20%-40%','40%-60%','60%-80%','80%-100%'], y=df_group_latest.values)],
               go.Layout(barmode='group',
                        title='Percentage of metrics with the rate of disclosures within range',
                        yaxis = {'ticksuffix':'%'}))
fig.show()

Disclosure distribution by company

The histogram below depicts the distribution of the number of ESG disclosures by the number of companies in 2002 vs. now. We can see a very strong shift to the right in the distribution curve signifying greatly improved disclosure rate since 2002.

df_dist_perf_2002 = (df_perf_2002.count(axis=1) - (df_perf_2002 == '').sum(axis=1))
df_dist_pol_2002 = (df_pol_2002 == 'True').sum(axis=1)
df_dist_2002 = df_dist_perf_2002 + df_dist_pol_2002
df_dist_perf_latest = (df_perf_latest.count(axis=1) - (df_perf_latest == '').sum(axis=1))
df_dist_pol_latest = (df_pol_latest == 'True').sum(axis=1)
df_dist_latest = df_dist_perf_latest + df_dist_pol_latest
fig = plt.figure(figsize=(16,10))
ax = plt.subplot(1,2,1)
df_dist_2002.hist(ax = ax, bins=20)
plt.title('2002', fontsize = 16)
plt.xlabel('Number of ESG disclosures', fontsize = 14)
plt.ylabel('Number of companies', fontsize = 14)
ax = plt.subplot(1,2,2)
df_dist_latest.hist(ax = ax, bins=20)
plt.title('Latest', fontsize = 16)
plt.xlabel('Number of ESG disclosures', fontsize = 14)
plt.ylabel('Number of companies', fontsize = 14)
plt.show()

Evolution of companies ESG disclosures since 2002 by sector 

Now let's see how the evolution of ESG disclosures since 2002 breaks down by sector. In the chart below we see that all sectors saw very strong improvement in the ESG disclosure rate. Interestingly there's not a lot of variation in average ESG disclosures by economic sector.

df_sector, err = ek.get_data(instr, 'TR.TRBCEconomicSector')
df_sector.set_index('Instrument', inplace = True)
df_sector = pd.concat([df_dist_2002, df_dist_latest, df_sector], axis=1)
df_sector.columns = ['2002','latest','sector']
df_sector = df_sector.groupby(['sector']).mean()
fig = go.Figure([go.Bar(name = '2002', x=df_sector.index, y=df_sector['2002'].values),
                        go.Bar(name = 'latest', x=df_sector.index, y=df_sector['latest'].values)],
               go.Layout(barmode='group',
                         xaxis_tickangle=-45,
                         title='Average number of ESG disclosures by sector'))
fig.show()

Evolution of company disclosures for companies with biggest improvement and biggest decrease in ESG score since 2002 

Finally let's take a look at how ESG disclosure rate evolved between 2002 and now for 5 companies with the biggest improvement in overall ESG Score since 2002 and for 5 companies with the biggest decrease in ESG score. We can see from the chart below that the 5 companies with biggest ESG Score improvement all show strong and fairly steady trend of increasing ESG disclosures while the 5 companies with biggest decrease in ESG Score don't exhibit such trend and show little if any increase in the number of ESG disclosures.

df_esg_score, err = ek.get_data(instr,['TR.TRESGScore(Period=FY2002)','TR.TRESGScore'])
df_esg_score['ESG Score Diff'] = df_esg_score.iloc[:,2] - df_esg_score.iloc[:,1]
esg_score_gainers = df_esg_score.nlargest(5,'ESG Score Diff')['Instrument'].tolist()
esg_score_losers = df_esg_score.nsmallest(5,'ESG Score Diff')['Instrument'].tolist()
#Remove duplicates from combined list
instr = list(set(esg_score_gainers + esg_score_losers))
df_perf, err = ek.get_data(instr, ['TR.ESGPeriodLastUpdateDate.fpa'] + perf_fields,
                                   {'Period':'FY2002;FY2019'})
df_pol, err = ek.get_data(instr, ['TR.ESGPeriodLastUpdateDate.fpa'] + pol_fields,
                                   {'Period':'FY2002;FY2019'})
df_perf['Disclosure'] = df_perf.iloc[:,2:].count(axis=1) - (df_perf.iloc[:,2:] == '').sum(axis=1)
df_pol['Disclosure'] = (df_pol.iloc[:,2:] == 'True').sum(axis=1)
df_perf = df_perf[['Instrument','Financial Period Absolute','Disclosure']]
df_perf = df_perf.set_index(['Financial Period Absolute','Instrument'])
df_pol = df_pol[['Instrument','Financial Period Absolute','Disclosure']]
df_pol = df_pol.set_index(['Financial Period Absolute','Instrument'])
df_disclosure = df_perf + df_pol
df_disclosure = df_disclosure.unstack()
df_disclosure.columns = df_disclosure.columns.get_level_values(1)
df_disclosure.replace({0:np.nan}, inplace = True)
fig = plt.figure(figsize=(16,11))
ax1 = plt.subplot(3,1,1) 
df_disclosure[esg_score_gainers].plot(ax = ax1)
plt.title('ESG disclosures for companies with biggest improvement in ESG score since 2002', fontsize=16)
plt.ylabel('Number of ESG disclosures', fontsize=14)
plt.xlabel('')
ax2 = plt.subplot(3,1,2) 
df_disclosure[esg_score_losers].plot(ax = ax2)
plt.title('ESG disclosures for companies with biggest decrease in ESG score since 2002', fontsize=16)
plt.ylabel('Number of ESG disclosures', fontsize=14)
plt.xlabel('')
plt.show()