1. Home
  2. Article Catalog
  3. Gathering aggregated ESG data on companies: ESG_Boolean_Data Python Class

Gathering aggregated ESG data on companies

Jonathan Legrand
Developer Advocate Developer Advocate

This article looks at building a Python framework to aid in investigating aggregated Environmental, Social and Governance (ESG) data on companies.

It outlines the fashion in which one may attempt to use Refinitiv's Python Data Application Programming Interface (Python DAPI) to gather ESG data on all companies included in the MSCI World Index and then create graphs to uncover insight behind the data.

This Class built bellow can be used for other data-types, not just ESG data; but it was built for this purpose and only tested as such. Therefore it is named ' ESG_Boolean_Data '

Binary/Boolean Data:

In this article, we will exemplify how one may gather data on

  • Environmental Controversies (AnalyticEnvControv)
  • Material Sourcing (EnvMaterialsSourcing)
  • Environmental Supply Chain Management (EnvSupplyChainMgt)
  • Policy Sustainable Packaging (PolicySustainablePackaging)

Get to the Coding

Import libraries

First we can use the library ' platform ' to show which version of Python we are using

    	
            

# The ' from ... import ' structure here allows us to only import the module ' python_version ' from the library ' platform ':

from platform import python_version

print("This code runs on Python version " + python_version())

This code runs on Python version 3.7.7

We use Refinitiv's Eikon Python Application Programming Interface (API) to access financial data. We can access it via the Python library "eikon" that can be installed simply by using pip install.

    	
            

import eikon as ek

 

# The key is placed in a text file so that it may be used in this code without showing it itself:

eikon_key = open("eikon.txt","r")

ek.set_app_key(str(eikon_key.read()))

# It is best to close the files we opened in order to make sure that we don't stop any other services/programs from accessing them if they need to:

eikon_key.close()

The following are Python-built-in modules/libraries, therefore they do not have specific version numbers.

    	
            

# datetime will allow us to manipulate Western World dates

import datetime

 

# dateutil will allow us to manipulate dates in equations

import dateutil

The Python library 'os' will allow us to retrieve the path of our running script

    	
            import os
        
        
    

The Python library 'warnings' will allow us to remove library depreciation warnings

    	
            

import warnings

# warnings.filterwarnings("ignore")

numpy is needed for datasets' statistical and mathematical manipulations

    	
            

import numpy

print("The numpy library imported in this code is version: " + numpy.__version__)

The numpy library imported in this code is version: 1.18.5

pandas will be needed to manipulate data sets

    	
            

import pandas

# This line will ensure that all columns of our dataframes are always shown:

pandas.set_option('display.max_columns', None)

print("The pandas library imported in this code is version: " + pandas.__version__)

The pandas library imported in this code is version: 1.0.5

openpyxl and xlsxwriter will be useful in manipulating Excel workbooks

    	
            

import openpyxl

import xlsxwriter

print("The openpyxl library imported in this code is version: " + openpyxl.__version__)

print("The xlsxwriter library imported in this code is version: " + xlsxwriter.__version__)

The openpyxl library imported in this code is version: 3.0.3
The xlsxwriter library imported in this code is version: 1.2.9

The bellow are needed to plot graphs of all kinds

    	
            

import plotly

import plotly.express

from plotly.graph_objs import *

from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot

 

init_notebook_mode(connected = False)

 

import chart_studio

chart_studio.tools.set_config_file(

    plotly_domain='https://chart-studio.plotly.com',

    plotly_api_domain='https://plotly.com')

 

for i,j in zip(["plotly"], [plotly]):

    print("The " + str(i) + " library imported in this code is version: " + j.__version__)

The plotly library imported in this code is version: 4.7.1

Coding

Boolean data:

Creating a Python Class

To make this as useful as possible (and to build a more intuitive session for users of our functionalities), the class bellow allows anyone to pick and choose the functions and graphs they want (as well as saving an excel workbook with the produced and 'cleaned' data.

    	
            

class ESG_Boolean_Data():

    """ ESG_Boolean_Data Python Class Version 1.0: Not for CodeBook use. An example of this article is uploaded in CodeBook.

    This Python class allows its user to aggregate ESG Boolean Data (i.e.: Data that is 'False' or 'True') for companies within Refinitiv's data-base on a per-static field (be it country, economic sector, ...) basis.

    It (i.e.: the ' __init__ ' of ' ESG_Boolean_Data ') takes the following arguments:

    

    Companies (list of strings): list of the instrument RICs (Refinitiv Instrument Codes) defining the companies to look at.

    Defaulted to: ['0#.MIWO00000PUS']. 'MIWO00000PUS' is the Refinitiv Identifier Code (RIC) for the MSCI World Index in USD.

 

    Fields (list of strings): list of the fields wanted from Refinitiv in our informatics.

    Defaulted to: ['TR.AnalyticEnvControv', 'TR.EnvMaterialsSourcing', 'TR.EnvSupplyChainMgt', 'PolicySustainablePackaging'].

    

    Static_fields (list of strings): list of the static fields wanted from Refinitiv in our informatics.

    Defaulted to: ['TR.HeadquartersCountry', 'TR.TRBCEconomicSector']

    

    SDate, EDate, Period, and Frq are string variables defining specifications of the data being retrieved from Refinitiv.

    Defaulted to: SDate = '0', EDate = '-19', Period = 'FY0', and Frq = 'FY'.

    

    save_excel (boolean): if True, it will save an Excel workbook named 'ESG_Boolean_Data.xlsx' in the location where this script is being run filled with a sheet for each data-set returned.

    Defaulted to: save_excel = False.

    

    

    Attributes:

    self.companies: returns a list of the companies asked for.

    self.fields: returns a list of the companies' RIC (Refinitiv Identification Codes) asked for.

    self.SDate: returns the Start Date (SDate) asked for.

    self.EDate: returns the End Date (EDate) asked for.

    self.period: returns the period asked for.

    self.frq: returns the frequency asked for.

    

    self.Vectorise_df: returns a Python function that 'vectorises' pandas-dataframes from a format where rows are element of the static field and columns are years. It takes the following arguments:

    _df_statics (Pandas data-frame): table of our data.

    _static_name (string): name of the static fields.

    _field_name (string): name of the static fields as shown in '_df_statics'.

    

    self.Df_statics_percent = returns a Python function that constructs a pandas data-frame of the number of 'True' data points per year (columns) per static field element (e.g.: each country in our data-set) as a percentage of all data-points in the data-set.

    _statics (list of strings): list of each element of our static fields (e.g.: the name of each country in our data-set if our static field is 'TR.HeadquartersCountry').

    _years (list of integers): list of the years in our data-set.

    _static_name (string): name of the static field used.

    _df (Pandas data-frame): our table of data in its raw form.

    _field_name (string): name of the metric's field asked for.

    

    self.Df_statics_abs = returns a Python function that constructs a pandas data-frame of the number of 'True' data points per year (columns) per static field element (e.g.: each country in our data-set).

    _statics (list of strings): list of each element of our static fields (e.g.: the name of each country in our data-set if our static field is 'TR.HeadquartersCountry').

    _years (list of integers): list of the years in our data-set.

    _static_name (string): name of the static field used.

    _df (Pandas data-frame): our table of data in its raw form.

    _field_name (string): name of the metric's field asked for.

    

    ESG_Boolean_Data().get_data() collects data via Refinitiv's eikon API and 'completes' it (filling in static fields).

    

    ESG_Boolean_Data().get_data().Aggregate_Graph() aggregates the data on a per-country basis for each year and returns a plotly line graph.

    

    ESG_Boolean_Data().Aggregate_Graph_with_Tot() returns plots that (can) include the total number of datapoints in the data-set for in-chart reference.

    

    ESG_Boolean_Data().Aggregate_Map_Graph() returns a map representing the asked datapoints per year (with a time slider).

    

    

    """

 

    def __init__(self,

                 Companies = ['0#.MIWO00000PUS'],

                 Fields = ['TR.AnalyticEnvControv',

                           'TR.EnvMaterialsSourcing',

                           'TR.EnvSupplyChainMgt',

                           'TR.PolicySustainablePackaging'],

                 Static_fields = ['TR.HeadquartersCountry',

                                  'TR.TRBCEconomicSector'],

                 SDate = '0', EDate = '-19', Period = 'FY0', Frq = 'FY',

                 save_excel = False,

                 Value = "True",

                 Antivalue = "False"):

        

        

        

        def Vectorise_df(_df_statics, _static_name, _field_name):

            ''' Vectorise_df:

            ESG_Boolean_Data.Vectorise_df is a function that takes Pandas data-frames that has static fields (e.g.: country name)

            as rows and years as columns, and transforms it to a 'vectorised' Pandas data-frame with the 1st column showing the static

            field name (e.g.: Country of Headquarters), its 2nd column displays years and the last one the field in question

            (e.g.: Environmental Controversies per Country of Headquarters). It returns the df_vectorised Pandas data-frame.

            Variables include:

            

            _df_statics (Pandas data-frame): original Pandas data-frames that has static fields (e.g.: country name) as rows and years as columns

            

            _static_name (string): name of the static field categorising through (e.g.: "Country of Headquarters")

            

            _field_name (string): name of the field of interest (e.g.: "Environmental Controversies")

            '''

            

            # Create an empty Pandas data-frame to be populated subsequently

            df_vectorised = pandas.DataFrame()

            

            # Create empty lists to be populated subsequently

            lis_stat, lis_year, lis1 = [], [], []

            

            # Iterate through the range from 0 to the 'height' of our ' _df_statics ' Pandas data-frame (i.e.: number of rows/indices)

            for i in range(len(_df_statics.index)):

                

                # Iterate through the range from 0 to the lenth of our ' _df_statics ' Pandas data-frame (i.e.: number of columns)

                for j in range(len(_df_statics.columns)):

                    

                    ## Append our previously created empty lists

                    # Append ' lis_stat ' with each index of our ' _df_statics ' Pandas data-frame

                    lis_stat.append(_df_statics.index[i])

                    # Append ' lis_year ' with each year in ' _df_statics '

                    #     The reaon for the need of this ' [1] ' here is because ' _df_statics ' has MultiIndex-ed columns,

                    #     we thus have to specify that we want to select the second set of columns

                    lis_year.append(_df_statics.columns[j][1])                    

                    # Append ' lis1 ' with the actual data we are interested in: the data in ' _df_statics '

                    lis1.append(_df_statics.iloc[i,j])

            # Add the three created lists to our Pandas data-frame ' df_vectorised '

            df_vectorised[_static_name] = lis_stat

            df_vectorised["year"] = lis_year

            df_vectorised[str(_field_name + " per " + _static_name)] = lis1

            

            return df_vectorised

        

        

        

        def Df_statics_percent(_statics, _years, _static_name, _df, _field_name, Value, Antivalue):

            '''Df_statics_percent:

            Returns a pandas data-frame of the number of 'True' (or whatever was chosen as 'Value') data points per year (columns) per static field element

            (e.g.: each country in our data-set) as a ratio/percentage. It takes the following arguments:

            

            _statics (list of strings): list of each element of our static fields (e.g.: the name of each country in our data-set if our static field is 'TR.HeadquartersCountry').

            _years (list of integers): list of the years in our data-set.

            _static_name (string): name of the static field used.

            _df (Pandas data-frame): our table of data in its raw form.

            _field_name (string): name of the metric's field asked for (e.g.: "Country of Headquarters")

            '''

            # Create an empty Pandas data-frame to populate subsequently

            df_statics = pandas.DataFrame(index = _statics)

            

            # Iterate through the years in ' _year '

            for year in _years:

                # Create a list ' lis ' with entries for each "True" value (if the variable ' Value ' was set to "True")

                #     AS A PERCENTAGE / ratio of all recorded values in _df where its static field (e.g.: Country)

                #     is iterated though (e.g.: from France to U.K):

                lis = [len(_df[_df[_static_name] == s][_df["Years"] == year][_df[_field_name] == Value]) / 

                       max(1,(len(_df[_df[_static_name] == s][_df["Years"] == year][_df[_field_name] == Value]) + 

                              len(_df[_df[_static_name] == s][_df["Years"] == year][_df[_field_name] == Antivalue]))

                          ) for s in _statics]

                # Add our newly created list to our data-frame ' df_statics ' under the column name of its year:

                df_statics[str(year)] = lis

            

            # In order to differentiate between each table/data-frame returned, the bellow creates a two tear column label to do just that

            columns = pandas.MultiIndex.from_product([[str("Aggregate data on a per " + str(_static_name) + 

                                                           " basis for " + _field_name)],

                                                      list(df_statics.columns)],

                                                     names = ['', 'Year'])

            

            # Finally: add everything to a Pandas data-frame

            df_statics = pandas.DataFrame(df_statics.values, index = df_statics.index, columns = columns)

            

            return df_statics

        

        

        def Df_statics_abs(_statics, _years, _static_name, _df, _field_name, Value):

            '''Df_statics_abs:

            Returns a pandas data-frame of the number of 'True' (or whatever was chosen as 'Value') data points per year (columns) per static field element

            (e.g.: each country in our data-set). It is not in percent, it is in 'absolute', thus the 'abs'.

            It returns the Pandas data-frame. It takes the following arguments:

            

            _statics (list of strings): list of each element of our static fields (e.g.: the name of each country in our data-set if our static field is 'TR.HeadquartersCountry').

            _years (list of integers): list of the years in our data-set.

            _static_name (string): name of the static field used.

            _df (Pandas data-frame): our table of data in its raw form.

            _field_name (string): name of the metric's field asked for (e.g.: "Country of Headquarters")

            '''

            # Create an empty Pandas data-frame to populate subsequently

            df_statics = pandas.DataFrame(index = _statics)

            

            # Iterate through the years in ' _year '

            for year in _years:

                # Create a list ' lis ' with entries for each "True" value (if the variable ' Value ' was set to "True")

                #     in _df where its static field (e.g.: Country) is iterated though (e.g.: from France to U.K):

                lis = [len(_df[_df[_static_name] == s][_df["Years"] == year][_df[_field_name] == Value]

                          ) for s in _statics]

                # Add our newly created list to our data-frame ' df_statics ' under the column name of its year:

                df_statics[str(year)] = lis

            

            # In order to differentiate between each table/data-frame returned, the bellow creates a two tear column label to do just that

            columns = pandas.MultiIndex.from_product([[str("Aggregate data on a per " + str(_static_name) +

                                                           " basis for " + _field_name)], list(df_statics.columns)],

                                                     names = ['', 'Year'])

            

            # Finally: add everything to a Pandas data-frame

            df_statics = pandas.DataFrame(df_statics.values, index = df_statics.index, columns = columns)

            

            return df_statics

        

        

        

        ## Set all attributes:

        self.companies = Companies

        self.fields = Companies

        self.SDate = SDate

        self.EDate = EDate

        self.period = Period

        self.frq = Frq

        self.Vectorise_df = Vectorise_df

        self.Df_statics_percent = Df_statics_percent

        self.Df_statics_abs = Df_statics_abs

        self.Value = Value

        self.Antivalue = Antivalue

        

        

    def get_data(self,

                 Companies = ['0#.MIWO00000PUS'],

                 Fields = ['TR.AnalyticEnvControv',

                           'TR.EnvMaterialsSourcing',

                           'TR.EnvSupplyChainMgt',

                           'TR.PolicySustainablePackaging'],

                 Static_fields = ['TR.HeadquartersCountry',

                                  'TR.TRBCEconomicSector'],

                 SDate = '0', EDate = '-19', Period = 'FY0', Frq = 'FY',

                 save_excel = False):

        

        """ESG_Boolean_Data().get_data() collects data via Refinitiv's eikon API and 'completes' it (filling in static fields).

        It takes the following arguments:

 

        Companies (list of strings): list of the instrument RICs (Refinitiv Instrument Codes) defining the companies to look at.

        Defaulted to: ['0#.MIWO00000PUS']. 'MIWO00000PUS' is the Refinitiv Identifier Code (RIC) for the MSCI World Index in USD.

 

        Fields (list of strings): List of the fields wanted from Refinitiv in our informatics.

        Defaulted to: ['TR.AnalyticEnvControv', 'TR.EnvMaterialsSourcing', 'TR.EnvSupplyChainMgt', 'PolicySustainablePackaging'].

        

        Static_fields (list of strings): list of the static fields wanted from Refinitiv in our informatics.

        Defaulted to: ['TR.HeadquartersCountry', 'TR.TRBCEconomicSector']

 

        SDate, EDate, Period, and Frq are string variables defining specifications of the data being retrieved from Refinitiv.

        Defaulted to: SDate = '0', EDate = '-19', Period = 'FY0', and Frq = 'FY'.

 

        save_excel (boolean): If True, it will save an Excel workbook named 'ESG_Boolean_Data.xlsx' in the location where this script is being run.

        Defaulted to: save_excel = False.

        

        

        Attributes:

        self.fields: returns a list of the fields asked for.

        self.field_names: returns a list of the fields asked for's names.

        self.df: returns the most complete / raw / wholesome data-frame.

        self.statics: returns a list of our static fields (e.g.: ['TR.PolicySustainablePackaging'])

        self.static_fields_names = returns a list of our static fields names.

        self.years: returns a list of the years for which data was collected.

        """

        

        # First: create a list with the fields asking for the country and sector of each company requested.

        # This list is to appended with the list of companies to request info. for.

        All_fields = list(Static_fields)

        

        for i in Fields:

            All_fields.append(i + ".date") # Collect the date for this data point

            All_fields.append(i) # We obviously want the data point in question

        

        # Collect data from Refinitiv

        df, err = ek.get_data(instruments = Companies,

                              fields = All_fields,

                              parameters = {'SDate' : self.SDate,

                                            'EDate' : self.EDate,

                                            'Period' : self.period,

                                            'Frq' : self.frq})

        

        # Collect the name of the fields we are interested in:

        if len(Fields) == 1:

            Fields_Names = [str(df.columns[(len(Static_fields) + 2)])]

        else:

            Fields_Names = list(df.iloc[:, [len(Static_fields) + 2*i  for i in range(1, len(Fields) + 1)]].columns)

        Static_fields_names = list(df.iloc[:,1:len(Static_fields)+1].columns)

        

        

        ## Now we will populate every static column in our newly defined ' df ' pandas data-frame

        # define the difference in time for use further down the code

        time_diff = abs(abs(int(SDate)) - abs(int(EDate)))

        # replace all empty values in static columns with NaN

        df[Static_fields_names] = df[Static_fields_names].replace('', numpy.nan, regex = True)

        # Now we can change these NaN values easilly and complete the previously empty values quickly:

        df[Static_fields_names] = df[Static_fields_names].fillna(method = 'ffill', limit = time_diff + 1)

        

        # Construct a straight-forward 'Years' column

        df_years = []

        # the following if statement is there to account for any empty values in the "Date" df column

        if list(df.columns).count("Date") == 1:

            for d in df["Date"]:

                if str(d)[:4] == "":

                    df_years.append(numpy.nan)

                else:

                    df_years.append(int(str(d)[:4]))

        else:

            for d in range(len(df["Date"].index)):

                if str(df["Date"].iloc[d,:][0])[:4] == "":

                    df_years.append(numpy.nan)

                else:

                    df_years.append(int(str(df["Date"].iloc[d,:][0])[:4]))

        df["Years"] = df_years

        

        Statics = [df[i].unique() for i in Static_fields_names]

        

        

        # Make a Python variable containing all the years in our data-frame

        if list(df.columns).count("Date") == 1: # if statement here to account for when there's only 1 Date column

            years = [str(i)[:4] for i in df["Date"]]

        else:

            years = [str(i)[:4] for i in df["Date"].iloc[:,0]]

        years = numpy.unique(numpy.array(years))[1:].astype(numpy.int) # The 0th value is an empty string

 

        if save_excel == True:

            # ' ExcelWriter ' is needed to create an Excel Workbook with multiple tabs/sheets

            writer = pandas.ExcelWriter("ESG_Boolean_Data.xlsx", engine = 'xlsxwriter')

            df.to_excel(writer, sheet_name = "df")

            writer.save()

            writer.close()

        

        

        ## Set more attributes:

        self.fields = All_fields

        self.field_names = Fields_Names

        self.df = df

        self.statics = Statics

        self.static_fields_names = Static_fields_names

        self.years = years

        

        return self

    

    

    

    def Aggregate_Graph(self,

                        Percent = False,

                        Line_graph_theme = 'solar',

                        Start_graph_year = 2000,

                        End_graph_year = 2019,

                        Type = "line", # "line", "area", "bubble" or "scatter".

                        Line_shape = "linear", # can be 'linear', 'spline', 'hv', 'vh', 'hvh', 'vhv'

                        Bubble_size = 20,

                        save_excel = False,

                        show_figures = False): 

        """ESG_Boolean_Data().get_data().Aggregate_Graph() aggregates the data on a per-country basis for each year and returns a plotly line graph.

        It takes the following arguments:

 

        Percent (boolean): If 'True', will return data as a ratio (i.e.: percentage) of the total number of companies in the data-set. 'False' will return the actual number of companies in question.

        Defaulted to: Percent = False

        

        Line_graph_theme (str): defines the theme of the outputted line graph.

        Defaulted as: Line_graph_theme = 'solar'.

        Available templates: 'pearl', 'white', 'ggplot', 'solar', 'space'.

         

        Start_graph_year (int): start year of the graph outputted.

        Defaulted as: Start_graph_year = 2000.

        

        End_graph_year (int): end year of the graph outputted.

        Defaulted as: End_graph_year = 2019.

        

        Type (str): defines the type of chart produced

        Defaulted as: Type = "line".

        Available templates: "line", "area", "bubble" or "scatter".

        

        Line_shape (str): defines the type of line used in the graphs when ' Type ' is  "line" or "area".

        Defaulted as: Line_shape = "linear".

        Available templates: 'linear', 'spline', 'hv', 'vh', 'hvh', 'vhv'

        

        Bubble_size (int): the maximum bubble size for a "bubble" or "scatter" chart Type.

        Defaulted as: Bubble_size = 20

 

        save_excel (boolean): If True, it will save an Excel workbook named 'ESGAggregated.xlsx' in the location where this script is being run. If an Excel workbook named 'ESGAggregated.xlsx' already exists, it will add sheets to the existing one.

        Defaulted to: save_excel = False.

        

        show_figures (boolean): If True, the charts will be returned one by one.

        Defaulted as: show_figures = False

        

        

        Attributes:

        self.df_statics_list: returns a list of all the data-frames constructed (each one aggregating data on a per-static basis for each year).

        self.df_vectorised_list: returns a list of all the 'vectorised' data-frames constructed. They differ from the ' df_statics_list ' data-frames as they are not formatted with years or static fields as columns or rows.

        self.fig: returns a list of all the figures/charts constructed.

        """

        

        # Collect our previously defined variables of relevance to this ' Aggregate_Graph ' instance

        df = self.df

        years = self.years

        Vectorise_df = self.Vectorise_df

        Df_statics_percent = self.Df_statics_percent

        Df_statics_abs = self.Df_statics_abs

        Value = self.Value

        invalid_Type_exception = "Invalid value assigned to 'Type'. 'Type' can only be 'line', 'area', 'bubble' or 'scatter' as specified by plotly. For more see 'More from plotly.graph_objs' on the right in https://www.programcreek.com/python/example/103216/plotly.graph_objs.Layout"

        

        

        # Create empty lists to populate subsequently

        each_df_statics_list, figures, df_vectorised_list = [], [], []

        

        for static_name, statics in zip(self.static_fields_names, self.statics):

            df_statics_list = [] # Create empty lists to populate subsequently

            

            for field_name, field in zip(self.field_names, self.fields):

                

                # Going through all the possible charts that can be asked, all types need the following

                #     few lines bar the bubble (a.k.a: scatter) chart

                if Type != "bubble" and Type != "scatter":

                    

                    # Account for if user wants values in Percent or Absolute values:

                    if Percent == True:

                        df_statics = Df_statics_percent(_statics = statics,

                                                        _years = years, _df = df,

                                                        _static_name = static_name,

                                                        _field_name = field_name,

                                                        Value = self.Value,

                                                        Antivalue = self.Antivalue)

                        title = str("Ratio of companies with " + field_name + " per " +

                                    str(static_name) + " as a ratio out of all companies reporting this data")

                    elif Percent == False:

                        df_statics = Df_statics_abs(_statics = statics,

                                                    _years = years, _df = df,

                                                    _static_name = static_name,

                                                    _field_name = field_name,

                                                    Value = self.Value)

                        title = str("Number of companies reporting " + str(self.Value) + " for " + field_name

                                    + " per " + str(static_name) + " (out of all companies reporting this data)")

                    else:

                        # ' Percent ' needs to be a boolean value, it is possible for users to misassign it; the bellow is a catch 22 for it:

                        raise Exception("Invalid value assigned to 'Percent'. 'Percent' is a boolean variable in 'ESG_Boolean_Data().get_data().Aggregate_Graph', please assign it a 'True' or 'False' value. Note that its default value is 'False'.")

                    df_statics_list.append(df_statics)

                    

                    # Now we can take care of saving an Excel Workbook if the user wants to

                    if save_excel == True:

                        # ' ExcelWriter ' is needed to create an Excel Workbook with multiple tabs/sheets

                        if os.path.isfile("ESGAggregated.xlsx"): # If file exists:

                            # Open Excel Workbook and load it:

                            writer = pandas.ExcelWriter("ESGAggregated.xlsx", engine = 'openpyxl')

                            writer.book = openpyxl.load_workbook("ESGAggregated.xlsx")

                        else: # If file doesn not exist:

                            # Create Excel Workbook with specified name, then load it:

                            writer = pandas.ExcelWriter("ESGAggregated.xlsx", engine = 'xlsxwriter')

                        

                        # Save the ' df_statics ' data-frame to the workbook:

                        df_statics.to_excel(writer, sheet_name = (field.replace("TR.", "")[0:21] +

                                                                  "_df_" + str(static_name)[0:5]))

                        writer.save() # Save the Workbook

                        writer.close() # Close the loaded data

                    

 

                    # Vectorise our data-frame to then graph it correctly:

                    df_vectorised = Vectorise_df(_df_statics = df_statics,

                                                 _static_name = static_name,

                                                 _field_name = field_name)

                    # Keep all ' df_vectorised ' for future reference

                    df_vectorised_list.append(df_vectorised)

 

                    # Create our plot:

                    _x = "year"

                    _y = str(field_name + " per " + static_name)

                    _labels = {str(field_name + " per " + static_name): str(field_name), "year" : "Years"}

 

                    if Type == "line":

                        fig = plotly.express.line(df_vectorised, x = _x, y = _y, labels = _labels,

                                                  color = static_name, line_group = static_name,

                                                  hover_name = static_name, title = title,

                                                  line_shape = Line_shape)

 

                    elif Type == "area":

                        fig = plotly.express.area(df_vectorised, x = _x, y = _y, color = static_name,

                                                  line_group = static_name, title = title)

                        

                    else:

                        raise Exception(invalid_Type_exception)

                

                # If a "bubble" or "scatter" plot was asked for, the steps to take are a little different

                elif Type == "bubble" or Type == "scatter":

                    

                    # Get both percent and absolute values:

                    df_statics_perc = Df_statics_percent(_statics = statics,

                                                         _years = years, _df = df,

                                                         _static_name = static_name,

                                                         _field_name = field_name,

                                                         Value = self.Value,

                                                         Antivalue = self.Antivalue)

                    

                    df_statics_abs = Df_statics_abs(_statics = statics,

                                                    _years = years, _df = df,

                                                    _static_name = static_name,

                                                    _field_name = field_name,

                                                    Value = self.Value)

                    

                    ## Vectorise our data-frame to then graph it correctly:

                    # Vectorise our percentage data

                    df_vectorised_perc = Vectorise_df(_df_statics = df_statics_perc,

                                                      _static_name = static_name,

                                                      _field_name = field_name)

                    # Vectorise our absolute data

                    df_vectorised = Vectorise_df(_df_statics = df_statics_abs,

                                                 _static_name = static_name,

                                                 _field_name = field_name)

                    # Concatinate thre two together

                    df_vectorised[str(df_vectorised_perc.columns[2] + " in percent")

                                 ] = df_vectorised_perc[df_vectorised_perc.columns[2]]

                    

                    # Put our data together in an easy to use list

                    df_vectorised_list.append(df_vectorised)

                    

                    # Plot our data:

                    fig = plotly.express.scatter(df_vectorised,

                                                 x = "year",

                                                 y = str(field_name + " per " + static_name),

                                                 size = str(df_vectorised_perc.columns[2] + " in percent"),

                                                 color = static_name,

                                                 hover_name = static_name,

                                                 size_max = Bubble_size,

                                                 labels = {str(field_name + " per " + static_name):

                                                           str(field_name), "year" : "Years"},

                                                 title = str("Number of companies reporting 'True' for "

                                                             + field_name + " per " + str(static_name) +

                                                             " (out of all companies reporting this data)"))

                    

                    # Keep both our percent and absolute data-frames for easy reference by the user:

                    df_statics_list.append(df_statics_perc)

                    df_statics_list.append(df_statics_abs)

                    

                    # Now we can take care of saving an Excel Workbook if the user wants to

                    if save_excel == True:

                        # Here we are interested in both data-frames, Percentage (P) and Absolute (A):

                        for DF,PA in zip((df_statics_perc, df_statics_abs), (P,A)):

                            # ' ExcelWriter ' is needed to create an Excel Workbook with multiple tabs/sheets

                            if os.path.isfile("ESGAggregated.xlsx"): # If file exists:

                                writer = pandas.ExcelWriter("ESGAggregated.xlsx", engine = 'openpyxl')

                                writer.book = openpyxl.load_workbook("ESGAggregated.xlsx")

                            else: # If file doesn not exist:

                                # Create Excel Workbook with specified name, then load it:

                                writer = pandas.ExcelWriter("ESGAggregated.xlsx", engine = 'xlsxwriter')

                            # Save the data-frame to the workbook:

                            DF.to_excel(writer, sheet_name = (field.replace("TR.", "")[0:20] +

                                                              "_df" + PA + "_" + str(static_name)[0:5]))

                            writer.save() # Save the Workbook

                            writer.close() # Close the loaded data

                

                else:

                    raise Exception(invalid_Type_exception)

                

                # Specify plot specs.

                fig.layout.template = "plotly_dark"

                figures.append(fig)

                if show_figures == True:

                    iplot(fig, config=dict(showLink=True))

                    

            # Keep all figures in an easy to use list the user can summon

            each_df_statics_list.append(df_statics_list)

        

        ## Set more attributes:

        self.statics_list = each_df_statics_list

        self.fig = figures

        self.df_vectorised_list = df_vectorised_list

        

        return self

    

    

    def Aggregate_Graph_with_Tot(self,

                                 Total_no_in_data_set = True,

                                 Color_theme = "plotly_dark",

                                 Total_data_shape = "line", # "lines", "markers", "lines+markers"

                                 show_figures = False,

                                 Percent = False,

                                 Type = "Bar"): # can be "Line", "Scattergl", "XAxis", "Bar", "Scatter", "YAxis", "Margin"

        """ESG_Boolean_Data().Aggregate_Graph_with_Tot() returns plots that (can) include the total number of datapoints in the data-set for in-chart reference.

        It takes the following arguments:

        

        Total_no_in_data_set (boolean): if True, then the total number of data-points in the data-set will be showing in the plot.

        Defaulted to: Total_no_in_data_set = True

        

        Color_theme (str): defines the theme of the plotly plot

        Defaulted to: Color_theme = "plotly_dark"

        Available templates: "plotly", "plotly_white", "plotly_dark", "ggplot2", "seaborn", "simple_white", "none"

        

        Total_data_shape (str): defines the type of line (&/or markers) used in the graphs.

        Defaulted as: Total_data_shape = "line".

        Available templates: "lines", "markers", "lines+markers"

        

        show_figures (boolean): If True, the charts will be returned one by one.

        Defaulted as: show_figures = False

        

        Percent (boolean): If 'True', will return data as a ratio (i.e.: percentage) of the total number of companies in the data-set. 'False' will return the actual number of companies in question.

        Defaulted to: Percent = False

        

        Type (str): defines the type of chart produced

        Defaulted as: Type = "line".

        Available templates: "Line", "Scattergl", "XAxis", "Bar", "Scatter", "YAxis", "Margin".

        

        To be added in V 2.0:

        save_excel (boolean): If True, it will save an Excel workbook named 'ESGAggregated.xlsx' in the location where this script is being run. If an Excel workbook named 'ESGAggregated.xlsx' already exists, it will add sheets to the existing one.

        Defaulted to: save_excel = False.

        

        

        Attributes:

        self.fig: returns a list of all the figures/charts constructed.

        self.statics_list: returns a list of all the data-frames constructed (each one aggregating data on a per-static basis for each year).

        """

        

        # Collect our previously defined variables of relevance to this ' Aggregate_Graph_with_Tot ' instance

        df = self.df

        years = self.years

        fields = self.fields

        static_fields_names = self.static_fields_names

        statics = self.statics

        field_names = self.field_names

        Df_statics_percent = self.Df_statics_percent

        Df_statics_abs = self.Df_statics_abs

        

        # Warn the user that percentage values will look tiny compared to total if

        #     setting Total_no_in_data_set = True and Percent = True.

        #     With that said, if that's what they want to do, why stop them?

        if Total_no_in_data_set == True and Percent == True:

            warnings.warn('Note that setting Total_no_in_data_set = True and Percent = True will \n means that the bar-graph values will be in percent of Total_no_in_data_set. However, you may still use Total_no_in_data_set = True and Percent = True if you so wish')

        

        

        # We will populate the list ' figures ' with the figures created.

        # We can go through them as self.fig[i] for the ith figure.

        # Remember that there is one figure per static field per 'dynamic' field.

        figures = []

        

        # Create empty lists to populate subsequently

        df_statics_list, each_df_statics_list, Total_df_list = [], [], []

        

        # Iterate through each field and their (column) name:

        for field_name, field in zip(field_names, fields):

            

            # If wishing to get total number of datapoints in data-set on the figure:

            if Total_no_in_data_set == True:

                Total_df = pandas.DataFrame(index = ["Number of companies in the data-set for " + field_name])

                for year in years:

                    lis = len(df[df[field_name] != ""][df["Years"] == year]["Instrument"].unique())

                    Total_df[str(year)] = lis

 

                # ' Total_df ' might be the same for all ' field_name 's, but THAT IS NOT A GUARENTEE!

                Total_df_list.append(Total_df)

            

            

            if Percent == True: # If wishing to display values in percent on the chart:

                

                # Iterate through each static field (e.g: TR.HeadquartersCountry)

                #     and its column name (e.g.: Country of Headquarters)

                for static_name, static in zip(static_fields_names, statics):

                    # Get percent values:

                    df_statics = Df_statics_percent(_statics = static,

                                                    _years = years, _df = df,

                                                    _static_name = static_name,

                                                    _field_name = field_name,

                                                        Value = self.Value,

                                                        Antivalue = self.Antivalue)

                    

                    # Keep all data-frames in an easy to use list the user can summon

                    df_statics_list.append(df_statics)

            

            # If wishing to display values in absolute terms on the chart:

            if Percent == False:

                

                # Iterate through each static field (e.g: TR.HeadquartersCountry)

                #     and its column name (e.g.: Country of Headquarters)

                for static_name, static in zip(static_fields_names, statics):

                    

                    # Get absolute values

                    df_statics = Df_statics_abs(_statics = static,

                                                    _years = years, _df = df,

                                                    _static_name = static_name,

                                                    _field_name = field_name,

                                                Value = self.Value)

                    

                    # Keep all data-frames in an easy to use list the user can summon

                    df_statics_list.append(df_statics)

            

            else: # Catch 22 for users miss-specifying the ' Percent ' variable

                warnings.warn("Please note that the variable 'Percent' in 'ESG_Boolean_Data().Stacked_Bar_Graph()' is Boolean and needs to be defined as True or False.")

            

            # Keep all data-frames in an easy to use list the user can summon

            each_df_statics_list.append(df_statics_list)

            

        # ' static_name ' is needed to go through each static field's data-frame within ' df_statics_list '

        for static_name in static_fields_names:

            

            for field_name in field_names: # Iterate through each field and their (column) name:

                

                data = [] # ' data ' will be our maid data-frame that we will populate with plotly data

                

                if Total_no_in_data_set == True: # If wishing to display values in percent on the chart:

                    for Tot_df_list_element in Total_df_list: # For each field of interest:

                        if field_name in str(Tot_df_list_element.index): # If the field is in our data-frame:

                            _Total_df = Tot_df_list_element

 

                for df_stat_list_element in df_statics_list: 

                    if field_name in str(df_stat_list_element.columns) and static_name in str(df_stat_list_element.columns):

                        _df_static = df_stat_list_element

 

                for s in _df_static.index: # for each of the fields of interest:

                    

                    # Specify the plot's content:

                    _x = [_df_static.columns[i][1] for i in range(len(_df_static.columns))]

                    _y = list(_df_static.T[s])

                    

                    # Plot our graphs:

                    if Type == "Bar":

                        data.append(plotly.graph_objs.Bar(name = s, x = _x, y = _y))

                    elif Type == "Line":

                        data.append(plotly.graph_objs.Line(name = s, x = _x, y = _y))

                    elif Type == "Scatter":

                        data.append(plotly.graph_objs.Scatter(name = s, x = _x, y = _y))

                    elif Type == "Scattergl":

                        data.append(plotly.graph_objs.Scattergl(name = s, x = _x, y = _y))

                    elif Type == "XAxis":

                        data.append(plotly.graph_objs.XAxis(name = s, x = _x, y = _y))

                    elif Type == "YAxis":

                        data.append(plotly.graph_objs.YAxis(name = s, x = _x, y = _y))

                    elif Type == "Margin":

                        data.append(plotly.graph_objs.Margin(name = s, x = _x, y = _y))

                        

                # Include total number in dataset if asked:

                if Total_no_in_data_set == True:

                    data.append(

                        plotly.graph_objs.Scatter(

                            name = "number of companies in the data-set",

                            x = [_df_static.columns[i][1] for i in range(len(_df_static.columns))],

                            y = [_Total_df.values[0][i] for i in range(len(_Total_df.values[0]))],

                            mode = Total_data_shape))

                

                # Specify the layout of our plotly chart:

                layout = plotly.graph_objs.Layout(

                    barmode = 'stack',

                    title = str("Worldwide number of companies reporting 'True' on " +

                                field_name + " per " + static_name))

                

                # Create the figure:

                fig = plotly.graph_objs.Figure(data = data, layout = layout)

                fig.layout.template = Color_theme

                

                # Show figures one by one if asked:

                if show_figures == True:

                    iplot(fig, filename = 'pandas-bar-chart-layout', config=dict(showLink=True)) # IPython notebook

                

                # Keep all data-frames in an easy to use list the user can summon

                figures.append(fig)

        

        ## Set more attributes:

        self.fig = figures

        self.statics_list = each_df_statics_list

        

        return self

    

 

    

    def Aggregate_Map_Graph(self,

                            Color_theme = "plotly_dark",

                            show_figures = True,

                            Percent = False):

        """ESG_Boolean_Data().Aggregate_Map_Graph() returns a map representing the asked datapoints per year (with a time slider).

        It takes the following arguments:

        

        Color_theme (str): sets the theme of the plotly map displayed.

        Defaulted to: Color_theme = "plotly_dark"

        Available templates (to be verified): "plotly", "plotly_white", "plotly_dark", "ggplot2", "seaborn", "simple_white", "none"

        

        show_figures (boolean): If True, the charts will be returned one by one.

        Defaulted as: show_figures = False

        

        Percent (boolean): If 'True', will return data as a ratio (i.e.: percentage) of the total number of companies in the data-set. 'False' will return the actual number of companies in question.

        Defaulted to: Percent = False

        

        

        Attributes:

        self.fig: returns a list of all the figures/charts constructed.

        self.df_vectorised_list: returns a list of all the data-frames constructed.

        """

        

        

        # Collect our previously defined variables of relevance to this ' Aggregate_Map_Graph ' instance

        df = self.df

        years = self.years

        Vectorise_df = self.Vectorise_df

        Df_statics_percent = self.Df_statics_percent

        Df_statics_abs = self.Df_statics_abs

        

        

        # collect data-frame as provided by plotly on how to create map plots:

        plomap = plotly.express.data.gapminder()

 

        # create ' country_n_iso_df ' data-frame to list countries in ' plomap ' pandas data-frame

        country_n_iso_df = pandas.DataFrame() # create data-frame to populate below

        country_n_iso_df["country"] = pandas.Series(plomap["country"][plomap["country"] != "Korea, Dem. Rep."]).unique()

 

        # populate ' country_n_iso_df '

        place_holder = list(pandas.Series(plomap["iso_alpha"]).unique()) # need this step as the below is 'capricious'

        country_n_iso_df["iso_alpha"] = place_holder

 

        # rename countries the same in ' country_n_iso_df ' as per Reinitiv's data-sets

        country_n_iso_df.country[country_n_iso_df.country == "United States"] = "United States of America"

        country_n_iso_df.country[country_n_iso_df.country == "Ireland"] = "Ireland; Republic of"

        country_n_iso_df.country[country_n_iso_df.country == "Hong Kong, China"] = "Hong Kong"

 

        # add iso_alpha missing

        country_n_iso_df = country_n_iso_df.append(

            pandas.DataFrame(data = [["Isle of Man", "IMN"],

                                     ["Luxembourg", "LUX"],

                                     ["Macau", "MAC"],

                                     ["Bermuda", "BMU"],

                                     ["Papua New Guinea", "PNG"]],

                             columns = ["country", "iso_alpha"]

                            )).sort_values('country', ignore_index = True)

 

        

        if "TR.HeadquartersCountry" in self.fields:

            # ' int(self.fields.index("TR.HeadquartersCountry")) ' provides the position of "TR.HeadquartersCountry" in our fields.

            statics = self.statics[int(self.fields.index("TR.HeadquartersCountry"))]

        else:

            warnings.warn('The static field needed to use the map function is "HeadquartersCountry". Please make sure that you include "HeadquartersCountry" in the list of fields in the data set; specifically the Static_fields')

        

        df_vectorised_list = [] # empty list to fill in subsequently with vectorised data-frame

        figures = [] # empty list to fill in subsequently with our produced figures

        for field_name, field in zip(self.field_names, self.fields):

            

            if Percent == True: # If wishing to display values in percent on the map:

                df_statics = Df_statics_percent(_statics = statics,

                                                _years = years, _df = df,

                                                _static_name = "Country of Headquarters",

                                                _field_name = field_name,

                                                Value = self.Value,

                                                Antivalue = self.Antivalue)

                title = str("Ratio of companies with " + field_name +

                            " per " + "TR.HeadquartersCountry" + " as a ratio out of all companies reporting this data")

            

            elif Percent == False: # If wishing to display values in absolute terms on the map:

                df_statics = Df_statics_abs(_statics = statics,

                                            _years = years, _df = df,

                                            _static_name = "Country of Headquarters",

                                            _field_name = field_name,

                                            Value = self.Value)

                title = str("Number of companies reporting 'True' for " + field_name +

                            " per " + "TR.HeadquartersCountry" + " (out of all companies reporting this data)")

            else:

                raise Exception("Invalid value assigned to 'Percent'. 'Percent' is a boolean variable in 'ESG_Boolean_Data().get_data().Aggregate_Map_Graph', please assign it a 'True' or 'False' value. Note that its default value is 'False'.")

            

            

            # Vectorise our data-frame to then graph it correctly:

            df_vectorised = Vectorise_df(_df_statics = df_statics,

                                         _static_name = "TR.HeadquartersCountry",

                                         _field_name = field_name)

            df_vectorised_list.append(country_n_iso_df)

            df_vectorised_list.append(df_vectorised)

            

            

            # finally merge/construct our data-frame used to project our map:

            map_df = df_vectorised.merge(country_n_iso_df,

                                         left_on = ['TR.HeadquartersCountry'],

                                         right_on = ['country'],

                                         how = "left")

            

            # Project our plotly map:

            fig = plotly.express.choropleth(

                map_df,

                locations = "iso_alpha",

                color = str(field_name + " per " + "TR.HeadquartersCountry"),

                hover_name = "country",

                animation_frame = "year",

                range_color = [min(list(df_vectorised.iloc[:, 2])),

                               max(max(list(df_vectorised.iloc[:, 2])), 0.0001)])

            fig.layout.template = Color_theme

            figures.append(fig)

            if show_figures == True: # Only show figures one by one if asked:

                iplot(fig, config = dict(showLink = True)) # IPython notebook

        

        ## Set more attributes:

        self.fig = figures

        self.df_vectorised_list = df_vectorised_list

        

        return self

Example Uses:

Our Python class can be used in several ways:

Gather Raw Data:

ESG_Boolean_Data Python Class Version 1.0: This Python class allows its user to aggregate ESG Boolean Data (i.e.: Data that is 'False' or 'True') for companies within Refinitiv's data-base on a per-static field (be it country, economic sector, ...) basis.

    	
            tes0 = ESG_Boolean_Data(Value = "True", Antivalue = "False")
        
        
    

' ESG_Boolean_Data().get_data() ' collects data via Refinitiv's eikon API and 'completes' it (filling in static fields).

    	
            

tes1 = tes0.get_data(Companies = ['0#.MIWO00000PUS'], # e.g.: SPX, MIWO00000PUS

                     Fields = ['TR.AnalyticEnvControv',

                               'TR.EnvMaterialsSourcing',

                               'TR.EnvSupplyChainMgt',

                               'TR.PolicySustainablePackaging'],

                     Static_fields = ['TR.HeadquartersCountry',

                                      'TR.TRBCEconomicSector'])

' .df ' returns the most complete / raw / wholesome data-frame.

    	
            tes1.df
        
        
    
  Instrument Country of Headquarters TRBC Economic Sector Name Date Environmental Controversies Date Environmental Materials Sourcing Date Environmental Supply Chain Management Date Policy Sustainable Packaging Years
0 ZTS.N United States of America Healthcare 2019-12-31T00:00:00Z FALSE 2019-12-31T00:00:00Z TRUE 2019-12-31T00:00:00Z TRUE 2019-12-31T00:00:00Z TRUE 2019
1 ZTS.N United States of America Healthcare 2018-12-31T00:00:00Z FALSE 2018-12-31T00:00:00Z TRUE 2018-12-31T00:00:00Z TRUE 2018-12-31T00:00:00Z FALSE 2018
2 ZTS.N United States of America Healthcare 2017-12-31T00:00:00Z FALSE 2017-12-31T00:00:00Z FALSE 2017-12-31T00:00:00Z TRUE 2017-12-31T00:00:00Z FALSE 2017
3 ZTS.N United States of America Healthcare 2016-12-31T00:00:00Z FALSE 2016-12-31T00:00:00Z FALSE 2016-12-31T00:00:00Z TRUE 2016-12-31T00:00:00Z FALSE 2016
4 ZTS.N United States of America Healthcare 2015-12-31T00:00:00Z FALSE 2015-12-31T00:00:00Z FALSE 2015-12-31T00:00:00Z TRUE 2015-12-31T00:00:00Z FALSE 2015
... ... ... ... ... ... ... ... ... ... ... ... ...
31751 CARLb.CO Denmark Consumer Non-Cyclicals 2004-12-31T00:00:00Z FALSE 2004-12-31T00:00:00Z FALSE 2004-12-31T00:00:00Z FALSE 2004-12-31T00:00:00Z FALSE 2004
31752 CARLb.CO Denmark Consumer Non-Cyclicals 2003-12-31T00:00:00Z FALSE 2003-12-31T00:00:00Z FALSE 2003-12-31T00:00:00Z FALSE 2003-12-31T00:00:00Z FALSE 2003
31753 CARLb.CO Denmark Consumer Non-Cyclicals 2002-12-31T00:00:00Z FALSE 2002-12-31T00:00:00Z FALSE 2002-12-31T00:00:00Z FALSE 2002-12-31T00:00:00Z FALSE 2002
31754 CARLb.CO Denmark Consumer Non-Cyclicals 2001-12-31T00:00:00Z 2001-12-31T00:00:00Z 2001-12-31T00:00:00Z 2001-12-31T00:00:00Z 2001
31755 CARLb.CO Denmark Consumer Non-Cyclicals 2000-12-31T00:00:00Z 2000-12-31T00:00:00Z 2000-12-31T00:00:00Z 2000-12-31T00:00:00Z 2000

Plot Graphs:

' ESG_Boolean_Data().get_data().Aggregate_Graph() ' aggregates the data on a per-country basis for each year and returns a plotly line graph.

    	
            

tes2 = tes1.Aggregate_Graph(Percent = True,

                            Line_graph_theme = 'solar',

                            Start_graph_year = 2000,

                            End_graph_year = 2019,

                            Line_shape = "linear",

                            save_excel = False,

                            show_figures = False,

                            Type = "area",

                            Bubble_size = 20)

' .fig ' returns a list of all the figures/charts constructed.

    	
            

# ' config = dict(showLink = True) ' is there to spit out the link at the bottom right:

iplot(tes2.fig[0], config = dict(showLink = True))

Gather Graph Data:

' statics_list ' returns a list of all the data-frames constructed (each one aggregating data on a per-static basis for each year).

    	
            tes2.statics_list[0][0]
        
        
    
Aggregate data on a per Country of Headquarters basis for Environmental Controversies
Year 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020
United States of America 0 0 0 0 0 0 0 0 0 0 0 0.00718 0.02055 0.00664 0.00444 0.01099 0.02146 0.02087 0.04324 0.02505 0.03915 0.03889 0
Switzerland 0 0 0 0 0 0 0 0 0 0 0 0 0.02381 0 0 0.02273 0 0 0.06818 0 0 0.02564 0
France 0 0 0 0 0 0 0 0 0 0 0 0.01613 0.04688 0 0.01515 0 0.01493 0.02941 0.05797 0.02817 0.04225 0.03333 0
Japan 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00763 0 0.00366 0.00719 0 0.01365 0.01661 0.00654 0.03053 0
Australia 0 0 0 0 0 0 0 0 0 0 0 0 0.05882 0.05769 0.01852 0.01818 0.01754 0.01724 0.0678 0.04918 0 0.06667 0
Singapore 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Spain 0 0 0 0 0 0 0 0 0 0 0 0 0.05556 0.05556 0.05556 0.05556 0 0 0.05 0 0 0 0
Germany 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.05882 0.11321 0.10345 0.15254 0.18519 0
United Kingdom 0 0 0 0 0 0 0 0 0 0 0 0.01191 0.04706 0.02299 0.01124 0.01111 0.02222 0.01087 0.08696 0.03261 0.04348 0.0875 0
Canada 0 0 0 0 0 0 0 0 0 0 0 0 0.03947 0.0125 0.025 0.025 0.03798 0.04938 0.11494 0.03488 0.05882 0.02174 0
Sweden 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.02941 0 0
Israel 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.1 0 0 0
Austria 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Belgium 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Denmark 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Netherlands 0 0 0 0 0 0 0 0 0 0 0 0 0 0.10526 0 0 0.09091 0.09091 0.08696 0.08 0.07692 0.08333 0
Ireland; Republic of 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Norway 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.22222 0 0.22222 0.11111 0
Isle of Man 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Hong Kong 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.0303 0.02941 0 0 0
New Zealand 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Luxembourg 0 0 0 0 0 0 0 0 0 0 0 0 0.33333 0 0 0 0.25 0 0 0 0 0.25 0
Italy 0 0 0 0 0 0 0 0 0 0 0 0 0.07143 0 0 0 0 0 0.05263 0.04762 0.09091 0 0
China 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Macau 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Finland 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Argentina 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Bermuda 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Papua New Guinea 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Portugal 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    	
            tes2.statics_list[1][0].head(2)
        
        
    
Aggregate data on a per TRBC Economic Sector Name basis for Environmental Controversies
Year 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020
Healthcare 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.03731 0 0.01379 0.0084 0
Consumer Non-Cyclicals 0 0 0 0 0 0 0 0 0 0 0 0 0.01 0 0 0.00926 0.00935 0.00901 0.04237 0.0084 0.01667 0.01087 0
    	
            tes2.statics_list[0][1].head(2)
        
        
    
Aggregate data on a per Country of Headquarters basis for Environmental Materials Sourcing
Year 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020
United States of America 0 0 0 0 0.04651 0.04206 0.03597 0.07463 0.08709 0.15909 0.25521 0.3134 0.34703 0.38717 0.39024 0.37363 0.36481 0.33966 0.36577 0.37746 0.41103 0.36111 0.29167
Switzerland 0 0 0 0 0.17241 0.10345 0.125 0.22857 0.14286 0.25641 0.4 0.5 0.57143 0.53488 0.53488 0.52273 0.56818 0.56818 0.54546 0.65909 0.68889 0.71795 1

' .df_vectorised_list' returns a list of all the 'vectorised' data-frames constructed. They differ from the ' df_statics_list ' data-frames as they are not formatted with years or static fields as columns or rows.

    	
            tes2.df_vectorised_list[1].head(2)
        
        
    
  Country of Headquarters year Environmental Materials Sourcing per Country of Headquarters
0 United States of America 1998 0
1 United States of America 1999 0
    	
            tes2.df_vectorised_list[0]
        
        
    
  Country of Headquarters year Environmental Controversies per Country of Headquarters
0 United States of America 1998 0
1 United States of America 1999 0
2 United States of America 2000 0
3 United States of America 2001 0
4 United States of America 2002 0
... ... ... ...
685 Portugal 2016 0
686 Portugal 2017 0
687 Portugal 2018 0
688 Portugal 2019 0
689 Portugal 2020 0

 

Plot Graphs with Total:

    	
            

tes3 = ESG_Boolean_Data().get_data(Companies = ['0#.MIWO00000PUS'],

                                   Fields = ['TR.AnalyticEnvControv',

                                             'TR.EnvMaterialsSourcing',

                                             'TR.EnvSupplyChainMgt',

                                             'TR.PolicySustainablePackaging'],

                                   Static_fields = ['TR.HeadquartersCountry',

                                                    'TR.TRBCEconomicSector'])

' ESG_Boolean_Data().Aggregate_Graph_with_Tot() ' returns plots that (can) include the total number of datapoints in the data-set for in-chart reference.

    	
            

tes4 = tes3.Aggregate_Graph_with_Tot(Total_no_in_data_set = True,

                                     Total_data_shape = "markers",

                                     Color_theme = "plotly_dark",

                                     show_figures = False,

                                     Percent = False)

    	
            tes4.fig[2]
        
        
    
    	
            tes4.statics_list[0][0]
        
        
    
Aggregate data on a per Country of Headquarters basis for Environmental Controversies
Year 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020
United States of America 0 0 0 0 0 0 0 0 0 0 0 3 9 3 2 5 10 11 24 14 22 14 0
Switzerland 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 3 0 0 1 0
France 0 0 0 0 0 0 0 0 0 0 0 1 3 0 1 0 1 2 4 2 3 2 0
Japan 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 1 2 0 4 5 2 8 0
Australia 0 0 0 0 0 0 0 0 0 0 0 0 3 3 1 1 1 1 4 3 0 4 0
Singapore 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Spain 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 1 0 0 0 0
Germany 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 6 6 9 10 0
United Kingdom 0 0 0 0 0 0 0 0 0 0 0 1 4 2 1 1 2 1 8 3 4 7 0
Canada 0 0 0 0 0 0 0 0 0 0 0 0 3 1 2 2 3 4 10 3 5 1 0
Sweden 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0
Israel 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
Austria 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Belgium 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Denmark 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Netherlands 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 2 2 2 2 2 2 0
Ireland; Republic of 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Norway 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 2 1 0
Isle of Man 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Hong Kong 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0
New Zealand 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Luxembourg 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0
Italy 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 2 0 0
China 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Macau 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Finland 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Argentina 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Bermuda 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Papua New Guinea 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Portugal 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Map our Data:

 

    	
            

tes5 = ESG_Boolean_Data().get_data(Companies = ['0#.MIWO00000PUS'],

                                   Fields = ['TR.AnalyticEnvControv',

                                             'TR.EnvMaterialsSourcing',

                                             'TR.EnvSupplyChainMgt',

                                             'TR.PolicySustainablePackaging'],

                                   Static_fields = ['TR.HeadquartersCountry',

                                                    'TR.TRBCEconomicSector'])

' ESG_Boolean_Data().Aggregate_Map_Graph() ' returns a map representing the asked datapoints per year (with a time slider).

    	
            

tes6 = tes5.Aggregate_Map_Graph(Color_theme = "plotly_dark",

                                show_figures = False,

                                Percent = False)

    	
            tes6.fig[2]
        
        
    
    	
            tes5.df
        
        
    
  Instrument Country of Headquarters TRBC Economic Sector Name Date Environmental Controversies Date Environmental Materials Sourcing Date Environmental Supply Chain Management Date Policy Sustainable Packaging Years
0 ZTS.N United States of America Healthcare 2019-12-31T00:00:00Z FALSE 2019-12-31T00:00:00Z TRUE 2019-12-31T00:00:00Z TRUE 2019-12-31T00:00:00Z TRUE 2019
1 ZTS.N United States of America Healthcare 2018-12-31T00:00:00Z FALSE 2018-12-31T00:00:00Z TRUE 2018-12-31T00:00:00Z TRUE 2018-12-31T00:00:00Z FALSE 2018
2 ZTS.N United States of America Healthcare 2017-12-31T00:00:00Z FALSE 2017-12-31T00:00:00Z FALSE 2017-12-31T00:00:00Z TRUE 2017-12-31T00:00:00Z FALSE 2017
3 ZTS.N United States of America Healthcare 2016-12-31T00:00:00Z FALSE 2016-12-31T00:00:00Z FALSE 2016-12-31T00:00:00Z TRUE 2016-12-31T00:00:00Z FALSE 2016
4 ZTS.N United States of America Healthcare 2015-12-31T00:00:00Z FALSE 2015-12-31T00:00:00Z FALSE 2015-12-31T00:00:00Z TRUE 2015-12-31T00:00:00Z FALSE 2015
... ... ... ... ... ... ... ... ... ... ... ... ...
31751 CARLb.CO Denmark Consumer Non-Cyclicals 2004-12-31T00:00:00Z FALSE 2004-12-31T00:00:00Z FALSE 2004-12-31T00:00:00Z FALSE 2004-12-31T00:00:00Z FALSE 2004
31752 CARLb.CO Denmark Consumer Non-Cyclicals 2003-12-31T00:00:00Z FALSE 2003-12-31T00:00:00Z FALSE 2003-12-31T00:00:00Z FALSE 2003-12-31T00:00:00Z FALSE 2003
31753 CARLb.CO Denmark Consumer Non-Cyclicals 2002-12-31T00:00:00Z FALSE 2002-12-31T00:00:00Z FALSE 2002-12-31T00:00:00Z FALSE 2002-12-31T00:00:00Z FALSE 2002
31754 CARLb.CO Denmark Consumer Non-Cyclicals 2001-12-31T00:00:00Z   2001-12-31T00:00:00Z   2001-12-31T00:00:00Z   2001-12-31T00:00:00Z   2001
31755 CARLb.CO Denmark Consumer Non-Cyclicals 2000-12-31T00:00:00Z   2000-12-31T00:00:00Z   2000-12-31T00:00:00Z   2000-12-31T00:00:00Z   2000

 

References

You can find more detail regarding the Eikon Data API and related technologies for this notebook from the following resources:

For any question related to this example or Eikon Data API, please use the Developers Community Q&A Forum.