1. Home
  2. Article Catalog
  3. Introduction To Filings - Python

Refinitiv Data Platform APIs

Introduction To Filings - Python

Preston Tan
Product Manager, Digital Wealth Man... Product Manager, Digital Wealth Management
Zoya Farberov
Platform Application Developer Platform Application Developer

Introducing Filings API Service on Refinitiv Data Platform

A new Filings API service is available on Refinitiv Data Platform (RDP) providing access to Global and EDGAR filing data for over 40 million documents from 135,000 companies worldwide, spanning over 50 years of history from 1968. Automated document feeds and newswires delivers timely and comprehensive collections for USA, Canada, Japan, Norway, Italy, Australia, Singapore, India, China and Korea.

Filings service consists of search and retrieval of public corporate disclosures. In this article, we will review on how to search for specific filings documents and to download the documents from the API.

Filings Search Using GraphQL

Filings documents can be searched through our GraphQL endpoint. GraphQL is a query language for APIs used on the front end to request and receive specific data in the response. Some capabilities used to search for filings documents include:

  • Filtering
  • Sorting
  • Limit
  • Pagination
  • Keyword Search

To learn more about GraphQL, please visit https://graphql.org/.

Python Environment

For the purpose of demonstration, we are going to use Jupyter Lab with Python 3..8.   We are going to discuss the code that is available for download from https://github.com

Valid Credentials - Replace in Code or Read From File

Valid RDP credentials are required to interact with an RDP service:

  • USERNAME
  • PASSWORD
  • CLIENTID
    	
            

USERNAME = "VALIDUSER"

PASSWORD = "VALIDPASSWORD"

CLIENT_ID = "SELFGENERATEDCLIENTID"

 

def readCredsFromFile(filePathName):

### Read valid credentials from file

    global USERNAME, PASSWORD, CLIENT_ID

    credFile = open(filePathName,"r")    # one per line

                                                #--- RDP MACHINE ID---

                                                #--- LONG PASSWORD---

                                                #--- GENERATED CLIENT ID---

 

    USERNAME = credFile.readline().rstrip('\n')

    PASSWORD = credFile.readline().rstrip('\n')

    CLIENT_ID = credFile.readline().rstrip('\n')

 

    credFile.close()

 

readCredsFromFile("..\creds\credFileHuman.txt")

 

# Uncomment - to make sure that creds are either set in code or read in correctly

#print("USERNAME="+str(USERNAME))

#print("PASSWORD="+str(PASSWORD))

#print("CLIENT_ID="+str(CLIENT_ID))

We include two ways to supply the valid credentials. 

  • One is, to replace the placeholders in code, "VALIDUSER" ... with the valid personal credential values.    To enact, comment out the call  to read cred from file: 

        #readCredsFromFile("..\creds\credFileHuman.txt")

  • The other way is to store a set of valid RDP credentials in a file that is stored in path "../creds" in file "credsFileHuman.txt" and have the code retrieve the credentials from the file.  

        The file is expected to be in simple format one entity per line:

    	
            

VALIDUSER

VALIDPASSWORD

SELFGENERATEDCLIENTID 

 

Define Token Handling and Obtain a Valid Token

Having a valid token is a pre-requisite to requesting of any RDP content, and will be passed into the next steps.  For additional information on Authorization and Tokens, refer to RDP Tutorial: Authorization - All about tokens.

The implementation steps that come next may look familiar, as with some variation they come up repeatedly, with any RDP service interaction.

    	
            

TOKEN_ENDPOINT = RDP_BASE_URL + CATEGORY_URL + RDP_AUTH_VERSION + ENDPOINT_URL

 

def _requestNewToken(refreshToken):

    if refreshToken is None:

        tData = {

            "username": USERNAME,

            "password": PASSWORD,

            "grant_type": "password",

            "scope": SCOPE,

            "takeExclusiveSignOnControl": "true"

        };

    else:

        tData = {

            "refresh_token": refreshToken,

            "grant_type": "refresh_token",

        };

 

    # Make a REST call to get latest access token

    response = requests.post(

        TOKEN_ENDPOINT,

        headers = {

            "Accept": "application/json"

        },

        data = tData,

        auth = (

            CLIENT_ID,

            CLIENT_SECRET

        )

    )

    

    if response.status_code != 200:

        raise Exception("Failed to get access token {0} - {1}".format(response.status_code, response.text));

 

    # Return the new token

    return json.loads(response.text);

 

def saveToken(tknObject):

    tf = open(TOKEN_FILE, "w+");

    print("Saving the new token");

    # Append the expiry time to token

    tknObject["expiry_tm"] = time.time() + int(tknObject["expires_in"]) - 10;

    # Store it in the file

    json.dump(tknObject, tf, indent=4)

    

def getToken():

    try:

        print("Reading the token from: " + TOKEN_FILE);

        # Read the token from a file

        tf = open(TOKEN_FILE, "r+")

        tknObject = json.load(tf);

 

        # Is access token valid

        if tknObject["expiry_tm"] > time.time():

            # return access token

            print(tknObject["expiry_tm"])

            print("time.time()="+ str(time.time()))

            return tknObject["access_token"];

 

        print("Token expired, refreshing a new one...");

        tf.close();

        # Get a new token from refresh token

        tknObject = _requestNewToken(tknObject["refresh_token"]);

 

    except Exception as exp:

        print("Caught exception: " + str(exp))

        print("Getting a new token using Password Grant...");

        tknObject = _requestNewToken(None);

 

    # Persist this token for future queries

    saveToken(tknObject)

    # Return access token

    return tknObject["access_token"];

Define Filings Helper Function requestSearch

    	
            

FILINGS_ENDPOINT = RDP_BASE_URL+'/data-store'+RDP_FILINGS_VERSION + '/graphql'

 

def requestSearch(token, payloadSearch):   

    

    global FILINGS_ENDPOINT

    print("requestSearch...")

  

    querystring = {}

    payload = ""

    jsonfull = ""

    jsonpartial = ""

    

    headers = {

            'Content-Type': "application/json",

            'Authorization': "Bearer " + token,

            'cache-control': "no-cache"

    }

        

    response =requests.post(FILINGS_ENDPOINT, json={'query': payloadSearch}, headers=headers)

 

    

    print("Response status code ="+str(response.status_code))

    

    if response.status_code != 200:

        if response.status_code == 401:   # error when token expired

                accessToken = getToken();     # token refresh on token expired

                headers["Authorization"] = "Bearer " + accessToken

                response =requests.post(FILINGS_ENDPOINT, json={'query': payloadSearch}, headers=headers)

         

    print('Raw response=');

    print(response);

    

    if response.status_code == 200:

        jsonFullResp = json.loads(response.text)

        return jsonFullResp; 

    else:

        return '';

Search Filings by File Type

The following example searches for all 10-Qs on February 12, 2021. 

    	
            

payloadIn = """

{

  FinancialFiling(filter: {AND: [

    {FilingDocument: {DocumentSummary: {FormType: {EQ: "10-Q"}}}}, 

    {FilingDocument: {DocumentSummary: {FilingDate: {BETWN: {FROM: "2021-02-12T00:00:00Z", TO: "2021-02-12T23:59:59Z"}}}}}]}, 

    sort: {FilingDocument: {DocumentSummary: {FilingDate: DESC}}},

    limit: 25 ) {

    _metadata {

      totalCount

      cursor

    }

    FilingOrganization {

      Names {

        Name {

          OrganizationName (filter: {OrganizationNameTypeCode: {EQ: "LNG"}}){

            OrganizationName

          }

        }

      }

    }

    FilingDocument {

      Identifiers {

        OrganizationId

        Dcn

      }

      DocId

      FinancialFilingId

      DocumentSummary {

        DocumentTitle

        FeedName

        FormType

        HighLevelCategory

        MidLevelCategory

        FilingDate

        SecAccessionNumber

        SizeInBytes

      }

      FilesMetaData {

        FileName

        MimeType

      }

    }

  }

}

"""

jsonFullResp = requestSearch(accessToken,payloadIn);

print('Parsed json response=');

print(json.dumps(jsonFullResp, indent=2));

docId = jsonFullResp["data"]["FinancialFiling"][0]["FilingDocument"]["DocId"]

print('DocId is',str(docId))

cursor = jsonFullResp["data"]["FinancialFiling"][0]["_metadata"]["cursor"]

print('cursor is', str(cursor))

Once we have identified the required DocId or DocIds, and cusrsor, this info is used by the next steps to request the required Filings documents

Pagination

The maximum number response is 200 if a limit is not specified in the query. Since we use a cursor-based pagination, returns a pointer to a specific item in the dataset. Each cursors is unique to a specific record. Last record is used to paginate.

To view the next 25 responses in the previous example, set the cursor from the last data point in the response to retrieve 26-50.

    	
            

payloadIn1 = """

{

  FinancialFiling(filter: {AND: [

    {FilingDocument: {DocumentSummary: {FormType: {EQ: "10-Q"}}}}, 

    {FilingDocument: {DocumentSummary: {FilingDate: {BETWN: {FROM: "2021-02-12T00:00:00Z", TO: "2021-02-12T23:59:59Z"}}}}}]}, 

    sort: {FilingDocument: {DocumentSummary: {FilingDate: DESC}}},

    limit: 25

  cursor: """

payloadIn2 = """

) {

    _metadata {

      totalCount

      cursor

    }

    FilingOrganization {

      Names {

        Name {

          OrganizationName (filter: {OrganizationNameTypeCode: {EQ: "LNG"}}){

            OrganizationName

          }

        }

      }

    }

    FilingDocument {

      Identifiers {

        OrganizationId

        Dcn

      }

      DocId

      FinancialFilingId

      DocumentSummary {

        DocumentTitle

        FeedName

        FormType

        HighLevelCategory

        MidLevelCategory

        FilingDate

        SecAccessionNumber

        SizeInBytes

      }

      FilesMetaData {

        FileName

        MimeType

      }

    }

  }

}

"""

print("Request="+payloadIn1+"\""+str(cursor)+"\""+payloadIn2)

jsonFullResp = requestSearch(accessToken,payloadIn1+"\""+str(cursor)+"\""+payloadIn2);

print('Parsed json response=');

print(json.dumps(jsonFullResp, indent=2));

Search by OrganizationId

Search for all filings documents for Tesla in 2021

    	
            

payloadIn = """

{

  FinancialFiling(filter: {AND: [{FilingDocument: {Identifiers: {OrganizationId: {EQ: "4297089638"}}}}, 

    {FilingDocument: {DocumentSummary: {FilingDate: {BETWN: {FROM: "2021-01-01T00:00:00Z", TO: "2021-12-31T11:59:59Z"}}}}}]}, 

    sort: {FilingDocument: {DocumentSummary: {FilingDate: DESC}}}, 

    limit: 10) {

    _metadata {

      totalCount

    }

    FilingOrganization {

      Names {

        Name {

          OrganizationName (filter: {OrganizationNameTypeCode: {EQ: "LNG"}}){

            OrganizationName

          }

        }

      }

    }

    FilingDocument {

      Identifiers {

        OrganizationId

        Dcn

      }

      DocId

      FinancialFilingId

      DocumentSummary {

        DocumentTitle

        FeedName

        FormType

        HighLevelCategory

        MidLevelCategory

        FilingDate

        SecAccessionNumber

        SizeInBytes

      }

      FilesMetaData {

        FileName

        MimeType

      }

    }

  }

}

"""

jsonFullResp = requestSearch(accessToken,payloadIn);

print('Parsed json response=');

print(json.dumps(jsonFullResp, indent=2));

Keyword Search by Document Text

One of the other features available in keyword word against document or section text.

    	
            

payloadIn = """

    {

        FinancialFiling( 

            sort: {FilingDocument: {DocumentSummary: {FilingDate: DESC}}}, 

            filter: {FilingDocument: {DocumentSummary: {FilingDate: {BETWN: {FROM: "2020-07-01T00:00:00Z", TO: "2020-08-01T00:00:00Z"}}}}}, 

            keywords: {searchstring: "FinancialFiling.FilingDocument.DocumentText:COVID-19"}, 

            limit: 5) { 

            _metadata { 

                totalCount 

                } 

            FilingOrganization { 

                Names { 

                    Name { 

                        OrganizationName(  

                        filter: {AND: [ {

                            OrganizationNameLanguageId: {EQ: "505062"}}, {

                            OrganizationNameTypeCode: {EQ: "LNG"}}]}) 

                        { 

                            OrganizationName 

                        } 

                    } 

                } 

            }             

            FilingDocument { 

                DocId

                DocumentSummary { 

                    DocumentTitle 

                    FilingDate 

                    FormType 

                    FeedName                     

                } 

                DocumentText 

            } 

        }

    } 

    """

jsonFullResp = requestSearch(accessToken,payloadIn);

print('Parsed json response=');

print(json.dumps(jsonFullResp, indent=2));

docId = jsonFullResp["data"]["FinancialFiling"][0]["FilingDocument"]["DocId"]

 

print('DocId is',str(docId))

Download Filings Documents

There are four identifers or retrieval methods you can use to download a document.

  • FilingId (FilingId, or Financial Filing Id, is an internal permanent identifier assigned to each filings document. This is our strategic filings identifier.)
  • Dcn (Dcn, also known as Document Control Number, is an external identifier and an enclosed film-number specific to Edgar documents.)
  • DocId (DocId, or Document Identifier, is an internal identifier assigned to financial filings documents.)
  • Filename (Filename provides a faster and direct route to download documents without going through a resolver.)

Define Helper Function retrieveURL

    	
            

def retrieveURL(token, retrievalParameters):   

 

    ENDPOINT_DOC_RETRIEVAL = RDP_BASE_URL+'/data/filings'+RDP_FILINGS_VERSION + '/retrieval/search/' + retrievalParameters

    

    headers = {

        "Authorization": "Bearer " + token,

        "X-API-Key": "155d9dbf-f0ac-46d9-8b77-f7f6dcd238f8",

        "ClientID" : "api_playground"

    }

    print("Next we retrieve: " + ENDPOINT_DOC_RETRIEVAL);

    

    response = requests.get(ENDPOINT_DOC_RETRIEVAL, headers = headers);

    

    print("Response status code ="+str(response.status_code))

    

    if response.status_code != 200:

        if response.status_code == 401:   # error when token expired

                token = getToken();     # token refresh on token expired

                print("Token now is: "+token)

                headers["Authorization"] = "Bearer " + token

                response = requests.get(ENDPOINT_DOC_RETRIEVAL, headers = headers);

 

    print("Response status code ="+str(response.status_code))

    if response.status_code == 200:

        jsonFullResp = json.loads(response.text)        

        return jsonFullResp; 

    else:

        return '';

 Retrieve URL by DocID

    	
            

jsonFullResp = retrieveURL(accessToken,'docId/54932207')

print('full response is =');

print(json.dumps(jsonFullResp, indent=2));

fileName = list(jsonFullResp.keys())[0]

print("fileName is: ")

print(fileName)

signedUrl = jsonFullResp[list(jsonFullResp.keys())[0]]["signedUrl"]

print("signedUrl to retrieve is: ")

print(signedUrl)

Retrieve URL by FilingId

    	
            

jsonFullResp = retrieveURL(accessToken,'filingId/97661417885')

print('full response is =');

print(json.dumps(jsonFullResp, indent=2));

fileName = list(jsonFullResp.keys())[0]

print("fileName is: ")

print(fileName)

signedUrl = jsonFullResp[list(jsonFullResp.keys())[0]]["signedUrl"]

print("signedUrl to retrieve is: ")

print(signedUrl)

Download the Document

Now we are ready to download the Filings document and save it under downloads folder

    	
            

def retrieveSaveDoc(fileName, signedUrl, token):   

    

    headers = {

            'clientId': CLIENT_ID,

            'Authorization': "Bearer " + token

    }

 

    response = requests.get(signedUrl, headers = headers, allow_redirects=True);

 

    if response.status_code == 200:

        filenameWithDir = './downloads/'+ str(fileName)

        os.makedirs(os.path.dirname(filenameWithDir), exist_ok=True)

        open(filenameWithDir, 'wb').write(response.content)

        print("The document ",fileName," has been downloaded indo downloads subfolder")

        return fileName; 

    else:

        print("Response code on error is:",str(response.status_code))

        return '';

 

retrieveSaveDoc(fileName, signedUrl, accessToken)ken)

At this point a filings pdf file is stores under downloads folder:

 

For more common use case examples that can be implemented in Python analogously, see Filings Developer Guide -> Section Use Cases