The EDP Research API is Refinitiv aggregate delivery system of providing the buy-side with their entitled sell-side research reports on a real-time basis. This system delivers asynchronous updates (alerts) via Amazon’s Simple Queue Service (SQS). It is possible to create serverless application based on AWS services to receive and process messages from the queue without provisioning or managing server. Amazon provides set of services that can be used to create serverless application. In this article, we will utilize a set of AWS services to create application working with EDP Research API.

I structured the article in two parts. In part 1 - I explain basic information about EDP Reserach API, Amazon Web Services. It alsos provide setup instructions and application's workflow. In part 2 - I provide basic information about AWS Steps Functions, setup and run instructions.

EDP Research API Overview

The EDP Research API uses Alerts mechanism to delivery updates. An application firstly needs to login to Elektron Data Platform and get access token used in any requests to Research API. Application can use API to subscribe for Research document. After that, new updates (alerts) will be put in AWS SQS queue. It is application’s responsibility to keep polling the queue to get new messages.

You can find more information about the Research API from the following resources.

Amazon Web Services Overview

Application in this article utilizes various of Amazon Web Services. Below are some descriptions and resources.

  • AWS Lambda Function

AWS Lambda let you run code without provisioning or managing server. AWS Lambda supports multiple languages through the use of runtimes. We use Python runtime to execute our application’s code in Python in this article.

You can also use AWS Lambda to run your code in response to events, such as changes to data in an Amazon S3 bucket or an Amazon DynamoDB table. According to Using Lambda with Amazon SQS, Amazon SQS can also be an Event source of AWS Lambda Function which invokes a Lambda Function with an event that contains queue message, however the SQS created by Research API currently doesn’t support this functionality. So, in this article, we will implement a Lambda function to poll the SQS queue manually.

For more information about AWS Lambda: https://docs.aws.amazon.com/lambda/latest/dg/getting-started.html

  • AWS System Manager Parameter Store (SSM Parameter Store):

AWS Systems Manager Parameter Store provides secure, hierarchical storage for configuration data management and secrets management. You can store data such as passwords, database strings, and license codes as parameter values. The value can be stored as plain text or encrypted data.

In this article, we use SSM Parameter Store to store username, password, UUID and access token used by application. SSM Parameter Store also stores Last Modified Date of parameter, so we can use this timestamp information to verify whether the Access Token of EDP is expired or not.

For more information about SSM Parameter Store: https://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-parameter-store.html

  • Amazon Simple Storage Service (Amazon S3):

Amazon S3 is an object storage service. It has concept of “bucket” which is a container for objects stored in Amazon S3. Every object is contained in a bucket. 

For more information about S3: https://aws.amazon.com/s3/

  • AWS Step Functions:

The AWS Step Functions is a service which lets user coordinate multiple AWS services into serverless workflow. Workflows are made up of a series of steps, with the output of one step acting as input into the next. It translates application’s workflow to state diagram which is easy to understand and monitor.

In this article, we use AWS Step Functions to integrate each step of Lambda functions following the Research API workflow such as get access token, subscribe for Research and poll SQS. Below is the state diagram generated by Step Functions for the application. During run-time, it displays current execute steps and log status and result of each step.

For more information about AWS Step Function: https://aws.amazon.com/step-functions/

Application’s workflow

The application is implemented in Python based on the sample application in the News and Research alerts in Python tutorial. The application is implemented to repeatedly poll SQS for new updates. As Lambda was intended for small, simple functions, I need to separate each steps of implementation to multiple Lambda functions and integrate them with AWS Step Functions.

  1. Step Function invokes a Lambda function; “getEDPToken” to get EDP username, password from AWS Systems Manager Parameter Store (SSM Parameter Store), and then use the information to get EDP access token from Elektron Data API service. The Access token will be stored back in the Parameter Store. Below is the snippet code of “getEDPToken” Lambda function to get parameters’ value from SSM Parameter Store.

 

           # Get parameters' values from SSM Parameter Store          
            client = boto3.client('ssm')
            response = client.get_parameters(
                Names=['EDPUsername','EDPPassword','EDPClientId'],
                WithDecryption=False
            )
         
            params = response['Parameters']
            username = list(filter(lambda x : x['Name'] == 'EDPUsername', params))[0]['Value']
            password = list(filter(lambda x : x['Name'] == 'EDPPassword', params))[0]['Value']
            clientId = list(filter(lambda x : x['Name'] == 'EDPClientId', params))[0]['Value']
  1. A Lambda function; “subscribeResearch” is invoked to subscribe for Research Alerts, and then pass the Encryption key and endpoint to next function. The function will unsubscribe the remain subscription first to prevent duplicate subscription error.
  2. With the endpoint, a Lambda function; "getCloudCredential" will request Cloud Credential from EDP service.
  3. Application repeatedly poll the SQS Queue to see whether there is new Alert message, and then get the document ID of Research.
  4. A Step Function; “getAlertMessage” verifies whether messages containing document ID are available in the queue. If available, the function will pass list of document IDs to next function to download documents. If not, a Wait X seconds state is invoked to wait for next interval.
  5. A Step Functions; “refreshToken” will be invoked before the download documents state to refresh token if the stored access token is expired download and store the file on Amazon S3, if available. The function; “downloadDocuments” will be invoked for each document ID to download data from EDP, and then store the file in AWS S3.
  6. The Research API supports two types of result; text or pdf. The application in this article is implemented to get Research document in text format. You can modify the type to pdf in the "downloadDocuments" Lambda function. Below is the sample code for pdf.
#=============================================================================
def downloadDocument(id,docUrl,outputBucket):
	s3 = boto3.client('s3')	
	response = requests.get(docUrl, stream=True)
	print(response.raw)
	s3.upload_fileobj(response.raw, outputBucket, id+".pdf")

#=============================================================================
def getDocumentUrl(token,docID,uId):
	document_type = "/pdf"
	
	p = {'uuid': uId}
	
	RESOURCE_ENDPOINT = document_URL + docID + document_type
	
...

Below is the connectivity diagram describing how the application integrates with other Amazon Web Services and Elektron Data API.

Environment setup

  1. To use Lambda and other AWS services, you need an AWS account and IAM User first. Below are the information regarding the setup from the Get Started with Lambda page. Please follow the instructions if you do not have an AWS account. 

"To use Lambda and other AWS services, you need an AWS account. If you don't have an account, visit aws.amazon.com and choose Create an AWS Account. For detailed instructions, see Create and Activate an AWS Account.

As a best practice, you should also create an AWS Identity and Access Management (IAM) user with administrator permissions and use that for all work that does not require root credentials. Create a password for console access, and access keys to use command line tools. See Creating Your First IAM Admin User and Group in the IAM User Guide for instructions."

  1. Download and install AWS Command Line Interface (CLI) which is used to deploy Lambda Functions in this article.
  2. Setup your AWS credential in the AWS CLI.

Firstly, you need to get your access key ID and secret access key. You can follow the instructions in this guide to get your credential information. Run the configure command to set region, access key ID and secret access key.

With regard to the default region, all resources are created in “us-east-1” because the region is used by Research API to create SQS queue. To prevent data transfer cost, we use this region for all Amazon Web Services.

>aws configure
AWS Access Key ID [None]: accesskey
AWS Secret Access Key [None]: secretkey
Default region name [None]: us-east-1
Default output format [None]:
  1. The application get EDP Username, Password, Access Token and Refresh Token from SSM Parametere Store, so you need to create the following parameters on the AWS Console .
    1. EDPUsername
    2. EDPPassword
    3. EDPClientId
    4. UUID
    5. EDPAccessToken
    6. EDPRefreshToken
    7. BucketStorage

The console screen is below once you create all parameters.

  1. The application download Research document information, and then store the information as file object in an AWS S3 bucket. You need to create a bucket on the AWS Console.

 

Create Lambda Functions

Next, we will create Lambda Functions from deployment packages using AWS Command Line Interface (CLI). The deployment packages and installation scripts can be downloaded from Github. After extract the file, you will see the structure as follows.

The “install.ps1” is PowerShell script which can be used for installation. You can run the script to setup all Lambda functions. Otherwise, please follow the instructions as follows.

First, open command line and change current directory to the extracted files’ location.

  1. IAM Role for AWS service access

IAM Role can be created on a user account to define permission policies. As you have already known, the application accesses various services. We need to create IAM Role contain permission policy for the services.

  • Create IAM Role

This step will return created Role’s ARN. You will need the Role’s ARN to create Lambda Function in the next step.

aws iam create-role --role-name lambda-sqs-ssm --assume-role-policy-document file://role.json

 

Below is the sample of ARN in returned message.

  • Attach Role Policy
aws iam attach-role-policy --policy-arn arn:aws:iam::aws:policy/AmazonS3FullAccess --role-name lambda-sqs-ssm
aws iam attach-role-policy --policy-arn arn:aws:iam::aws:policy/AmazonSSMFullAccess --role-name lambda-sqs-ssm
aws iam attach-role-policy --policy-arn arn:aws:iam::aws:policy/AWSLambdaExecute --role-name lambda-sqs-ssm

 

  1. Create Lambda Functions

You need to replace the $arn_info with the ARN of the IAM Role created in the previous step.

aws lambda create-function --function-name getEDPToken --runtime python3.7 --role $arn_info --handler lambda_function.lambda_handler --timeout 5 --zip-file fileb://getEDPToken.zip --region us-east-1

aws lambda create-function --function-name subscribeResearch --runtime python3.7 --role $arn_info --handler lambda_function.lambda_handler --timeout 5 --zip-file fileb://subscribeResearch.zip --region us-east-1

aws lambda create-function --function-name getCloudCredential --runtime python3.7 --role $arn_info --handler lambda_function.lambda_handler --timeout 3 --zip-file fileb://getCloudCredential.zip --region us-east-1

aws lambda create-function --function-name getAlertMessage --runtime python3.7 --role $arn_info --handler lambda_function.lambda_handler --timeout 10 --zip-file fileb://getAlertMessage.zip --region us-east-1

aws lambda create-function --function-name refreshToken --runtime python3.7 --role $arn_info --handler lambda_function.lambda_handler --timeout 5 --zip-file fileb://refreshToken.zip --region us-east-1

aws lambda create-function --function-name downloadDocuments --runtime python3.7 --role $arn_info --handler lambda_function.lambda_handler --timeout 10 --zip-file fileb://downloadDocuments.zip --region us-east-1

 

After this step, you will see the list of Lambda functions in the AWS Console GUI.

 

Custom Python Library in Lambda Function

Lambda Function generally is executed in dedicated environment. If your function depends on libraries other than the SDK for Python (Boto 3), you need to create deployment package which includes the libraries. In this article, Lambda Function depends on requests and pycryptodome libraries for REST API and decryption. For more information, please refer to Updating a Function with Additional Dependencies section in AWS Lambda Deployment Package in Python

 

Conclusion

This article provides basic understanding for Research API, a set of Amazon Web Services. It also describes the application workflows which interacts with various service. Finally, it provides instructions to setup environment and Lambda Functions in your AWS account. The next part of this article will put all created serverless services together with AWS Step Functions.