Input Headers

Note to internal Intelligent Tagging users (internal customers who do not connect through API Gateway) - A different set of input headers is supported for internal Intelligent Tagging. For details, please see the section about Request Headers in the Supplementary Guide for Internal Intelligent Tagging (for internal customers who do not connect through API Gateway).

Additional resource: API User Guide.

The input content sent to Intelligent Tagging is accompanied by a set of parameters specified in key-value pairs as HTTP headers of the request. The parameters must be sent as US-ASCII characters. Header names and values are not case sensitive. In case an HTTP header contains a non-US-ASCII character, the client application must encode it before sending this header to Intelligent Tagging.

The request is an HTTP Post with the following query parameters:

  • Content-Type (MANDATORY): Indicates the input mime type.

  • omitOutputtingOriginalText: Excludes the original text from the output. Highly recommended for large input files.

  • outputFormat: Defines the output format.

  • x-ag-access-token (MANDATORY for hosted Intelligent Tagging and for Open Calais): The value of this header is your license key. (For Intelligent Tagging On Premise, this header is not supported and not relevant.)

  • x-calais-contentClass: Specifies the genre of the input document. Highly recommended for optimal extraction when input files are research reports in PDF format (relevant to premium users), transcripts, or news stories. (The Research and Transcripts content classes are available to premium users.)

  • x-calais-DocumentTitle: (Relevant when the input content type is text/raw.) For best results when tagging text files, use this header to specify the title of the document.

    Tip: When tagging XML files, you can specify the title by using the Title tag; when tagging PDF files you can specify the title by adding a Title metadata field in the PDF document.

  • x-calais-EnableTickerExtraction (Available to Premium Users): Extends company tagging by also extracting companies based on ticker mentions in the text. 

  • x-calais-EnableTickerRecallOriented (Available to Premium Users): Triggers recall-oriented company tagging for research content such as analyst emails. (This feature is not optimized for documents like long research reports.) This header must be enabled in addition to the x-calais-EnableTickerExtraction header.

  • x-calais-language: Indicates the language of the input text. Overrides the automatic language detection functionality. Highly recommended to use this header when tagging non-English language input, short texts, or texts containing many non-letter symbols.

  • x-calais-pdftagzone (Available to Premium Users): (This header applies to PDF input files only.) (This header is supported only when the x-calais-contentClass is “research.”) Extends tagging to tables in PDF documents. By default, the tagging mechanism does not parse tables.

  • x-calais-selectiveTags: (MANDATORY for new users. Soon-to-be mandatory for all users.) Determines which of the Intelligent Tagging processes are triggered. Use this header to specify all of the metadata types that are relevant to your use case.

  • x-calais-socialTagsImportanceThreshold: Lets you exclude from the output, social tags with importance scores below a specified threshold.

  • x-calais-socialTagsResultSize: Limits the total number of social tags in the output.

  • x-calais-source (Available to Premium Users): Use this header to specify the source of the document, for optimized extraction.

  • x-calais-SuppressSocialTagsFin: Excludes from the output, specific generic social tags that do not add value when tagging financial documents.

  • x-calais-UseDisclosureExtraction (Available to Premium Users): Uses the information found in the disclosure section of research reports to enhance company tagging. This header can be used when the x-calais-contentClass is Research.

  • x-calais-useSlugline: This header is soon to be deprecated. Instead, use the x-calais-selectiveTags header to trigger slugline tagging. (Slugline tagging is a premium feature.)

 

 

Content-Type (Mandatory)

Indicates the input content type (mime type). Intelligent Tagging processes the input documents according to the value of this parameter for optimal metadata extraction.

Values:
  • text/html: Use this value when submitting web pages.
  • text/xml: Use this value when submitting XML content.
  • text/raw: Use this value when submitting clean, unformatted text.

  • application/pdf: Use this value when submitting PDF files as binary streams. This value is available to Premium Intelligent Tagging users. Please make sure that your PDF files contain text objects; Intelligent Tagging does not extract text from images in PDF files.

Default:  None
Remarks: 
  • This is a mandatory parameter.
  • We recommend that before submission, you remove from the input document any redundant or irrelevant text (such as ads, disclaimers, repeated generic text such as “contact customer support for further advice…,” trademarks, etc.).
  • Text content should be UTF-8 encoded; otherwise, specify charset, e.g. text/xml; charset=utf-8.
  • Note that if your text includes accented characters, for example, "Ségolène Royal," and you do not set encoding to UTF-8, the Intelligent Tagging output strips these characters, trashing the original text.
  • Intelligent Tagging expects the url-encoded arguments to be encoded using UTF-8. HttpClient defaults to another encoding, so you must instruct it to use UTF-8 for proper url-encoding of your arguments.
  • For binary documents (e.g. PDF) the http body should include the binary stream.
  • To optimize tagging of research reports (in PDF format only), make sure to also define the x-calais-contentClass header for best results.
  • To optimize tagging of text files, you can define the x-calais-DocumentTitle header for best results.

 

 

omitOutputtingOriginalText

Use this parameter to exclude the submitted text from the output, thus reducing the size of the response.

Values: true, false
Default: false (the original text is included in the output)
Remarks:
  • By default, Intelligent Tagging returns the submitted text. Set this parameter to true to exclude the original text from the output.
  • It is highly recommended to use this header for large input files.

 

 

outputFormat

Defines the output response format (mime type).

Values:
  • xml/rdf
  • application/json
  • text/n3
Default:      xml/rdf

 

 

x-ag-access-token

This header is mandatory for hosted Intelligent Tagging and for Open Calais.

For Intelligent Tagging On Premise, this header is not supported and not relevant.

Use this header to pass the license key (token) which grants you access to Intelligent Tagging and defines your submission capacity rights.

Value:

Your license key.

For Open Calais, if you do not already have a license key, you can Register for MyRefinitiv, and then login to https://PermID.org with your new credentials. An Open Calais API Key is automatically e-mailed to you. 

For premium Intelligent Tagging, if you do not already have a license key, contact us.

Default:  None
Remarks: 

This is a mandatory parameter for hosted Intelligent Tagging and for Open Calais.

You should note the allowed submission size and rate for the token and adhere to them in your code; if you exceed these limits, your submissions can be blocked automatically for a certain period of time.

 

 

x-calais-contentClass

Lets you specify the genre of the input files, to optimize extraction.

Values:
  • news – Define this value when input files are news stories.
  • research – This value (available to premium users) triggers optimal tagging when input files are research reports in PDF format.
  • transcripts - Enables the classification topics that are optimized for transcript documents in XML format. Relevant to premium users.
Default:  news
Remarks: 
  • For best quality output, it is highly recommended to use this header when the input files are research reports in PDF format (relevant to premium users), transcript documents in XML format, or news stories.
  • Currently, Intelligent Tagging applies contentClass news by default if the header is not defined. However, our best-practice recommendation is to define this header anyway, as the default may change.

 

 

x-calais-DocumentTitle

Use this header to specify the title of the document, to optimize tagging output for text files.

Value: The document title, taken from the input file.
Default: None
Remarks:
  • For best results, we recommend using this header when tagging text files.
  • This header is supported only for text files, and therefore must be used only when Content-Type is text/raw. However, please note that when processing XML tags, you can specify the title with the Title tag, and when processing PDF files, you can specify the title by adding a Title metadata field in the PDF document.
  • If you refer to this document after tagging it, it is important to refer to the contents of the c:document tag in the tagging output, which includes both the title and body of the document, and not to the input file which no longer includes the title. This is because the offsets that indicate the location of tagged entities are based on the c:document tag contents.

 

 

x-calais-EnableTickerExtraction (Available to Premium Users)

Extends company tagging by also extracting companies based on ticker mentions in the text. 

Please note that Intelligent Tagging identifies and extracts primary tickers mentioned in the text. For example, “Falabella,” the primary ticker of the company SACI Falabella, is identified as a company mention, whereas “FALABE-OSA,” a non-primary ticker, is not extracted.

Values: True, False
Default: False
Remarks:
  • Supported for English language input only.

  • Supported for all content types (text/html, text/xml, text/raw, application/pdf).

  • Supported for all content classes (research, news).
  • Company tagging must also be enabled. 

 

 

x-calais-EnableTickerRecallOriented (Available to Premium Users)

Expands the coverage of company tagging, so that mentions of both primary and non-primary tickers in the text are identified as companies.

Values: True, False (True activates recall oriented ticker extraction)
Default: False
Remarks:
  • Supported for English language input only.

  • Supported for content types: text/xml, application/pdf.

  • Tested on research content such as analyst emails. (Not optimized for documents such as long research reports.)

  • The x-calais-EnableTickerExtraction header must be defined as well. Otherwise company tagging based on ticker mentions in the text is not triggered.
  • Company tagging must also be enabled.
  Important: To trigger recall-oriented ticker extraction optimized for research content such as analyst emails, you must ALSO define the x-calais-EnableTickerExtraction header. Both headers are mandatory for this workflow. If you define the x-calais-EnableTickerRecallOriented header but don’t define the x-calais-EnableTickerExtraction header, there will be no ticker extractions (company extractions based on ticker mentions in the text) at all.

 

 

x-calais-language

Specifies the language of the input text. You can use this header to override the automatic language detection functionality if you know the language you are submitting.

It is highly recommended to use this header when tagging non-English language input, short texts, or texts with many non-letter symbols which reduces the accuracy of automatic language identification.

Values:   The full name of the language, in English:
  • English

  • Chinese

  • French

  • German

  • Japanese

  • Spanish

Default: None

 

 

x-calais-pdftagzone (Available to Premium Users)

Use this header to extend tagging to tables in PDF documents. By default, the tagging mechanism does not parse tables.

Values: True, False
Default: False
Remarks:   

This header applies to PDF input files only. This header is supported only when x-calais-contentClass is “research.”

Support for PDF files is available to premium users, as is support for the content class, Research.

 

 

x-calais-selectiveTags

(MANDATORY for new users. Soon-to-be mandatory for all users.) Determines which of the Intelligent Tagging processes are triggered. Use this header to specify all of the metadata types that are relevant to your use case.

Read this document for details about implementing the x-calais-selectiveTags header, and for a list of valid values.

 

 

x-calais-socialTagsImportanceThreshold

The importance attribute of the SocialTag indicates how centric the topic is to the document as a whole (very centric, somewhat centric, less centric).

Use this header to exclude from the output, social tags with importance scores below a specified threshold.

Values:
  • 1 – Only the social tags with importance level 1 (very centric) are included in the output.
  • 2 – Social tags assigned importance levels 1 (very centric) and 2 (somewhat centric) are included in the output.
  • 3 – All social tags are included in the output. (The same result as not using this header at all.)
Default: None
Remarks:          
  • If the number of SocialTags with importance levels above the selected threshold is bigger than a maximum defined by the x-calais-socialTagsResultSize header, the number of SocialTags included in the output is limited by the x-calais-socialTagsResultSize header.
  • SocialTags tagging must also be enabled.

 

 

x-calais-socialTagsResultSize

Limits the total number of social tags in the output.

Intelligent Tagging assigns a relevance score to each SocialTag. The relevance score is a measure of how centric the topic is to the document as a whole and is more granular than the importance score. This relevance score is not an attribute of the SocialTag. It is invisible to the user, and is used by the application to rank all of the SocialTags from highest to lowest relevance.

When you use this header, the tags with the highest relevance scores are selected first for inclusion in the output.

Values: Any whole number from 0-500.
Default: None
Remarks:  

SocialTags tagging must also be enabled.

If you also define the x-calais-socialTagsImportanceThreshold  header:

  • The output is limited to SocialTags with importance levels above the selected threshold.

 

  • The output is limited to the number of SocialTags defined by the x-calais-socialTagsResultSize header.

If the number of SocialTags above the selected threshold is greater than the limit set by the x-calais-socialTagsResultSize header, the internal relevance score is used to identify the highest ranking tags for inclusion.

 

 

 

x-calais-source (Available to Premium Users)

This header is used to optimize extraction from news stories and research reports, from a number of different sources. A source may be a news provider (e.g. Reuters News) or an investment bank (e.g. Morgan Stanley).

If the input document is from a supported source, you can use this header to specify the source to trigger optimized extraction.

The following are affected:

  • Zoning – During zoning, generic sections of text that are not relevant to the story (such as headers, footers, disclaimers, etc.) are identified and excluded from processing. For some sources, zoning is optimized.

  • Extraction – Extraction is tailored to the source. For example, when the specified source is Dow Jones, the text “DJ Microsoft” is not recognized as a company. Without the header, “DJ Microsoft” might be extracted as a company.

  • Relevance – If the source company is extracted, the relevance score is usually affected (generally speaking, the source company is considered less relevant to the story). The relevance score assigned to related companies (such as subsidiaries of the company) may be affected as well.

Value

A valid value is any source code from the list of supported sources. Internal users may view the list of supported sources here.

 

 

x-calais-SuppressSocialTagsFin

Excludes from the output, specific generic social tags that do not add value when tagging financial documents.

Values: True, False
Default: False
Remarks:
  • When tagging a corpus of financial documents we recommendusing this header to reduce noise in the output.
  • SocialTags tagging must also be enabled.

This header excludes the following social tags from the output:

  • 9
  • Americas
  • Bank
  • Bond
  • Business
  • Business economics
  • Chicago Board Options Exchange
  • Computing
  • Corporate finance
  • Data
  • Data management
  • Document
  • Documents
  • Donald Trump
  • Dow Jones & Company
  • Economics
  • Economy
  • Economy of Germany
  • Economy of North America
  • Economy of the European Union
  • Equity securities
  • Finance
  • Financial crisis of 2007-2008
  • Financial economics
  • Financial markets
  • Financial services
  • Financial software
  • Funds
  • Geography of Asia
  • Geography of China
  • Geography of Europe
  • Government
  • Income
  • Index
  • Information
  • Investment
  • Knowledge
  • Management
  • Manhattan
  • Market economics
  • Marketing
  • Mathematical finance
  • Matter
  • Money
  • NASDAQ
  • NASDAQ-100
  • New York City
  • New York Stock Exchange
  • Password
  • Profit
  • Revenue
  • RTT
  • S&P 500 Index
  • S&P Dow Jones Indices
  • S&P/TSX 60 index
  • Science and technology in the United States
  • SEC filings
  • Shareholders
  • WWW Hall of Fame
  • Stock
  • Stock market
  • Stock market crashes
  • Stock Market index
  • Technology
  • Television in the United States
  • Transcript
  • United States corporate law
  • Wall Street
  • West
  • World oil market chronology from
  • World Wide Web
  • WWE Hall of Fame
  • Yoy

 

x-calais-UseDisclosureExtraction (Available to Premium Users)

Uses the information found in the disclosure section of research reports to enhance company tagging.
This capability is currently supported for English language input in PDF format.

Values: True, False
Default: False
Remarks:

The following headers and values must also be defined:

  • Content-Type: application/pdf
  • x-calais-contentClass: research
  • x-calais-pdftagzone: True

 

x-calais-useSlugline (soon to be deprecated)

This header is soon to be deprecated. Instead, use the x-calais-selectiveTags header to trigger slugline tagging.