We all experience this… we ask Google a question, and Google not only finds exactly what we were looking for, but even “predicts” what we are thinking about searching, as we type … Amazing!
Sometimes I feel as if Google listened to the conversation I just had…
Even more - Google search results are presented in an intuitive and engaging way. infobox, or concise excerpts, list of typical questions and answers that you might have meant to ask.
Why don't we have this search experience in our Enterprise applications?...
What is it that makes Google understand what we are looking for?
In Professional environment we use these types of Search workflows:
Structured workflows (more common in Enterprise applications)
Passive: Pre-defined profile / alerts
- This is when we know what you care about – Things of interest - and we would like to be alerted whenever content (News, Research, or other internal content) is about one of these Things of Interest. For example: I would like to be alerted when some new blog or story about spicy Rice recipe is published. A simple keyword search might yield broad and not necessarily relevant results including Rice fields and agriculture, different spicy food.
In this case we would like to get documents that are about the both topics RICE (FOOD) AND SPICY FOOD. (advanced search engines can use operator such as NEAR)
A TOPIC is a word or code represents a theme, that a document is related to or about. The topic is usually part of a taxonomy from a specific domain. For example the topic Rice could be related to a demand for Rice as a tradeable commodity. However the topic Rice could be related to recipes of foods made out of Rice . A topic is identified by supervised machine learning classifier.
Use more topics, less keywords. The use of topics help reduce noise (assuming topics are trained for very specific verticals/domains). I.e. when you use RICE (FOOD) you will unlikely get results about Rice as tradable commodity...
This is mostly passive workflow – you define what you're looking for - and you wait to receive relevant content when available.
Active framework-based search: Facet Search
- More active workflow, that is based on pre-defined facets of topics or entities. As we conduct search, we iteratively select and deselect various combination of topics or entities or events we want the result content to be about.
This is done by user-interface facets to select from.
For example: I can select topic RICE (FOOD) from the RAW FOOD facet, combined with SOUTH AMERICA from the GEGRAPHY facet, combined with SPICY FOOD from the FLAVORS facet, combined with Chef Gordon Ramsay from the CHEFS facets, etc.
Less structured workflows are much more challenging.
Active Research Search: A search based on either asking questions but mostly on typing anything of interest and exploring results and then keep refine search. Although Google relies on our previous search queries, in Enterprise search we need some workflow that walks us from one possible interesting results to another and then another. We need relevant and actionable results that can lead us to our next search. For example: If we search for spicy Rice recipe we might be offered to refine our search for specific chefs or specific vegetables that are used or even some similar recipes based on corn or wheat.
What would make the search engine to be smart enough to recommend?
In this workflow you don’t want to be bounded by specific pre-defined topics taxonomy (which you probably don’t remember by heart anyway).
Search engine that leverages
- Use of Synonyms
- Synonyms mapped to Topics
- Auto Suggest
- Knowledge Graph
Here are definitions and types of Search workflows.
A SYNONYM is similar or alternative word to a word you tend to use often. For example: synonym to Rice could be grain, brown rice, white rice, bran, corn, wheat, rye, etc.
RELATED TERMS are topics or phases that are related to a term you typically use. For example: Related terms to Rice would be: Buckwheat, Chinese food, gluten free food, basic food…
KEYWORD SEARCH: search for mention of words, but does not take into account the context and relevance.
AUTO-SUGGEST – as you type, the search interface suggests options you might meant to search for (predicting your intent based on your profile, based on previous searches you made, based on what most others searched for, based on the matches in the universe of documents available). Example: If you previously searched for vegan food, then as you start typing the word Rice – autosuggest might include suggestions like vegan rice recipes, etc...
This can also use stemming and fuzzy search. Google does this all the time… Here is nice post on this.
GRAPH-SEARCH– helps to enhance related info or when we are in exploration mode and want things that are related …
For example: we are looking for interesting vegan recipe for brown Rice.
Navigating through connections/relations between entities or events might surfaces information we were not aware of (made by our favorite chefs, or connections through people who like similar food like we do, our cooking styles, favorite ingredients, connections through the popular cooking TV shows - all might lead to an interesting recipe for us…).
Ask questions: Who can help me with …
This is part of *Expert finder* capability which requires a dedicated post on. One example for such platforms is Sinequa Insight.
Coleman Research is an expert network platform worth checking.
How can Refinitiv Intelligent Tagging help?
Refinitiv Intelligent Tagging applies topic codes / themes, as well as additional types of topics to documents: SocialTags (topics derived from broad Wikipedia taxonomy), and a unique type event topic: SlugLine - current world/News events taken from Reuters News Editorial story-events coverage.
Current events or terms vs. Long lasting taxonomy:
Note the difference: Slugs tags are events that are happening these days and likely covered in the News (likely would be less relevant in few months later). The Refinitiv Topics taxonomy is much smaller and refer to sustainable topics that rarely change. For example: a Slug could be CHINA-TESLA/CUSTOMS, while Topic would be Metals & Mining.
Use of metadata can significantly improve the search, relevancy and noise reduction of Enterprise Search workflows. It is another tiny step toward the aspired Google like search experience.
For Active Research Search – synonyms combined with auto-suggest might work better.
For Passive Search – i.e. alerts, facet search – topics from a defined taxonomy are more appropriate.
The other main parameter you will most likely need to override is the hostname for the server you are connecting to. EMA defaults this to localhost:14002 (where 14002 is the default port number on most servers for Consumer type connections).
OmmConsumerConfig & host (const EmaString &host="localhost:14002")
OmmConsumerConfig host(java.lang.String host)