Natural language processing (NLP) is a technology spawned from the need for machines to understand and communicate with humans in human language, not formal computer languages. The concept behind NLP is simple: if and when machines can understand and communicate with humans in natural (human) language, it democratizes data science, enabling humans to access, analyze, and leverage data more intelligently and become more efficient as they offload redundant, data-heavy tasks to machines. NLP is most commonly understood as a user interface (UI) technology, enabling two-way communications with computers via speech or text. However, NLP is also a critical technology for extracting insights and analysis from a vast amount of previously unindexed and unstructured data; mining video and audio files, emails, scanned documents, and more.
NLP adoption is accelerating, but not because of the creation of new NLP algorithms, as the data science in that regard is mature. NLP adoption has accelerated over the past 2 to 3 years because of scalable, affordable computational power, the increasing digitization of all data, and the melding of NLP with machine learning (ML) and deep learning (DL).
The Importance of Machine and Deep Learning to Natural Language Processing
NLP as a standalone technology has an increasingly limited value. Legacy methods used to feed and teach an NLP engine, such as rules-based and statistical model computer programming, are expensive, time consuming, and, in many cases, not very good. The reason for this is that the complexity of language context and nuance creates an infinite number of variations in meaning, making human-written programming impractical.
NLP combined with ML or DL allows an NLP engine to learn from vast data sets that can detect patterns of context and nuance and more accurately understand meaning and intent than legacy NLP programming methods.
Use Cases for Hybrid Natural Language Processing
The best bet NLP use cases have some common characteristics:
- They seek operational or production efficiencies. They seek to automate redundant, data-heavy tasks or jobs to allow humans to focus on human-suited tasks.
- They seek insight. The analysis of data from previously untapped sources to gain competitive advantages, become more attuned to customers, audiences, and stakeholders, or to build predictive, prescriptive plans.
One of the more intriguing NLP use cases is sentiment analysis, which, in Tractica’s latest Natural Language Processing research report, is being pursued by a wide range of companies.
According to Wikipedia, sentiment analysis “refers to the use of NLP, text analysis, computational linguistics and biometrics to systematically identify, extract, quantify and study affective states and subjective information… it aims to determine the attitude of a speaker, writer or other subject with respect to some topic or the overall contextual polarity or emotional reaction to a document, interaction or event.” Sentiment analysis will be used to bolster business, market, and competitive intelligence for companies and will be important to overall business strategy, particularly in understanding public attitudes about brands, companies, governments, public policy, political candidates, and more. Sentiment analysis could reduce the dependency of entities on consumer surveys or opinion polling due to more accurate, less-biased market feedback. Below is a list of a few companies working in sentiment analysis.
Founded in 2007, Infegy analyzes social media to help market research companies and advertising agencies analyze marketing campaigns. Through social media monitoring, Infegy can produce insights about what people are saying about a product; whether the product has altered perceptions and feedback on what product features and benefits are most important, whether messaging should be altered, etc. The company buys access to Twitter data feeds, scanning blogs, Tumblr, and other social media to build its analysis.
Founder and CEO Justin Graves said that while NLP and other AI technology is shifting, digital data is becoming more useful and valuable. “Twitter has grown, as has Facebook,” said Graves. “The amount of data we can process and what we can tell from it is evolving.” Graves said the company has had early success with sentiment analysis. He said it is now extremely accurate and as good as human analysis.
When asked if he thought social media analysis would replace statistically proven market research market and research feedback methods like random sampling surveys, Graves said he thinks its sample size is sufficient to give good feedback and he “feels confident in what we are hearing at a more aggregate level, as we understand demographics of the channels. Unsolicited commentary is truer than focus groups or polls.”
Lexalytics is a sentiment analysis and NLP company with solutions that help clients process over 3 billion documents per day. The company lists 77 customers on its website, a broad mix of consumer market research-related companies, but also a few such as Accenture, Oracle, Hewlett Packard, and Microsoft. According to Chief Marketing Officer (CMO) Seth Redmar, Lexalytics generates under $10 million in revenue, but is one of the largest text analysis players in the market. Redmar states that the company has been cash positive “for some time now” and is not venture funded.
The company offers two solutions. The Salience platform provides an on-premise, state-of-the-art text analytics engine. The Semantria platform is a cloud-based application development environment and application programming interface (API). Specific features of the platforms include sentiment analysis, text mining, intentions analysis, category extraction, classification, topic modeling, entity extraction, text summarization, and multi-lingual support.
Redmar told Tractica in May 2017 that it is beta testing new products. The products use hyperparameter optimization, which allows Lexalytics to combine lots of different algorithm models, such as Bayesian, Long Short-Term Memory (LSTM), Maximum Entropy, and others. “So, the system learns collectively with less data and compute power,” said Redmar. “We found our customers want business intelligence to drive company strategies, but it was hard for the business people to get the data from the data scientists, so this type of product grows the user base, you don’t have to be a data scientist to completely use it, though it won’t be that way at first.”
Protagonist uses NLP and ML to mine data and produce narratives about a company, government, or other entity. “Ten years ago, I was working with the White House to determine the beliefs and views of the U.S. around the world,” said Doug Randall, CEO and founder. “How would people judge the U.S.? What are the filters? We had political scientists, anthropologists, and social scientists looking into it, sort of in the way a journalist would look at the problem. The real learning we got from the exercise was we needed to learn what people want and then tell them about it, instead of just pushing out a message we thought was good. Today, we combine data, tech, and people to describe those narratives for marketing. We are pulling in billions of data points to understand an issue for a client. Banks for the notions of trust, doctors for patient trust, etc. These narratives are being expressed in all sorts of forms. Protagonist tech can filter out the paid content and noise; also, surface-level chatter – something where there isn’t much sentiment or emotion to it. So that creates a custom database for understanding the specific beliefs.”