Amazon Comprehend adds customized language lists to machine learning tool

Last year Amazon announced Comprehend, a natural language processing tool to help companies extract common words and phrases from a corpus of information. Today, a week ahead of its Re:invent customer conference, Amazon announced an enhancement to Comprehend that allows developers to build lists of specialized words and phrases without machine learning domain knowledge.

“Today we are excited to bring new customization features to Comprehend, which allow developers to extend Comprehend to identify natural language terms and classify text which is specialized to their team, business or industry,” Matt Wood, GM for deep learning and AI wrote in a blog post announcing the enhancement.

The key aspect of this is that Amazon is handling all of the complexity, allowing developers to add customized lists without having deep machine learning or natural language processing background. “Under the hood, Comprehend will do the heavy lifting to build, train, and host the customized machine learning models, and make those models available through a private API,” Wood wrote.

This involves two pieces. First of all developers define a list of custom entities. This could be something like legal language at a law firm or a list of part numbers at an automobile company. All the developer needs to do is expose a list of these entities. Amazon learns to identify the customized language and builds a private, customized model based on the list.

The second piece involves customized classifying. Once you have the language, you can begin to build logical lists where the terms appear. “Through as few as 50 examples, Comprehend will automatically train a custom classification model that can be used to categorize all your documents. You could group support emails by department, social media posts by product, or analyst reports by business unit,” Wood wrote. You could see how this could be useful to take these items after they have been extracted and categorized, and move them through a workflow to the appropriate personnel or for further use programmatically by an application.

Amazon is providing a way to build customized machine learning models, while it takes care of the details behind the scenes. At their best, cloud companies simplify the complex and provide access to sets of services that might otherwise be too difficult for many developers to achieve on their own. Comprehend is trying to offer a way to build customized models without having any machine learning knowledge whatsoever.

The new Comprehend features are generally available starting today.

Leave a Reply