360AI content preprocessing API consists of a tokenizer and Part-of-Speech (POS) tagger that standardize the text before it is further analysed by our AI engine.
The tokenizer identifies start and end of each word and sentence. The Part-of-Speech tagger disambiguates word categories (e.g. noun, verb, etc.) for each word. This is a prerequisite for any computer analysis of text. Our content preprocessor is available on word level, sentence level and full text level.
Content standardization, Part-of-Speech Tagging, Tokenizing.
Prepares your content for computational analysis and use of AI.
This API extracts valuable entities from the text. Entities are keywords that can be dates, people, organizations, events, which are derived from the content itself.
Keyword (entity) extraction, entity disambiguation.
Entity extraction pulls (extracts) the correct (not ambiguous) keywords from a text. Entity extraction prepares content for the next step which is topic classification.
Topic classification (also known as topic categorization) API assigns one or more topic categories based on the subject matter of the text. Our topic classifier is trained to predict topics using entities extracted from our entities extractor API. Our topic classification uses standard taxonomies such as IAB.
Other topic categorizations can be trained per request.
Topic categorization, Custom categorization.
Classifies your topics correctly and accurately. Content is easier to search or filter based off respective topics.
360AI readability analyser calculates the ‘readability’ (aka text difficulty or subject matter level) of a given text. It’s also known as content difficulty or content analysis. We support CEFR, Flesch Reading Ease, Flesch Grade Level, Dale-Chall, SMOG. Additionally, 360AI can train other readability scores based on customer requirements.
CEFR prediction using neural networks.
Saves a lot of effort on categorizing manually the right reading level (content) to the right readability score or grade level. The readability API can process content in real-time, automatically and leads to more consistent results.
The similarity analysis component determines the percentage of syntactic similarity between two texts. Syntactic similarity indicates exactly how similar two texts are taking into account the exact wording. Semantic similarity would indicate how similar two texts are in meanin g.
So if you take a text and replace all words with synonyms and restructure sentences, the semantic similarity would be high (the meaning is still the same) but the syntactic similarity would be low (the words are different).
% Of similarity, text clustering based on similarity.
Useful to finding exact duplicates, close duplicates or detecting plagiarism.