CONTENT PRE-PROCESSING

360AI technology production step_1.jpg

360AI content preprocessing API consists of a tokenizer and Part-of-Speech (POS) tagger that standardize the text before it is further analysed by our AI engine.

The tokenizer identifies start and end of each word and sentence. The Part-of-Speech tagger disambiguates word categories (e.g. noun, verb, etc.) for each word. This is a prerequisite for any computer analysis of text. Our content preprocessor is available on word level, sentence level and full text level.

FEATURES

Content standardization, Part-of-Speech Tagging, Tokenizing.

BENEFITS

Prepares your content for computational analysis and use of AI.


ENTITY EXTRACTION

360AI technology production step_2.jpg

This API extracts valuable entities from the text. Entities are keywords that can be dates, people, organizations, events, which are derived from the content itself.

FEATURES

Keyword (entity) extraction, entity disambiguation.

Benefits

Entity extraction pulls (extracts) the correct (not ambiguous) keywords from a text. Entity extraction prepares content for the next step which is topic classification.


TOPIC CLASSIFICATION

360AI technology production step_3.jpg

Topic classification (also known as topic categorization) API assigns one or more topic categories based on the subject matter of the text. Our topic classifier is trained to predict topics using entities extracted from our entities extractor API. Our topic classification uses standard taxonomies such as IAB.

Other topic categorizations can be trained per request.

FEATURES

Topic categorization, Custom categorization.

BENEFITS

Classifies your topics correctly and accurately. Content is easier to search or filter based off respective topics.


READABILITY ANALYSIS

360AI technology production step_4.jpg

360AI readability analyser calculates the ‘readability’ (aka text difficulty or subject matter level) of a given text. It’s also known as content difficulty or content analysis. We support CEFR, Flesch Reading Ease, Flesch Grade Level, Dale-Chall, SMOG. Additionally, 360AI can train other readability scores based on customer requirements.

FEATURES

CEFR prediction using neural networks.

BENEFITS

Saves a lot of effort on categorizing manually the right reading level (content) to the right readability score or grade level. The readability API can process content in real-time, automatically and leads to more consistent results.


SIMILARITY ANALYSIS

360AI technology production step_5.jpg

The similarity analysis component determines the percentage of syntactic similarity between two texts. Syntactic similarity indicates exactly how similar two texts are taking into account the exact wording. Semantic similarity would indicate how similar two texts are in meanin g.
So if you take a text and replace all words with synonyms and restructure sentences, the semantic similarity would be high (the meaning is still the same) but the syntactic similarity would be low (the words are different).

FEATURES

% Of similarity, text clustering based on similarity.

BENEFITS

Useful to finding exact duplicates, close duplicates or detecting plagiarism.


 
 

NEED AN INTRODUCTION TO SMART CONTENT?