Frequently Asked Questions

General

Automatic text summarization is one of the forms of information management and it can help you to reduce information overloading.

Using it’s AI & Machine Learning technologies, SummarizeBot can extract the most important information from weblinks, news articles, scientific articles, books, e-mails, lectures, patents, legal documents, etc.

By summarizing weblinks you can completely change your web browsing by being more effective and efficient. Focus on what is important. Just summarize it!

SummarizeBot identifies the language and the context of the information. After it measures each words importance in the context and defines the most important sentences.
Share a link or file with the bot and enjoy your result.

There are 3 ways to use SummarizeBot and it’s features:
1) by interacting (sharing links & files) directly with the bot;
2) by using /summarize command (e.g.: /summarize [url for processing]);
3) by inviting the bot to the channel (e.g: @SummarizeBot [url for processing]).

Features

SummarizeBot supports almost every language including English, Chinese, Russian, Japanese, Arabic, German, Spanish, French, Portuguese, etc. Please see full list here

To get and summarize breaking news for the last 48h from over 50,000 sources just type: "latest" - to get top 10 latest news or "news + subject" (in any language) example: news about Donald Trump (FR: nouvelles sur Donald Trump).

Yes, you can upload audio files and SummarizeBot will recognise and summarize information from it. For audio recognition we support the following languages: English, Russian, Chinese, French, German, Italian, Spanish, Japanese, Swedish, Finnish, Arabic.

Yes, you can upload image files and SummarizeBot will recognise and summarize information from it. For image recognition we support the following languages: English, Latvian, French, German, Russian, Italian, Dutch, Spanish, Portuguese, Swedish, Finnish.

SummarizeBot supports most of the text, image and audio formats: .html, .pdf, .doc, .docx, .csv, .eml, .epub, .gif, .jpg, .jpeg, .mp3, .msg, .odt, .ogg, .png, .pptx, .ps, .rtf, .tiff, .tif, .txt, .wav, .xlsx, .xls, .psv, .tsv, .tff, .aif, .aiff, .avr, .cdr, .wv, .au, .flac, .snd, .vox.

Both in Facebook Messenger and Slack you can upload files up to 10 mb.

Yes, both shared files from Google Drive and Dropbox can be summarized.

Just press "Save" button and summary results will be saved to your device or copied to clipboard (mobile devices).

Just use "Summary size" slider to vary your summary size.

Only public links can be summarized. Sometimes the text from the link can’t be extracted well. Please let us know in case some links can’t be processed well by SummarizeBot. We will be happy to improve our service.

For now SummarizeBot can be used in Facebook Messenger and Slack. Telegram, Work Chat, LINE, Skype and WeChat are coming soon.

When you get your summary results we calculate how much time you saved by summarizing your information. It has been calculated based on words per minute measure for different languages. For more information, see here.

Technologies

SummarizeBot combines different algorithms to provide you with the best summarization accuracy on the market. The core technologies are machine learning, natural language processing, artificial intelligence and blockchain.

Machine learning allows us to effectively preprocess incoming information for its further analysis. With the help of machine learning algorithms we can successfully understand document formats and encodings, extract an article text from a file, detect document language, implement deep linguistic analysis techniques and extract relevant document features for our artificial intelligence classifier.

In order to create well-written summaries we use custom Linguistic Processor that detects tokens and sentences, identifies parts of speech tags (PoS), lemmas, noun phrases, extracts semantic relations for each sentence and understand the meaning of unstructured information.

We develop semi-supervised artificial intelligence classifier to detect the most important aspects and sentences of text documents.

Our custom classification model takes several text mining features into an account, including statistical sentence weights, inter-sentence similarities, uniqueness and cohesiveness of text information, thematic words representations, named entities distribution and etc.

We apply decentralized architecture to train and test our artificial intelligence model. Using blockchain technology helps us to collect more accurate data for training.

We used blockchain protocols to validate data label quality, ensuring the most accurate datasets possible. Besides with blockchain we’ve tested our classification model with different experts in this field (content creators, lawyers, librarians, students, teachers and etc.) to find the most accurate text features and weights for classification algorithm.