STILUS Class

Automatic classification of texts

STILUS Class is one of the components of the STILUS products family of linguistic technology, offering functionality for the automatic classification of texts from a previously trained model. The algorithm of classification is a hybrid statistic model that has a filter based on rules, with the following steps:

  • 1st step: selection of the possible categories through a statistic algorithm based on the comparison of the text to be classified with each category, through the kNN algorithm (the k-nearest neighbours)

  • 2nd step: filtering (accept/reject) of categories through a system of rules based on a list of obligatory (that must appear in the text) terms (multiword) and eliminatory terms (that must not appear)

  • 3rd step: organization of the categories according to a descendant relevance order

In Showroom, the Daedalus' demo website, demos of the classification capacities of STILUS Class are offered, with the IPTC standard (International Press Telecommunication Council, an international organization joining the most outstanding news agencies and communication companies and focused on the development and publication of technical standards to improve the exchange of news) as well as the Eurovoc thesaurus (multilingual thesaurus including all the fields of activity of the European Communities and indexing the documents in the systems of documentation of the European institutions and their users).

White paper on Language Technologies

Download >>

Showroom

Try our products and demonstrators.

Showroom >>