• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site
Contacts

Moscow, 105066, Staraya Basmannaya St, 21/4, office 518-528

Phone: (495) 772-95-90 *22699, *22803, *22687
 

Book
Goncharov in the Twenty-First Century

Зубков К. Ю., Guskov S., Балакин А. Ю. et al.

Academic Studies Press, 2021.

Book chapter
Image

Poselyagin N.

In bk.: The Companion to Juri Lotman: A Semiotic Theory of Culture. L.; NY; Dublin: Bloomsbury, 2022. Ch. 16. P. 225-233.

Working paper
Language and Cultural Contacts in the Russian-Nordic Borderlands: Change and Continuity

Vlakhov A., Deresh A., Mironova E. et al.

Linguistics. WP BRP. НИУ ВШЭ, 2021. No. 108.

HSE Doctoral Student Develops E-thesaurus for the Russian Language

Daniil Alexeevsky, doctoral student in Philology, presented the final part of his thesis on the development of a large electronic lexical database of the Russian language, similar to Princeton’s Wordnet.

Books similar to Princeton’s WordNet are widely used for solving various problems arising in automatic text processing, which involves determining the semantic similarity of words, as well as problems relating to automatic translation. Although these resources are in clear demand, today there is no open-access Russian language thesaurus that meets Princeton WordNet’s standards.

Daniil Alexeevsky has developed a chain of programmes that process dictionaries in order to encode relations between words as ‘super-subordinate’ (also called hyperonymy, hyponymy or ISA relation) – WordNet also works in a similar way. This correctly defines the word (general term) and interpretation (to an accuracy of 85%, or significantly higher than in similar published works), although disambiguation (interpretation of the general term), requires further improvements.

However, for some noun classes, disambiguation works successfully, for example, the terms for musical instruments and technical tools and devices are correctly extracted and divided in the dictionary.

Daniil plans to improve disambiguation using Word2Vec, and then analyze and compare the results of processing several dictionaries.