Applying latent Dirichlet allocation for analysis of publications in scientometric databases


  • A. S. Kolyada Odessа Polytechnic National University
  • B. A. Yakovenko Odessа Polytechnic National University
  • Victor Dmitriyevich Gogunsky Odessа Polytechnic National University



model, latent, semantic, Dirichlet, topic, publication


The aim of the work is to determine the most appropriate model for a thematic classification of scientific publications by author with the same sirname. The probabilistic models are analyzed and it is proposed to use the model of latent Dirichlet allocation — the leading one among probabilistic models thanks to numerous generalizations and applications to the analysis of collections of text documents. For comparison the latent semantic analysis model is chosen. The model is used in the project for the extraction of publications from scientometric databases. In this project the usage of topic modeling solves the problem of separation of publications of authors with the same sirname, where titles of publications are selected as collection of documents. The results show that the model of latent Dirichlet allocation yield to the latent semantic analysis with usage of small volume of the contents of documents. Therefore, for small collections of documents of volume it is preferable to use latent semantic analysis, and for large volumes — latent Dirichlet allocation.


Download data is not yet available.

Author Biographies

A. S. Kolyada, Odessа Polytechnic National University


B. A. Yakovenko, Odessа Polytechnic National University


Victor Dmitriyevich Gogunsky, Odessа Polytechnic National University

Victor Dmitriyevich Gogunsky has been working in the Odessa National Polytechnic University (ONPU) since 1969. He obtained his Ph. D (engineering) in 1975 and his D. Sc. in 1991. Since 1991 V. D. Gogunsky has been Head of the chair of life activity safety systems management, ONPU. The title of professor – since 1993.

V. D. Gogunsky combines successfully his teaching and scientific activities. He took part in several investigations creating new scientific areas of adopting new information technologies in the national economy of Ukraine. He has worked out and implemented a resource Management Information System (MIS) at the Kremenchug knitting factory.

Also he has implemented a computer-driven system for steelmakers teaching and training at the Dniprodzerzhinsk steel works. V. D. Gogunsky has developed a MIS for the street traffic light system of Odessa. He took part in the substantiation study and implementation of the ecological rebalancing project for the ecological disaster area in the Louzanivs’kiy microdistrict, Odessa city.

V. D. Gogunsky has published more than 400 scientific and methodological works. Among his pupils there are 10 Doctors of Philosophy (eng.) and 4 Doctors of  Science (eng.).  He has been given the title of “Honoured Science and Engineering Worker of Ukraine” and the sign of “High Achiever in Education of Ukraine” for his personal val


1. Gogunsky V. D., Vishnevskaya V.М., Bouslayev А.G. Implementation analysis for an adaptive teaching program based on the fuzzy logic // Labours of the Odessa Polytechnic Universi-ty.– Iss. 2(28). – 2007. – P. 127 – 128 (in Russian).
2. Vishnevskaya V.М., Gogunsky V. D. Creating conception for an adaptive teaching pro-gram based on the fuzzy logic // Higher education of Ukraine. - № 1 (28). – 2008. – P. 91 – 95 (in Ukrainian).
3. Yakovenko V.D., Gogunsky V. D. State prediction for an educational organization quality con-trol system // System investigations and information technologies. – 2009. -- № 2. – P. 50 – 57 (in Ukrainian).
4. Тоnkonogy V.М., Gogunsky V. D., Yakovenko V.D., Yakovenko A.E. Моdeling infor-mation processes for an educational organization quality control // State-of-the-art technologies in machine building. – Iss. 3. – 2009. - P. 261 – 268 (in Ukrainian).
5. Gogunsky V. D., Теslenko P.А. Generation of a control action function for a project con-trol sys-tem similar to the tack moving system // The East European journal of advanced technolo-gies. - № 1/3 (43). –2010 – P. 22 – 24 (in Russian).
6. Gogunsky V. D., Pletnev А.N. The migration cоefficients and a method for buffer ad-justment in cluster systems // // Engineering problems. - № 1. – 2010. – P. 101-109 (in Russian).
7. Тkachuk S.V., Gogunsky V. D. A simulation model for the educational program quality control system at higher education institutes // Herald of the Kherson National Technical University. - № 2(38). – 2010. – P. 497-502 (in Ukrainian).
8. Теslenko P.А., Gogunsky V. D. Project control graph-analytic optimization on the Pon-tryagin maximum principle basis // Оptimization of production processes. – Iss. 12. –2010. – P. 88 – 92 (in Russian).
9. Yakovenko E.A., Gogunsky V. D., Nosov P.S. Estimation of the managerial knowledge level // State-of-the-art technologies in machine building. – Iss. 4. - 2010. - P. 303 – 308 (in Ukrai-nian).
10. Теslenko P.А., Gogunsky V. D. Transformation of the project processes’ qualitative properties model into a system status model // Projects control and production development.- 2010. – № 1(33). - P. 42 – 46 (in Russian).


Коляда, А.С. Автоматизация извлечения информации из наукометрических баз даннях / А.С. Коляда, В.Д. Гогунский // Управління розвитком складних систем. - 2013. - Вип. 16. - С. 96 - 99.

Коляда, А.С. Латентно семантический подход для анализа информации из наукометрических баз даннях / А.С. Коляда // Управління розвитком складних систем. - 2014. - Вип. 17. - С. 101 -108.

Воронцов, К.В. Вероятностное тематическое моделирование [Электронный ресурс] / К.В. Воронцов // - Режим доступа: (Дата обращения: 03.03.2014).

Daud, A. Knowledge discovery through directed probabilistic topic models: a survey / A. Daud, J. Li, L. Zhou, F. Muhammad // Frontiers of Computer Science in China. - 2010. - Vol. 4, Iss. 2. - PP. 280 - 301.

Blei, D.M. Latent Dirichlet Allocation / D.M. Blei, A.Y. Ng, M.I. Jordan // Journal of Machine Learn-ing Research. - 2003. -Vol. 3. - PP. 993 - 1022.



How to Cite

Kolyada, A.S., Yakovenko, B.A. and Gogunsky, V.D. 2014. Applying latent Dirichlet allocation for analysis of publications in scientometric databases. Proceedings of Odessa Polytechnic University. 1(43) (Apr. 2014), 186–191. DOI:



Computer and information networks and systems. Manufacturing automation