Relation between Titles and Keywords in Japanese Academic Papers using Quantitative Analysis and Machine Learning

Masaki Murata, Natsumi Morimoto


In this study, we analyzed keywords from different academic papers using data from more than 300 papers. Using the concept of quantitative surveys and machine learning, we conducted various analyses on the keywords in different papers. The findings obtained from these surveys and analyses are assumed to lend themselves to the automatic assignment of keywords for papers. In this study, the number of keywords included in a paper is quantitatively expressed using the covering rate and density of keywords. The results confirm that paper titles are likely to include keywords. The performed keyword analyses predict words that can be used as keywords via machine learning. The proposed method has an accuracy range 0.6–0.8. In addition, by analyzing the features used in machine learning, we can obtain the characteristics of the words that are mentioned as keywords in papers.


Thesis, title, keyword, machine learning, feature analysis

Full Text: PDF