Using latent semantic analysis for automated keyword extraction from large document corpora

Süzek, Tuğba Önal

View/Open

Tam metin / Full text (479.9Kb)

Date

2017

Author

Süzek, Tuğba Önal

Metadata

Show full item record

Abstract

In this study, we describe a keyword extraction technique that uses latent semantic analysis (LSA) to identify semantically important single topic words or keywords. We compare our method against two other automated keyword extractors, Tf-idf (term frequency-inverse document frequency) and Metamap, using human-annotated keywords as a reference. Our results suggest that the LSA-based keyword extraction method performs comparably to the other techniques. Therefore, in an incremental update setting, the LSA-based keyword extraction method can be preferably used to extract keywords from text descriptions from big data when compared to existing keyword extraction methods.

Source

Turkish Journal of Electrical Engineering and Computer Sciences

Volume

Issue

URI

https://doi.org/10.3906/elk-1511-203
https://hdl.handle.net/20.500.12809/2169

Collections

Bilgisayar Mühendisliği Bölümü Koleksiyonu [103]
TR-Dizin İndeksli Yayınlar Koleksiyonu [3005]
WoS İndeksli Yayınlar Koleksiyonu [6466]