• Türkçe
    • English
  • Türkçe 
    • Türkçe
    • English
  • Giriş
Öğe Göster 
  •   DSpace@Muğla
  • Araştırma Çıktıları | TR-Dizin | WoS | Scopus | PubMed
  • WoS İndeksli Yayınlar Koleksiyonu
  • Öğe Göster
  •   DSpace@Muğla
  • Araştırma Çıktıları | TR-Dizin | WoS | Scopus | PubMed
  • WoS İndeksli Yayınlar Koleksiyonu
  • Öğe Göster
JavaScript is disabled for your browser. Some features of this site may not work without it.

On constituent chunking for Turkish

Thumbnail

Göster/Aç

Tam metin / Full text (486.9Kb)

Tarih

2018

Yazar

Aslan, Özkan
Günal, Serkan
Dinçer, Bekir Taner

Üst veri

Tüm öğe kaydını göster

Özet

Chunking is a task which divides a sentence into non-recursive structures. The primary aim is to specify chunk boundaries and classes. Although chunking generally refers to simple chunks, it is possible to customize the concept. A simple chunk is a small structure, such as a noun phrase, while constituent chunk is a structure that functions as a single unit in a sentence, such as a subject. For an agglutinative language with a rich morphology, constituent chunking is a significant problem in comparison to simple chunking. Most of Turkish studies on this issue use the IOB tagging schema to mark the boundaries. In this study, we proposed a new simpler tagging schema, namely OE, in constituent chunking for Turkish. "E" represents the rightmost token of a chunk, while "O" stands for all other items. In reference to OE, we also used a schema called OB, where "B" represents the leftmost token of a chunk. We aimed to identify both chunk boundaries and chunk classes using the conditional random fields (CRF) method. The initial motivation was to employ the fact that Turkish phrases are head-final for chunking. In this context, we assumed that marking the end of a chunk (OE) would be more advantageous than marking the beginning of a chunk (013). In support of the assumption, the test results reveal that OB has the worst performance and OE is significantly a more successful schema in many cases. Especially in long sentences, this contrast is more obvious. Indeed, using OE means simply marking the head of the phrase (chunk). Since the head and the distinctive label "E" are aligned, CRF finds the chunk class more easily by using the information contained in the head. OE also produced more successful results than the schemas available in the literature. In addition to comparing tagging schemas, we performed four analyses. Along with the examination of window size, which is a parameter of CRF, it is adequate to select and accept this value as 3. A comparison of the evaluation measures for chunking revealed that F-score was a more balanced measure in contrast to token accuracy and sentence accuracy. As a result of the feature analysis, syntactic features improves chunking performance significantly under all conditions. Yet when withdrawing these features, a pronounced difference between OB and OE is forthcoming. In addition, flexibility analysis shows that OE is more successful in different data.

Kaynak

Information Processing & Management

Cilt

54

Sayı

6

Bağlantı

https://doi.org/10.1016/j.ipm.2018.05.004
https://hdl.handle.net/20.500.12809/1306

Koleksiyonlar

  • Bilgisayar Mühendisliği Bölümü Koleksiyonu [103]
  • Scopus İndeksli Yayınlar Koleksiyonu [6219]
  • WoS İndeksli Yayınlar Koleksiyonu [6466]



DSpace software copyright © 2002-2015  DuraSpace
İletişim | Geri Bildirim
Theme by 
@mire NV
 

 




| Politika | Rehber | İletişim |

DSpace@Muğla

by OpenAIRE
Gelişmiş Arama

sherpa/romeo

Göz at

Tüm DSpaceBölümler & KoleksiyonlarTarihe GöreYazara GöreBaşlığa GöreKonuya GöreTüre GöreDile GöreBölüme GöreKategoriye GöreYayıncıya GöreErişim ŞekliKurum Yazarına GöreBu KoleksiyonTarihe GöreYazara GöreBaşlığa GöreKonuya GöreTüre GöreDile GöreBölüme GöreKategoriye GöreYayıncıya GöreErişim ŞekliKurum Yazarına Göre

Hesabım

GirişKayıt

DSpace software copyright © 2002-2015  DuraSpace
İletişim | Geri Bildirim
Theme by 
@mire NV
 

 


|| Politika || Rehber|| Yönerge || Kütüphane || Muğla Sıtkı Koçman Üniversitesi || OAI-PMH ||

Muğla Sıtkı Koçman Üniversitesi, Muğla, Türkiye
İçerikte herhangi bir hata görürseniz, lütfen bildiriniz:

Creative Commons License
Muğla Sıtkı Koçman Üniversitesi Institutional Repository is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 Unported License..

DSpace@Muğla:


DSpace 6.2

tarafından İdeal DSpace hizmetleri çerçevesinde özelleştirilerek kurulmuştur.