Basit öğe kaydını göster

dc.contributor.authorİbrahim, Ahmed Hassan
dc.contributor.authorKarabulut, Onur Can
dc.contributor.authorKarpuzcu, Betül Asiye
dc.contributor.authorTürk, Erdem
dc.contributor.authorSüzek, Barış Ethem
dc.date.accessioned2023-05-24T06:45:22Z
dc.date.available2023-05-24T06:45:22Z
dc.date.issued2023en_US
dc.identifier.citationIbrahim AH, Karabulut OC, Karpuzcu BA, Türk E, Süzek BE. A correlation coefficient-based feature selection approach for virus-host protein-protein interaction prediction. PLoS One. 2023 May 2;18(5):e0285168. doi: 10.1371/journal.pone.0285168. PMID: 37130110; PMCID: PMC10153705.en_US
dc.identifier.urihttps://doi.org/10.1371/journal.pone.0285168
dc.identifier.urihttps://hdl.handle.net/20.500.12809/10698
dc.description.abstractPrediction of virus-host protein-protein interactions (PPI) is a broad research area where various machine-learning-based classifiers are developed. Transforming biological data into machine-usable features is a preliminary step in constructing these virus-host PPI prediction tools. In this study, we have adopted a virus-host PPI dataset and a reduced amino acids alphabet to create tripeptide features and introduced a correlation coefficient-based feature selection. We applied feature selection across several correlation coefficient metrics and statistically tested their relevance in a structural context. We compared the performance of feature-selection models against that of the baseline virus-host PPI prediction models created using different classification algorithms without the feature selection. We also tested the performance of these baseline models against the previously available tools to ensure their predictive power is acceptable. Here, the Pearson coefficient provides the best performance with respect to the baseline model as measured by AUPR; a drop of 0.003 in AUPR while achieving a 73.3% (from 686 to 183) reduction in the number of tripeptides features for random forest. The results suggest our correlation coefficient-based feature selection approach, while decreasing the computation time and space complexity, has a limited impact on the prediction performance of virus-host PPI prediction tools.en_US
dc.item-language.isoengen_US
dc.publisherPlosen_US
dc.relation.isversionof10.1371/journal.pone.0285168.en_US
dc.item-rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectVirus-host proteinen_US
dc.titleA correlation coefficient-based feature selection approach for virus-host protein-protein interaction predictionen_US
dc.item-typearticleen_US
dc.contributor.departmentMÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümüen_US
dc.contributor.authorID0000-0002-1521-4306en_US
dc.contributor.institutionauthorİbrahim, Ahmed Hassan
dc.contributor.institutionauthorKarabulut, Onur Can
dc.contributor.institutionauthorTürk, Erdem
dc.contributor.institutionauthorSüzek, Barış Ethem
dc.identifier.volume18en_US
dc.identifier.issue5en_US
dc.relation.journalPLoS Oneen_US
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US


Bu öğenin dosyaları:

Thumbnail

Bu öğe aşağıdaki koleksiyon(lar)da görünmektedir.

Basit öğe kaydını göster