Application of machine-learning algorithms for tephrochronology: a case study of Plio-Quaternary volcanic fields in the South Aegean Active Volcanic Arc
Citation
Uslular, G., Kıyıkçı, F., Karaarslan, E. et al. Application of machine-learning algorithms for tephrochronology: a case study of Plio-Quaternary volcanic fields in the South Aegean Active Volcanic Arc. Earth Sci Inform (2022). https://doi.org/10.1007/s12145-022-00797-5Abstract
We performed several machine-learning algorithms on a geochemical dataset including whole-rock (n = 1656) and glass (n = 1092) compositions of lavas and pyroclastics belonging to 8 volcanic fields along the South Aegean Active Volcanic Arc (SAAVA). We did not only test our trained model with the unknown distal tephras, but also controlled its performance using some known distal tephras (e.g., Nisyros-Kyra) from the easternmost part of the SAAVA. The different metrics and kappa values revealed that Naive Bayes, Linear Discriminant Analysis, Artificial Neural Network, and Support Vector Machine (both probabilistic and non-probabilistic models) were the least performing algorithms; while the Random Forest and the gradient boosting algorithms (e.g., CatBoost, LightGBM) together with their average ensemble (Voting Classifier) were the best for the volcanic-source predictions of tephras. This also indicates that the latter algorithms give better results for the machine-learning applications on an imbalanced geochemical dataset, which was the main artifact in our training model. Despite the accurate prediction and training models especially for those having larger datasets (i.e., Santorini and Nisyros volcanoes), we here would like to express that the machine-learning can be as yet a time-saving tool (not an automatized decision-maker) in the tephrochronology studies providing a more efficient and rapid way of finding the possible volcanic sources for unknown tephras. In this regard, our freely-available Python codes would be easily implemented in further "tephra-hunting" studies in and around the SAAVA. However, there is a need for increasing the available geochemical (e.g., mineral chemistry) and also other interrelated datasets (e.g., geochronology) that should be as yet evaluated manually by the tephrochronologists to be able to improve the performances of machine-learning algorithms in the volcanic-source predictions.