Abstract Details
(2020) Capturing the P-T Evolution of Magmas in Volcanic Plumbing Systems by Machine Learning
Petrelli M, Caricchi L & Perugini D
https://doi.org/10.46427/gold2020.2070
The author has not provided any additional details.
05d: Room 2, Friday 26th June 22:00 - 22:03
Maurizio Petrelli
View all 5 abstracts at Goldschmidt2020
Luca Caricchi View all 2 abstracts at Goldschmidt2020 View abstracts at 10 conferences in series
Diego Perugini
Luca Caricchi View all 2 abstracts at Goldschmidt2020 View abstracts at 10 conferences in series
Diego Perugini
Listed below are questions that have been submitted by the community that the author will try and cover in their presentation. To submit a question, ensure you are signed in to the website. Authors or session conveners approve questions before they are displayed here.
Submitted by Cora McKenna on Thursday 25th June 16:01
Thank you for sharing this interesting work. I have a couple of questions (referencing slide 5, the test data plots) 1. to my eye, the model seems to have a slight drop in performance at extreme values (high/low ranges), but in the temperature data at least it looks like you have a good spread across values. Is the train/val data also well stratified? Or could there be an effect of a slight imbalance in training data in those ranges? (Tree based algorithms are particularly sensitive to imbalance) 2. alternatively, is it a case of applying linear algorithms to a non-linear trend? Have you tried introducing higher order versions of the input data (squares, cubes) to allow a linear model to produce a pseudo-polynomial fit? (though that might be more relevant for the linear regression model than the tree based model presented on that slide) 3. finally, the pressure data plots interestingly. Am i right to assume the expected values are coming from models/experiments with limited output so only a few discrete values are represented. are you treating the pressure as continuous data or discrete values/categories in the predictions? It kind of looks like you are predicting a continuous value and comparing to a predicted discrete value, in which case incorporating the error from expected pressure might give the model more room in training
Hi Cora, thanks for the interesting questions. 1) We worked on the original data to provide a balanced train-dataset. In the presentation, we reported a global calibration over a large spread of P-T-X, always characterized by lower errors than 'ordinary' methods. Working on a local, more focused dataset will probably improve the accuracy of specific P-T-X spaces, but it was not the aim of the study. 2) We tested both linear and non-linear algorithms, but there we only reported the best performing algorithm. 3) The train-dataset is made of experimental data from the literature, and pressure is treated as a continuous variable, as expected for natural systems. Thanks for the suggestions and feedback.
Thank you for sharing this interesting work. I have a couple of questions (referencing slide 5, the test data plots) 1. to my eye, the model seems to have a slight drop in performance at extreme values (high/low ranges), but in the temperature data at least it looks like you have a good spread across values. Is the train/val data also well stratified? Or could there be an effect of a slight imbalance in training data in those ranges? (Tree based algorithms are particularly sensitive to imbalance) 2. alternatively, is it a case of applying linear algorithms to a non-linear trend? Have you tried introducing higher order versions of the input data (squares, cubes) to allow a linear model to produce a pseudo-polynomial fit? (though that might be more relevant for the linear regression model than the tree based model presented on that slide) 3. finally, the pressure data plots interestingly. Am i right to assume the expected values are coming from models/experiments with limited output so only a few discrete values are represented. are you treating the pressure as continuous data or discrete values/categories in the predictions? It kind of looks like you are predicting a continuous value and comparing to a predicted discrete value, in which case incorporating the error from expected pressure might give the model more room in training
Hi Cora, thanks for the interesting questions. 1) We worked on the original data to provide a balanced train-dataset. In the presentation, we reported a global calibration over a large spread of P-T-X, always characterized by lower errors than 'ordinary' methods. Working on a local, more focused dataset will probably improve the accuracy of specific P-T-X spaces, but it was not the aim of the study. 2) We tested both linear and non-linear algorithms, but there we only reported the best performing algorithm. 3) The train-dataset is made of experimental data from the literature, and pressure is treated as a continuous variable, as expected for natural systems. Thanks for the suggestions and feedback.
Sign in to ask a question.