Abstract
Telehealth services are becoming more and more popular, leading to an increasing amount of data to be monitored by health professionals. Machine learning can support them in managing these data. Therefore, the right machine learning algorithms need to be applied to the right data. We have implemented and validated different algorithms for selecting optimal time instances from time series data derived from a diabetes telehealth service. Intrinsic, supervised, and unsupervised instance selection algorithms were analysed. Instance selection had a huge impact on the accuracy of our random forest model for dropout prediction. The best results were achieved with a One Class Support Vector Machine, which improved the area under the receiver operating curve of the original algorithm from 69.91 to 75.88 %. We conclude that, although hardly mentioned in telehealth literature so far, instance selection has the potential to significantly improve the accuracy of machine learning algorithms.
Original language | English |
---|---|
Pages (from-to) | 840-844 |
Number of pages | 5 |
Journal | Studies in Health Technology and Informatics |
Volume | 310 |
DOIs | |
Publication status | Published - 25 Jan 2024 |
Event | MedInfo 2023: THE FUTURE IS ACCESSIBLE - International Convention Centre (ICC), Sydney, Australia Duration: 8 Jul 2023 → 12 Jul 2023 Conference number: 19. https://medinfo2023.org/ |
Research Field
- Exploration of Digital Health
Keywords
- Instance selection
- training data selection
- predictive modelling
- telehealth