Összesen 1 találat.
#/oldal:
Részletezés:
Rendezés:

1.

001-es BibID:BIBFORM123124
Első szerző:Szeghalmy Szilvia (programtervező matematikus)
Cím:A comparative study on noise filtering of imbalanced data sets / Szeghalmy Szilvia, Fazekas Attila
Dátum:2024
ISSN:0950-7051
Megjegyzések:Class imbalance in data sets used to train classifiers can negatively affect the performance of the resulting models. A commonly used solution to address this issue is to oversample the data sets and supplement them with synthetic samples. However, oversampling can increase the level of noise in the data sets. Several sampling methods attempt to prevent this negative effect in a variety of ways, but there are still many open questions about which solutions might be appropriate in different cases. In our study, we compared the impact of different noise filtering methods on relatively small, imbalanced synthetic data sets and then examined how noise filters perform as a preprocessing step of sampling methods following different sampling strategies. The success of the noise filtering was gauged through the performance of k-Nearest Neighbours models built on the balanced data sets. Our results highlight the importance of cleaning the minority class and provide insight into which noise filtering approaches might be useful as a first step to oversampling on imbalanced noisy data sets. Among the noise filtering methods included in our study, the GMM-based ones perform well on highly noisy imbalanced data sets. It is also worth highlighting some versions of ENN, which are very effective when the noise level is moderate.
Tárgyszavak:Műszaki tudományok Informatikai tudományok idegen nyelvű folyóiratközlemény külföldi lapban
folyóiratcikk
Noise removal
Oversampling
Imbalanced learning
kNN
Megjelenés:Knowledge-Based Systems. - 301 (2024), p. 1-17. -
További szerzők:Fazekas Attila (1968-) (matematikus, informatikus)
Internet cím:Szerző által megadott URL
DOI
Intézményi repozitóriumban (DEA) tárolt változat
Borító:
Rekordok letöltése1