Viewing a single comment thread. View all comments

Far-Butterscotch-436 t1_ivuhdm4 wrote

Easy, use all the training data, use smaller label weights for the uncertain data. But keep in mind, if the data is uncertain how can you trust it??? If you say the label is uncertain is there a probability that the label is incorrect? How will you measure performance on your uncertain data vs certain? Boosting algorithms will certainly overfit , it will be difficult.

1

Ulfgardleo t1_ivvfwzf wrote

it is not so easy. if we talk about noise in the input patterns and not in the labels then the noise inputs can be catastrophic to model performance. in that case the model needs to know what data source the input is from.

1