Viewing a single comment thread. View all comments

sanman t1_j0ynjfi wrote

How to Handle Lots of Missing/Null Values in Data?

There's a data set that I've been given to analyze, and it's got a lot of missing data. Typically, I should replace missing values with mean, or mode, etc. But one particular column has nearly 70% null values. What is the threshold to reject a column as unsuitable for analysis, instead of trying to replace those missing values? How large a proportion of missing values is acceptable before I have to reject/discard the column altogether? Is there some rule of thumb for this?

1