Data preprocessing in data science