ABSTRACT

Employee turnover (ET) is a significant problem that businesses in all industries must deal with. In order to avoid paying for hiring and training, businesses are constantly seeking ways to retain professional employees. Being able to foresee an employee's resignation will let the business take preventative action.

Using historical employee records, artificial intelligence (AI) and machine learning (ML) prediction models can assist in categorizing the chance of people quitting their jobs voluntarily. However, the lack of transparency and interpretability in the output responses produced by these AI-based ML models makes it challenging for human resource (HR) managers to comprehend the reasoning behind the AI forecasts. Managers will not be able to enhance data-driven decision-making and provide value to the businesses if they do not comprehend how and why responses are generated by AI models based on the input datasets.

Utilizing the evolutionary gradient boosted random forest algorithm (EGB-RFA), we provide a novel data-centric predictive model to address these disadvantages. To calculate employee turnover, the dataset from the company's HR department was used. We normalized the data as part of the preprocessing step to eliminate duplicates. Based on the attributes, the dataset indicates whether the employee is leaving or staying.

Now, we use 80% of the preprocessed data for training and 20% for testing to build a prediction model. The essential features are extracted using principal component analysis (PCA). The suggested framework is assessed on the chosen dataset with various feature settings and dataset sizes. Results show that the final model created by our approach performs better than expected.