Enhancing Predictive Accuracy Using Ensemble Machine Learning Models: A Data-Centric Approach
Keywords:
Data-Centric Approach, Machine Learning, Artificial Intelligence, Big data analytics, Prediction AccuracyAbstract
Due to the abundance of data and processing capacity, artificial intelligence (AI) has developed and is now linked to deep learning. In order to optimize the efficiency without changing the underlying data, academics have traditionally taken a Data-centric approach, concentrating on creating innovative models and algorithms. The Data Centric Approach, also called the Data Oriented model, was created as a result of notable AI expert Andrew Ng's current emphasis on improved (quality) data rather than better methods. In the field of ML, the shift from a model-oriented to a data-oriented model has accelerated. Notwithstanding its potential, the Data-Centric Approach has a number of obstacles to overcome, such as (a) producing high-quality data, (b) protecting data privacy, and (c) resolving biases to make datasets equitable. There hasn't been much work done lately to prepare high-quality data. By concentrating on producing high-quality data using techniques like data augmentation, multistage hashing to remove duplicate instances, detecting and correcting noisy labels, and confident learning, our study seeks to close this gap. Our study presents an Enhancing Predictive Accuracy Using Ensemble Machine Learning Models: A Data-Centric Approach (EPAE-MLDCA). The EPAE-MLDCA continuously beat the data model centric model according to a comparative performance study. This research shows how the EPAE-MLDCA approach may be further investigated and used in a variety of fields, including entertainment, finance, healthcare, and education, where high-quality data could greatly improve performance. We discovered that compared to data-centric techniques, the EPAE-MLDCA approach provided a greater accuracy.