ENSEMBLE LEARNING APPROACHES FOR CLASSIFICATION WITH HIGH-DIMENSIONAL DATA

Chỉ số đề mục

Lĩnh vực nghiên cứu

Dạng tài liệu

BB

Tác giả

Tran Cao Truong

Nhan đề

ENSEMBLE LEARNING APPROACHES FOR CLASSIFICATION WITH HIGH-DIMENSIONAL DATA

Nhan đề tiếng anh

ENSEMBLE LEARNING APPROACHES FOR CLASSIFICATION WITH HIGH-DIMENSIONAL DATA

Nguồn trích

Journal of Science and Technique: Section on Information and Communication Technology

Năm xuất bản

2023

Số

01

Trang

83

ISSN

Từ khóa

Từ khóa tiếng anh

Tóm tắt

Classification with high-dimensional data is a significant challenge in machine learning because the abundance of features in high-dimensional data makes it difficult to identify meaningful patterns, which leads to overfitting and reduced classification performance. Moreover, the computational cost of processing high-dimensional data is often prohibitively expensive, requiring specialized hardware or optimized algorithms. Ensemble learning is a powerful machine learning technique that combines multiple models to improve classification accuracy. By aggregating the predictions of multiple models, ensemble learning can reduce overfitting, increase robustness, and improve performance on a wide range of real-world classification problems. Ensemble learning is effective for classification with high-dimensional data because it can combine multiple models to mitigate the effects of the curse of dimensionality, reduce overfitting, and enhance generalization performance. By using different learning algorithms or subsets of features, ensemble learning can improve the diversity of the models, leading to better overall performance on high-dimensional data. This paper proposes two hybrid ensemble machine learning approaches that integrate random subspace ensemble with bagging and boosting to enhance classification performance with high-dimensional data. Experimental results demonstrate that these methods significantly improve classification accuracy with highdimensional data.

Tóm tắt tiếng anh

Classification with high-dimensional data is a significant challenge in machine learning because the abundance of features in high-dimensional data makes it difficult to identify meaningful patterns, which leads to overfitting and reduced classification performance. Moreover, the computational cost of processing high-dimensional data is often prohibitively expensive, requiring specialized hardware or optimized algorithms. Ensemble learning is a powerful machine learning technique that combines multiple models to improve classification accuracy. By aggregating the predictions of multiple models, ensemble learning can reduce overfitting, increase robustness, and improve performance on a wide range of real-world classification problems. Ensemble learning is effective for classification with high-dimensional data because it can combine multiple models to mitigate the effects of the curse of dimensionality, reduce overfitting, and enhance generalization performance. By using different learning algorithms or subsets of features, ensemble learning can improve the diversity of the models, leading to better overall performance on high-dimensional data. This paper proposes two hybrid ensemble machine learning approaches that integrate random subspace ensemble with bagging and boosting to enhance classification performance with high-dimensional data. Experimental results demonstrate that these methods significantly improve classification accuracy with highdimensional data.

Kí hiệu kho

File toàn văn

Xem toàn văn