EVA: Exploring the Limits of Masked Visual Representation Learning at Scale
Disruptive technologies provides unparalleled opportunities to contribute to
the identifications of many aspects in pervasive healthcare, from the adoption
of the Internet of Things through to Machine Learning (ML) techniques. As a
powerful tool, ML has been widely applied in patient-centric healthcare
solutions. To further improve the quality of patient care, Electronic Health
Records (EHRs) are commonly adopted in healthcare facilities for analysis. It
is a crucial task to apply AI and ML to analyse those EHRs for prediction and
diagnostics due to their highly unstructured, unbalanced, incomplete, and
high-dimensional nature. Dimensionality reduction is a common data
preprocessing technique to cope with high-dimensional EHR data, which aims to
reduce the number of features of EHR representation while improving the
performance of the subsequent data analysis, e.g. classification. In this work,
an efficient filter-based feature selection method, namely Curvature-based
Feature Selection (CFS), is presented. The proposed CFS applied the concept of
Menger Curvature to rank the weights of all features in the given data set. The
performance of the proposed CFS has been evaluated in four well-known EHR data
sets, including Cervical Cancer Risk Factors (CCRFDS), Breast Cancer Coimbra
(BCCDS), Breast Tissue (BTDS), and Diabetic Retinopathy Debrecen (DRDDS). The
experimental results show that the proposed CFS achieved state-of-the-art
performance on the above data sets against conventional PCA and other most
recent approaches. The source code of the proposed approach is publicly
available at https://github.com/zhemingzuo/CFS.