Steffen Holter, Homer Gamil and Anthony Tzes
The importance of dimensionality reduction (DR) stems from its widespread use in the field of Big Data as it facilitates the visualization, classification, communication, and storage of high-dimensional information. As such even small improvements in the feature extraction method can yield significant benefits to the user. Common algorithms such PCA and its variants have long become obsolete as they are limited to linear orthogonal transformations. Considerably superior performance has been achieved through the use of deep learning and autoencoders. However, for these, as well as most other DR methods the evaluation and interpretability of the feature extraction is remarkably limited. This paper introduces a 3D data visualization interface for the depiction and evaluation of dimensionality reduction. A data driven scheme relying on a multi-layer autoencoder architecture is used to generate the underlying mapping from high dimensional space to the reduced space. The efficacy of the feature extraction process is examined by evaluating data and class property retention through various functions integrated into the tool. For example, a cluster analysis is used to explore the relative positions of similarly classed datapoints. In addition, the surrounding feature space is explored through the generation of custom points using an inverse mapping. Simulation studies on the MNIST dataset are used as a use case to exemplify the functionality of the tool.