Voxel2vec

Relationships in scientific data are intricate and complex, such as the numerical and spatial distribution relations of features in univariate data, the scalar-value combinations’ relations in multivariate data, and the association of volumes in time-varying and ensemble data. This paper presents voxel2vec, a novel unsupervised representation learning model, to learn distributed representations of scalar values in a low-dimensional vector space. The basic assumption is that if two scalar values/scalar-value combinations have similar contexts, they usually have high similarity in terms of features. By representing scalar values as symbols, voxel2vec learns the similarity of scalar values in the context of spatial distribution and then we can explore the overall association between volumes by transfer prediction. We demonstrate the usefulness and effectiveness of voxel2vec by comparing it with the isosurface similarity map of univariate data and applying the learned distributed representations to feature classification for multivariate data and association analysis for time-varying and ensemble data.