Indoor visual localization,i.e.,6 Degree-of-Freedom camera pose estimation for a query image with respect to a known scene,is gaining increased attention driven by rapid progress of applications such as robotics and augmented reality.However,drastic visual discrepancies between an onsite query image and prerecorded indoor images cast a significant challenge for visual localization.In this paper,based on the key observation of the constant existence of planar surfaces such as floors or walls in indoor scenes,we propose a novel system incorporating geometric information to address issues using only pixelated images.Through the system implementation,we contribute a hierarchical structure consisting of pre-scanned images and point cloud,as well as a distilled representation of the planar-element layout extracted from the original dataset.A view synthesis procedure is designed to generate synthetic images as complementary to that of a sparsely sampled dataset.Moreover,a global image descriptor based on the image statistic modality,called block mean,variance,and color (BMVC),was employed to speed up the candidate pose identification incorporated with a traditional convolutional neural network(CNN) descriptor.Experimental results on a popular benchmark demonstrate that the proposed method outperforms the state-of-the-art approaches in terms of visual localization validity and accuracy.