Near infrared-visible ( NIR-VIS) face recognition is to match an NIR face image to a VIS image. The main challenges of NIR-VIS face recognition are the gap caused by cross-modality and the lack of sufficient paired NIR-VIS face images to train models. This paper focuses on the generation of paired NIR-VIS face images and proposes a dual variational generator based on ResNeSt ( RS-DVG ) . RS-DVG can generate a large number of paired NIR-VIS face images from noise, and these generated NIR-VIS face images can be used as the training set together with the real NIR-VIS face images. In addition, a triplet loss function is introduced and a novel triplet selection method is proposed specifically for the training of the current face recognition model, which maximizes the inter-class distance and minimizes the intra-class distance in the input face images. The method proposed in this paper was evaluated on the datasets CASIA NIR-VIS 2. 0 and BUAA-VisNir, and relatively good results were obtained.