An intelligent solution method is proposed to achieve real-time optimal control for continuous-time nonlinear systems using a novel identifier-actor-optimizer (IAO) policy learning architecture.In this IAO-based policy learning approach,a dynamical identifier is developed to approximate the unknown part of system dynamics using deep neural networks (DNNs).Then,an indirect-method-based optimizer is proposed to generate high-quality optimal actions for system control considering both the constraints and performance index.Furthermore,a DNN-based actor is developed to approximate the obtained optimal actions and return good initial guesses to the optimizer.In this way,the traditional optimal control methods and state-of-the-art DNN techniques are combined in the IAO-based optimal policy learning method.Compared to the reinforcement learning algorithms with actor-critic architectures that suffer hard reward design and low computational efficiency,the IAO-based optimal policy learning algorithm enjoys fewer user-defined parameters,higher learning speeds,and steadier convergence properties in solving complex continuous-time optimal control problems (OCPs).Simulation results of three space flight control missions are given to substantiate the effectiveness of this IAO-based policy learning strategy and to illustrate the performance of the developed DNN-based optimal control method for continuous-time OCPs.