Cognitive radar is a new framework of radar system proposed by Simon Haykin recently. Adaptive waveform selection is an important problem of intelligent transmitter in cognitive radar. In this paper, the problem of adaptive waveform selection is modeled as stochastic dynamic programming model. Then Q-learning is used to solve it. Q-learning can solve the problems that we do not know the explicit knowledge of state-transition probabilities. The simulation results demonstrate that this method approaches the optimal wave-form selection scheme and has lower uncertainty of state estimation compared to fixed waveform. Finally, the whole paper is summarized.