Algorithms derived from this paper by Stanford's Jiwei Li et al.
Updated and improved from earlier versions now supporting: TensorFlow 1.3+ Python 3.6+
The model is based on generative adversarial architectures with a Generative model learning to create examples to be evaluated by a Discriminator model.
Generative Model: Since this is a NLP task we use a Seq2Seq setup utilizing a GRU cell to implement attention.
Discriminator Model: Hierarchical RNN as used in Iulian V. Serban's paper.
As discussed by Jiwei Li et al. discrete problems such as dialogue generation have been difficult to train using a reinforcement strategy. Li implemented a Monte Carlo search amongst partially decoded sequences to develop a method of reward for the reinforcement.
Many thanks to @liuyuemaicha for providing the initial code base for Tensorflow < 1.x.