The Neural Conversation Model

Oriol Vinyals and Quoc V.Le

Neural Networks to map sequences to sequences which in this regard are statements and their responses.

The advantage of using neural networks is that the system becomes are simple and general, the same end-to-end system can be used for machine translation, conversations and question answering.

The only difference in conversations is that the input sequence is the concatenation of what has been covered so far (the entire context) and the output sequence is the reply.

Translation in comparison to conversations is a much easier task. This is again fortified by the fact that the objective function in conversations is a simplification of the human communication objective. Human conversations are long term and is based on exchange of information rather than step to step prediction. They also need world knowledge which is absent in an unsupervised model.

Model

A sequence to sequence, called seq2seq framework which takes input of one token at one timestamp and outputs one token at one timestamp. The framework is established using RNNs. The training phase involves learning the given true output through back propagation.

In language tasks the objective is to decrease the perplexity and in this task as well the model is trained to maximize the cross entropy of the correct sequence using the context.

Experiments

The experiments in the paper were carried out on two datasets:

A closed domain IT helpdesk troubleshooting dataset
An open domain movie transcript

Some local experiments were carried out by us and these show that the model can remember facts, understand context and can perform common sense reasoning all in an end-to-end fashion. However, in agreement with the paper there are some disadvantages and advantages as below.

Disadvantages

Answers are simple, can be unsatisfying
No personality of system

Advantages

Model can generalize to new questions
Not rule based
No look up of answer from an existing database
is able to extract knowledge from noisy open domain dataset

It is necessary to remember that conversation models are AI-Hard and as such there is no well defined metric to measure the quality of a conversational model. Usually all systems perform evaluation manually using AMT.

blog comments powered by Disqus

Published

24 May 2016