Note that this text is not in the quality of a full tutorial nor is it meant to be. Its purpose is merely to give guidance on how to use the deep RL policies which are part of the new PyDial release.
For the quick start with deepRL algorithms we suggests looking at the benchmarking paper:
https://arxiv.org/abs/1711.11023
and following the guide at:
http://www.camdial.org/pydial/benchmarks/
so that you can reproduce the results.
With a high probability you will have to make changes to the model when you will try to apply it to the new domain. Neural network models are highly susceptible to the changes in architecture and training hyperparameters. That is why, we added also a script that can help with finding right parameter set. The script is at:
cd scripts/gridengine/paramsearch/runScript.py
The runScript.py is a wrapper over the repo. It enables to quickly run all policy models with different set-up of parameters. Inside the script you can specify the range of tested paramateres, number of training or testing dialogues, schedule of exploration and many other, model-specific hyperparameters.
The most influential parameters that are shared across all architectures are discussed below:
Importance sampling mechanism for A2C and ENAC is implemented however is highly unstable. To turn it on you have set importance_sampling parameter to True.
To run training just execute:
python runScript.py
and then to test:
python runScript.py --test
Results can be quickly then parsed by:
python parseResults_all.py gRun tra_ no_of_runs .log no_of_models