How to add your own module

One of the main aims of PyDial is to provide a common statistical dialogue system framework where people are able to integrate and evaluate their own models. Having this in mind, CUED-PyDial has been designed to offer clearly defined interfaces for the main modules. To understand this, we first have to look at PyDial's acchitecture.

The modular structure of PyDial

PyDial is based on a modular architecture as presented in the following figure. Multi-domain capability is achieved using a topic tracker which identifies the topic of the input. Based on that, domain-specific instances of all downstream modules are used.

PyDial Architecture

Creating new modules

For easy integration of own modules, Pydial provides simple interface classes for the following modules:

  • Topic Tracking: topictracking.RuleTopicTrackers.TopicTrackerInterface
  • Semantic Belief Tracking (from words to belief state): semanticbelieftracking.SemanticBeliefTrackingManager.SemanticBeliefTracker
  • Semantic Decoding (from words to semantics): semi.SemI.SemI
  • Belief Tracking (from semantics to belief state): belieftracking.BeliefTracker.BeliefTracker
  • Policy: policy.Policy.Policy
  • Language Generation: semo.SemOManager.SemO
  • Evaluation: EvaluationManager.Evaluator

Each of these modules is modelled in a similar way. As PyDial's main objective is to provide a multi-domain dialogue platform, a concept called manager is introduced. It handles the domain instances of each module using a dictionary-like structure.

The domain-instances are organised using an abstract class containing all general behaviour and defining the interface to the component as shown in the following figure. By that, implementing own modules which are not already contained in PyDial is straightforward.

PyDial Manager Example

To implement your own module, have a look at the interface class and identify the methods you are supposed to override in a sub-class. To user your module, you have to specify the class in the config file. An example of modules which are loaded dynamically can be found in config/dynamically_load_modules.cfg.

Example: Creating a new parrot policy

In this example, a new policy class is created which simply takes the user action and returns it as a system action (just like a parrot always mirroring the user). To do this, the first step is to create a new class which inherits from policy.Policy.Policy:

from policy import Policy

class Parrot(Policy.Policy):
    pass

The most important method to inherit from Policy is nextAction(beliefState) which takes the belief state as input and produces the new system action.

from policy import Policy

class Parrot(Policy.Policy):
    def nextAction(self, beliefstate):
        pass

To implement the parrot behaviour, we have to extract the last user input from the belief state, parse it a bit and then return it. In the beginning, we default to the hello() system act. The final class definition is as follows:

from policy import Policy
import re

class Parrot(Policy.Policy):  
    def nextAction(self, beliefstate):
        userActs = beliefstate['userActs']
        if userActs is not None:
            systemAct = userActs[0][0]
        else:
            systemAct = "hello()"
        
        return systemAct    

To allow PyDial to use the new class, the module must reside in a package which is accessible by the python path. For this example, the file (let's use Parrot.py) can simply be stored in PyDial's root folder. To use it, the config file must be altered accordingly. The policy is defined within the [Policy_domain] section with the entry policytype. If we assume a dialogue in the CamRestaurants domain, the respective entry in the config file may look like this:

[policy_CamRestaurants]
belieftype = belieftracking.baseline.BaselineTracker
policytype = Parrot.Parrot

You can now test your policy using the chat. Note that even though the dialogue acts are mirrored the actual language response is different due to the way the language generator works.

Object format definitions

CUED-PyDial uses the following structures to pass information among the modules:

User Acts

User acts are represented as strings contining of an intent and a list of slots or slot-value-pairs. Here is a list of system actions the system currently uses:

  • request(slot)
  • inform(slot=value)
  • confirm(slot=value)
  • confreq(slot=value,slot)
  • affirm()
  • hello()
  • negate()
  • repeat()
  • requalts()
  • bye()

Belief State

The belief state is encapsulated in a DialogueState object. Inside, the belief of each domain is modelled using a dictionary with the following structure:

{'beliefs': {u'informable_slot1': {'**NONE**': 1.0,
                       u'value1': 0.0,
                       u'value2': 0.0,
                       ...  },
             u'informable_slot2': {'**NONE**': 1.0,
                       u'value1': 0.0,
                       u'value2': 0.0,
                       ...  },
             ...
             'discourseAct': {u'ack': 0.0,
                              'bye': 0.0,
                              u'hello': 0.0,
                              u'none': 1.0,
                              u'repeat': 0.0,
                              u'silence': 0.0,
                              u'thankyou': 0.0},
             'method': {u'byalternatives': 0.0,
                        u'byconstraints': 0.5,
                        u'byname': 0.0,
                        u'finished': 0.0,
                        u'none': 0.5,
                        u'restart': 0.0},
             'requested': {u'infromable_slot1': 0.0,
                           u'informable_slot2': 0.0,
                           ...
                           u'requestable_slot1': 0.0,
                           u'requestable_slot2': 0.0,
                           ...
                           u'name': 0.0}},
 'userActs': [('hypo1', 0.8),('hypo2',0.1),...]}

The actual values for and with the informable_slot, the discourseAct, the method and the requested fields is directly extracted from the ontology file. While informable_slot, discourseAct and method all represent a probability distribution, ie, the numerical values of each field sum up to 1, this is different for the requested field. There, each of the numerical values may be in [0,1].

System Acts

System acts are represented as strings contining of an intent and a list of slots or slot-value-pairs. Here is a list of system actions the system currently uses:

  • request(slot)
  • inform(slot=value, ...)
  • confirm(slot=value)
  • affirm()
  • hello()
  • negate()
  • repeat()
  • requalts()
  • bye()