Evaluation

EvaluationManager.py - module for determining the reward

Copyright CUED Dialogue Systems Group 2015 - 2017

See also

CUED Imports/Dependencies:

import ontology.OntologyUtils
import utils.Settings
import utils.ContextLogger


class evaluation.EvaluationManager.EvaluationManager

The evaluation manager manages the evaluators for all domains. It supports two types of reward: a turn-level reward and a dialogue-level reward. The former is accessed using turnReward() and the latter using finalReward(). You can either use one or both methods for reward computing.

An example where both are used in the traditional reward computation where each turn is penalised with a small negative reward (which is realised with turnReward()) and in the end, the dialogue is rewarded with a big positive reward given the overall dialogue (which is realised with finalReward()).

_bootup_domain(dstring)

Ensures that the respective domain’s evaluator is booted up correctly and resets it.

Parameters

dstring (str) – the domain of which the evaulator should be booted.

Returns

None

_load_domains_evaluator(domainString=None)

Loads and instantiates the respective evaluator as configured in config file. The new object is added to the internal dictionary.

Default is ‘objective’.

Parameters

domainString (str) – the domain the evaluator will work on. Default is None.

Returns

None

finalReward(domainString, finalInfo)

Computes the final reward for the given domain using finalInfo by delegating to the domain evaluator.

Parameters
  • domainString (str) – the domain string unique identifier.

  • finalInfo (dict) – parameters necessary for computing the final reward, eg., task description or subjective feedback.

Returns

int – the final reward for the given domain.

finalRewards(finalInfo=None)

Computes the finalReward() method for all domains where it has not been computed yet.

Parameters

finalInfo (dict) – parameters necessary for computing the final rewards, eg., task description or subjective feedback. Default is None

Returns

dict – mapping of domain to final rewards

print_dialog_summary()

Prints the history of the just completed dialog.

print_summary()

Prints the history over all dialogs run thru simulate.

restart()

Restarts all domain evaluators.

turnReward(domainString, turnInfo)

Computes the turn reward for the given domain using turnInfo by delegating to the domain evaluator.

Parameters
  • domainString (str) – the domain string unique identifier.

  • turnInfo (dict) – parameters necessary for computing the turn reward, eg., system act or model of the simulated user.

Returns

int – the turn reward for the given domain.

class evaluation.EvaluationManager.Evaluator(domainString)

Interface class for a single domain evaluation module. Responsible for recording/calculating turns, dialogue outcome, reward for a single dialog. To create your own reward model, derive from this class and depending on your requirements override the methods _getTurnReward() and _getFinalReward().

_getFinalReward(finalInfo)

Computes the final reward using finalInfo and sets the dialogue outcome.

Should be overridden by sub-class if values others than 0 should be returned.

Parameters

finalInfo (dict) – parameters necessary for computing the final reward, eg., task description or subjective feedback.

Returns

int – the final reward, default 0.

_getTurnReward(turnInfo)

Computes the turn reward using turnInfo.

Should be overridden by sub-class if values others than 0 should be returned.

Parameters

turnInfo (dict) – parameters necessary for computing the turn reward, eg., system act or model of the simulated user.

Returns

int – the turn reward, default 0.

doTraining()

Defines whether the currently evaluated dialogue should be used for training.

Should be overridden by sub-class if values others than True should be returned.

Returns

bool – whether the dialogue should be used for training

finalReward(finalInfo)

Computes the final reward using finalInfo by calling _getFinalReward(). Updates total reward and dialogue outcome

Parameters

finalInfo (dict) – parameters necessary for computing the final reward, eg., task description or subjective feedback.

Returns

int – the final reward.

print_dialog_summary()

Prints a summary of the current dialogue. Assumes dialogue outcome represents success. For other types, override methods in sub-class.

print_summary()

Prints the summary of a run - ie multiple dialogs. Assumes dialogue outcome represents success. For other types, override methods in sub-class.

restart()

Reset the domain evaluators internal variables. :param: None :returns None:

turnReward(turnInfo)

Computes the turn reward using turnInfo by calling _getTurnReward(). Updates total reward and number of turns

Parameters

turnInfo (dict) – parameters necessary for computing the turn reward, eg., system act or model of the simulated user.

Returns

int – the turn reward.

class evaluation.SuccessEvaluator.ObjectiveSuccessEvaluator(domainString)

This class provides a reward model based on objective success. For simulated dialogues, the goal of the user simulator is compared with the the information the system has provided. For dialogues with a task file, the task is compared to the information the system has provided.

class evaluation.SuccessEvaluator.SubjectiveSuccessEvaluator(domainString)

This class implements a reward model based on subjective success which is only possible during voice interaction through the DialogueServer. The subjective feedback is collected and passed on to this class.