ClarifyDelphi

Reinforced Clarification Questions with Defeasibility Rewards for Social and Moral Situations

Published in

AI2 Blog

5 min readMay 30, 2023

Asking questions: Why is it important?

Context is everything, especially in commonsense moral reasoning. Offering someone a cup of coffee is generally considered appropriate. If offered to a work colleague, it may even be viewed as a courteous gesture. However, offering coffee to a toddler would be deemed morally irresponsible.

Delphi (Jiang et al. 2021), a recently proposed commonsense moral reasoning model, generates moral judgments for simple actions described in text. However, Delphi’s judgments are made in isolation, without any knowledge of surrounding context. How can moral reasoners elicit missing salient context? A natural way to do so is by asking clarification questions.

In our ACL 2023 paper we worked on generating consequential questions whose answers could potentially provide salient context for moral reasoning. We define what a consequential question is in terms of defeasibility, which describes a way of reasoning that takes into consideration (new) evidence which could either support (i.e. strengthen) or cancel/weaken an initial inference. Prior research in cognitive science shows that human reasoning exhibits the flexibility not only to articulate where a certain moral rule should hold, but also to imagine valid exceptions where the rule can be bent or defeated based on the demands of the context (Kwon et al., 2022; Levine et al., 2020; Awad et al., 2022).

Learning to ask the right questions

We use Reinforcement Learning (PPO) to train our question generation system. Broadly this works in the following steps (as illustrated in the image above):

Let’s assume the input situation is: offering a cup of coffee. We first generate a question; for example, “Who did you offer it to?”. Given that question and the situation, we train a model that is able to simulate hypothetical answers. We specifically train the model to try to predict both weakening and strengthening answers. In this example they could be “to a work colleague” and “to a toddler”. As soon as we have these answers we can create two updated situations: offering a cup of coffee to a work colleague and offering a cup of coffee to a toddler. For each updated situation we can get a distribution over possible moral judgments from Delphi, which allows us to calculate the Jensen–Shannon divergence. This divergence functions as the reward — with the intuition that a question leading to diverging answers is a consequential one. Had the question and answers been “When did you offer it?”, “in the morning”, and “in the afternoon”, then the divergence of the judgments would have been smaller.

In order to train the subcomponents of our modeling approach we had to crowdsource a dataset and also create a synthetic dataset. We collected a dataset of human written clarification questions for social and moral situations, resulting in more than 30k questions. Additionally, for the answer generation model we created a synthetic defeasible QA dataset, by making use of an existing defeasible inference dataset (𝛿-SocialChem, Rudinger et al. 2020) and enriching it with questions obtained from GPT3.

We evaluate the approach using human evaluation: compared to four other baselines ClarifyDelphi produces more informative and more relevant questions. Most importantly, ClarifyDelphi generates more questions which could lead to either weakening or strengthening answers.

Use cases, limitations, and outlook

We believe that our system can be useful in an interactive setting.

The figure above illustrates examples of such an interaction between a user, Delphi as the moral reasoning system and ClarifyDelphi.

After each turn, the situation is updated with the user-provided context, for which Delphi produces a new decision. We limit the interaction to three turns. This is based on the observation that after the third turn the sentence fusion starts to deteriorate, resulting in less relevant and more repetitive questions. Additionally, we find that the first two questions generally can capture missing contexts that are most central to making moral decisions.

This project also comes with certain limitations. First, we rely on a trained model, Delphi, whose output probabilities are not perfectly calibrated. This could lead to some error propagation. Secondly, both Delphi and ClarifyDelphi are western-centric and do not explicitly take value pluralism into account. A study on moral defeasibility across different cultures could improve our approach.

There are a lot of other follow up research questions we believe are interesting to tackle. Our evaluation revealed that it is easier to generate strengthening answers. It would be interesting to investigate why this is. We currently do not take multi-turn interaction history into account when generating the questions, but it might be interesting to develop an RL system which both uses this history, but also is able to incorporate it in its reward. Additionally we currently hard code the amount of turns that can happen between a user and ClarifyDelphi. It would be more elegant to have a way to automatically predict when enough contextual information has been obtained and the system does not have to ask an additional question. We also envision that our questions could not only be used in an interactive setting, but could also be used to retrieve relevant information for other types of knowledge sources.

In general we believe that AI making judgments/predictions based on limited information and context is risky, especially with the increasing popularity of chatbots. We therefore argue that it is of general interest that the community builds models which are able to inquire about contextual information and incorporate such interaction in their decision making process.

PDF: https://api.semanticscholar.org/CorpusID:258762844?utm_source=wikipedia

Data + Code: https://github.com/allenai/clarifydelphi

References

Awad, Edmond, Sydney Levine, Andrea Loreggia, Nicholas Mattei, Iyad Rahwan, Francesca Rossi, Kartik Talamadupula, Joshua Tenenbaum, and Max Kleiman-Weiner. “When is it acceptable to break the rules? knowledge representation of moral judgment based on empirical data.” arXiv preprint arXiv:2201.07763 (2022).

Jiang, Liwei, Jena D. Hwang, Chandra Bhagavatula, Ronan Le Bras, Maxwell Forbes, Jon Borchardt, Jenny Liang, Oren Etzioni, Maarten Sap, and Yejin Choi. “Delphi: Towards machine ethics and norms.” arXiv preprint arXiv:2110.07574 (2021).

Kwon, Joseph, Josh Tenenbaum, and Sydney Levine. “Flexibility in Moral Cognition: When is it okay to break the rules?.” Proceedings of the Annual Meeting of the Cognitive Science Society. Vol. 44. №44. 2022.

Levine, Sydney, Max Kleiman-Weiner, Laura Schulz, Joshua Tenenbaum, and Fiery Cushman. “The logic of universalization guides moral judgment.” Proceedings of the National Academy of Sciences 117, no. 42 (2020): 26158–26169.

Rudinger, Rachel, Vered Shwartz, Jena D. Hwang, Chandra Bhagavatula, Maxwell Forbes, Ronan Le Bras, Noah A. Smith, and Yejin Choi. “Thinking like a skeptic: Defeasible inference in natural language.” In Findings of the Association for Computational Linguistics: EMNLP 2020, pp. 4661–4675. 2020.

Check out our current openings, follow @allen_ai on Twitter, and subscribe to the AI2 Newsletter to stay current on news and research coming out of AI2.