Response-Based and Counterfactual Learning for Sequence-to-Sequence Tasks in NLP: An Overview
This post presents a summary of my PhD thesis. I explored how to learn from feedback given to model outputs when the collection of direct supervision signals...