The Best Explanation: Beyond Right and Wrong in Question Answering

Anders Trærup Johannsen

    Publikation: Bog/antologi/afhandling/rapportPh.d.-afhandlingForskning

    1742 Downloads (Pure)

    Abstract

    There are right and wrong answers, but there are also ways of answering questions which are helpful and some which are not, even when they convey the same information. In this thesis we present a data-driven approach to automatically recognizing the good answers. A good answer lays out its information so that it is easy to read and understand. And answer structure matters—imagine assembling a piece of IKEA furniture if the order of instructions is scrambled. In general, text is structured by discourse relations, and discourse markers (DMs, e.g. however, moreover, then) are the most apparent and reliable signs of this structure. In the thesis we use DMs to model aspects of answer structure. Unfortunately, standard discourse processing software make two unrealistic assumptions about DMs, making it hard to apply to community-generated data. They are a) that gold-standard annotations are available for feature generation; and b) that DMs form a closed class for which we have labeled examples of all members. We challenge those assumptions, showing that a) in the absence of gold annotations, state-of-the-art performance can be obtained with much simpler features; and b) sharing features between DMs based on similarity via word embeddings gives an error reduction of at least 20% on unknown DMs, compared to no sharing or sharing by part-of-speech.
    Structure-building expressions are often more complex than the simple DMs discussed above and could even be discontinuous (e.g not only X but also Y). However, discovering such patterns automatically is a very hard search problem. As an alternative, we generate representations based on regular expressions
    using data elicited from workers on Mechanical Turk. Using these complex expressions for answer ranking gives an error reduction of 24% compared to a bag-of-words model. We introduce the task of ranking answers across domains,
    learning from questions and answers collected from community Q&A sites (cQA). In one experiment we show that importance sampling, where training data is sampled according to similarity between questions, leads to significant improvements over an uninformed sampling strategy.
    OriginalsprogEngelsk
    ForlagDet Humanistiske Fakultet, Københavns Universitet
    Antal sider187
    StatusUdgivet - 2013

    Note vedr. afhandling

    Forsvaret 13. december 2013

    Citationsformater