Diversification des k meilleures réponses à des requêtes par l’exemple


For a given table T and a user query Q, the top-k answers are the k tuples from T that best match Q. The integration of a diversity constraint aims at avoiding returning redundant tuples, that are too similar one to another. This paper addresses the diversification question in the Query By Example setting, especially for approaches that can deal with possibly very different representative examples provided by the user. It proposes a new definition for diversity that depends on the query, in order to measure whether the result set illustrates the diversity of the representative examples provided by the user, covering all components of the query. The paper proposes a numerical measure to assess diversity in that sense, an algorithm to identify such a diversified top-k set, optimising both the query satisfaction and the diversity measure, as well as its integration into a flexible querying approach.