Student teams from degree-granting institutions are invited to compete in the annual SIGMOD Programming Contest. This year, the subject of the contest is to construct a blocking system for Entity Resolution. Teams' submissions will be judged on their performance on a set of supplied datasets.
The winning team will be awarded a prize of $4,000 (USD), and there will be an additional prize of
$2,000 (USD) for the runner-up.
Prize money is donated by Microsoft.
This year's contest is brought to you by the Chu Data Lab at the Georgia Tech and the DBGroup at the University of Modena and Reggio Emilia. The organizing team is made up of Xu Chu (co-chair), Giovanni Simonini (co-chair), Renzhi Wu, Peng Li, and Luca Zecchini.
For this year's contest, the task is to build a blocking system for Entity Resolution. Entity Resolution (ER) is the problem of identifying and matching different tuples that refer to the same real-world entity in a dataset, such as tuples Georgia Tech and Georgia Institute of Technology. When performing ER on a table A, considering all quadratic number (A x A) of tuple pairs can be computationally expensive. Thus, a filtering step, referred to as blocking step, is used first to quickly filter out obvious non-matches and to obtain a much smaller candidate set of tuple pairs, for which a matching step is then applied to find matches.
For this task, you are asked to perform blocking on several datasets, each one containing a different type of object (e.g., people, products, movies, etc.) and different distributions of data and noise. The challenge is to design a blocking system for each dataset that can quickly filter out non-matches in a limited time to generate a small candidate set with high recall (not losing too many true matches).
More details about this year's problem can be found on the Task page.
|18 February 2022
|New site up. Contest requirements specification and datasets available.
|30 April 2022
|Final submission deadline.
|17 May 2022
|12-17 June 2022
|ACM SIGMOD/PODS 2022 Conference.