Annotation and Mapping Discovery among Data Sources
- Details
- Last Updated on Friday, 20 November 2015 18:31
-
Schema matching is the task of finding the semantic correspondences (mappings) between elements of two schemata
-
Approach: starting from “hidden” meanings associated to schema labels (i.e. class and attribute names, also called terms), the MOMIS Data Integration system discovers lexical relationships among schema elements
- Lexical Annotation of schema labels is the explicit assignment of meanings w.r.t. a reference lexical thesaurus (such as WordNet )
- Manual Annotation is a boring and not scalable task --> Automatic or Semi-automatic Annotation
-
WSD (Word Sense Disambiguation) is the ability of identifying the meanings of words in a context by a computational technique
The semi-automatic CWSD (Combined Word Sense Disambiguation) method:-
associates to each label, one/more WordNet meanings
- combines two WSD algorithms: SD (Structural Disambiguation) exploits the schema derived relationships & WND (WordNet domains Disambiguation) exploits WordNet Domains
-
-
Schema label normalization: is the reduction of each label to some standardized form that can be easily recognized
→ abbreviation expansion and CN (Compound Noun) annotation
- For a detailed description, please see the Phd Thesis of Serena Sorrentino and the Phd Thesis of Laura Po
- Techniques are implemented in NORMS, a tool of the MOMIS-Datariver Data Integrator, developed within the FIT STARTUP project.