| The topics the Bologna Unit
will deal with are part of the resarch Themes 1 and 3, and can
be summarized as follows:
Theme 1:
- “Content summaries” creation for information sources
(Content summaries)
Theme 3:
- Distributed query execution in WISDOM (Execution)
- Usage and navigation, based on ontologies, of the query results
(Navigation)
In particular, the task concerning
“Content summaries” aims to deliver a characterization
(“profile”) of the data sources from the statistical
point of view in order to accurately evaluate their relevance
with respect to a given query and, consequently, to allow a
smart selection of the most relevant data sources. The “Execution”
task deals with aspects related to the distributed execution
of queries on different data sources and their coordination/synchronization,
so as to determine, with a minimal amount of resources,
the most relevant results. Finally, the “Navigation”
task, which takes place after query processing, is aimed at
defining mechanisms for exploiting the results in a synthetic
and flexible representation by relying on the multiple abstraction
levels available with a
specific domain ontology.
Research on such topics, given the project structure, will be
organized as follows:
PHASE 1
The first phase will be devoted to accurately define the requirements
for the 3 research topics, then we will analyze and criticize
the related literature in order to identify the limits of the
available solutions with respect to our current goals. In detail:
(Content summaries) The analysis phase will survey state-of-the-art
techniques for building profiles, in order to determine how
they can be extended to the case where a data source is described
by a domain ontology. In particular, we will define the requirements
that must be satisfied by the content summaries so as to ensure
that they can be effectively exploited to determine the relevance
of a data source in answering a query.
(Execution) A thorough analysis of the different distributed
query processing techniques will be carried on, so as to highlight
the limits of such techniques with respect to the WISDOM architecture
(we remind that in WISDOM a data source is externally perceived
only through its domain ontology). In particular, the different
aspects that may influence the relevance of a result will be
analyzed to see at which extent they are influenced by the WISDOM
architecture.
(Navigation) We will analyze how query results can be elaborated
in order to be returned to the user in a compact and easy to
use form. Then, we will evaluate how navigation and aggregation
techniques experienced in business intelligence and data mining
can be combined in order to ensure the maximum flexibility in
choosing the level of granularity for presenting data. Furthermore,
we will study how the paradigms devised for visual querying
databases can be extended to queries that involve the use of
ontologies.
Finally, we will work, together with the other Units, on the
definition of the methodological and functional architecture
for the whole project (deliverable D0.R1).
PRODUCTS
The expected deliverables in this phase are technical reports
(R). The number after the letter D represents the theme (0 for
propdotti common to all the themes).
D0.R1 Report on the methodological and functional architecture
(in collaboration with Modena e Reggio Emilia - MO, Roma - RM,
Trento - TN)
D1.R1 Review of the languages and emerging standards for ontologies
(in collaboration with MO, RM, TN)
D3.R1 Review of the query languages and of the rewriting techniques
based on ontologies (in collaboration with MO, TN)
D3.R2 Review of query processing techniques in heterogeneous
environments
PHASE 2
During the second phase we will work on solutions for the 3
topics handled by the Unit:
(Content summaries) During the second phase we will define the
mechanisms for adding numeric information to the domain ontologies.
The basic idea is to extend the existing techniques for “probing”
the data sources by considering ontological information and
the derived constraints. The extension will be inspired by economy
principles as: 1) require as few “probes” as possible,
and 2) return the most significant quantitative information
given fixed quantity of memory for storing content summaries.
According to the targets of Theme 1, we will specify the update
methods for the content summaries when a new data source is
added and when the corresponding domain ontology is extended.
(Execution) The aim of this phase is the definition of a set
of techniques for the execution of distributed queries that,
considering the limits imposed by WISDOM architecture, return
the most relevant results while minimizing the used resources.
Since the relevance of a given object depends on several factors
and on their relationships, the techniques that will be developed
will be very general in order to be capable of working properly
and efficiently even when the combination criterion is changed.
For this criterion, which initially may be implemented as a
weighted sum of the different factors, we will also consider
the more general and expressive “qualitative” case,
that is, based not only on numerical techniques.
(Navigation) As concern the exploitation of the query results,
we will identify the techniques necessary to precisely define
the desired granularity level. In particular, we will define
the compact and rich in semantic representations for information
available at different
abstraction levels, and we will identify the operators necessary
for an interactive navigation on the different levels according
to the domain ontology.
Finally, we will work, in collaboration with the other Units,
to the definition of the interfaces of the different components
of the integrated prototype (deliverable D0.R2).
PRODUCTS
D0.R2 Specification of the component interfaces of the integrated
prototype (in collaboration with MO, RM, TN)
D1.R2 Definition of the language for the specification of domain
ontologies (in collaboration with MO, TN)
D1.R3 Definition of the techniques for the creation of content
summaries
D3.R3 Definition of the query language and of the ontology-based
query rewriting techniques (in collaboration with MO, TN)
D3.R4 Definition of query execution techniques in the WISDOM
environment
PHASE 3
During the third phase we will develop 3 prototypes and, jointly
with the other Units, we will collaborate to the integration
of the prototypes developed in the project.
The first prototype, starting from a pre-existing domain ontology,
will implement “probing” techniques for the corresponding
data sources and it will define algorithms for building the
content summaries starting from the results obtained.
The second prototype (joint work with the MO Unit) will accept
and analyze user queries. It will also determine the sets of
relevant sources for the query at hand.
The third prototype will implement the query execution techniques
defined during phase 2. Further, it will include an interface
for an ontology-based interactive navigation at different abstraction
levels.
An extensive experimental activity will be carried on in order
to asses the performance of prototypes.
PRODUTCS
Deliverables expected for this phase are software prototypes
(P).
D0.P1 Integrated system prototype (in collaboration with MO,
RM, TN)
D1.P2 Prototype for the creation of content summaries
D3.P1 Prototype for query specification (in collaboration with
MO)
D3.P2 Prototype of the query execution engine
|