Project acronym: QUESTION-HOW
Project Full
Title:Quality Engineering
Solutions via Tools, Information and Outreach for the New
Highly-enriched Offerings from W3C: Evolving the Web in
Europe
Project/Contract No.
IST-2000-28767
Workpackage 2, Deliverable D3-XML/RDF Digital
Libraries
Project Manager: Daniel Dardailler
<danield@w3.org>
Author of this document: Oreste Signore
<oreste.signore@isti.cnr.it>
Created: 28 August 2003. Last updated: 04/09/2003 13.32
The main task has been to develop a user interface to query
complex and specialized XML documents corpora (like
juridical documents, cultural heritage cataloguing cards,
user manuals, etc.).
At the lower level, we make use of the XCDE library
developed at the Department of Computer Science, University
of Pisa. The core of the library (indexing and compressing
algorithms) remain property of their authors. The library
is written in C and provides a set of efficient algorithms
and data structures for indexing and searching an XML
document collection.
The documents must be well-formed and may be heterogeneous
in that they may reflect different DTDs. The library
supports the storage and management of these XML files in
native form, operating directly at the File System level.
The main features of the library are: state-of-the-art
algorithms and data structures for text indexing,
compressed space occupancy, and novel succinct data
structures for the management of the hierarchical structure
of the XML document.
The library provides an API with a rich set of functions
to operate on its whole collection of data structures and
algorithms. It may implement most of the basic
functionalities of XQuery at the higher level, and it may
support more complex IR-like searches.
The user interface gets the document structure from the
XMLSchema, and makes use of some RDF facilities to
broadening or narrowing query terms, implementing a
graphical browsing of thesauri, in order to support
semantic equivalences for more effective searches.