Semi-Structured Data (2004)
Organizers:
Georg Gottlob (Vienna)
Scientific Theme and Short description of programme:
In the context of Computer Science Research of WPI the topic "semi-structured data" was particularly successful and promising.
Most currently existing databases are based on Codd's relational database
model, where the data is organized in form of tables, called relations.
Data on the Web, however, is organized in tree-like form (reflecting the
parsing tree of a Web document) and requires different access and processing
mechanisms. Data organized this way is usually referred-to as semi-structured
data. The efficient storage, retrieval, and manipulation of semi-structured
data in formats such as HTML and XML is currently one of the most important
topics of database research.
In the context of the WPI, intensive research was dedicated to the topics and
problems described in the sequel, and several papers were already published
at leading international conferences and journals. One of these papers
(by Gottlob und Koch) dealing with the expressive power of languages for
querying XML and for data extraction received the Best Paper Award at the
ACM International Symposium on Database Theory - ACM PODS 2002 in Madison,
Wisconsin, USA and its full version was published in the Journal of the ACM.
In order to intensify the research activities on semi-structured data and
related topics, we want to invite international specialists and to cooperate
with them. In particular, we plan a cooperation with researchers in the
following areas at the cutting edge between database theory, web information
management, and computational logic.
-
Query languages for semi-structured data:
Several query and transformation languages for the XML data model were proposed
and subsequently standardized, for example, XSLT and XQUERY. While the syntax
and the semantics of these languages are well-defined, the complexity and
expressive power of queries formulated in these and related languages has
not been sufficiently analysed. The complexity of XPATH, an important
sub-fragment of XSLT and of XQUERY was recently investigated by WPI member
G. Gottlob, Christoph Koch, and Reinhard Pichler. By this research it turned
out that while all commercially available XPATH processors (e.g. the one
contained in Microsoft?s Internet Explorer), as well as those available
under an open source license (e.g. Xalan, Saxon) have exponential combined
complexity and are thus not scalable, the combined complexity of processing
XPATH queries is actually polynomial. Moreover, we have identified large and
important fragments of XPATH whose execution actually requires quadratic time
only (size of the database times size of the query). We have designed a new
and scalable algorithm for XPATH query processing. We now hope that together
with peer researchers invited by the WPI we will be able to determine
complexity of larger fragments of XSLT and other query languages.
-
Web-Data extraction:
The overwhelming part of the Web consists of pages formatted in HTML. This
format, however, neither describes the content nor imposes data structures
on it, but just specifies a layout suited for a human reader. Therefore,
information on HTML Web pages, even though available in electronic form,
is not suited for being directly and automatically processed by standard EDP
applications. On the other hand, XML Web documents, where tags qualify data
items at various granularities, are well-suited for automated data processing
tasks. Unfortunately, and in contrary to previous predictions, most information
available on the Web is formatted in HTML and the ratio of HTML vs. XML pages
does not seem to decrease. Towards highly automated Web information processing
it is thus necessary to extract data automatically from HTML pages, structure
the data, and enrich it with meaningful tags, in order to make it accessible
to EDP applications. The Institute of Information Systems of TU Wien, in
cooperation with WPI and EC3 (the Vienna Electronic Commerce Competence
Centre) has achieved several results in the area of automated data extraction.
New methods were defined, new algorithms were designed and tested, and a new
theory on the expressive power and complexity of data extraction formalisms
was developed. The results were published in top international conferences
and journals (e.g. VLDB, ACM PODS, JACM). The activities in this area led
to the foundation of the start up company "Lixto Software GmbH" (www.lixto.com)
which was created as a spin off of TU Wien and. EC3. Through the invitation of
international experts, post docs and Ph.D. students, we would like to further
investigate the area of Web data extraction.
-
Semantic Web and Intelligent Semantic Systems:
A Semantic Web Initiative (www.semanticweb.at:8282/) under the leadership of
Profs. Fensel, Gottlob, and Werthner, has obtained that the Austrian Ministry
of Infrastructure and Industry will fund projects with both industrial and
academic partners on applied product-oriented aspects of Intelligent Semantic
Systems and the Semantic Web (FIT-IT-Programme). The first call of this
programme will be issued in May 2004 with a total funding of 4 million Euros.
In order to foster research on the more theoretical aspects of the Semantic
Web and intelligent semantic systems, we would like to invite international
scientists who are either experts on the Semantic Web itself or who are
experts in a related area (e.g. automated reasoning, description logic,
semantics, etc.). We also plan to invite students and post docs in this area
for a limited period of time.
Events
|
Location: Vienna
|
Time: 16. Dec 2004 (Thu) - 18. Dec 2004 (Sat); Opening: 9:00
|
|
Topics:
Graph Decompositions
Hypergraph Decompositions
Applications in databases
Applications in Constraint Satisfaction and other areas of interest
|
Remark: This workshop will bring together researchers working on graph and hypergraph decomposition
techniques to be used in different areas of Computer Science. Both advances on decomposition
techniques as well as applications in databases, Constraint Satisfaction and other areas of interest. | |
Talks in the framework of this thematic program... (by date)
, (by name)
Pauli Fellows
Visitors
Beer, Andreas |
|
|
16. Dec 2004-18. Dec 2004 |
local address
|
Benedikt, Michael |
|
|
10. Dec 2004-23. Dec 2004 |
local address
|
Benedikt, Michael |
|
|
14. Oct 2004-23. Oct 2004 |
local address
|
Berwanger, Dietmar |
|
GAMES
|
16. Dec 2004-18. Dec 2004 |
local address
|
Bjorner, Dines |
|
|
2. Jun 2004-3. Jun 2004 |
local address
|
Brandstaedt, Andreas |
|
|
16. Dec 2004-18. Dec 2004 |
local address
|
Bretto, Alain |
|
|
16. Dec 2004-18. Dec 2004 |
local address
|
Buneman, Peter |
|
Wittgenstein
|
21. Jan 2004-25. Jan 2004 |
local address
|
Cohen, Dave |
|
|
16. Dec 2004-18. Dec 2004 |
local address
|
Dechter, Rina |
|
|
16. Dec 2004-18. Dec 2004 |
local address
|
Dermaku, Artan |
|
|
16. Dec 2004-18. Dec 2004 |
local address
|
Durand, Arnaud |
|
|
16. Dec 2004-18. Dec 2004 |
local address
|
Grädel, Erich |
|
GAMES
|
16. Dec 2004-18. Dec 2004 |
local address
|
Grandjean, Etienne |
|
|
16. Dec 2004-18. Dec 2004 |
local address
|
Grohe, Martin |
|
|
16. Dec 2004-18. Dec 2004 |
local address
|
Grueber, Magdalena |
|
|
16. Dec 2004-18. Dec 2004 |
local address
|
Gurevich, Yuri |
|
Microsoft
|
15. Feb 2004-21. Feb 2004 |
local address
|
Harz, Patrick |
|
|
13. Sep 2004-19. Sep 2004 |
local address
|
Hassan, Tamir |
|
|
11. Nov 2004-13. Nov 2004 |
local address
|
Helmer, Sven |
|
Vienna University of Technology
|
23. May 2004-24. May 2004 |
local address
|
Hlineny, Petr |
|
|
16. Dec 2004-18. Dec 2004 |
local address
|
Hnich, Brahim |
|
|
16. Dec 2004-18. Dec 2004 |
local address
|
Ianni, Goivambattista |
|
|
16. Dec 2004-18. Dec 2004 |
local address
|
Klein, Thomas |
|
|
5. Aug 2004-9. Aug 2004 |
local address
|
Kosa, Balazs |
|
|
16. Dec 2004-18. Dec 2004 |
local address
|
Kreutzer, Stephan |
|
|
16. Dec 2004-18. Dec 2004 |
local address
|
Lehmann, Peter |
|
|
1. Jul 2004-3. Jul 2004 |
local address
|
Leone, Nicola |
|
Vienna University of Technology
|
22. May 2004-25. May 2004 |
local address
|
Lukasiewicz, Thomas |
|
Vienna University of Technology
|
23. May 2004-25. May 2004 |
local address
|
Makino, Kazuhisa |
|
|
1. May 2004-30. May 2004 |
local address
|
Makowsky, Janos |
|
|
16. Dec 2004-18. Dec 2004 |
local address
|
Martens, Wim |
|
|
9. Aug 2004-11. Aug 2004 |
local address
|
Martens, Wim |
|
|
1. Dec 2004-31. Mar 2005 |
local address
|
McMahan, Ben |
|
|
16. Dec 2004-18. Dec 2004 |
local address
|
McMahan, Benjamin |
|
GAMES
|
1. Jun 2004-31. Jul 2004 |
local address
|
Michnevych, Vadym |
|
|
20. Apr 2004-8. May 2004 |
local address
|
Miklos, Zoltan |
|
|
16. Dec 2004-18. Dec 2004 |
local address
|
Mykola, Rusnak |
|
|
20. Apr 2004-8. May 2004 |
local address
|
Niehren, Joachim |
|
|
11. Feb 2004-18. Feb 2004 |
local address
|
Oum, Sang-il |
|
|
16. Dec 2004-18. Dec 2004 |
local address
|
Pichler, Reinhard |
|
|
16. Dec 2004-18. Dec 2004 |
local address
|
Samer, Marko |
|
|
16. Dec 2004-18. Dec 2004 |
local address
|
Scarcello, Francesco |
|
|
16. Dec 2004-18. Dec 2004 |
local address
|
Schulz, Klaus |
|
|
31. May 2004-6. Jun 2004 |
local address
|
Schweikardt, Nicole |
|
|
1. Oct 2004-7. Oct 2004 |
local address
|
Schwentick, Thomas |
|
Vienna University of Technology
|
24. May 2004-24. May 2004 |
local address
|
Seufert, Andreas |
|
|
1. Jul 2004-6. Jul 2004 |
local address
|
Shcherbina, Oleg A. |
|
|
16. Dec 2004-18. Dec 2004 |
local address
|
Simon, Kai |
|
Wittgenstein
|
10. Jul 2004-17. Jul 2004 |
local address
|
Smitha, Akkina |
|
|
12. Sep 2004-18. Sep 2004 |
local address
|
Stegmaier, Bernhard |
|
|
2. Oct 2004-7. Oct 2004 |
local address
|
Szeider, Stefan |
|
|
16. Dec 2004-18. Dec 2004 |
local address
|
Toffeti Carughi, Giovanni |
|
|
3. Oct 2004-15. Oct 2004 |
local address
|
Turan, György |
|
|
19. Oct 2004-21. Oct 2004 |
local address
|
Vassalos, Vasilis |
|
|
7. May 2004-7. May 2004 |
local address
|
Wei, Fang |
|
|
13. Jul 2004-14. Jul 2004 |
local address
|
Weigel, Felix |
|
|
9. Dec 2004-16. Dec 2004 |
local address
|
Ziegler, Cai |
|
Wittgenstein
|
10. Jul 2004-17. Jul 2004 |
local address
|
Long Term Visitors
PreDocs