KOBV-Portal

Treffer pro Seite

Treffer 1 - 1 | 1 Treffer

Alles auswählen Exportieren

Online-Ressource

XStruct (2006)

Hegewald, Jan ; Naumann, Felix ; Weis, Melanie

Berlin : Humboldt-Universität zu Berlin, Mathematisch-Naturwissenschaftliche Fakultät II

zur Merkliste hinzufügen auf der Merkliste

Details

UID:

edochu_18452_9866

Inhalt: XML is the de facto standard format for data exchange on the Web. While it is fairly simple to generate XML data, it is a complex task to design a schema and then guarantee that the generated data is valid according to that schema. As a consequence much XML data does not have a schema or is not accompanied by its schema. In order to gain the benefits of having a schema - efficient querying and storage of XML data, semantic verification, data integration, etc.- this schema must be extracted. In this paper we present an automatic technique, XStruct, for XML Schema extraction. Based on ideas of [5], XStruct extracts a schema for XML data by applying several heuristics to deduce regular expressions that are 1-unambiguous and describe each element’s contents correctly but generalized to a reasonable degree. Our approach features several advantages over known techniques: XStruct scales to very large documents (beyond 1GB) both in time and memory consumption; it is able to extract a general, complete, correct, minimal, and understandable schema for multiple documents; it detects datatypes and attributes. Experiments confirm these features and properties.

Inhalt: Peer Reviewed

Sprache: Englisch

DOI: 10.1109/ICDEW.2006.166

DOI: 10.18452/9214

URN: urn:nbn:de:kobv:11-10065894

URL: Volltext (kostenfrei)

Bibliothek	Standort	Signatur	Band/Heft/Jahr	Verfügbarkeit

Andere fanden auch interessant ...

Online-Ressource

Volltext

HU Berlin

Treffer 1 - 1 | 1 Treffer

Kooperativer Bibliotheksverbund

Berlin Brandenburg