Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/16947
Full metadata record
DC FieldValueLanguage
dc.contributor.authorCiravegna, Fabio-
dc.contributor.authorDingli, Alexiei-
dc.contributor.authorWilks, Yorick-
dc.contributor.authorPetrelli, Daniela-
dc.date.accessioned2017-03-04T17:34:24Z-
dc.date.available2017-03-04T17:34:24Z-
dc.date.issued2002-
dc.identifier.citationCiravegna, F., Dingli, A., Wilks, Y., & Petrelli, D. (2002). Adaptive information extraction for document annotation in amilcare. 25th ACM/SIGIR International Conference on Research and Development in Information Retrieval, Tampere. 451.en_GB
dc.identifier.issn01635840-
dc.identifier.urihttps://www.um.edu.mt/library/oar//handle/123456789/16947-
dc.description.abstractAmilcare is a tool for Adaptive Information Extraction (IE) designed for supporting active annotation of documents for the Semantic Web (SW). It can be used either for unsupervised document annotation or as a support for human annotation. Amilcare is portable to new applications/domains without any knowledge of IE, as it just requires users to annotate a small training corpus with the information to be extracted. It is based on (LP)2, a supervised learning strategy for IE able to cope with different texts types, from newspaper-like texts, to rigidly formatted Web pages and even a mixture of them[1][5].Adaptation starts with the definition of a tag set for annotation, possibly organized as an ontology. Then users have to manually annotate a small training corpus. Amilcare provides a default mouse-based interface called Melita, where annotations are inserted by first selecting a tag from the ontology and then identifying the text area to annotate with the mouse. Differently from similar annotation tools [4, 5], Melita actively supports training corpus annotation. While users annotate texts, Amilcare runs in the background learning how to reproduce the inserted annotation. Induced rules are silently applied to new texts and their results are compared with the user annotation. When its rules reach a (user-defined) level of accuracy, Melita presents new texts with a preliminary annotation derived by the rule application. In this case users have just to correct mistakes and add missing annotations. User corrections are inputted back to the learner for retraining. This technique focuses the slow and expensive user activity on uncovered cases, avoiding requiring annotating cases where a satisfying effectiveness is already reached. Moreover validating extracted information is a much simpler task than tagging bare texts (and also less error prone), speeding up the process considerably. At the end of the corpus annotation process, the system is trained and the application can be delivered. MnM [6] and Ontomat annotizer [7] are two annotation tools adopting Amilcare's learner.In this demo we simulate the annotation of a small corpus and we show how and when Amilcare is able to support users in the annotation process, focusing on the way the user can control the tool's proactivity and intrusivity. We will also quantify such support with data derived from a number of experiments on corpora. We will focus on training corpus size and correctness of suggestions when the corpus is increased.en_GB
dc.language.isoenen_GB
dc.publisherThe ACM Digital Libraryen_GB
dc.rightsinfo:eu-repo/semantics/restrictedAccessen_GB
dc.subjectNatural language processing (Computer science)en_GB
dc.subjectSemantic Weben_GB
dc.subjectSelf-adaptive softwareen_GB
dc.subjectKnowledge managementen_GB
dc.subjectCorpora (Linguistics)en_GB
dc.titleAdaptive information extraction for document annotation in amilcareen_GB
dc.typeconferenceObjecten_GB
dc.rights.holderThe copyright of this work belongs to the author(s)/publisher. The rights of this work are as defined by the appropriate Copyright Legislation or as modified by any successive legislation. Users may access this work and can make use of the information contained in accordance with the Copyright Legislation provided that the author must be properly acknowledged. Further distribution or reproduction in any format is prohibited without the prior permission of the copyright holder.en_GB
dc.bibliographicCitation.conferencename25th ACM/SIGIR International Conference on Research and Development in Information Retrievalen_GB
dc.bibliographicCitation.conferenceplaceTampere, Finland, 11-15/08/2002en_GB
dc.description.reviewedpeer-revieweden_GB
Appears in Collections:Scholarly Works - FacICTAI

Files in This Item:
File Description SizeFormat 
Conference paper - Adaptive information extraction for document annotation in Amilcare.pdf
  Restricted Access
Adaptive information extraction for document annotation in amilcare131.37 kBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.