Options
Siegfried Handschuh
Title
Prof. Dr.
Last Name
Handschuh
First name
Siegfried
Email
siegfried.handschuh@unisg.ch
Phone
+41 71 224 3441
Now showing
1 - 10 of 11
-
PublicationA Canonical Context-Preserving Representation for Open IE: Extracting Semantically Typed Relational Tuples from Complex Sentences(Elsevier, 2023-05-23)
;Freitas, AndréModern systems that deal with inference in texts need automatized methods to extract meaning representations (MRs) from texts at scale. Open Information Extraction (IE) is a prominent way of extracting all potential relations from a given text in a comprehensive manner. Previous work in this area has mainly focused on the extraction of isolated relational tuples. Ignoring the cohesive nature of texts where important contextual information is spread across clauses or sentences, state-of-the- art Open IE approaches are thus prone to generating a loose arrangement of tuples that lack the expressiveness needed to infer the true meaning of complex assertions. To overcome this limitation, we present a method that allows existing Open IE systems to enrich their output with additional meta information. By leveraging the semantic hierarchy of minimal propositions generated by the discourse-aware Text Simplification (TS) approach presented in Niklaus et al. (2019), we propose a mechanism to extract semantically typed relational tuples from complex source sentences. Based on this novel type of output, we introduce a lightweight semantic representation for Open IE in the form of normalized and context-preserving relational tuples. It extends the shallow semantic representation of state-of-the-art approaches in the form of predicate-argument structures by capturing intra-sentential rhetorical structures and hierarchical relationships between the relational tuples. In that way, the semantic context of the extracted tuples is preserved, resulting in more informative and coherent predicate-argument structures which are easier to interpret. In addition, in a comparative analysis, we show that the semantic hierarchy of minimal propositions benefits Open IE approaches in a second dimension: the canonical structure of the simplified sentences is easier to process and analyze, and thus facilitates the extraction of relational tuples, resulting in an improved precision (up to 32%) and recall (up to 30%) of the extracted relations on a large benchmark corpus.Type: journal articleJournal: Knowledge-Based SystemsIssue: 268 -
PublicationClassification of Composite Semantic Relations by a Distributional-Relational ModelDifferent semantic interpretation tasks such as text entailment and question answering require the classification of semantic relations between terms or entities within text. However, in most cases it is not possible to assign a direct semantic relation between entities/terms. This paper proposes an approach for composite semantic relation classification using one or more relations between entities/term mentions, extending the traditional seman- tic relation classification task. Different from existing approaches, which use machine learning models built over lexical and distributional word vector features, the proposed model uses the combination of a large commonsense knowledge base of binary relations, a distributional navigational algorithm and sequence classification to provide a solution for the composite semantic relation classification problem. The proposed approach outperformed existing baselines with regard to F1-score, Accuracy, Precision and Recall.Type: journal articleJournal: Data & Knowledge EngineeringVolume: 117
-
PublicationShallow Discourse Parsing for Open Information Extraction and Text Simplification(International Conference on Computational Linguistics, 2022-10)
;Freitas, AndréType: conference paper -
PublicationDisSim: A Discourse-Aware Syntactic Text Simplification Framework for English and German( 2019)
;Freitas, AndréWe introduce DisSim, a discourse-aware sentence splitting framework for English and German whose goal is to transform syntactically complex sentences into an intermediate representation that presents a simple and more regular structure which is easier to process for downstream semantic applications. For this purpose, we turn input sentences into a two-layered semantic hierarchy in the form of core facts and accompanying contexts, while identifying the rhetorical relations that hold between them. In that way, we preserve the coherence structure of the input and, hence, its interpretability for downstream tasks.Type: conference paper -
PublicationTransforming Complex Sentences into a Semantic Hierarchy(Association for Computational Linguistics, 2019-07)
;Freitas, AndréWe present an approach for recursively splitting and rephrasing complex English sentences into a novel semantic hierarchy of simplified sentences, with each of them presenting a more regular structure that may facilitate a wide variety of artificial intelligence tasks, such as machine translation (MT) or information extraction (IE). Using a set of hand-crafted transformation rules, input sentences are recursively transformed into a two-layered hierarchical representation in the form of core sentences and accompanying contexts that are linked via rhetorical relations. In this way, the semantic relationship of the decomposed constituents is preserved in the output, maintaining its interpretability for downstream applications. Both a thorough manual analysis and automatic evaluation across three datasets from two different domains demonstrate that the proposed syntactic simplification approach outperforms the state of the art in structural text simplification. Moreover, an extrinsic evaluation shows that when applying our framework as a preprocessing step the performance of state-of-the-art Open IE systems can be improved by up to 346% in precision and 52% in recall. To enable reproducible research, all code is provided online.Type: conference paperDOI: 10.18653/v1/P19-1333 -
PublicationMinWikiSplit: A Sentence Splitting Corpus with Minimal Propositions( 2019)
;Freitas, AndréWe compiled a new sentence splitting corpus that is composed of 203K pairs of aligned complex source and simplified target sentences. Contrary to previously proposed text simplification corpora, which contain only a small number of split examples, we present a dataset where each input sentence is broken down into a set of minimal propositions, i.e. a sequence of sound, self-contained utterances with each of them presenting a minimal semantic unit that cannot be further decomposed into meaningful propositions. This corpus is useful for developing sentence splitting approaches that learn how to transform sentences with a complex linguistic structure into a fine-grained representation of short sentences that present a simple and more regular structure which is easier to process for downstream applications and thus facilitates and improves their performance.Type: conference paper -
PublicationGraphene: A Context-Preserving Open Information Extraction System(Association for Computational Linguistics, 2018-08)
;Freitas, AndréWe introduce Graphene, an Open IE system whose goal is to generate accurate, meaningful and complete propositions that may facilitate a variety of downstream semantic applications. For this purpose, we transform syntactically complex input sentences into clean, compact structures in the form of core facts and accompanying contexts, while identifying the rhetorical relations that hold between them in order to maintain their semantic relationship. In that way, we preserve the context of the relational tuples extracted from a source sentence, generating a novel lightweight semantic representation for Open IE that enhances the expressiveness of the extracted propositions.Type: conference paper -
PublicationA Survey on Open Information Extraction(Association for Computational Linguistics, 2018-08)
;Freitas, AndréType: conference paperJournal: Proceedings of the 27th International Conference on Computational Linguistics -
PublicationAn Open Vocabulary Semantic Parser for End-User Programming using Natural LanguageThe ability to automatically interpret natural language commands and actions has the potential of freeing up end-users to interact with software artefacts without the syntactic, vocabulary and formal constraints of a programming language. As most semantic parsers for end-user programming have been operating under a restricted vocabulary setting, it is unclear how these approaches perform over conditions of high semantic heterogeneity (e.g. in an open vocabulary). As the generation of annotated data is costly and time-consuming, models that effectively address complex learning problems constrained under the assumption of small annotated data sets are highly relevant. In this paper, we propose a semantic parsing approach to map natural language commands to actions from a large and heterogeneous frame set trained under a small set of annotated data. The semantic parsing approach uses the combination of semantic role labelling, distributional semantics geometric features and semantic pivoting in order to address the semantic matching problem in an open vocabulary setting.Type: conference paper
Scopus© Citations 8 -
PublicationObject Classification in Images of Neoclassical Furniture Using Deep Learning(Springer, 2016-05-05)
;Freitas, André ;Donig, SimonType: conference paperJournal: Computational History and Data-Driven HumanitiesScopus© Citations 6