Repository logo
  • English
  • Deutsch
Log In
or
  1. Home
  2. HSG CRIS
  3. HSG Publications
  4. MinWikiSplit: A Sentence Splitting Corpus with Minimal Propositions
 
  • Details

MinWikiSplit: A Sentence Splitting Corpus with Minimal Propositions

Type
conference paper
Date Issued
2019
Author(s)
Niklaus, Christina  
Freitas, André
Handschuh, Siegfried  
Abstract
We compiled a new sentence splitting corpus that is composed of 203K pairs of aligned complex source and simplified target sentences. Contrary to previously proposed text simplification corpora, which contain only a small number of split examples, we present a dataset where each input sentence is broken down into a set of minimal propositions, i.e. a sequence of sound, self-contained utterances with each of them presenting a minimal semantic unit that cannot be further decomposed into meaningful propositions. This corpus is useful for developing sentence splitting approaches that learn how to transform sentences with a complex linguistic structure into a fine-grained representation of short sentences that present a simple and more regular structure which is easier to process for downstream applications and thus facilitates and improves their performance.
Language
English
HSG Classification
contribution to scientific community
Publisher place
Tokyo, Japan
Event Title
12th International Conference on Natural Language Generation
Event Location
Tokyo, Japan
Event Date
29 October - 1 November 2019
URL
https://www.alexandria.unisg.ch/handle/20.500.14171/99476
Subject(s)

computer science

Division(s)

ICS - Institute of Co...

Eprints ID
258308
File(s)
Loading...
Thumbnail Image

open.access

Name

67_Paper.pdf

Size

108.32 KB

Format

Adobe PDF

Checksum (MD5)

20324d7545eba1cddae3d440b6b0880d

here you can find instructions and news.

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Privacy policy
  • End User Agreement
  • Send Feedback