C-ORAL-ROM: Integrated Reference Corpora for Spoken Romance LanguagesEmanuela Cresti, Massimo Moneglia John Benjamins Publishing, 2005 - 303 páginas The C-ORAL-ROM book and DVD provide a unique set of comparable corpora of spontaneous speech for the main Romance languages, French, Italian, Portuguese and Spanish. The corpora are accompanied by comparative linguistic studies, models and standard linguistic measures of spoken language variability. Each corpus is built to the same design using identical sampling techniques, and each corpus is presented in multimedia format, allowing simultaneous access to aligned acoustic and textual information. Texts are headed with information about provenance, participants, etc. and the transcriptions show changes of speaker. Speech acts are tagged according to the evidence of prosodic criteria. Each corpus totals 300,000 words and presents formal and informal speech in a variety of contexts of use, dialogue structure and text genres, semantic domains and speech act typologies. The corpora have great statistical relevance for spoken language structures and can address key issues in human language technology such as speech recognition in unrestricted discourse, the suitability of speech synthesis in natural prosody, and multilingual applications of the spoken language interface. The work provides new data and innovative theoretical perspectives that are relevant for corpus linguistics, romance linguistics, syntactic theory, speech and prosody research, and second language acquisition. |
Contenido
CHAPTER | 1 |
APPENDIX | 4 |
CHAPTER 2 | 71 |
CHAPTER 3 | 111 |
CHAPTER 4 | 135 |
CHAPTER 5 | 163 |
CHAPTER 6 | 209 |
Evaluation of consensus on the annotation of terminal and nonterminal | 257 |
Results | 267 |
277 | |
Otras ediciones - Ver todas
C-ORAL-ROM: Integrated Reference Corpora for Spoken Romance Languages Emanuela Cresti,Massimo Moneglia Vista previa limitada - 2005 |
C-ORAL-ROM: Integrated Reference Corpora for Spoken Romance Languages, Volumen1 Emanuela Cresti,Massimo Moneglia Sin vista previa disponible - 2005 |
Términos y frases comunes
adjectives adverbs alignment analysis annotation automatic tagging Biber Blanche-Benveniste C-ORAL-ROM corpus clitic context coordination CORLEX corpora corpus design Corpus Linguistics Cresti dialogic turn dialogue disambiguation discourse markers discourse particles domain Dutch corpus elements errors European Portuguese evaluation example Figure DVD formal forms French functions grammar illocutionary act interjections Italian label Lemma Rank Lemma lemmatisation lexical lexicon Linguistique locutions Moneglia monologues morpho-syntactic multiword nodes non-standard non-terminal breaks nouns occur onomatopoeia orthographic paralinguistic Parlato phenomena phonetic Português Portuguese POS tagging position preposition pronouns prosodic break prosodic tagging Rank Lemma Rank recorded relevant resource retracting Romance languages sampling segment semantic Spanish speakers specific speech act speech corpus speech events speech flow speech recognition spoken corpora spoken corpus spoken language spontaneous speech strategy structure syntactic Table tagger tagset terminal breaks tone unit transcribed transcription types variation verbs Voghera WinPitch words