Abstract
Coherence of a text is provided by various language means, including discourse connectives (coordinating and subordinating conjunctions, adverbs etc.). However, semantic relations between text segments can be deduced without an explicit discourse connective, too (the so called implicit discourse relations, cf. He missed his train. 0 He had to take a taxi.). In our paper, we introduce a corpus of Czech annotated for implicit discourse relations (Enriched Discourse Annotation of Prague Discourse Treebank Subset 1.0) and we analyze some of the factors influencing the explicitness/implicitness of discourse relations, such as the text genre, semantic type of the discourse relation and the presence of negation in discourse arguments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
For a comparison, a measurement of inter-annotator agreement for implicit relations in the Turkish Discourse Treebank reports chance-corrected \(\kappa \) values of 0.52 for the class level, 0.43 for the type level and 0.34 for the subtype level [16]. The measurement at the subtype level corresponds to our measurement of agreement on discourse types.
References
Jínová, P., Poláková, L., Mírovský, J.: Sentence Structure and Discourse Structure (Possible Parallels), Linguistics Today, vol. 215, pp. 53–74. John Benjamins Publishing Company, Amsterdam (2014)
Mírovský, J., Hajičová, E.: What can linguists learn from some simple statistics on annotated treebanks. In: Henrich, V., Hinrichs, E., de Kok, D., Osenova, P., Przepiórkowski, A. (eds.) Proceedings of 13th International Workshop on Treebanks and Linguistic Theories (TLT13). pp. 279–284. University of Tübingen, University of Tübingen, Tübingen (2014)
Mírovský, J., Mladová, L., Žabokrtský, Z.: Annotation tool for discourse in PDT. In: Huang, C.R., Jurafsky, D. (eds.) Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010). vol. 1, pp. 9–12. Chinese Information Processing Society of China, Tsinghua University Press, Beijing (2010)
Pajas, P., Štěpánek, J.: Recent advances in a feature-rich framework for treebank annotation. In: Scott, D., Uszkoreit, H. (eds.) The 22nd International Conference on Computational Linguistics - Proceedings of the Conference. vol. 2, pp. 673–680. The Coling 2008 Organizing Committee, Manchester (2008)
Pitler, E., Louis, A., Nenkova, A.: Automatic sense prediction for implicit discourse relations in text. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, vol. 2, pp. 683–691. Association for Computational Linguistics (2009)
Poláková, L.: K možnostem korpusového zpracování nadvětných jevů [on the possibilities of a corpus-based approach to discourse phenomena]. Naše řeč 4-5/2014, pp. 241–258 (2014)
Poláková, L., Mírovský, J., Nedoluzhko, A., Jínová, P., Zikánová, Š., Hajičová, E.: Introducing the prague discourse treebank 1.0. In: Proceedings of the 6th International Joint Conference on Natural Language Processing, pp. 91–99. Asian Federation of Natural Language Processing, Asian Federation of Natural Language Processing, Nagoya (2013)
Prasad, R., et al.: Penn Discourse Treebank Version 2.0. Data/software (2008). lDC2008T05
Prasad, R., et al.: The Penn Discourse Treebank 2.0 Annotation Manual. Technical Report IRCS-08-01. Institute for Research in Cognitive Science, University of Pennsylvania (2007)
Prasad, R., Webber, B., Lee, A., Joshi, A.: Penn Discourse Treebank Version 3.0. Data/software (2019). lDC2019T05
Rysová, M., et al.: Prague discourse treebank 2.0. Data/Software. LINDAT/CLARIN digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University (2016). http://hdl.handle.net/11234/1-1905
Taboada, M., Brooke, J., Stede, M.: Genre-based paragraph classification for sentiment analysis. In: Healey, P., Pieraccini, R., Byron, D., Young, S., Purver, M. (eds.) Proceedings of the SIGDIAL 2009 Conference. The 10th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pp. 62–70. Association for Computational Linguistics, Stroudsburg (2009)
Webber, B.: Genre distinctions for discourse in the Penn TreeBank. In: Su, K.Y., Su, J., Wiebe, J., Li, H. (eds.) Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. pp. 674–682. Association for Computational Linguistics, Suntec (2009)
Webber, B., Prasad, R., Lee, A., Joshi, A.: The Penn Discourse Treebank 3.0 Annotation Manual. Technical report, University of Edinburgh (2019)
Webber, B., Stone, M., Joshi, A., Knott, A.: Anaphora and discourse structure. Comput. Linguist. 29(4), 545–587 (2003)
Zeyrek, D., Demirşahin, I., Çallı, A.B.S., Kurfali, M.: Annotating implicit discourse relations in Turkish & the challenge of annotating corrective discourse relations. Oral presentation. In: IPrA Conference 2015, Antverp, Belgium (2016)
Zikánová, Š., Synková, P., Mírovský, J.: Enriched Discourse Annotation of PDiT Subset 1.0 (PDiT-EDA 1.0). Data/Software. LINDAT/CLARIN digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University (2018). http://hdl.handle.net/11234/1-2906
Acknowledgments
This work has been supported by project “Implicit relations in text coherence” GA17-03461S of the Czech Science Foundation. The research team has been using language resources and tools distributed by the LINDAT/CLARIN project of the Ministry of Education, Youth and Sports of the Czech Republic (projects LM2015071 and OP VVV VI CZ.02.1.01/0.0/0.0/16 013/0001781).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Zikánová, Š., Mírovský, J., Synková, P. (2019). Explicit and Implicit Discourse Relations in the Prague Discourse Treebank. In: Ekštein, K. (eds) Text, Speech, and Dialogue. TSD 2019. Lecture Notes in Computer Science(), vol 11697. Springer, Cham. https://doi.org/10.1007/978-3-030-27947-9_20
Download citation
DOI: https://doi.org/10.1007/978-3-030-27947-9_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-27946-2
Online ISBN: 978-3-030-27947-9
eBook Packages: Computer ScienceComputer Science (R0)