Parlameter – a Corpus of Contemporary Slovene Parliamentary Proceedings

  • Darja Fišer Department of Translation, Faculty of Arts, University of Ljubljana
  • Nikola Ljubešić Jožef Stefan Institute
  • Tomaž Erjavec Department of Knowledge Technologies, Jožef Stefan Institute
Keywords: parliamentary proceedings, corpus construction, language technology, corpus analysis


The paper presents the Parlameter corpus of contemporary Slovene parliamentary proceedings, which covers the VIIth mandate of the Slovene Parliament (2014-2018). The Parlameter corpus offers rich speaker metadata (gender, age, education, party affiliation) and is linguistically annotated (lemmatization, tagging), which boost research in several digital humanities and social sciences disciplines. We demonstrate the potential of the corpus analysis techniques for investigating political debates. The corpus architecture allows for regular extensions of the corpus with additional Slovene data, as well as data from other parliaments, starting with Croatian.



Bayley, Paul. 2014. “Introduction: The whys and wherefores of analyzing parliamentary discourse.” In Cross-Cultural Perspectives on Parliamentary Discourse, edited by Paul Bayley, 1–44. Amsterdam, Philadelphia: John Benjamins Publishing.

Cheng, Jennifer E. 2015. “Islamophobia, Muslimophobia or racism? Parliamentary discourses on Islam and Muslims in debates on the minaret ban in Switzerland.” Discourse & Society 26 (5): 562–86.

Chester, Daniel Norman, and Nona Bowring. 1962. Questions in Parliament. Oxford: Clarendon Press.

van Dijk, Teun A. 2010. “Political identities in parliamentary debates.” In European Parliaments under Scrutiny: Discourse strategies and interaction practices, edited by Cornelia Ilie, 29–56. Amsterdam, Philadelphia: John Benjamins Publishing.

Fišer, Darja, and Jakob Lenardič. 2018. “Parliamentary Corpora in the CLARIN infrastructure.” In Selected papers from the CLARIN Annual Conference 2017, edited by Maciej Piasecki, 75–85. Accessed February 27, 2019.

Fišer, Darja, and Vojko Gorjanc. 2013. Korpusna analiza. Ljubljana: Znanstvena založba Filozofske Fakultete.

Fišer, Darja, Nikola Ljubešić, and Tomaž Erjavec. 2018. “The Janes project: language resources and tools for Slovene user generated content.” Language Resources and Evaluation. In press.

Franklin, Mark N., and Philip Norton. 1993. Parliamentary Questions: For the Study of Parliament Group. Oxford: Oxford University Press.

Hirst, Graeme, Vanessa Wei Feng, Christopher Cochrane, and Nona Naderi. 2014. “Argumentation, Ideology, and Issue Framing in Parliamentary Discourse.” In ArgNLP. Accessed 27 February 2019.

Hughes, Lorna M., Paul S. Ell, Gareth A.G. Knight, and Milena Dobreva. 2013. “Assessing and measuring impact of a digital collection in the humanities: An analysis of the SPHERE (Stormont Parliamentary Hansards: Embedded in Research and Education) Project.” Digital Scholarship in the Humanities 30 (2): 183–98.

Ihalainen, Pasi, Cornelia Ilie, and Kari Palonen. 2016. Parliament and Parliamentarism: A Comparative History of a European Concept. Oxford, New York: Berghahn Books.

Ilie, Cornelia. 2017. “Parliamentary Debates.” In The Routledge Handbook of Language and Politics, edited by Ruth Wodak and Bernhard Forchtner. Routledge.

Ljubešić, Nikola, and Tomaž Erjavec. 2016. “Corpus vs. Lexicon Supervision in Morphosyntactic Tagging: The Case of Slovene.” In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), edited by Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, and Stelios Piperidis, 1527–31. Accessed February 27, 2019.

Ljubešić, Nikola, Tomaž Erjavec, Darja Fišer, Tanja Samardžić, Maja Miličević, Filip Klubička, and Filip Petkovski. 2016. “Easily Accessible Language Technologies for Slovene, Croatian and Serbian.” In Proceedings of the Conference on Language Technologies and Digital Humanities 2016, edited by Tomaž Erjavec and Darja Fišer, 120–24. Accessed February 27, 2019.

Pančur, Andrej, and Mojca Šorn. 2016. “Smart Big Data: Use of Slovenian Parliamentary Papers in Digital History.” Prispevki za novejšo zgodovino 56 (3): 130–46.

Pančur, Andrej. 2016. “Označevanje zbirke zapisnikov sej slovenskega parlamenta s smernicami TEI.” In Proceedings of the Conference on Language Technologies and Digital Humanities 2016, edited by Tomaž Erjavec and Darja Fišer, 142–48. Accessed February 27, 2019.

Rheault, Ludovic, Kaspar Beelen, Christopher Cochrane, and Graeme Hirst. 2016. “Measuring Emotion in Parliamentary Debates with Automated Textual Analysis.” PLoS ONE 11 (12): 1–18.

TEI Consortium, 2017. Guidelines for Electronic Text Encoding and Interchange. Accessed February 27, 2019.