Smart Big Data: Use of Slovenian Parliamentary Papers in Digital History

Authors

  • Andrej Pančur Inštitut za novejšo zgodovino / Institute of Contemporary History
  • Mojca Šorn Inštitut za novejšo zgodovino / Institute of Contemporary History

DOI:

https://doi.org/10.51663/pnz.56.3.09

Keywords:

digital humanities, digital history, Slovenia, parliament

Abstract

The paper calls attention to the problem of massive amounts of digital historical sources that will eventually be faced by researchers of contemporary history. Slovenian parliamentary papers are then presented in detail as an example of smart big data. The authors believe that historians will be unable to process massive amounts of such digital materials using only standard historiographical methods and will be forced to start using methods and tools developed by digital history, digital humanities and also language technologies.

 

Author Biography

References

Benardou, Agiatis, Alastair Dunning, Martin Schaller and Nephelie Chatzi. Research Themes for Aggregating Digital Content: Parliamentary Papers in Europe. Europeana Cloud, 2015. Accessed 28 September 2016. http://pro.europeana.eu/files/Europeana_Professional/Projects/Project_list/Europeana_Cloud/WP1%20Research%20Needs/research-themes-for-aggregating-digital-content-parliamentary-papers.pdf.

Cvelfar, Bojan, Tatjana Hajtnik, Miroslav Novak, Nada Čibej, and Drago Trpin. Strategija in izvedbeni načrt razvoja slovenskega elektronskega arhiva 2016 – 2020. Ljubljana: Arhiv Republike Slovenije, 2016.

McCallum, Andrew Kachites. MALLET: A Machine Learning for Language Toolkit. 2002. Accessed July 19, 2016. http://mallet.cs.umass.edu/.

Collins, Sandra, Natalie Harrower, Dag Trygve Truslew Haug, Beat Immenhauser, Gerhard Lauer, Tito Orlandi, Laurent Romary and Eveline Wandl-Vogt. ALLEA E-Humanities Working Group Report: Going Digital: Creating Change in the Humanities. Berlin: All European Academies, 2015. Accessed July 19, 2016. http://www.allea.org/Content/ALLEA/WG%20E%20Humanities/Going%20Digital_digital%20version.pdf.

Deutsche Forschungsgemeinschaft. DFG Practical Guidelines on Digitisation. Bonn: Deutsche Forschungsgemeinschaft, 2013. Accessed July 19, 2016. http://www.dfg.de/formulare/12_151/12_151_en.pdf.

Elliot, Devon, Robert MacDougall and William J. Turkel. New Old Things: Fabrication, Physical Computing, and Experiment in Historical Practice. Canadian Journal of Communication 37 (2012): 122. Accessed July 19, 2016. http://www.cjc-online.ca/index.php/journal/article/view/2506.

Erjavec, Tomaž. ”Korpusi in konkordančniki na strežniku nl.ijs.si.” Slovenščina 2.0 1, no. 1 (2013): 24-49. Accessed July 19, 2016. http://slovenscina2.0.trojina.si/arhiv/2013/1/Slo2.0_2013_1_03.pdf.

Erjavec, Tomaž, Jan Jona Javoršek and Simon Krek. ”Raziskovalna infrastruktura CLARIN.SI.” In Proceedings of the 17th International Multiconference Information Society – IS 2014: Language Technologies, edited by Tomaž Erjavec and Jerneja Žganec Gros, 19-24. Ljubljana: IJS, 2014. Accessed September 30, 2016. http://nl.ijs.si/isjt14/proceedings/isjt2014_03.pdf.

Fokkens, Antske, Serge ter Braake, Niels Ockeloen, Piek Vossen, Susan Legêne and Guus Schreiber. ”BiographyNet: Methodological issues when NLP supports historical research.” In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), edited by Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk and Stelios Piperidis, 3728-3735. Reykjavik: European Language Resources Association (ELRA), 2014. Accessed September 30, 2016. http://www.lrec-conf.org/proceedings/lrec2014/pdf/1103_Paper.pdf.

Gašparič, Jure. Državni zbor 1992 – 2012: O slovenskem parlamentarizmu. Ljubljana: Inštitut za novejšo zgodovino, 2012.

Gašparič, Jure. ”Pisati politično zgodovino Republike Slovenije.” In Četrt stoletja Republike Slovenije – izzivi, dileme, pričakovanja, edited by Jure Gašparič and Mojca Šorn, 27-37. Ljubljana: Inštitut za novejšo zgodovino, 2016.

Gašparič, Jure. “Slovenian Socialist Parliament on the Eve of the Dissolution of the Yugoslav Federation: A feeble ”ratification body” or important political decision-maker?” Prispevki za novejšo zgodovino 55, no. 3 (2015): 41-59. Accessed July 19, 2016. http://ojs.inz.si/pnz/article/view/123.

Gašparič, Jure. Slovenski parlament: politično-zgodovinski pregled od začetka prvega do konca šestega mandata (1992-2014). Ljubljana: Inštitut za novejšo zgodovino, 2014. Accessed July 19, 2016. http://hdl.handle.net/11686/26950.

Graham, Shawn, Scott Weingart and Ian Milligan. ”Getting Started with Topic Modeling and MALLET.” Programming Historian (2 September 2012). Accessed July 19, 2016. http://programminghistorian.org/lessons/topic-modeling-and-mallet.

Graham, Shawn, Ian Milligan and Scott Weingart. Exploring Big Historical Data: The Historian’s Macroscope. London: Imperial College Press, 2015. Accessed 28 September, 2016. http://www.themacroscope.org.

Haber, Peter. “Zeitgeschichte und Digital Humanities.” In Zeitgeschichte – Konzepte und Methoden, edited by Frank Bösch and Danyel Jürgen, 47-66. Göttingen: Vandenhoeck & Ruprecht, 2012. Accessed July 19, 2016. http://dx.doi.org/10.14765/zzf.dok.2.269.v1.

Hajtnik, Tatjana. ”Strategija razvoja slovenskega javnega elektronskega arhiva ’e-ARH.si’.” Knjižnica 55, no. 1 (2011): 39-56. http://revija-knjiznica.zbds-zveza.si/Izvodi/K1101/Hajtnik.pdf

Jakubíček, Miloš and Vojtěch Kovář. CzechParl. “Corpus of Stenographic Protocols from Czech Parliament.” In Proceedings of Recent Advances in Slavonic Natural Language Processing, RASLAN 2010, edited by Petr Sojka and Aleš Horák, 41-46. Tribun EU, 2010. Accessed July 19, 2016, http://www.muni.cz/research/publications/914313.

Ljubešić, Nikola, Marija Stupar, Tereza Jurić and Željko Agić. “Combining Available Datasets for Building Named Entity Recognition Models of Croatian and Slovene.” Slovenščina 2.0 1, no. 2 (2013): 35-57.

Martin-Dancausa, Carlos and Maarten Marx. “Parliamentary documents from Spain.” In Proceedings of the International Conference on Language Resources and Evaluation, edited by Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, Daniel Tapias. Valetta: LREC, 2010. Accessed July 19, 2016, https://www.researchgate.net/publication/239585911.

Marx, Maarten. ”Advanced Information Access to Parliamentary Debates.” Texas Digital Library 10, no. 6 (2009): 1-11. Accessed July 19, 2016. https://journals.tdl.org/jodi/index.php/jodi/article/view/668.

Marx, Maarten and Anne Schuth. “DutchParl: The Parliamentary Documents in Dutch.” In Proceedings of the International Conference on Language Resources and Evaluation, edited by Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, Daniel Tapias, 3670-3677. Valetta: LREC, 2010. Accessed July 19, 2016, http://www.lrec-conf.org/proceedings/lrec2010/pdf/263_Paper.pdf.

Nanni, Federico, Hiram Kumper, and Simone Paolo Ponzetto. “Semi-Supervised Textual Analysis and Historical Research Helping Each Other: Some Thoughts and Observations.” International Journal of Humanities and Arts Computing 10, no. 1 (2016): 63-77. Accessed July 19, 2016. http://dx.doi.org/10.3366/ijhac.2016.0160.

Nicholson, Bob. “The Digital Turn: Exploring the methodological possibilities of digital newspaper archives.” Media History 19, no. 1 (2013): 59-73. Accessed September 30, 2016. http://dx.doi.org/10.1080/13688804.2012.752963.

Ogrodniczuk, Maciej. “The Polish Sejm Corpus.” In LREC 2010, Eight International Conference on Language Resources and Evaluation, edited by Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis, 2219-2223. Istanbul, 2012. Accessed July 19, 2016. http://www.lrec-conf.org/proceedings/lrec2012/pdf/653_Paper.pdf.

Pančur, Andrej, Mojca Šorn and Tomaž Erjavec. Slovenian parliamentary corpus SlovParl 1.0 (2016). Distributed by Slovenian language resource repository CLARIN.SI. http://hdl.handle.net/11356/1075.

Pančur, Andrej. ”Označevanje zbirke zapisnikov sej slovenskega parlamenta s smernicami TEI [Encoding the Slovenian Parliament Session Minutes in Line with the TEI Guidelines].” In Zbornik konference Jezikovne tehnologije in digitalna humanistika [Proceedings of the Conference on Language Technologies & Digital Humanities], edited by Tomaž Erjavec and Darja Fišer, 142-148. Ljubljana: Znanstvena založba Filozofske fakultete v Ljubljani, 2016. Accessed October 5, 2016, http://nl.ijs.si/isjt16/proceedings-en.html.

Pesek, Rosvita. Osamosvojitev Slovenije. Ljubljana: Nova revija, 2007.

Piersma, Hinke and Kees Ribbens. »Digital Historical Research: Context, Concept and the Need for Reflection.« BMGN – Low Countries Historical Review 128, no. 4 (2013): 78-102. Accessed September 30, 2016,

Piersma, Hinke, Ismee Tames, Lars Buitinck in Maarten Marx. ”War in Parliament: What a Digital Approach Can Add to the Study of Parliamentary History.” DHQ: Digital Humanities Quarterly 8, no. 1 (2014). Accessed September 30, 2016, http://www.digitalhumanities.org/dhq/vol/8/1/000176/000176.html.

Robertson, Stephen. “The Differences between Digital Humanities and Digital History.” In Debates in Digital Humanities 2016, edited by Matthew K. Gold and Lauren F. Klein. Minneapolis, London: University of Minnesota Press, 2016. Accessed September 25, 2016, http://dhdebates.gc.cuny.edu/debates/text/76

Rosenzwig, Roy. “Scarcity or Abundance? Preserving the Past in a Digital Era.” American Historical Review 108, no. 3 (2003): 735-762.

Schöch, Christof. ”Big? Smart? Clean? Messy? Data in the Humanities.” Journal of Digital Humanities 2, no. 3 (2013). Accessed September 25, 2016, http://journalofdigitalhumanities.org/2-3/big-smart-clean-messy-data-in-the-humanities/.

Spiro, Lisa. “Access, Explore, Converse: The Impact (and Potential Impact) of the Digital Humanities on Scholarship.” In Keys for architectural history research in the digital era, edited by Juliette Hueber and Antonio Mendes da Silva. 2014. Accessed September 25, 2016, https://inha.revues.org/4925.

TEI Consortium. TEI P5: Guidelines for Electronic Text Encoding and Interchange, Text Encoding Initiative Consortium, 2016. Accessed July 19, 2016. http://www.tei-c.org/Guidelines/P5/.

Zaagsma, Gerben. ”On Digital History.” BMGN – Low Countries Historical Review 128, no. 4 (2013): 3-29. http://www.bmgn-lchr.nl/articles/10.18352/bmgn-lchr.9344/

Published

2016-12-05

Most read articles by the same author(s)

1 2 3 4 > >>