A Mixed-principle Rule-based Approach to the Automatic Syllabification of Serbian

  • Aniko Kovač Faculty of Philosophy, University of Novi Sad
  • Maja Marković Faculty of Philosophy, University of Novi Sad
Keywords: syllable, rule-based approach, sonority, computational linguistics, phonology

Abstract

In this paper, we present a mixed-principle rule-based approach to the automatic syllabification of Serbian, based on prescriptive rules from traditional grammar in combination with the Sonority Sequencing Principle. We explore the problems and limitations of the existing rule set and sonority-based approaches, introduce an algorithm that utilizes both means in an attempt to produce a more accurate segmentation of words into syllables that is better aligned with the intuition of the native speakers, and present the statistical data related to the distribution of syllables and their structure in Serbian.

References

Barber, Horacio, Marta Vergara, and Manuel Carreiras. 2004. “Syllable-frequency effects in visual word recognition: evidence from ERPs.” Neuroreport 15 (3): 545–48.

Bradley, Dianne C., Rosa M. Sánchez-Casas, and José E. García-Albea. 2007. “The status of the syllable in the perception of Spanish and English.” Language and Cognitive Processes 8 (2): 197–233.

Bigi, Brigitte, and Caterina Petrone. 2014. “A generic tool for the automatic syllabification of Italian.” In Proceedings of The First Italian Conference on Computational Linguistics, CLiC-it, 73–77. Pisa: Pisa University Press.

http://siti.fileli.unipi.it/projects/clic/proceedings/Proceedings-CLICit-2014.pdf.

Butt, Matthias. 1992. “Sonority and the Explanation of Syllable Structure.” Linguistische Berichte 137: 45–67.

Cholin, Joana, Willem J. M. Levelt, and Niels O. Schiller. 2006. “Effects of syllable frequency in speech production.” Cognition 99 (2): 205–35.

Cholin Joana, and Willem J. M. Levelt. 2009. “Effects of syllable preparation and syllable frequency in speech production: Further evidence for syllabic units at a post-lexical level.” Language and Cognitive Processes 24(5): 662–84.

Clements, George N. 1990. “The Role of the Sonority Cycle in Core Syllabification.” In Papers in Laboratory Phonology I: Between the Grammar and Physics of Speech, edited by John Kingston, John and Mary E. Beckman, 282–333. Cambridge: Cambridge University Press.

Daelemans, Walter, and Antal van den Bosch. 1992. “Generalization Performance of Backpropagation Learning on a Syllabification Task.” In Connectionism and Natural Language Processing: Proceedings of the 3rd Twente Workshop on Language Technology, TWLT3, 27–38. Enschede: University of Twente, Department of Computer Science. https://pure.uvt.nl/portal/files/760578/generalization.pdf.

Foley, James. 1972. “Rule Precursors and Phonological Change by Meta-rule.” In Linguistic change and generative theory, edited by Robert P. Stockwell and Ronald K. S. Macaulay, 96–100. Bloomington: Indiana University Press.

Goldsmith, John A. 1995. The handbook of phonological theory. London: Blackwell Publishers.

Gvozdanović, Jadranka. 2011. “Phonological domains.” In Sandhi Phenomena in the Languages of Europe, edited by Henning Andersen, 27–54. Berlin: Mouton de Gruyter.

Hankamer, Jorge, and Judith Aissen. 1974. “The sonority hierarchy.” In Papers from the Parasession on Natural Phonology, edited by Anthony Bruck, Robert Allen Fox, and Michael W. La Galy, 131–45. Chicago: Chicago Linguistic Society.

Hunt, Andrew. 1993. “Recurrent Neural Networks for Syllabification.” Speech Communication 13 (3–4): 323–32.

Iacoponi, Luca, and Renata Savy. 2011. “Sylli: Automatic Phonological Syllabification for Italian.” In INTERSPEECH 2011, 12th Annual Conference of the International Speech Communication Association, 641–44. Florence: International Speech Communication Association. http://eden.rutgers.edu/~li51/php/papers/interspeech2011.pdf.

Kaplar, Sebastijan, Marija Radojičić, Ivan Obradović, Biljana Lazić, and Ranka Stanković. 2018. “Solution for quantitative analysis of texts in Serbian based on syllables.” In ICIST 2018 Proceedings 2, 315–20. Belgrade: Society for Information Systems and Computer Networks. http://www.eventiotic.com/eventiotic/library/paper/429.

Kašić, Zorka. 2014. “Opšta lingvistika 2 (Fonologija).” Lecture Materials, Faculty of Philosophy, University of Belgrade.

Koehler, Klaus J. 1966. “Is the syllable a phonological universal?” Journal of Linguistics 2: 207–208.

Kovač, Aniko, and Maja Marković. 2018. “A Rule-Based Syllabifier for Serbian.” In Proceedings of the Conference on Language Technologies and Digital Humanities 2018, 140–46. Ljubljana: Ljubljana University Press.

Ladefoged, Peter, and Keith Johnson. 2014. A Course in Phonetics. Belmont: Wadsworth Publishing.

Ladefoged, Peter. 1982. A Course in Phonetics. New York: Harcourt Brace Jovanovich.

Landsiedel, Christian, Jens Edlund, Florian Eyben, Daniel Neiberg, and Björn Schuller. 2011. “Syllabification of conversational speech using Bidirectional Long-Short-Term Memory Neural Networks.” In 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 5256–9. Prague: IEEE.

http://ieeexplore.ieee.org/abstract/document/5947543.

Marchand, Yannick, Connie R. Adsett, and Robert I. Damper. 2009. “Automatic syllabification in English: A comparison of different algorithms.” Language and Speech 52 (1): 1–27.

Mehler, Jacques, Jean Yves Dommergues, Uli Frauenfelder, and Juan Segui. 1981. “The syllable's role in speech segmentation.” Journal of Verbal Learning and Verbal Behavior 20 (3): 298–305.

Meštrović, Ana, Sanda Martinčić-Ipšić, and Mihaela Matešić. 2015. “Postupak automatskoga slogovanja temeljem načela najvećega pristupa i statistika slogova za hrvatski jezik.” Govor, 32: 3–34.

Morelli, Frida. 1999. “The phonotactics and phonology of obstruent clusters in optimality theory.” PhD diss., University of Maryland.

Ohala, John, and Haruko Kawasaki. 1984. “Prosodic Phonology and Phonetics.” Phonology Yearbook, 1: 113–27.

Ohala, John. 1990. “The Phonetics and Phonology of Aspects of Assimilation.” In Papers in Laboratory Phonology I, edited by John Kingston, John and Mary E. Beckman, 258–75. Cambridge: Cambridge University Press.

Popović, Zoran. 2010. “Taggers Applied on Texts in Serbian.” INFOtheca 11 (2): 21a–38a.

Selkirk, Elisabeth O. 1984. “On the Major Class Features and Syllable Theory.” In Language Sound Structure, edited by Mark Aronoff and Richard T. Oehrle, 107–36. Cambridge: MIT Press.

Stanojčić, Živojin, and Ljubomir Popović. 2005. Gramatika srpskoga jezika. Belgrade: Zavod za udžbenike i nastavna sredstva Beograd.

Stoianov, Ivelin, John Nerbonne, and Huub Bouma. 1997. “Modelling the phonotactic structure of natural language words with Simple Recurrent Networks.” In Computational Linguistics in the Netherlands 1997: Selected Papers from the Eight Clin Meeting, 77–95. Amsterdam: Rodopi.

Subotić, Ljiljana, Dejan Sredojević, and Isidora Bjelaković. 2012. Fonetika i fonologija: Ortoepska i ortografska norma standardnog srpskog jezika. Novi Sad: Filozofski fakultet Univerziteta u Novom Sadu.

Utvić, Miloš. 2011. “Annotating the Corpus of Contemporary Serbian.” INFOtheca 12 (2): 36a–37a.

Zec, Draga. 2000. “O strukturi sloga u srpskom jeziku.” Južnoslovenski filolog 56 (1-2): 435–48.

Published
2019-06-08