Mathieu Avanzi1, 2, Anne Catherine Simon3, Jean-Philippe Goldman3, 4 & Antoine Auchlin4

1Neuchâtel, Switzerland; 2Paris Ouest, France; 3Louvain-la-Neuve, Belgium, 4Genève, Switzerland
C-PROM. An annotated Corpus for French Prominence Studies.

 

This paper presents C-PROM, an annotated corpus for French prominence studies. The corpus, including different regional varieties of French (Belgian, Swiss and metropolitan French and various discourse-genres (from oral reading to spontaneous conversations) for a total duration of 70 minutes, was annotated by two phonetics experts. The two experts in charge of the coding followed a strict protocol, which takes into account both the previous mistakes encountered by prior research into prominence detection in French and elements of the methodology followed by scholars working on other languages. We conclude by discussing the average consistency between the two transcribers. The results obtained are quite encouraging, as the F-measure between the two annotators reaches 82.8%, and the kappa-score 0.86.