DASSA’2025

Call for Papers

Workshop

Digital Avenues for Low-Resource Languages of Sub-Saharan Africa (DASSA’2025)
27-28 May 2025

Organized by the initiative AI4AfricanLanguages (Leveraging AI for African Low-Resource Languages to Enhance Crises Monitoring), and supported by the French National Research Agency.

LOCATION

Salle de conférence du CNS-RIC-DSU du MINESUP situé entre l’Ecole Nationale Supérieure Polytechnique de Yaoundé (ENSPY) et le Centre International de Référence Chantal Biya (CRIRB) à Melen.

ZOOM LINK

https://zoom.us/j/99757156634?pwd=LrFgsKKjMDf5xJEMXibITvlDxUb2ou.1

Meeting ID: 997 5715 6634
Passcode: 702575

¨PROGRAM

27 mai 2025

8h30 – Accueil des participants

9h – Introduction 

9h15 – Keynote : Laurence NGOUMAMBA (Université de Yaoundé I, Cameroun), Mécanisme de développement terminologique en langues africaines (présentiel) 

10h15 –  Pause café

10h30 – Richard Bertrand ETABA ONANA (Ecole Supérieure des Sciences et Techniques de l’Information et de la Communication, Cameroun), Lexique médical numérique camerounais, langues locales (fulfulde et/ou ewondo)/langues officielles : étude processuelle de collecte de données (présentiel) 

11h00 – Ibrahima NDAO (Assane Seck University of Ziguinchor, Senegal), AbuseBERT-WoFr: refined BERT model for detecting abusive messages on tweets mixing Wolof-French codes (distanciel)

11h30 – Charnelle YANHAMO KAPAWOU (Université de Yaoundé I, Cameroun), Diversity Text Generation via Adversarial Network in low-resource languages (présentiel)

12h30 – Déjeuner

14h – Keynote : Francesca FAGANDINI (Cirad, Sénégal), Enjeux linguistiques et traductions de savoirs en santé publique (présentiel) 

15h Keynote – Jean Romain KOUESSO (Université de Dschang, Cameroun), Traitement automatique des langues peu documentées : observations pour une approche de convergence (présentiel) 

16h – Pause-café

16h30  – 17h45 : Table-ronde “L’IA et langues peu dotée en Afrique : Quelles données ? Pour quoi faire ?”

  • MINEPIA, Processus de remontée des données en santé animale

  • MINADER, La collecte de données sur l’agriculture pour les systèmes d’alerte rapide

28 mai 2025

8h30 – Accueil des participants

9h15 – Keynote : Élodie GAUTIER (Orange, France), Reconnaissance de la parole pour les langues africaines (distanciel)

10h30 – Pause-café

11h00 – Go Issa TRAORE (Université Nazi BONI, Burkina Faso), The hybrid CNN+LSTM+SVM based architecture for multilingual speech emotion recognition in low-resource African language using radio data (distanciel)

11h30 –  Dimitri TCHAHEU TCHAHEU (Université de Yaoundé I, Cameroun), A Triphone Hidden Markov Model for Forced Alignment of Nda’ Nda’ Speech (présentiel) 

12h30 – Déjeuner 

14h30 – Saint Germes BENGONO OBIANG (Université de Yaoundé I, Cameroun), Pro-TeVA: Prototype-based Explainable Tone Recognition for Low-Resource Language (présentiel)

15h – Keynote : MINEPIA (Cameroun), Surveillance communautaire en santé animale (présentiel)

15h30 – Keynote : IRAD (Cameroun), Protocoles d’inclusion des paysans dans la surveillance des maladies des plantes (présentiel)

16h – Pause-café

16h30 – 17h45 : Table ronde  “L’IA et langues peu dotées en Afrique, quels enjeux méthodologiques à l’ère du Big Data ?”

17h45 – Mot de clôture et perspectives 

ACCEPTED PAPERS

  1. Bengono Obiang Saint Germes Bienvenu, Paulin Melatagia Yonta, Norbert Tsopze, Tania Jimenez, Jean-Francois Bonastre and Farida Nchare. Pro-TeVA: Prototype-based Explainable Tone Recognition for Low-Resource Language
  2. Etaba Onana Richard Bertrand. Cameroonian digital medical lexical, local languages ​​(Fulfulde and/or Ewondo)/official languages: procedural study of data collection
  3. Go Issa Traore and Borlli Michel Jonas Some. The hybrid CNN+LSTM+SVM based architecture for multilingual speech emotion recognition in low-resource African language using radio data
  4. Ibrahima Ndao, Khadim Dramé, Gorgoumack Sambe and Gayo Diallo. AbuseBERT-WoFr: refined BERT model for detecting abusive messages on tweets mixing Wolof-French codes
  5. Charnelle Yanhamo Kapawou and Norbert Tsopze. Diversity Text Generation via Aversarial Network in low-resource languages
  6. Tchaheu Tchaheu Dimitri, Kana Azeuko Sherelle and Melatagia Yonta Paulin. A Triphone Hidden Markov Model for Forced Alignment of Nda’ Nda’ Speech

SCOPE 

Automatic language processing has made significant progress thanks to Artificial Intelligence, particularly with the emergence of large language models (LLMs). However, these advances mainly focus on dominant languages such as English and French. For more than 6,000 languages around the world, the lack of usable data is an obstacle to training models.

The performance of LLMs strongly depends on the volume, quality, and representativeness of training data, elements that are often insufficient for under-resourced languages such as the languages of sub-Saharan Africa, many of which are mainly oral. This data deficit poses challenges such as decreased model accuracy, overfitting to scarce resources, and difficulty integrating these languages into generic or specialized applications (e.g., media-based monitoring). The challenge is even greater for voice systems, as radio remains a primary medium in many sub-Saharan African regions.

This workshop aims to explore the intersections between linguistics, data, learning, and models to address the challenges of under-representation of sub-Saharan languages.

TOPICS OF INTEREST

The workshop will address the various stages involved in processing textual and vocal data (including collection, pre-processing, and learning) for under-resourced sub-Saharan African languages. Contributions are welcome from disciplines like data mining, linguistics, machine learning, AI, and data science.

The main topics of interest include:

  • Data collection and curation in low-resource languages
  • Evaluation and impact of biases introduced during collection and annotation
  • Integration of multi-modal data (text and audio)
  • Learning in a context of noisy and low-resource data
  • Frugal learning (learning resources, computational resources)
  • Explainability of machine learning approaches
  • Contributions and limitations of LLMs
  • Speech recognition and synthesis
  • Model evaluation and benchmarking

All innovative research addressing these themes is welcomed, irrespective of the field of application (e.g., health, agriculture, social sciences and humanities, digital humanities, etc.).

SUBMISSIONS

Every accepted submission must have at least one author registered for the workshop. All submitted extended abstracts must be written in English and follow the LNCS format (guidelines) with a 6-page limit (including title page, figures, references, and optional appendix).

Submissions should be sent electronically in PDF format via EasyChair: submission link.
Oral presentations may be given in French or English.

IMPORTANT DATES

  • Submission deadline : April 11, 2025
  • Notification to authors : April 28, 2025
  • Workshop : May 27-28, 2025

PARTICIPATION

The workshop will be held in Yaoundé, Cameroon as a hybrid event, combining both in-person and online participation.

WORKSHOP CHAIRS

  • Paulin Melatiaga (Université de Yaoundé I, Cameroun) – paulinyonta @ gmail.com
  • Sarah Valentin (UMR TETIS, Cirad, France) – sarah.valentin @ cirad.fr
  • Damien Nouvel (INALCO, France) – damien.nouvel @ inalco.fr
Translate 🌐 »