On this page you can test the first text-to-speech synthesis system in Serbian language with elements of prosody incorporated. The system has the following features:

  • the solution is software-only and does not require any additional hardware
  • a version of PSOLA algorithm is used for synthesis
  • the system is adaptable to other synthesis algorithms (HN model, HMM)
  • the system is fast and reliable, and can handle 20 lines in real time
  • implemented prosodic elements (accentuation) significantly contribute to the intelligibility and naturalness of synthesized speech, while the naturalness of sentence intonation is achieved using techniques based on classification and regression trees (CART)
  • the system has many additional features (reading Cyrillic alphabet, numbers, words without diacritical marks, abbreviations, words of foreign origin...)

WARNING!
This demonstration is for promotional purposes only. Any commercial use of the sound samples is forbidden.



The first continuous speech recognition system in Serbian language is based on recognition of phonemes in context, and it has the following features:

  • the system is speaker independent
  • the solution is software-only and does not require any additional hardware
  • accuracy exceeds 98% on a dictionary of 500 words (telephone quality)
  • the system gives an estimate of recognition reliability, with a list of alternative recognition results sorted by their likelihood
  • there are wildcard models useful for word-spotting
  • the system is fast and reliable, and can handle up to 50 lines in real time, depending on the grammar
  • several interfaces are supported: C++ library, ActiveX, MS SAPI, IP server, MRCP

If you wish to test this system, you can do so at the telephone number we have provided for that purpose. The entire demonstration is, however, in Serbian, and we strongly suggest that the testing itself be carried out by a Serbian-speaking person. The demo application can be activated by calling +381 21 4750204 (option 5), and it is organised as an IVR system enabling callers to choose options using their voice only..

At the beginning, the caller will be offered to choose one of the following sets of words (so called grammars): CIFRE (digits), DATUMI (dates), GRADOVI (cities), IMENA (names), IZNOSI (amounts), PIĆA (drinks) or CENTRALA (exchange), and there are some logical instructions a caller must follow:

  • CIFRE - the caller is expected to say a digit or a sequence of digits;
  • DATUMI - the caller is expected to say a date and a month, with or without specifying the day of the week at the beginning, i.e. "Ponedeljak, trinaesti avgust" (Monday, the 13th of August);
  • GRADOVI - the caller is expected to say the name of any city from the list;
  • IMENA - the caller is expected to say any combination of the first and the last name on the list;
  • CENTRALA - this dialogue is somewhat more complex, but with a great potential for practical use. The caller can say any of the introductory phrases, i.e. "Dobar dan, dajte mi" ("Good morning/afternoon, give me"), but does not have to. Then, within the same sentence, the caller can say what he/she actually wants. It can be an extension, i.e. "Lokal 2 3 4" ("Extension 2 3 4"), as well as a service, i.e. "Vesti" ("News") or the first and last name of a person, i.e. "Vlado Delić". A complete sentence could look like this: "Dobro veče, trebao bi mi Milan Sečujski" ("Good evening, I'd like to speak to Milan Sečujski"), but could also be as short as "Račun" ("Account"). Introductory phrases, as well as available services and persons are given in the list.

The system will tell the caller what it has recognised each time and it will lead the caller through the dialogue. In each moment the caller can say POMOĆ ("Help"), and the system will provide additional information, NAZAD ("Back"), to return to the previous menu, PROMENA GRAMATIKE ("Change of grammar"), to return to the main menu and select another grammar (set of words). The caller is advised to speak as intelligibly as possible and to avoid calling from a noisy environment. A more detailed scheme of the dialogue, as well as the list of words in each grammar, can be found here. The caller is advised to study it before calling the number.

This is only a demo application, and the sets of words, as well as the dialogue itself, can be easily modified and adapted according to the needs of a specific application. We wish you a pleasant testing of our demos, and we invite you to send us all your suggestions and comments to the address given at the Contact page.