The word AlfaNum represents a group of projects in the area of Automatic Speech Recognition (ASR), speaker identification and verification, and automatic Text-to-Speech conversion (TTS) in Serbian language. All these problems are multidisciplinary and require knowledge of numerous areas such as acoustics, phonetics and linguistics, as well as mathematics, telecommunications, signal processing and programming, and their high language dependency makes them even more challenging. The AlfaNum project assembles a group of teachers and teaching assistants at the Telecomunications and Signal Processing Chair, at the Faculty of Technical Sciences (aka Faculty of Engineering), University of Novi Sad, Serbia and Montenegro. The head of the team is professor Vlado Delić, PhD, and the team consists of about ten active and as many occasional associates. Not all of them could be employed at the Faculty as assistant teachers, so a company "AlfaNum Ltd." had to be created in 2003, employing all those involved in the project who could not find a job at the Faculty. We invite you to visit the company's Web page for a demonstration of some of our achievements. The page is available only in Serbian at the moment. As a scientific research team with impressive results, AlfaNum is one of the members of the Centre for Vibroacoustics and Signal Processing (CEVAS), the first Centre of Excellence at the Faculty, which is currently being accredited by the Ministry of Education and Technological Development of the Republic of Serbia.

Speech recognition and synthesis do not require any additional hardware as it was the case some time ago. Personal computers and mobile phones have become sufficiently fast, reliable, and cheap at the same time. Nevertheless, our market is rather closed and small, and Serbian language particularities are significant. That's why we set to develop speech technologies on our own. More about the results of our efforts can be found here.

All CTI applications that include ASR and TTS in Serbian language are our accomplishment. There are several hundreds of telephone lines with PCs receiving telephone calls in Serbia and other republics of former Yugoslavia today. They provide required information and services to callers through a human-machine spoken communication in Serbian language.

During the research and software development for speech recognition and synthesis we have developed appropriate audio-visual tools and a software environment for CTI development and application. We recorded and classified various speech databases in Serbian language, intended for both speech recognition and synthesis. We have also initiated several humanitarian projects aimed at popularizing computer literacy among the visually impaired, which has not been possible until high-quality text-to-speech system for Serbian language appeared.

Speech is the basic means of communication between humans. One can convey their thoughts and feelings to another in a way much more intricate than in any other animal species, and thus the human speech system is the most complicated one and comprises a number of organs - from lungs, trachea (windpipe), larynx and vocal folds, to oral cavity with tongue, teeth and lips, and nasal cavity.

Speech considered as a sound signal contains a multitude of information. Beside what has been said, it includes information on the speaker that reveal the emotional state, the identity of a known speaker or the gender and age of an unknown one. We understand the meaning, perceive the speaker's dialect, education level and culture.

We understand what has been said relying of our knowledge of the language and on context. Thus, segmentation of the sequence of sounds that we hear is possible only if we are familiar with the language. Speech perception is, therefore, not an inherited but a learned ability. Furthermore, one can focus on a particular speaker among many, to estimate the position of the speech source, and often to understand things that have not been actually said, but rather implied.

   

Automatic speech recognition (ASR) is considered one of the greatest technical challenges of today, attracting attention of many researchers worldwide for more than half a century. The aim of automatic speech recognition is to produce a textual output based on a sound recording of an utterance (a word or a sentence). Speech is thus converted to text, that is, the system "recognizes" what the speaker has said.

Automatic speech recognition is performed based on extraction of particular features from the speech signal, and their comparison to referent feature values or models previously created. A number of questions emerge at once: what features are relevant for speech recognition, how to extract them from speech signals, how to prepare referent feature values or models, how to compare extracted and referent features or models. Speech variability complicates the matter further, since two speakers do not pronounce a word in the same way, nor does a single speaker pronounce it in the same way twice. That is why speech recognition is not possible without systems such as Hidden Markov Models (HMM) and/or artificial neural networks.

Automatic speech recognition is carried out in two phases: training (off-line) and recognition (on-line). Training is performed using appropriate speech datasets. If an ASR system has to be trained for recognition of one particular person, it is a speaker dependent system. If, on the other hand, it can successfully recognize a multitude of speakers whose voices have not been used for training, it is a speaker independent system.

There are ASR systems that recognize only isolated words and those that can recognize connected words as well. In both cases the words must come from a predefined set of words called a vocabulary. Vocabulary size influences recognition error rate - the larger the vocabulary, the higher the error rate. Another problem occurs in case there are two words in the vocabulary with very similar pronunciations, but that can usually be avoided on application level.

   
Text-to-Speech Synthesis (TTS) is the oldest speech technology, originating from as early as the 18th century, when first "speaking machines" appeared. Meanwhile, this area has developed tremendously, due mostly to advances in computer technology during the last decades. This is the speech technology with language dependency at its highest, and solutions developed for one language cannot be used for others. An adaptation is possible (but still painstaking) only in case of extremely similar languages.

The aim of speech synthesis is to generate intelligible speech based on textual input. The intelligibility implies a certain level of naturalness, achieved by manipulating lexical and sentence intonation as well as phonetic content, in much the same way the humans do it. Naturalness of synthesized speech is not only a matter of aesthetics, it is an important element that makes synthesized speech easier to understand, helping the listener to separate it into words.

In order to make synthesized speech sound natural, a TTS system would have to know how to manipulate the intonation as well as which phonemes to pronounce. One of the biggest problems of speech synthesis is the fact that none of those information is explicitly present in the text, but the system has to deduce them in one way or another. Speech synthesizers for languages with relatively free accentuation have to rely on massive accentuation and morphological dictionaries, as well as modules for syntax analysis of the sentence.

Speech signal generation can be carried out in various ways, but owing to the existence of powerful processors and sufficient memory resources of modern computers, the most popular and by far the simplest way to do it is by concatenation of prerecorded speech segments. Much higher quality can be achieved if the segments are chosen at runtime. There are, again, some questions to answer - how to choose appropriate segments from a database that could contain tens of thousands of words, what type of segments to choose, how to modify them in order to make their acoustic features resemble the desired ones and to make transition between segments as inaudible as possible. Numerous speech signal synthesis techniques answer these questions, and the most important ones are TD-PSOLA method in time domain, as well as hybrid H/N speech synthesis model in frequency domain.

There are several points that should be considered when discussing speech technology applications in Serbian language:

• It will soon be quite common for us to talk to computers and various devices in our midst, and if we could not do it in our native language, it would be a great problem.
• Speech communication between humans and machines is possible only in languages for which automatic speech recognition and speech synthesis have been developed. The European Union therefore has a need for multilingual products and services.
• The dependence of speech technologies on language is extreme, thus solutions developed for one language cannot be applied to another. An adaptation is possible only in case of very similar languages (easier said than done!).
• Our market is relatively small and closed, so it would be unrealistic to expect that some of the world leaders in speech technologies decide to tackle all of the problems of Serbian language. The project Alfanum therefore also represents a stimulus for preservation of Serbian language.
• The Pentium class PCs are fast enough today, so no additional expensive hardware is required, and the software-only solutions offer acceptable investment level, enabling computers to handle several conversations at a time.
• Within the project Alfanum quality solutions for both automatic speech recognition and speech synthesis have been developed, and have already been applied.

 

Helping the visually impaired to use PCs is one of the most important and most noble applications of speech synthesis. To enable a blind person to use a computer and a variety of software almost as efficiently as her normally sighted counterpart, it is necessary that the computer keep her informed on the contents of the screen. Two software components are required for that. One of them is a "screen reader", an application that converts every single user action into a piece of information such as "this has been activated" or "that has been closed". The other component is a speech synthesizer, that will convert this piece of information into intelligible speech that the user will be able to understand, thus enabling him to work on a computer just like a person with normal sight, with minimum delay. This gives the visually impaired access to information and enables them to communicate, which in turn enables them to perform various jobs, raises their self-reliance and self-respect, and ultimately raises the quality of their lives.

Issuing voice commands is one of the most popular speech recognition applications. Speaker dependent systems are generally used here, because they can achieve somewhat lower error rate. However, it is necessary that the system be previously trained to recognize a particular speaker, which in turn means that the user is required to read a training text aloud previous to the recognition phase. After that, the system is able to recognize user issued commands. This feature is often used to initiate mobile phone calls, as well as to manage technological processes, house appliances, as well as educational software and games, and is of great help to differently abled people. Similar systems can also manage telephone call routing in public and private networks, but they obviously include speaker independent recognition.

The most popular commercial application of speech technologies is within Interactive Voice Response systems, which generally make use of both speech technologies - speech synthesis and recognition.

Interactive Voice Response (IVR) systems represent computer telephony applications that enable the users to access large quantities of data using a plain old telephone, as well as to initiate certain actions, such as making reservations, transaction management etc., without human operator intervention (unless they explicitely say they want it). Speech recognition enables callers to carry out an intuitive and efficient communication with IVR systems, and to say exactly what they need at once, without having to browse complex menu structures using a telephone keyboard. Some ideas for IVR system applications are given below.

Call Centres
Voice Mail

Automatted Attendant
Unified Messaging
Fax on Demand
Follow Me - One Number Service
Working Hours Monitoring
Security and Access Control
9x services
98x services
9xxx services
Call Collect Service
Yellow Pages
Catalogue Sale
Traffic Conditions
Technical Support
Advertising
Form filling
Public Opinion Research
Elections
Emergencies
Mobilization
Banks

Health Care
Power Companies
Water Companies, City Sanitation Dept.
Airports
Bus and Railway Stations
Wholesale Stores
Car Repair Services
Universities
Schools
Radio and TV Programme
Cable TV Providers
GSM Providers
Theatres, Cinemas, Concert Halls…
Tourist Agencies
Hotels
Locator service
Art Galleries and Private Dealers
Real Estate Agencies
Insurance Companies
Bookmakers
Horoscope
Entertainment
Voice Web


Call Centres represent an efficient means for companies that want to keep in touch with their clients. Today, multimedia, e-mail and Web communication, intelligent network routing technologies and modern CTI technologies are all incorporated in call centres. IVR systems included in call centers can manage a significant percentage of incoming calls on their own, thus relieving human operators of load, saving their time, and enabling them to manage all incoming calls even during peak hours. The first contact is always established through a call centre - the first impression is created, and today, owing to integration of business transactions into call centre functions, most of the operations can as well be completed through them. In that way a universal interface of the company towards its environment is created, and the quality of internal communication is also improved.

Voice Mail provides automatic reception of spoken messages, their remote audition, deletion, change of greeting messages etc. Every individual has their own mailbox, which can be listened from the outside as well (if the system allows remote access). The advantage of such a system lies in a far lower price than the one of a set of individual telephone answering systems, as well as in availability of the service when the required number is busy. Besides, voice mail service enables message access through computer monitors, voice message forwarding and multicast. Some systems are able to initiate calls automatically to the programmed phone number and to reproduce a newly received message. Some systems support Auto Call Back service, which automatically returns calls to the number which initiated the original call.
Automated Attendant accepts calls and reroutes them to the adequate extension. Such a system can completely replace several human operators or assist them in the intervals of increased traffic. It can equally function beyond the working hours, during the night or over the weekend, and can prompt the caller to leave a message if nobody answers the call. Such CTI systems can save a lot of money to big companies.
Unified Messaging is an interesting service for users who receive many messages of different types: e-mail, voice and fax messages. When using different systems for accessing those messages, a user must check his/her computer for e-mails, the telephone answering machine for voice messages and the fax machine or the fax department to check for fax messages. Unified messaging system combines these local services into one entity, offering a joint graphic user interface (GUI). This system offers the possibility of remote access, since conversion from one type of message to another is provided. For example, a TTS system can convert e-mail messages into voice that could be listened to over the phone. Fax messages can be converted into text using an optical character recognition (OCR) system and transmitted as voice in the same way.
Fax on Demand is used for sending detailed information on callers' demand. Namely, caller is offered to make a choice among all the information that could be sent. When the choice is made (by means of voice or a phone dialer), IVR system asks for a fax/phone number. The chosen fax is automatically sent. Many companies provide information about their products in this way.
Follow Me - One Number Service offers the possibility to follow a user when he/she is away from the working place. A list of user's other phone and pager numbers is initially composed, and the system calls those numbers in a specified order, also offering the possibility of leaving a voice message. There is a possibility of calling all the numbers at once as well.

Working Hours Monitoring can be done automatically. This is interesting for a variety of companies. Every employee has a personal identification number that they use for reporting upon arrival at their working place (i.e. when one begins to work) and finally upon their departure. If needed, voice check, i.e. speaker verification could be incorporated as well. On account of these information, the system can give information about current activity of employees and summarize it. This is particularly interesting for distributed call centers.

Security and Access Control can be accomplished using audio-visual procedures: cameras and speaker identification and verification through the recognition of the one's voice.
9X services (police, emergency service, fire brigade) can be improved in that the system determines the caller's address based on automatic calling number identification. This is very important since people who call these numbers can be very excited and frightened, and often unable to provide precise information or even specify their location - e.g. in the case of a traffic accident.
98X services could be automated for the most part. This would increase the efficiency of the employees in obtaining information as well as the number of calls that could be handled, thus reducing the period of unavailability. The system could accept the call, play the greeting message and put the call on hold. The operator could accept only the key information, type it in, and the system could then find the information and report it to the caller.
9XXX services reproduce recorded material related to certain information or to entertainment, usually without the possibility of choosing any option: user can just call the number and listen to the contents continuously replayed. The cost of such services is usuallly just a little above the regular call price. More sophisticated systems could enable users to choose the contents they want to listen to, or forward their calls to a human operator.
Call Collect Service can be achieved without any mediation on the part of a live operator. Users call a special number and the system asks them to state their name and to state or key in a phone number they want to connect to. Before the connection is established, the called party hears a recorded voice of the person initiating the call, and when asked if they accept the call on their account, answers with YES or NO.
Yellow Pages in a traditional printed form, represent a register of addresses and telephone numbers classified according to the kind of product and service being advertised. Yellow pages are generally issued by local telephone companies and distributed without any charge. However, an advertisement in the yellow pages is charged. Every advertisement contains a telephone number and an address as well as some basic information about the products and services. Since these registers are updated only occasionally, yellow pages cannot contain frequently changing information. Yellow pages based on computer telephony can be updated more easily and therefore offer such information as well. They can even go a step forward and reroute a call to the chosen entity. Such a system can protect the privacy of the advertiser by auto-forwarding calls to the advertiser without revealing their number.
Catalogue Sale is also very interesting for IVR systems application. Centers for telemarketing services have IVR units that collect calls and request some basic information from the caller, e.g. what product or service he/she is interested in, and then provide the basic information before forwarding the call to the appropriate sales agent who accepts specific orders and concentrates on specific sale details. Another approach - with given catalogue numbers of products - enables complete automatization of the ordering process, as well as the collection. The callers enter a catalogue and product number they want to order. They listen to the recorded descriptions to get more information about available models, sizes, colors etc. More sophisticated systems enable users to commit orders by entering number of their credit card.
Traffic conditions are usually broadcasted by radio stations, but it is useful that a driver be able to obtain information (when needed) on traffic conditions on certain routes. In the same way, in large cities, drivers need information on traffic jams in certain streets and on suggested reroutes.
Technical Support can be improved by a system that would reroute calls to the first available operator, in order to increase efficiency. Callers confirm their registered user status by entering the serial or registration number. Upon receiving answers on several questions, the system classifies the problem and forwards the call to the appropriate person. By means of fax on demand, users can obtain detailed information about the product, technical upgrades, diagrams and answers to FAQ. In this way, some technical support can be offered beyond working hours.
Advertising can be performed via fax broadcast. Ordering and order status check can be automated. Every order is associated with a number which caller uses during the order status check. System is connected to a corresponding status database, which can be read out aloud to the caller automatically.
Form Filling is a well-known problem. How many times have you been in a situation where you lacked some additional documents or photos needed for completing the form filling and submission? It is very important to have the information on all the necessary items and taxes. A system giving these information would, for instance, be interesting for embassies issuing visas: beside the list of required documents, the system could offer some other useful information, such as the address, working hours and main phone numbers.
Public Opinion Research can be accomplished automatically and used for many purposes: market research, opinion polls etc. Polltakers select subjects to be polled from a database. A completely automated system calls numbers on the list and asks a sequence of previously recorded questions. The answers can be obtained via speech recognition or phone dialing.
Elections could hardly be accomplished based on IVR systems, because nobody could ever be sure if the results were tampered with. However, many other activities linked to elections could be carried out very ellegantly based on IVR. For instance, an electoral roll check, i.e. determing every voter's constituency, could be accomplished by means of such a system accesing a unique database. Besides, a similar system could carry out a regional opinion poll, and to give an up-to-date insight into elections results.
Emergencies in large buildings and systems - occurences such as fire, flood, earthquake, power loss, poisoned gas leak, convict escape, bombing - always lead to panic and inadequate and destructive reactions. In such situations a help from the system that would automatically call certain telephone numbers, and give the right instructions, would be invaluable. Those instructions could be previously recorded, or broadcasted by a competent person.
Mobilization would have to be inforced quickly and efficiently. The fastest way to carry it out is by a system that would initiate a large number of telephone calls in a short period of time, and reproduce previously recorded directives. All the calls would be repeated until answered.
Banks give their clients permanent access to bank accounts. The clients enter their account number and personal identification number, and get access to a whole variety of services: current account state insight, last change date, ordering of checks or payment report on fax, initation of a transfer from one account to another etc. Such services are attractive both for common and credit card accounts, though the latter would use a direct connection to an appropriate credit company. In a case of payment above certain limit, operators would use special telephone numbers for account state check in a centralized database. Besides, banks using IVR systems could offer detailed information on themselves, exchange rates, credit arrangements and loans - interest rates calculated depending on the ammount and credit payback rate chosen by the caller.

Health Care can significantly improve some aspects of their activities using IVR systems. Hospitals could use automatic call forwarding and voice mail system, so that the staff could perform more important duties. Callers could get information about patients directly or by an IVR system. Callers could also leave a voice message for patients. Besides, such systems could give information about visiting hours, and could automatize appointments scheduling: a patient would select a doctor they wanted to be examined by and select the desired date and time. An appointment could as well be cancelled by such a system. The system would arrange these data and give the list of terms to the medical center personnel. It could also be programmed to call the patients and remind them of the visit a day before. Health department could permanently give various information about medicines. Information stored in the personal health cards could be stored in an appropriate database with controlled access. With appropriate authorization both doctors and patients could access the database with laboratory tests and analysis results.

Power Companies receive an extremely large number of calls when sudden power failures occure. This level of traffic totally paralyses the telephone system and disables its use in taking adequate action. One CTI system can answer these calls automatically, reproducing a recorded message, locating at the same time, the area where the calls are coming from through automatic number identification, and instantly notifying responsible dispatchers. Information on groups for planned power cut-offs can be provided, as well for cut-offs resulting from system malfunction.
Water Companies and City Sanitation Departments can inform the public on all planned activities both by public media and IVR systems. Such systems could be accessed by citizens in every moment in order to get the information they missed on radio or TV programme.
Airports use IVR systems to provide information on flights. In that way a large number of calls is efficiently handled and terminal personnel is relieved of load. The system offers information on both incoming and outgoing flights, and callers can select the flight number they are interested in.
Bus and Railway Stations, both regional and international, use IVR systems to give information on departures and arrivals. Callers are asked to listen to the list of existing routes and select the line they are interested in, or just to state the direction they are interested in. IVR system then reports departure/arrival times from that moment on. More sophisticated systems can forward the call to the appropriate ticket sale operator, where the desired booking can be made.
Wholesale Stores can significantly increase their efficiency with the help of an IVR system. Inventory data can be held up to date. Sellers can easily enter the inventory number or select an object by name and get the information about available quantity and price, location in the store, carry out the order, send order information directly to the printer, initiate the delivery, and update the quantity automatically. All the data on available stock quantities in stores are listed and delivered to the supply department. Buyers can be offered a service through which they check the status of their orders by entering their order or account number, and after an insight in the appropriate database, inquire on a possible delay and estimated delivery time.
Car Repair Services relying on IVR systems enable customers to call at any time, enter the number of repair order, and get information on repair status (price, completion date, need for further repairs, etc.). IVR system can receive information about the payment method (e.g. credit card number), initiate payment and prepare an appropriate bill. It can automatically call the car owner to let him know that the car has been repaired, what is the price and when the car can be picked up.
Universities can offer a variety of useful information to students through an IVR system. DTMF dialing or speech recognition would enable students to sign in for the exam without having to wait in the line. Students can access personal services as well, and their identity can be verified based on personal identification number. Students can also order that a schedule be delivered to them via e-mail or fax. Information on daily and weekly menus in student restaurants can also be offered in this way.
Schools are interested in the possibility to send messages to pupils' parents by e-mail. Those messages are otherwise delivered by the pupils themselves and usually get lost or altered. Parents are sometimes not sure if their children remembered what they had for homework or if they were absent from the class. Such information can also be given via an IVR system.
Radio and TV Programme in talk shows can rely on IVR systems as well. For example, when the listeners/viewers are expected to vote, they are given a telephone number. A system accepts their calls and summarizes the vote related to, say, music or movie hit to be broadcasted. Establishing audio conferences is especially interesting for certain kinds of radio programmes such as quizzes. An IVR system can be of great assistance during and after commercial shows. The number of calls usually increases dramatically at that time, and the number of incoming lines is not sufficient anymore. An IVR system able to talk to dozens of people at the same time can come very useful.
Cable TV Providers offer Video on Demand service, enabling their viewers to order and watch chosen movies from the provider's video collection. Movie selection and payment process can be automatized using an IVR system. A viewer calls the given number, selects a genre and movie title in an interactive conversation. Payment is accomplished automatically based on automatic number identification (ANI), since a telephone number is associated to a subscriber of the cable TV network. Furthermore, cable TV providers can offer another service to their subscribers - a certain channel that would be unscrambled and broadcasted at a desired time. Such services can also be completely automatized.
GSM Providers offer a variety of services via IVR systems. For example, they offer their subscribers a possibility that their voice messages be recorded when they are unable or do not want to accept the call. The paging and IVR system can merge and the information on an incoming message can be transferred to desired pager. For example, a user can demand that all calls between 7 and 8 AM, and 2 and 3 PM are transfered to his car phone, between 8 AM and 2 PM to his office, and after 3 PM to his pager or voice mail service. The use of IVR systems becomes increasingly popular, as speech represents the most natural interface, which is of great significance for mobile phone users. Call initiation can also be activated by voice. A caller can pronounce a number, or say a certain word or a phrase (e.g. "call office") that would be recognized by the ASR system and the call to the desired number would be placed. Speaker-dependent ASR is used for this purpose mostly, as it achieves somewhat lower error rate, and is simple enough to be implemented in a mobile phone itself. However, more sophisticated applications such as speaker verification or speaker-independent speech recognition must be offered by the providers themselves.
Theatres, Cinemas, Concert Halls… After selecting one of the current events and desired seat(s), callers can book their tickets. They are asked to enter a credit card number, and after a card status check, they are asked for the telephone number and the address where the tickets will be delivered to. These systems often inform the callers on incoming events such as concerts, so the caller can get more information about the desired event without having to listen to long and tedious lists. Some concert sponsors can be assigned special phone numbers through which they can offer some of the most important information on the concert, or even some greeting messages by the very performers. Theatres and cinemas can offer the information on current events as well, together with addresses and phone numbers of the cash desks. Beside this, callers can order faxes with complete schedule, seats arrangement, an instruction on how to find the venue etc.
Tourist agencies can offer information of special interest to their clients. For those who wish to travel abroad, after a selection of the country, the system would provide information on the climate and current weather conditions, journey duration, time zone, local currencies and exchange rates, local traffic, advice about health care, etc. The information can be offered in several languages.
Hotels can automatize a wide range of services they offer, starting from booking, giving basic information about themselves, their services, restaurants, entertaining repertoires, to the information about cultural and entertaining venues or events in the city. It is nevertheless necessary to offer a connection to live operator as an alternative, for guests who prefer it or want some more information. The system can include an answering machine service (voice mail) for each hotel guest. Guests often need to be awaked in the morning, and the system should be able to accept such requests. The system will then be able to initiate an awakening call at the desired time. If the call isn't answered or the line is busy, the call is repeated in programmed intervals.
Locator service can help users locate the desired object in a city. The user calls a certain number, enters the zip code of the city and chooses the desired object (a tourist-bureau, a rent-a-car, a hotel etc). The system returns a list of all nearest objects of the specified kind, with telephone numbers and addresses. The system should be able to forward the call to the chosen object.
Art Galleries and Private Dealers are often faced with very demanding customers who can be positively impressed by a well designed IVR system. The caller would choose a certain artist or a work of art and listen to the recorded description. The call can be forwarded to a dealer or a gallery if the caller is interested in more detailed information or purchase. The dealers would preserve discretion in this way as well.
Real Estate Agencies can efficiently apply IVR systems. Such a system can help potential customers through a verbal communication with them, by providing more detailed information, such as approximate price, location, number of rooms, building style etc. They can select objects that fit all customer requirements automatically. The callers can obtain the phone number of the agent who is in charge of the particular sale, or else their call can be forwarded to the agent.
Insurance Companies can automatize providing information on compensation claim status. Each claim number can be associated with a report to the caller. The reports could include information on documents that need to be submitted, and the calls could be forwarded to appropriate agents.
Bookmakers can offer up-to-date information on results of various events such as football matches. A caller selects a number of the match he is interested in, and the system synthetizes spoken information about the current state or the final result of the match. Placing bets can also be automatized in a similar way.
Horoscope can be organized in a classical way: a caller pronounces a horoscope sign, the computer recognizes it and gives the caller all the information related to that sign. The system can also calculate a personal horoscope, based on the date, time and place of the caller's birth, acquired in a verbal communication with the caller. Callers can order a fax with a more detailed personal horoscope.
Entertainment can be offered in a very effective way through IVR systems. Different ways of entertainment can be offered at the same number, and the caller would make the choice by means of a phone dialer or by voice. The system could offer information on local cultural, artistic, sport and other events, TV and radio programmes, recipes, music hits...
Voice Web. New speech technologies - speech and speaker recognition and speech sythesis - provide us with a new concept of communication (so-called Voice Web) that enables Internet access by telephone without the use of a computer. Voice Web is expected to open a variety of new possibilities for generating income by offering new and atractive user services over the phone as well as to introduce so-called v-commerce.