word AlfaNum represents a group of projects in the area of Automatic
Speech Recognition (ASR), speaker identification and verification, and
automatic Text-to-Speech conversion (TTS) in Serbian language. All these
problems are multidisciplinary and require knowledge of numerous areas
such as acoustics, phonetics and linguistics, as well as mathematics,
telecommunications, signal processing and programming, and their high
language dependency makes them even more challenging. The AlfaNum project
assembles a group of teachers and teaching assistants at the Telecomunications
and Signal Processing Chair, at the Faculty
of Technical Sciences (aka Faculty of Engineering), University of
Novi Sad, Serbia and Montenegro. The head of the team is professor Vlado
Delić, PhD, and the team consists of about ten active and as many
occasional associates. Not all of them could be employed at the Faculty
as assistant teachers, so a company "AlfaNum Ltd." had to
be created in 2003, employing all those involved in the project who
could not find a job at the Faculty. We invite you to visit the company's
Web page for a demonstration of some of our achievements. The page
is available only in Serbian at the moment. As a scientific research team with impressive results, AlfaNum is one of the members of the Centre for Vibroacoustics and Signal Processing (CEVAS), the first Centre of Excellence at the Faculty, which is currently being accredited by the Ministry of Education and Technological Development of the Republic of Serbia.
recognition and synthesis do not require any additional hardware as it was the case some time ago.
Personal computers and mobile phones have become sufficiently fast, reliable, and cheap at
the same time. Nevertheless, our market is rather closed and small,
and Serbian language particularities are significant. That's why we
set to develop speech technologies on our own. More about the results
of our efforts can be found here.
CTI applications that include ASR and TTS in Serbian language are our
accomplishment. There are several hundreds of telephone lines with PCs
receiving telephone calls in Serbia and other republics of former Yugoslavia today. They provide
required information and services to callers through a human-machine
spoken communication in Serbian language.
the research and software development for speech recognition and synthesis
we have developed appropriate audio-visual tools and a software environment
for CTI development and application. We
recorded and classified various speech databases in Serbian language,
intended for both speech recognition and synthesis. We have also initiated
several humanitarian projects aimed at popularizing computer literacy
among the visually impaired, which has not been possible until high-quality
text-to-speech system for Serbian language appeared.
is the basic means of communication between humans.
One can convey their thoughts and feelings to another in a way
much more intricate than in any other animal species, and thus
the human speech system is the most complicated one and comprises
a number of organs - from lungs, trachea (windpipe), larynx and
vocal folds, to oral cavity with tongue, teeth and lips, and nasal
Speech considered as a sound signal contains a multitude of information.
Beside what has been said, it includes information on the speaker
that reveal the emotional state, the identity of a known speaker
or the gender and age of an unknown one. We understand the meaning,
perceive the speaker's dialect, education level and culture.
understand what has been said relying of our knowledge of the
language and on context. Thus, segmentation of the sequence of
sounds that we hear is possible only if we are familiar with the
language. Speech perception is, therefore, not an inherited but
a learned ability. Furthermore, one can focus on a particular
speaker among many, to estimate the position of the speech source,
and often to understand things that have not been actually said,
but rather implied.
speech recognition (ASR) is considered one of the greatest
technical challenges of today, attracting attention of many researchers
worldwide for more than half a century. The aim of automatic speech
recognition is to produce a textual output based on a sound recording
of an utterance (a word or a sentence). Speech is thus converted
to text, that is, the system "recognizes" what the speaker
speech recognition is performed based on extraction of particular
features from the speech signal, and their comparison to referent
feature values or models previously created. A number of questions
emerge at once: what features are relevant for speech recognition,
how to extract them from speech signals, how to prepare referent
feature values or models, how to compare extracted and referent
features or models. Speech variability complicates the matter
further, since two speakers do not pronounce a word in the same
way, nor does a single speaker pronounce it in the same way twice.
That is why speech recognition is not possible without systems
such as Hidden Markov Models (HMM) and/or artificial neural networks.
Automatic speech recognition is carried out in two phases: training
(off-line) and recognition (on-line). Training is performed using
appropriate speech datasets. If an ASR system has to be trained
for recognition of one particular person, it is a speaker dependent
system. If, on the other hand, it can successfully recognize a
multitude of speakers whose voices have not been used for training,
it is a speaker independent system.
are ASR systems that recognize only isolated words and those that
can recognize connected words as well. In both cases the words
must come from a predefined set of words called a vocabulary.
Vocabulary size influences recognition error rate - the larger
the vocabulary, the higher the error rate. Another problem occurs
in case there are two words in the vocabulary with very similar
pronunciations, but that can usually be avoided on application
Synthesis (TTS) is the oldest speech technology, originating
from as early as the 18th century, when first "speaking machines"
appeared. Meanwhile, this area has developed tremendously, due mostly
to advances in computer technology during the last decades. This
is the speech technology with language dependency at its highest,
and solutions developed for one language cannot be used for others.
An adaptation is possible (but still painstaking) only in case of
extremely similar languages.
aim of speech synthesis is to generate intelligible speech based
on textual input. The intelligibility implies a certain level
of naturalness, achieved by manipulating lexical and sentence
intonation as well as phonetic content, in much the same way the
humans do it. Naturalness of synthesized speech is not only a
matter of aesthetics, it is an important element that makes synthesized
speech easier to understand, helping the listener to separate
it into words.
In order to make synthesized speech sound natural, a TTS system
would have to know how to manipulate the intonation as well as
which phonemes to pronounce. One of the biggest problems of speech
synthesis is the fact that none of those information is explicitly
present in the text, but the system has to deduce them in one
way or another. Speech synthesizers for languages with relatively
free accentuation have to rely on massive accentuation and morphological
dictionaries, as well as modules for syntax analysis of the sentence.
Speech signal generation can be carried out in various ways, but
owing to the existence of powerful processors and sufficient memory
resources of modern computers, the most popular and by far the
simplest way to do it is by concatenation of prerecorded speech
segments. Much higher quality can be achieved if the segments
are chosen at runtime. There are, again, some questions to answer
- how to choose appropriate segments from a database that could
contain tens of thousands of words, what type of segments to choose,
how to modify them in order to make their acoustic features resemble
the desired ones and to make transition between segments as inaudible
as possible. Numerous speech signal synthesis techniques answer
these questions, and the most important ones are TD-PSOLA method
in time domain, as well as hybrid H/N speech synthesis model in
are several points that should be considered when discussing speech
technology applications in Serbian language:
• It will soon be quite common for us to talk to computers and various
devices in our midst, and if we could not do it in our native language,
it would be a great problem.
• Speech communication between humans and machines is possible only
in languages for which automatic speech recognition and speech synthesis
have been developed. The European Union therefore has a need for multilingual
products and services.
• The dependence of speech technologies on language is extreme, thus
solutions developed for one language cannot be applied to another.
An adaptation is possible only in case of very similar languages (easier
said than done!).
• Our market is relatively small and closed, so it would be unrealistic
to expect that some of the world leaders in speech technologies decide
to tackle all of the problems of Serbian language. The project Alfanum
therefore also represents a stimulus for preservation of Serbian language.
• The Pentium class PCs are fast enough today, so no additional expensive
hardware is required, and the software-only solutions offer acceptable
investment level, enabling computers to handle several conversations
at a time.
• Within the project Alfanum quality solutions for both automatic
speech recognition and speech synthesis have been developed, and have
already been applied.
the visually impaired to use PCs is one of the most important
and most noble applications of speech synthesis. To enable a blind
person to use a computer and a variety of software almost as efficiently
as her normally sighted counterpart, it is necessary that the computer
keep her informed on the contents of the screen. Two software components
are required for that. One of them is a "screen reader",
an application that converts every single user action into a piece
of information such as "this has been activated" or "that
has been closed". The other component is a speech synthesizer,
that will convert this piece of information into intelligible speech
that the user will be able to understand, thus enabling him to work
on a computer just like a person with normal sight, with minimum
delay. This gives the visually impaired access to information and
enables them to communicate, which in turn enables them to perform
various jobs, raises their self-reliance and self-respect, and ultimately
raises the quality of their lives.
voice commands is one of the most popular speech recognition
applications. Speaker dependent systems are generally used here,
because they can achieve somewhat lower error rate. However, it
is necessary that the system be previously trained to recognize
a particular speaker, which in turn means that the user is required
to read a training text aloud previous to the recognition phase.
After that, the system is able to recognize user issued commands.
This feature is often used to initiate mobile phone calls, as
well as to manage technological processes, house appliances, as
well as educational software and games, and is of great help to
differently abled people. Similar systems can also manage telephone
call routing in public and private networks, but they obviously
include speaker independent recognition.
most popular commercial application of speech technologies is within
Interactive Voice Response systems, which generally make use of both
speech technologies - speech synthesis and recognition.
Voice Response (IVR) systems represent computer telephony
applications that enable the users to access large quantities of data
using a plain old telephone, as well as to initiate certain actions,
such as making reservations, transaction management etc., without human
operator intervention (unless they explicitely say they want it). Speech
recognition enables callers to carry out an intuitive and efficient
communication with IVR systems, and to say exactly what they need at
once, without having to browse complex menu structures using a telephone
keyboard. Some ideas for IVR
system applications are given below.
Centres represent an efficient means for companies that want to keep in
touch with their clients. Today, multimedia, e-mail and Web communication,
intelligent network routing technologies and modern CTI technologies
are all incorporated in call centres. IVR systems included in
call centers can manage a significant percentage of incoming calls
on their own, thus relieving human operators of load, saving their
time, and enabling them to manage all incoming calls even during
peak hours. The first contact is always established through a
call centre - the first impression is created, and today, owing
to integration of business transactions into call centre functions,
most of the operations can as well be completed through them.
In that way a universal interface of the company towards its environment
is created, and the quality of internal communication is also
Mail provides automatic reception of spoken messages,
their remote audition, deletion, change of greeting messages etc.
Every individual has their own mailbox, which can be listened from
the outside as well (if the system allows remote access). The advantage
of such a system lies in a far lower price than the one of a set
of individual telephone answering systems, as well as in availability
of the service when the required number is busy. Besides, voice
mail service enables message access through computer monitors, voice
message forwarding and multicast. Some systems are able to initiate
calls automatically to the programmed phone number and to reproduce
a newly received message. Some systems support Auto Call Back service,
which automatically returns calls to the number which initiated
the original call.
Attendant accepts calls and reroutes them to the adequate
extension. Such a system can completely replace several human operators
or assist them in the intervals of increased traffic. It can equally
function beyond the working hours, during the night or over the
weekend, and can prompt the caller to leave a message if nobody
answers the call. Such CTI systems can save a lot of money to big
Messaging is an interesting service for users who receive
many messages of different types: e-mail, voice and fax messages.
When using different systems for accessing those messages, a user
must check his/her computer for e-mails, the telephone answering
machine for voice messages and the fax machine or the fax department
to check for fax messages. Unified messaging system combines these
local services into one entity, offering a joint graphic user interface
(GUI). This system offers the possibility of remote access, since
conversion from one type of message to another is provided. For
example, a TTS system can convert e-mail messages into voice that
could be listened to over the phone. Fax messages can be converted
into text using an optical character recognition (OCR) system and
transmitted as voice in the same way.
on Demand is used for sending detailed information on
callers' demand. Namely, caller is offered to make a choice among
all the information that could be sent. When the choice is made
(by means of voice or a phone dialer), IVR system asks for a fax/phone
number. The chosen fax is automatically sent. Many companies provide
information about their products in this way.
Me - One Number Service offers
the possibility to follow a user when he/she is away from the working
place. A list of user's other phone and pager numbers is initially
composed, and the system calls those numbers in a specified order,
also offering the possibility of leaving a voice message. There
is a possibility of calling all the numbers at once as well.
Hours Monitoring can
be done automatically. This is interesting for a variety of companies.
Every employee has a personal identification number that they
use for reporting upon arrival at their working place (i.e. when
one begins to work) and finally upon their departure. If needed,
voice check, i.e. speaker verification could be incorporated as
well. On account of these information, the system can give information
about current activity of employees and summarize it. This is
particularly interesting for distributed call centers.
and Access Control can
be accomplished using audio-visual procedures: cameras and speaker
identification and verification through the recognition of the one's
services (police, emergency service, fire brigade) can be improved
in that the system determines the caller's address based on automatic
calling number identification. This is very important since people
who call these numbers can be very excited and frightened, and often
unable to provide precise information or even specify their location
- e.g. in the case of a traffic accident.
be automated for the most part. This would increase the efficiency
of the employees in obtaining information as well as the number
of calls that could be handled, thus reducing the period of unavailability.
The system could accept the call, play the greeting message and
put the call on hold. The operator could accept only the key information,
type it in, and the system could then find the information and report
it to the caller.
||9XXX services reproduce
recorded material related to certain information or to entertainment,
usually without the possibility of choosing any option: user can
just call the number and listen to the contents continuously replayed.
The cost of such services is usuallly just a little above the regular
call price. More sophisticated systems could enable users to choose
the contents they want to listen to, or forward their calls to a
Collect Service can
be achieved without any mediation on the part of a live operator.
Users call a special number and the system asks them to state their
name and to state or key in a phone number they want to connect
to. Before the connection is established, the called party hears
a recorded voice of the person initiating the call, and when asked
if they accept the call on their account, answers with YES or NO.
a traditional printed form, represent a register of addresses and
telephone numbers classified according to the kind of product and
service being advertised. Yellow pages are generally issued by local
telephone companies and distributed without any charge. However,
an advertisement in the yellow pages is charged. Every advertisement
contains a telephone number and an address as well as some basic
information about the products and services. Since these registers
are updated only occasionally, yellow pages cannot contain frequently
changing information. Yellow pages based on computer telephony can
be updated more easily and therefore offer such information as well.
They can even go a step forward and reroute a call to the chosen
entity. Such a system can protect the privacy of the advertiser
by auto-forwarding calls to the advertiser without revealing their
|| Catalogue Sale is
also very interesting for IVR systems application. Centers for telemarketing
services have IVR units that collect calls and request some basic
information from the caller, e.g. what product or service he/she
is interested in, and then provide the basic information before
forwarding the call to the appropriate sales agent who accepts specific
orders and concentrates on specific sale details. Another approach
- with given catalogue numbers of products - enables complete automatization
of the ordering process, as well as the collection. The callers
enter a catalogue and product number they want to order. They listen
to the recorded descriptions to get more information about available
models, sizes, colors etc. More sophisticated systems enable users
to commit orders by entering number of their credit card.
|| Traffic conditions are
usually broadcasted by radio stations, but it is useful that a driver
be able to obtain information (when needed) on traffic conditions
on certain routes. In the same way, in large cities, drivers need
information on traffic jams in certain streets and on suggested
|| Technical Support can
be improved by a system that would reroute calls to the first available
operator, in order to increase efficiency. Callers confirm their
registered user status by entering the serial or registration number.
Upon receiving answers on several questions, the system classifies
the problem and forwards the call to the appropriate person. By
means of fax on demand, users can obtain detailed information about
the product, technical upgrades, diagrams and answers to FAQ. In
this way, some technical support can be offered beyond working hours.
|| Advertising can
be performed via fax broadcast. Ordering and order status check
can be automated. Every order is associated with a number which
caller uses during the order status check. System is connected to
a corresponding status database, which can be read out aloud to
the caller automatically.
a well-known problem. How many times have you been in a situation
where you lacked some additional documents or photos needed for
completing the form filling and submission? It is very important
to have the information on all the necessary items and taxes. A
system giving these information would, for instance, be interesting
for embassies issuing visas: beside the list of required documents,
the system could offer some other useful information, such as the
address, working hours and main phone numbers.
|| Public Opinion
be accomplished automatically and used for many purposes: market
research, opinion polls etc. Polltakers select subjects to be polled
from a database. A completely automated system calls numbers on
the list and asks a sequence of previously
recorded questions. The answers can be obtained via speech recognition or
|| Elections could hardly be accomplished based on IVR systems, because nobody
could ever be sure if the results were tampered with. However, many
other activities linked to elections could be carried out very ellegantly
based on IVR. For instance, an electoral roll check, i.e. determing every voter's constituency,
could be accomplished by means of such a system accesing a unique
database. Besides, a similar system could carry out a regional opinion poll,
and to give an up-to-date insight into elections results.
large buildings and systems - occurences such as fire, flood, earthquake,
power loss, poisoned gas leak, convict escape, bombing - always
lead to panic and inadequate and destructive reactions. In such
situations a help from the system that would automatically call
certain telephone numbers, and give the right instructions, would
be invaluable. Those instructions could be previously recorded,
or broadcasted by a competent person.
|| Mobilization would
have to be inforced quickly and efficiently. The fastest way to
carry it out is by a system that would initiate a large number of
telephone calls in a short period of time, and reproduce previously
recorded directives. All the calls would be repeated until answered.
|| Banks give their clients permanent access to bank accounts. The clients
enter their account number and personal identification number, and
get access to a whole variety of services: current account state
insight, last change date, ordering of checks or payment report
on fax, initation of a transfer from one account to another etc.
Such services are attractive both for common and credit card accounts,
though the latter would use a direct connection to an appropriate
credit company. In a case of payment above certain limit, operators
would use special telephone numbers for account state check in a
centralized database. Besides, banks using IVR systems could offer
detailed information on themselves, exchange rates, credit arrangements
and loans - interest rates calculated depending on the ammount and
credit payback rate chosen by the caller.
Health Care can
significantly improve some aspects of their activities using IVR
systems. Hospitals could use automatic call forwarding and voice
mail system, so that the staff could perform more important duties.
Callers could get information about patients directly or by an
IVR system. Callers could also leave a voice message for patients.
Besides, such systems could give information about visiting hours,
and could automatize appointments scheduling: a patient would
select a doctor they wanted to be examined by and select the desired
date and time. An appointment could as well be cancelled by such
a system. The system would arrange these data and give the list
of terms to the medical center personnel. It could also be programmed
to call the patients and remind them of the visit a day before.
Health department could permanently give various information about
medicines. Information stored in the personal health cards could
be stored in an appropriate database with controlled access. With
appropriate authorization both doctors and patients could access
the database with laboratory tests and analysis results.
|| Power Companies receive
an extremely large number of calls when sudden power failures occure.
This level of traffic totally paralyses the telephone system and
disables its use in taking adequate action. One CTI system can answer
these calls automatically, reproducing a recorded message, locating
at the same time, the area where the calls are coming from through
automatic number identification, and instantly notifying responsible
dispatchers. Information on groups for planned power cut-offs can
be provided, as well for cut-offs resulting from system malfunction.
|| Water Companies
and City Sanitation Departments can
inform the public on all planned activities both by public media
and IVR systems. Such systems could be accessed by citizens in every
moment in order to get the information they missed on radio or TV
|| Airports use
IVR systems to provide information on flights. In that way a large
number of calls is efficiently handled and terminal personnel is
relieved of load. The system offers information on both incoming
and outgoing flights, and callers can select the flight number they
are interested in.
and Railway Stations, both
regional and international, use IVR systems to give information
on departures and arrivals. Callers are asked to listen to the list
of existing routes and select the line they are interested in, or
just to state the direction they are interested in. IVR system then
reports departure/arrival times from that moment on. More sophisticated
systems can forward the call to the appropriate ticket sale operator,
where the desired booking can be made.
|| Wholesale Stores can
significantly increase their efficiency with the help of an IVR
system. Inventory data can be held up to date. Sellers can easily
enter the inventory number or select an object by name and get the
information about available quantity and price, location in the
store, carry out the order, send order information directly to the
printer, initiate the delivery, and update the quantity automatically.
All the data on available stock quantities in stores are listed
and delivered to the supply department. Buyers can be offered a
service through which they check the status of their orders by entering
their order or account number, and after an insight in the appropriate
database, inquire on a possible delay and estimated delivery time.
|| Car Repair Services relying
on IVR systems enable customers to call at any time, enter the number
of repair order, and get information on repair status (price, completion
date, need for further repairs, etc.). IVR system can receive information
about the payment method (e.g. credit card number), initiate payment
and prepare an appropriate bill. It can automatically call the car
owner to let him know that the car has been repaired, what is the
price and when the car can be picked up.
|| Universities can
offer a variety of useful information to students through an IVR
system. DTMF dialing or speech recognition would enable students
to sign in for the exam without having to wait in the line. Students
can access personal services as well, and their identity can be
verified based on personal identification number. Students can also
order that a schedule be delivered to them via e-mail or fax. Information
on daily and weekly menus in student restaurants can also be offered
in this way.
|| Schools are
interested in the possibility to send messages to pupils' parents
by e-mail. Those messages are otherwise delivered by the pupils
themselves and usually get lost or altered. Parents are sometimes
not sure if their children remembered what they had for homework
or if they were absent from the class. Such information can also
be given via an IVR system.
|| Radio and TV Programme in
talk shows can rely on IVR systems as well. For example, when the
listeners/viewers are expected to vote, they are given a telephone
number. A system accepts their calls and summarizes the vote related
to, say, music or movie hit to be broadcasted. Establishing audio
conferences is especially interesting for certain kinds of radio
programmes such as quizzes. An IVR system can be of great assistance
during and after commercial shows. The number of calls usually increases
dramatically at that time, and the number of incoming lines is not
sufficient anymore. An IVR system able to talk to dozens of people
at the same time can come very useful.
|| Cable TV Providers offer
Video on Demand service, enabling their viewers to order and watch
chosen movies from the provider's video collection. Movie selection
and payment process can be automatized using an IVR system. A viewer
calls the given number, selects a genre and movie title in an interactive
conversation. Payment is accomplished automatically based on automatic
number identification (ANI), since a telephone number is associated
to a subscriber of the cable TV network. Furthermore, cable TV providers
can offer another service to their subscribers - a certain channel
that would be unscrambled and broadcasted at a desired time. Such
services can also be completely automatized.
Providers offer a variety of services via IVR systems. For example, they offer their
subscribers a possibility that their voice messages be recorded
when they are unable or do not want to accept the call. The paging
and IVR system can merge and the information on an incoming message
can be transferred to desired pager. For example, a user can demand
that all calls between 7 and 8 AM, and 2 and 3 PM are transfered
to his car phone, between 8 AM and 2 PM to his office, and after
3 PM to his pager or voice mail service. The use of IVR systems
becomes increasingly popular, as speech represents the most natural
interface, which is of great significance for mobile phone users.
Call initiation can also be activated by voice. A caller can pronounce
a number, or say a certain word or a phrase (e.g. "call office")
that would be recognized by the ASR system and the call to the desired
number would be placed. Speaker-dependent ASR is used for this purpose
mostly, as it achieves somewhat lower error rate, and is simple
enough to be implemented in a mobile phone itself. However, more
sophisticated applications such as speaker verification or speaker-independent
speech recognition must be offered by the providers themselves.
|| Theatres, Cinemas,
Concert Halls… After
selecting one of the current events and desired seat(s), callers
can book their tickets. They are asked to enter a credit card number,
and after a card status check, they are asked for the telephone
number and the address where the tickets will be delivered to. These
systems often inform the callers on incoming events such as concerts,
so the caller can get more information about the desired event without
having to listen to long and tedious lists. Some concert sponsors
can be assigned special phone numbers through which they can offer
some of the most important information on the concert, or even some
greeting messages by the very performers. Theatres and cinemas can
offer the information on current events as well, together with addresses
and phone numbers of the cash desks. Beside this, callers can order
faxes with complete schedule, seats arrangement, an instruction
on how to find the venue etc.
||Tourist agencies can
offer information of special interest to their clients. For those
who wish to travel abroad, after a selection of the country, the
system would provide information on the climate and current weather
conditions, journey duration, time zone, local currencies and exchange
rates, local traffic, advice about health care, etc. The information
can be offered in several languages.
automatize a wide range of services they offer, starting from booking,
giving basic information about themselves, their services, restaurants,
entertaining repertoires, to the information about cultural and
entertaining venues or events in the city. It is nevertheless necessary
to offer a connection to live operator as an alternative, for guests
who prefer it or want some more information. The system can include
an answering machine service (voice mail) for each hotel guest.
Guests often need to be awaked in the morning, and the system should
be able to accept such requests. The system will then be able to
initiate an awakening call at the desired time. If the call isn't
answered or the line is busy, the call is repeated in programmed
service can help users locate the desired object in a
city. The user calls a certain number, enters the zip code of the
city and chooses the desired object (a tourist-bureau, a rent-a-car,
a hotel etc). The system returns a list of all nearest objects of
the specified kind, with telephone numbers and addresses. The system
should be able to forward the call to the chosen object.
|| Art Galleries and
Private Dealers are
often faced with very demanding customers who can be positively
impressed by a well designed IVR system. The caller would choose
a certain artist or a work of art and listen to the recorded description.
The call can be forwarded to a dealer or a gallery if the caller
is interested in more detailed information or purchase. The dealers
would preserve discretion in this way as well.
|| Real Estate Agencies can
efficiently apply IVR systems. Such a system can help potential
customers through a verbal communication with them, by providing
more detailed information, such as approximate price, location,
number of rooms, building style etc. They can select objects that
fit all customer requirements automatically. The callers can obtain
the phone number of the agent who is in charge of the particular
sale, or else their call can be forwarded to the agent.
|| Insurance Companies can
automatize providing information on compensation claim status. Each
claim number can be associated with a report to the caller. The
reports could include information on documents that need to be submitted,
and the calls could be forwarded to appropriate agents.
offer up-to-date information on results of various events such as
football matches. A caller selects a number of the match he is interested
in, and the system synthetizes spoken information about the current
state or the final result of the match.
Placing bets can also be automatized in a similar way.
|| Horoscope can
be organized in a classical way: a caller pronounces a horoscope
sign, the computer recognizes it and gives the caller all the information
related to that sign. The system can also calculate a personal horoscope,
based on the date, time and place of the caller's birth, acquired
in a verbal communication with the caller. Callers can order a fax
with a more detailed personal horoscope.
|| Entertainment can
be offered in a very effective way through IVR systems. Different
ways of entertainment can be offered at the same number, and the
caller would make the choice by means of a phone dialer or by voice.
The system could offer information on local cultural, artistic,
sport and other events, TV and radio programmes, recipes, music
speech technologies - speech and speaker recognition and speech
sythesis - provide us with a new concept of communication (so-called
Voice Web) that enables Internet access by telephone without the
use of a computer. Voice Web is expected to open a variety of new
possibilities for generating income by offering new and atractive
user services over the phone as well as to introduce so-called v-commerce.