Searching \ for 'Generating Digital Speech cheaply' in subject line. ()
Make payments with PayPal - it's fast, free and secure! Help us get a faster server
FAQ page: www.piclist.com/techref/index.htm?key=generating+digital
Search entire site for: 'Generating Digital Speech cheaply'.

Truncated match.
PICList Thread
'Generating Digital Speech cheaply'
1998\04\20@065135 by Werner Terreblanche

flavicon
face
I would like to hear some opinions from other people that have already done
the same thing I plan to do....

Basically I want to make a very cheap voice playback device capable of
playing back a fixed set number of short pre-recorded voice messages.  I
know you get special IC's and CODECS that does exactly this, but they are
still slightly too expensive for what I had in mind.   EPROM IC's are very
inexpensive these days, and I thought that maybe if store my messages as raw
digitized speech on a large EPROM and play it back through a resistor ladder
network A/D and filter, I would be able to regenerate speech messages at a
relatively low cost.  If I then use a PIC to address the EPROM and add a
serial interface to it, this could become a very cheap speech messaging
system that is addressable via the serial bus from my target
microcontroller.

Now, my questions are this:

1.  Is it a feasible way of doing this?
2. What sampling rate would I need?  (I was thinking 8KHz @ 8 bits
resolution) Is this good enough?
3. Has anybody perhaps done something like this already? Was the sound
reasonably clear?

One final question....  Some time ago, someone on the Piclist mentioned a
stand-alone text-to-speech synthesizer available for as low a cost as
$49.95.   I can not remember the details anymore.  Do anybody where I can
buy one of these?  I made a search on the web, but all I could come up with
is a device from RC systems  http://members.aol.com/rcsys  which costs about
$150, and is thus a bit too expensive for what I had in mind.  Have anybody
reading this ever used these text-to-speech modules and do they work well?

Rgds
Werner
--
Werner Terreblanche  <spam_OUTwterrebTakeThisOuTspamplessey.co.za>Tel :  (021) 7102251 (office
hours)

1998\04\20@083303 by Jason Wolfson

flavicon
face
Werner,
I would look at an OKI chip (MSM5202)
that does 4 bit ADPCM encoded speech. at 8 khz it sounds really good. and
don't forget that's only 4 bits so each byte is 2 samples.
record your message with a PC into a WAV file
and OKI has software that will translate that to
4 bit ADPCM. Stuff that file in EPROM and it would
be easy for the PIC to feed it to the OKI chip at 8khz.
the OKI chip provides an interrupt to ask for another sample.
I did a similar design many years ago....
good luck.
Jason Wolfson
Lipidex Corp

{Original Message removed}

1998\04\20@084354 by Steve Baldwin

flavicon
face
> Basically I want to make a very cheap voice playback device capable
> 1.  Is it a feasible way of doing this?
> 2. What sampling rate would I need?  (I was thinking 8KHz @ 8 bits
> resolution) Is this good enough?
> 3. Has anybody perhaps done something like this already? Was the sound
> reasonably clear?

I did this with an 8051 (not that it makes any difference) a while
ago and it worked well. One of the Windows WAV file utilities lets
you play with bit sizes and sample rates, so you can try it on your
PC and see if it sounds good enough. It will show you the data size
too, if you are planning on saving the data as a WAV file in your
ROM which  is about as easy as it gets.

Steve.


======================================================
Steve Baldwin                Electronic Product Design
TLA Microsystems Ltd         Microcontroller Specialists
PO Box 15-680, New Lynn      http://www.tla.co.nz
Auckland, New Zealand        ph  +64 9 820-2221
email: .....stevebKILLspamspam@spam@tla.co.nz      fax +64 9 820-1929
======================================================

1998\04\20@085148 by g.daniel.invent.design

flavicon
face
Werner Terreblanche wrote:
{Quote hidden}

Werner: hard to beat the I.S.D.(information storage devices) chips ie-
[isd 2560]= 60 second message recorder; addressable/standalone/multi
cue-play features approx $20.00 (NZ) has mic & 16ohm spkr inp/drv on
chip already, 28 pin. uses multi level programing of internal eeprom for
hi-density. should save a lot of real estate.
regards Graham Daniel.

1998\04\20@091253 by Haile, Sam

flavicon
face
You have a lot of choice to do this project (isd recored/playback chip or
oki speech IC etc etc )
One thing I am not sure about is the PIC...I have tried to interface Pic
with ISD chipcorder for my talking clock project but with out success...if
you are familiar with motorola uC I would say go for motorola.

by the way, if all you need is a short message to play back ISD
chipcorder is the best choice around.

S+








{Quote hidden}

> {Original Message removed}

1998\04\20@111552 by Mike Keitz

picon face
On Mon, 20 Apr 1998 12:35:21 +0200 Werner Terreblanche
<.....wterrebKILLspamspam.....PLESSEY.COM> writes:
>I would like to hear some opinions from other people that have already
>done
>the same thing I plan to do....
>
>Basically I want to make a very cheap voice playback device capable of
>playing back a fixed set number of short pre-recorded voice messages.
>I
>know you get special IC's and CODECS that does exactly this, but they
>are
>still slightly too expensive for what I had in mind.   EPROM IC's are
>very
>inexpensive these days, and I thought that maybe if store my messages
>as raw
>digitized speech on a large EPROM

With any storage method, the #1 thing is to get a good original recording
of the speech to begin with.  Some geek speaking into a microphone built
into the front panel of a noisy computer just isn't going to sound good,
no matter what storage method is used.

If the total amount of speech is 60 seconds or less the analog-store
chips such as ISD's are not a bad choice.  They have enough internal
control logic that it may be possible to get away without the PIC.  In
any case, they won't need intervention from the PIC once they start
playing.  Stay away from the ones that sample at less than 6.4 KHz.

If you take "raw samples" they should be u-law or A-law for better
dynamic range.  Or compress the speech before recording, then a linear
converter will work for playback of course it's still compressed.  It can
be hard to make a "resistor tree" DAC work for more than 6 bits or so.
I'm still looking for where to buy the hybrid ones that are occasionally
found in consumer products.

Note that raw 8-bit samples at 8 KHz need 64 Kbit/sec to store, or 3.84
Mbit/minute.  A 4 MBit EPROM will hold only about 64 seconds of such
speech.

I had good results using CVSD techniques.  Only one bit per sample is
taken, but the sample rate needs to be about 32 KHz before it starts to
sound good (this is half the bit rate of raw PCM samples).  Rates of 50
KHz or more sound really good.  The decoder can be built with a few lines
of PIC software, a few op-amps and an analog switch.  The decoder circuit
can also be re-used as the core of an encoder circuit.  No special ICs
are required.  I'm sure there's an easy way to convert PCM to CVSD on a
PC but I haven't tried yet.




_____________________________________________________________________
You don't need to buy Internet access to use free Internet e-mail.
Get completely free e-mail from Juno at http://www.juno.com
Or call Juno at (800) 654-JUNO [654-5866]

1998\04\21@023341 by Werner Terreblanche

flavicon
face
Steve

 Thanks for your reply.  Inadvertantly, you also answered another question
that I had... namely "Is it possible to store WAV files directly in ROM and
play it back?"    I have never been able to confirm how a WAV file is
encoded.   Are you saying it is possibly to store the WAV file directly in
ROM and play it back just like that through a D->A converter?

Rgds
Werner

--
Werner Terreblanche  <EraseMEwterrebspam_OUTspamTakeThisOuTplessey.co.za>Tel :  (021) 7102251 (office
hours)

{Quote hidden}

1998\04\21@031119 by Werner Terreblanche

flavicon
face
Thank you for your advice Mike Keitz,  Sam Haile, Graham Daniel, Steve
Baldwin and Jason Wolfson.  I suppose it makes a lot of sense to investigate
the ISD chips a little bit more closely.  The things that I need to find out
about these chips are if their memories are non-volatile and whether it is
easy to make duplicates of pre-recorded chips for production purpouses.  The
ones I've seen so far does not meet either criteria.

I once saw a Texas Instruments chip once which meets both these criteria.
It's a one-time programmable chip which allows you to store a number of
speech messages (128 seconds worth).  The disadvantage is that its one time
programmable, but the advantage is that it's completely non-volatile.
Decisions... decisions hey?  :)

Rgds
Werner

1998\04\21@041019 by shaile

flavicon
face
ISD chipcorder is an analog EPROM  "NOT DIGITAL"   you can record
message again and again.

it is nonvolatile.

sam

1998\04\21@115730 by Paul Dartanian

flavicon
Do you know a website for this "ISD chipcorder  analog EPROM


-----Original Message-----
From: Haile, Sam <KILLspamshaileKILLspamspamESSEX.AC.UK>
To: RemoveMEPICLISTTakeThisOuTspamMITVMA.MIT.EDU <spamBeGonePICLISTspamBeGonespamMITVMA.MIT.EDU>
Date: Tuesday, April 21, 1998 12:07 PM
Subject: Re: Generating Digital Speech cheaply


>ISD chipcorder is an analog EPROM  "NOT DIGITAL"   you can record
>message again and again.
>
> it is nonvolatile.
>
>sam
>

1998\04\21@120928 by Mike Keitz

picon face
On Tue, 21 Apr 1998 08:22:03 +0200 Werner Terreblanche
<TakeThisOuTwterrebEraseMEspamspam_OUTPLESSEY.COM> writes:
>Steve
>
>  Thanks for your reply.  Inadvertantly, you also answered another
>question
>that I had... namely "Is it possible to store WAV files directly in
>ROM and
>play it back?"    I have never been able to confirm how a WAV file is
>encoded.

It's fairly widely documented on the net.  The format is rather flexible.
Each "chunk" of data has a header indicating the encoding method, number
of bits, number of channels, sample rate, etc.  Then the rest of the
"chunk" is just the data packed in into as many bytes per sample as
needed.  If the encoding is just mono 8-bit PCM, then each sample is one
byte, so you could just remove the header and store the rest directly.

In most cases, the entire file is one "chunk" so the only header is at
the beginning.  There is also provision for non-audio "chunks" like the
title, author, copyright, note, etc.  You'd need to remove those, or be
sure not to store them in the first place.

Are you saying it is possibly to store the WAV file
>directly in
>ROM and play it back just like that through a D->A converter?

The 8-bit formats are nonlinear PCM (u-law) which probably won't sound
right played back through a linear converter.  Other than that it would
work.

_____________________________________________________________________
You don't need to buy Internet access to use free Internet e-mail.
Get completely free e-mail from Juno at http://www.juno.com
Or call Juno at (800) 654-JUNO [654-5866]

1998\04\21@120931 by Mike Keitz

picon face
On Tue, 21 Apr 1998 08:58:44 +0200 Werner Terreblanche
<RemoveMEwterrebspamTakeThisOuTPLESSEY.COM> writes:
>I suppose it makes a lot of sense to
>investigate
>the ISD chips a little bit more closely.  The things that I need to
>find out
>about these chips are if their memories are non-volatile

Yes it is, they are EEPROM.  They claim 40 to 100 years data retention
and it seems that the chips (at least the ones with the parallel
interface) are well-protected against unintentional writing by tying the
R/W pin high.

and whether
>it is
>easy to make duplicates of pre-recorded chips for production
>purpouses.

It's not hard, but the normal recording mode must be used (meaning it
will take 20 seconds to duplicate a 20 second chip, etc.).  ISD offers a
"programmer" that makes duplicates or one could be easily built.  If you
need tens of thousands of duplicates the duplication may be a problem.
If that is your case, write or call ISD and ask them if they can supply
pre-programmed chips.  They may be willing to work with you if you are
dealing in large quantities.


>I once saw a Texas Instruments chip once which meets both these
>criteria.
>It's a one-time programmable chip which allows you to store a number
>of
>speech messages (128 seconds worth).  The disadvantage is that its one
>time
>programmable, but the advantage is that it's completely non-volatile.
>Decisions... decisions hey?  :)

I think the TI chips use a LPC algorithm to store a lot of speech in a
small memory, but with a very heavy "synthesizer" effect on the sound.
If this is the case, you'll need special software which TI may not give
you in order to encode your speech for programming.  But a large library
of common standard words should be available.

I was just looking at ISSI (http://www.issiusa.com).  They have some chips which
store ADPCM digitally in a OTP ROM.  The memory capacity isn't very large
(512K bit, samples are 4 bits so that's 16 seconds at 8K sampling) but I
think some of them had interfaces for external ROM.  I don't know if the
quality would be very good either since they seem to be targeted to very
cheap "squawking" devices using piezo speakers.  Everything about the
ISSI chips (how to encode, how to program) seems to be proprietary.


_____________________________________________________________________
You don't need to buy Internet access to use free Internet e-mail.
Get completely free e-mail from Juno at http://www.juno.com
Or call Juno at (800) 654-JUNO [654-5866]

1998\04\21@131134 by Haile, Sam

flavicon
face
yes

http://www.isd.com




On Tue, 21 Apr 1998, Paul Dartanian wrote:

{Quote hidden}

1998\04\21@132829 by Mike Keitz

picon face
On Tue, 21 Apr 1998 17:45:51 +0300 Paul Dartanian <RemoveMEpauligspam_OUTspamKILLspamcyberia.net.lb>
writes:
>Do you know a website for this "ISD chipcorder  analog EPROM

http://www.isd.com

The complete name of the company is "Information Storage Devices".  Their
name for the process used in the chipcorders is "DAST" (Direct Analog
Storage Technology).  Searching on these terms as well as the obvious
ones like "ISD", "Chipcorder", etc. may turn up some relatee third-party
web sites.  But the ISD site has a lot of information including data
pdf's, theories of operation, and even I think some application notes.


_____________________________________________________________________
You don't need to buy Internet access to use free Internet e-mail.
Get completely free e-mail from Juno at http://www.juno.com
Or call Juno at (800) 654-JUNO [654-5866]

1998\04\21@175514 by Steve Baldwin

flavicon
face
>   Thanks for your reply.  Inadvertantly, you also answered another question
> that I had... namely "Is it possible to store WAV files directly in ROM and
> play it back?"    I have never been able to confirm how a WAV file is
> encoded.   Are you saying it is possibly to store the WAV file directly in
> ROM and play it back just like that through a D->A converter?

Pretty much so, yes. There are a few variations of the standard WAV
file and the type is encoded in a header section. In my case, I was
able to stipulate that the sound would be 8bit, 11kHz so I didn't
need the header. When I got the file, I loaded it into the Windows
thing, saved it with those settings and then chopped off the header
in a binary editor. The header even tells you how much to chop off.

The header is pretty straight forward (sample rate, mono/stereo, 8/16
bit, file length, etc) and the data is the straight binary value.
Read it out of ROM, write it to DAC.
Someone else has already posted a link for a file format page. There
is a very long document (on ibm's web site I think) that has the full
RIFF specification. Don't bother with it unless you want to use APCM
(which is a lot more work but less ROM).

BTW. The ISD devices store raw digitised values. Some speech
recording chips compress the sound in a way that really only works
with speech. If you are storing sound effects, they sound awful. The
ISD ones are OK for that.

Steve.


======================================================
Steve Baldwin                Electronic Product Design
TLA Microsystems Ltd         Microcontroller Specialists
PO Box 15-680, New Lynn      http://www.tla.co.nz
Auckland, New Zealand        ph  +64 9 820-2221
email: RemoveMEstevebTakeThisOuTspamspamtla.co.nz      fax +64 9 820-1929
======================================================

1998\04\21@181605 by Steve Baldwin

flavicon
face
> The 8-bit formats are nonlinear PCM (u-law) which probably won't sound
> right played back through a linear converter.  Other than that it would
> work.

No they aren't. Well, I should say that's not entirely correct. The
simplest 8 bit format is straight 8 bit data. There is the option to
save as u-Law, A-Law, ADPCM, and a host of other formats, but what
Microsoft call PCM is straight out of the ADC.
Calling it PCM is a bit of a misnomer anyway. While it's in the file,
it's 8 bit (parallel) data.

>From memory, the reason I didn't use the ISD device is that with the
larger parts you couldn't concatenate messages without a small pause
between them. I needed a long, continuous recording.

Steve.

======================================================
Steve Baldwin                Electronic Product Design
TLA Microsystems Ltd         Microcontroller Specialists
PO Box 15-680, New Lynn      http://www.tla.co.nz
Auckland, New Zealand        ph  +64 9 820-2221
email: EraseMEstevebspamspamspamBeGonetla.co.nz      fax +64 9 820-1929
======================================================

1998\04\22@065205 by Tom Handley

picon face
  Graham (and Werner), ISD recently announced a 16 min version of their
4000 series chips. I'm not sure of the price.

  - Tom

At 12:49 AM 4/21/98 +1200, Graham Daniel wrote:
{Quote hidden}

1998\04\22@100343 by WF AUTOMACAO

flavicon
face
Werner Terreblanche wrote:
{Quote hidden}

AD558 it's very good D/A to play voice! (Analog Devices)

Miguel.

1998\04\23@073800 by mast
flavicon
face
> still slightly too expensive for what I had in mind.   EPROM IC's are very
> inexpensive these days, and I thought that maybe if store my messages as raw
> digitized speech on a large EPROM and play it back through a resistor ladder
> network A/D and filter, I would be able to regenerate speech messages at a

it definately works. 27256 will give you 8 seconds of speech at
8192 Hz sampling rate. Use CMOS binary counter for address and a
decoder (like 138) to select a chip. Hook the data bus in parallel to
the resistor ladder and off you go. For playpack, use 3 PIC pins to
select a sample and then use fourth pin to ouput 8912 Hz square wave
to advance the address counters. You probably want fifth pin for
counter reset but if you are not using all decoder outputs then you
may use one of these for reset. Of course, if youwant a peripherial
for host system then it's probably better to do without PIC and just
use parallel port to drive the thing, all you need instead of PIC is a
sample rate clock.

> 2. What sampling rate would I need?  (I was thinking 8KHz @ 8 bits
> resolution) Is this good enough?

For speech, absolutely. Just make sure you filter out the sampling
noise.

> 3. Has anybody perhaps done something like this already? Was the sound
> reasonably clear?

about the same as phone. Actually sounded better because you usually
use better device for playback.


--- Read on for todays random tagline...

   God is real, unless declared integer.

More... (looser matching)
- Last day of these posts
- In 1998 , 1999 only
- Today
- New search...