Usenet IPA/ASCII transcription

In August of 1992, some of the readers of the Usenet newsgroups sci.lang and alt.usage.english got fed up with common in which posters tried to describe how words were pronounced (by them or in dialects under discussion) by reference to how other words were pronounced (by the author). Since individuals pronounce different words differently, this tended to lead to (occasionally interesting, but often merely) long, fruitless threads.

There already was a scheme occasionally used for noting transcription, but it suffered from (among other things) the fact that it was highly skewed toward describing English. This made it less than useful for the denizens of sci.lang.

Since there already existed a notation (the International Phonetic Alphabet, or IPA) for precisely specifying phonemic and phonetic values, several of us decided that it couldn't be too hard to put together a reasonable transcription scheme of IPA into 7-bit ASCII characters. We naturally had to allow some of the IPA symbols to map onto multiple characters (since there are more IPA symbols than ASCII characters), but we finally settled on a scheme in which each segment is represented by a single character, potentially followed by some number of "diacritics", which can either be single characters or delimited tokens. [We also came up with a very narrow feature-based representation for use when precision is needed or when no symbol completely fits the bill.] Unlike some other such attempts, we took it as a given that this transcription had to be directly readable, so each character needed to be at least somewhat evocative of its IPA value.

It is expected that when the Unicode/ISO 10646 character set becomes commonly used for mail, news, and web pages, this transcription will no longer be needed, as the IPA characters will be able to be used directly.

Included in this archive are the specification itself and the "Pronunciation Symbols" page of Merriam-Webster's New Collegiate Dictionary", done over in this transcription. This latter should be of use for American English speakers who are not used to the IPA symbols.

In the future I hope to add a version of the specification which includes images of the actual IPA characters as well as sound clips of each of the segments.


[HP] [HP Labs] [Evan Kirshenbaum]


Evan Kirshenbaum <kirshenbaum@hpl.hp.com>