atomic_fungus (atomic_fungus) wrote,

#174: Language as a communications protocol

Reading Steven Den Beste's Chizumatic has gotten me to thinking about language. He makes the occasional foray into Japanese grammar and vocabulary, and he mused a bit about how Japanese works in songs.

Japanese is built on syllables, and each syllable is represented by a single character of their alphabet. There are some combination syllables which are represented by two characters, and some other characters have their pronunciation changed by the addition of voicing marks.

SO, for example, if I wish to spell the name of my favorite voice actress, Megumi Hayashibara, it ends up being made of symbols which can be converted into roman letters thus:


The " mark is one of the voicing marks; ° is the other.

" turns K into G, SH into J, and H into B. ° turns H into P.

(I apologize for not having any Japanese font to do this properly...but at least you get the idea.)

If I want to make the word "Tokyo" it actually ends up something like this:


The compound character is the "KI" character with a small "YO" character next to it, and it forms the single-syllable sound KYO. (Saying "KI-YO" is incorrect.)

The Japanese learned to write from the Chinese. That's why the Japanese use Chinese runes for the bulk of their writing. The phonetic alphabet hiragana was developed from the Chinese characters, as was the other phonetic alphabed katakana.

An interesting cultural point: Katakana is more angular than hiragana, and is used primarily for words which have been borrowed from other languages ("loanwords"). In other words, the harsher-looking alphabet indicates a word with alien (foreign) origin.

Problems arise when foreigners try to learn Japanese because many words have double vowels, double consonants, or both. My favorite example comes from the Maison Ikkoku final movie, in which the translators made such a basic mistake even I caught it: in the subtitles, one young girl, who was being tutored by the main character for a time, refers to him as "Godai the devil".

The girl actually calls him "older brother Godai". The difference between the two translations is a single "i"--she said "Godai onii-san", but for some reason the translator heard "Godai oni-san".

"Onii-san" means "older brother"; "oni-san" means "Mr. Ogre". If you know what to listen for, the difference is actually fairly obvious.

As Den Beste points out, Japanese is a synchronous protocol: it requires a cadence, because the double vowels and double consonants won't show up in an asynchronous protocol.

Thinking about it in that fashion--as a clocked information transmission protocol--got me to wondering what the actual bit rate of language is?

A quick check of Google reveals that the data rate for English is about 1.5 bits per letter. That's not bad. Using a simple UART chip (Universal Asynchronous Receiver-Transmitter) I can transmit the word APPLE using 54-84 bits. I'm transmitting 9 bits of actual information, but a UART functions much faster than the human brain's language generating and recognizing systems do.

But what information am I actually transmitting? If I say "APPLE" you have a definite idea in mind of what an apple is.

What I have done, in actuality, is transmitted a token. We agreed in advance what "apple" meant, and now by transmitting 9 bits I can convey a concept which is actually much more complex.

English is asynchronous, in that I can generally say "apple" at a variety of speeds, inflections, and pronunciations, and you will still understand what I am saying. This is why a Texan and a Yankee both understand "apple". This is why a Japanese man who knows English can understand someone from France who speaks English when he says "apple" even though their pronunciations are wildly different, and their understanding of the language is non-native.

All languages have a significant amount of redundancy in them. You can take a book written in English, cut off the last word on each line of a page, and still understand it, because of the redundancy.

If I introduce "noise" into a sample sentence, you can still understand it: "I a goin to ea an app." can rapidly be reconstructed from the redundant data to yield "I am going to eat an apple."

The mind takes these approximate steps:

"I a goin to ea an app."--message received, errors found.

"I aX goinX to eaX an appXX."--missing bits inserted with random hash.

"I am going to eat an apple."--reconstruction complete.

The mind knows that the phrase "I a" does not exist in English, so it adds a "space" for a letter and then pattern-matches something that fits. "I am" fits. The rest of the errors are similarly corrected.

It's normally done "on the fly" rather than with the entire sentence, but there is a kind of programming stack which stores all the elements of the sentence. A more complex sentence will require more internal reference to reconstruct.

This is why you can understand what Homestar Runner is saying when he says, "Hey, cwapfathe! Why don't you blow it out youw eaww?" His speech impediment introduces "noise" into the signal, but the redundancy is massive enough that your brain corrects the errors as it detects them.

Every once in a while the phenomenon is remarked on in chain e-mails. There is a demonstration of this effect with an utterly mangled sentence which is nonetheless perfectly understandable.

Here is a classic:


The first time you see it, the redundant "the" doesn't even register consciously; once it's pointed out, the redundancy becomes obvious. The brain detects the redundant information and ignores it, entirely automatically.

All natural languages have error-correction built into them, from the elimination of redundant information to the reconstruction of lost data. The end result is a robust communication system which requires a very narrow channel to convey information. This is good, because the human voice is not exactly "broadband"; the telephone specification in the US is for a 4kHz channel.

4kHz is not all that much information. It's a maximum of 4,000 bits per second using simple modulation; electronics can squeeze 54 kilobits out of a typical American telephone line, using complex modulation and coding schemes. The human voice is not nearly as efficient; but even if it were, it wouldn't matter, because the human ear is not designed to distinguish between, for example, horizontally- and vertically-polarized audio waves.

The human voice is a simple vibrating assembly with tone variability; all the phonemes we use come from the shape of the mouth, lips, throat, the position of the tongue, and so on. (This is why you can remove a cancerous larynx and give the poor guy a humming rod to jam against his throat, so he can still talk.) The basic sound is a hum, which cannot be further modulated without the rest of the works--and all the associated machinery does not change states that quickly. So while the ear can distinguish sounds between 20 Hz and 20,000 kHz (or a bit more) the actual phone connection need not be wider than about 4 kHz, centered somewhere in the lower end of the audio spectrum, between about 200 Hz and 3.4 kHz.

This means that a single fiber-optic cable--which has a bandwidth measured in gigabits--can carry literal millions of voice channels. And none of the users of that single line will ever misunderstand so much as a single syllable, because 4 kHz is wide enough to carry a conversation.

(In practice, fiber-optic cables actually carry less than "millions" of voice channels. Fiber is valuable stuff and it's used to carry all sorts of stuff, not just phone calls. There are also losses to transmission overhead and such. But fiber is still a big data pipe.)

Speech, therefore, is a relatively low-speed serial protocol; but its information density is high enough that the low transmission rate is not much of a problem. The fact that all symbols are pre-existing helps considerably; and the only time that data rate becomes a serious issue is when a symbol must be defined because one unit has never encountered it before.

It's also a simplex protocol; if you and I both talk at the same time, we won't understand each other; and if we both listen at the same time, no information is transmitted.

Writing is a bit different. Someone who is a proficient reader can scan a page of text fairly quickly, much faster than he could read it aloud. Although the symbols are read serially, they are interpreted quickly because the input system--the eye--is massively parallel. A proficient reader does not read individual letters, but sees words in their entirety--frequently he will read just enough letters to pattern-match the sequence against those in his memory, after which he will move on to the next word. The faster one reads, the larger the symbol groups he is interpreting with each "cycle".

This also speaks to my simple example of error correction, above. In "I a goin to ea an app" all the words have been truncated; none were foreshortened. A proficient reader will error-correct that sentence quickly--and it is possible that he might not even notice the truncation at first--because of this.

But "I m oin to et an apl" is also understandable. The error-correction process may take a bit longer, but the gist is still there and can be reconstructed.

Written language allows one to both read and speak at the same time because it uses different systems for input and output. A person must listen to his own speech in order to maintain proper control, which is why he cannot also listen to the speech of another while he is speaking; his "audio system" is in use. But the visual system is doing nothing during speech; the data can be read visually and then spoken. And because the text output system is independent of eyes and ears, people can learn how to type from an original, take dictation, or transcribe conversations in real time.

A person can redact written data on the fly because his visual input is so much faster than his audio output. He can decide, "I am not going to read that next sentence because it's dirty!" while he is finishing the prior sentence.

So then I have to wonder: what about an artificial language which is designed for humans which maximizes the data rate? Could it have less redundancy than English and transmit more data? Or could it transmit the same amount but at a higher speed?

In his novella Gulf Robert Heinlein discussed such a language, called Speedtalk, which is beyond the ability of most people. Speedtalk was designed for homo novis, new man, the race of geniuses which (the story posits) are evolving from homo sapiens.

It should be possible to design a language similar to Speedtalk but which is a bit less difficult to master; even a twofold improvement over standard English would be remarkable in its efficiency. Imagine how much more free time we'd all have if we didn't have conversations like this:

"What day is it? Tuesday?"
"No, it's Wednesday."
"Wednesday? I thought it was Tuesday."
"No, it's Wednesday."
"So it's the 24th, right?"
"No, Tuesday was the 24th. Wednesday is the 25th."
"Are you sure?"
"I'm looking at a calendar right now. Today is Wednesday, October 24th. Damn. 25th."
"Don't we have a big production meeting scheduled for this afternoon?"
"The boss canceled it."
"When did he cancel it?"
"No, last week. The 17th."

Of course we all have the option of just clubbing stupid people with baseball bats, too, but unfortunately I think that would get a lot of us arrested. I'm out of smart words. More later.

  • #7607: Oh, yeah

    Had another opportunity today to play Pat Metheney's "Spring Ain't Here", because it snowed for most of the morning. But yeah, "global warming"…


    Some fatuous pinhead on the radio said that Chicago was ready for "looting and other forms of protest". But you know what? It's fine. We no longer…

  • #7605: I don't even need lettuce any longer

    See, the tacos I make at home blow away any tacos I've ever had anywhere else. Lettuce used to be necessary, but now it just gets in the way, so I…

  • Post a new comment


    default userpic

    Your reply will be screened

    Your IP address will be recorded 

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.