Vocabulary conundrum

William T. Branch ("William T. Branch" <bill@...>) on April 12, 2006

Hello Glosa-pe

Below is a repeat of text I put at the bottom of a message = to Robin on this board. However, I realized after posting it, it probably s= hould have been addressed to the general board as its own subject since man= y Glosa-pe may not have bothered to scan to the bottom of the previous mess= age.

It is regarding the ideal size of a core vocabulary and why an IAL sh= ould have an extended vocabulary as well.

I suspect there is a good dividi= ng line as well. I devised a thought

experiment to address this after rea= ding Kevin’s site on Glo.

Using a small vocabulary can be burdensome for= the author, but as Kevin

stated, that’s OK because most IAL users read m= ore then they write.

However, at some point as you shrink the lexicon, it= also gets

burdensome to read as well because trying to understand common= ideas

conveyed in the language becomes a constant game of twenty questi= ons.

Where this magic point is should be the point of some research for a= ny

interested IAL developer. It may be as low as Kevin suggests at 500.

Maybe less maybe more.

Kevin mentions how you would address a situation= where you’re talking

about an elephant. You would simply say “very large= grey animal with

long tubular projection from face”. (These are my words= as I don’t have

his website opened.) If you’re going to talk about eleph= ants often, you

would say after the first description to convey what you’= re talking

about, “I will from here on refer to these as tube-faces”

T= his is where my thought experiment starts. Let’s imagine that a

perfectly= wonderful language is developed that can really be learned in

a matter o= f a week or so because the vocabulary is at 500. Anything

written in it c= an be understood by anyone who knows these 500 words.

After much testing = during the design of the language, 500 was the magic

number where the 20 = question game wasn’t over taxing to the reader.

Every work written would= average a number of grey elephant situations

per page, whether a fractio= n or number greater than one. This happens

anyway with works of English p= ushing new territory. (notice my use of

the phrase “grey elephant situat= ion” as an example) Because a lot of

these concepts that must be defined = up front are relatively common, they

will also show up in several other w= orks. All works that mention

elephants will face a similar up-front descr= iption and a simple

reference word for subsequent referencing.

Authors= may be tempted upon seeing several previous works regarding

elephants ju= st to use one of the references without a definition. But

this would be u= sing a word not in the vocabulary. For the IAL to keep

its integrity, the= authors MUST always pre-define all references in

their own work because = they can never assume the reader has read

anything else.

No real probl= em so far. This happens in all natural languages anyway.

All works speaki= ng of concepts that are likely not to be in the readers

lexicon, the auth= or should be aware enough to define the words that go

with those concepts= .

Being that several different authors would need to define elephants in=

various works, it is likely, even inevitable, that a different compound =

like word would be developed for each work, such as: “tube-nose”

“flopp= y-ear-giant” “thunder-snout” etc. The author and the reader both

know tha= t these words go out of scope at the end of the work.

I imagine a nove= l written in a language of 500 words might end up with a

vocabulary of co= mpound references exceeding the languages vocabulary.

This does represent= a tax on the readers memory. A good author would

have ways to minimize t= his for long works with many references.

One way is to space definitions= so they don’t clump together too

tightly. Another would be to keep using= the long description for awhile

with the corresponding word until the au= thor feels sure the reader will


Regardless of the techniques= the author uses to minimize memory strain,

there will always be some.


A logical step for authors then, in the search for minimizing strain, is

for there to be a standard word list for concepts that regularly pop up.

This does not alleviate the need to pre-define all words in every work

h= owever, because of the possibility that a reader, especially a new one,

w= ill not be familiar with this de-facto word list or previous works with

t= he word. It does however make the memory strain of regular readers

approa= ch zero rather then a constant level of acceptable difficulty for

everybo= dy regardless of how experienced they are.

What this thought experiment = shows is that an IAL with a small

vocabulary - such as Glosa, but especia= lly Glo and Tavo - SHOULD have

both a core and extended vocabulary. The c= ore should be all any reader

should have to memorize up front to read any= text written in the

language meant for auxiliary purposes. Obviously, th= ose writing for

themselves or other writers may use the whole language wi= th no

pre-defining of all extended words.

This is why I think the deci= sion in Glosa to have a core and extended

vocabulary was a good insight. = The real questions are, what is the ideal

size of the core and what words= should go in it?

We know the core must be as small as possible to get p= eople to fluently

read the language ASAP, while not being so small that p= eople are

constantly solving word riddles to figure out what the author i= s talking


I, like Robin, don’t mind synonyms. There are a coupl= e of definitions

for Rabbit in Glosa as well as several others. I think m= any creative

minds would not be attracted to a language without a large l= exicon for

expressiveness. I wouldn’t mind if Glosa had 80,000 words as l= ong as the

core was minimal.

I hope my argument shows that it does not= have to be a question between

IAL and expressiveness. It’s really about = writing in the “IAL style”.

That is; use a small core that everyone must = know and define all words

that are used outside the core within the work.=


