How would you organize a library or index in a language without an ordered alphabet, such as chinese?



How would you organize a library or index in a language without an ordered alphabet, such as chinese?

In: Culture


One way that some librarians or bookstores organise books “alphabetically” is by using the Chinese *pinyin* system.

Basically, pinyin is how you would “spell” a Chinese word using the alphabet based on the word’s pronunciation.

For example, the characters for Beijing are 北京, and the pinyin is indeed “bei jing”.

Using this modern method of organisation, 北京 (Beijing) would be in the “B” section based on its pinyin, as the pinyin is also organised A to Z.

**Edit:** A little further detail. Each Chinese character represents one “word” of pinyin, e.g. the above example 北京 is 北 (bei) and 京 (jing) together. Most pinyin “words” are made of two, three or four letters which represent the way you would pronounce them in spoken Mandarin.

When I’m booking a train ticket here in China, the city names are listed from A to Z according to the pinyin of the characters. So if I wanted to go to Shanghai (上海,shang + hai), it would be under “S”. My own city Xiamen (厦门,xia + men) would be under “X”.

This is not the only way of organising Chinese characters – one other way may be ordering the characters by the amount of brush strokes it takes to write them, e.g. the character 上 (shang; up) requires three strokes and would be somewhere near the top of the list.

Likewise you can organise the characters by the “components” within them, called radicals. For example, the character 板 (ban; wood or plank) contains a radical called 木 (mu; wood or tree) on the left side. All characters containing that radical on the left side of the character may be categorised together.

There are other methods too. The above two that I just mentioned are a little older and more complicated, but still fairly commonly used. The pinyin method is new but is gaining ground as the most popular method of organisation on computers, mobile apps and book stores because of its simplicity and the young generation’s stronger familiarity with the pinyin system.

Pingyin is a product of westernization.

First you need to understand a chinese word. it could be ranging from a single block like water 水 where as something like an ambassy would be using 3 word 大使馆

to actually find the word in a dictionary, you need to understand that a character is made up of components.

操 is made up of 3 parts, the 手leftside radical, then the 品top radial, and lastly the 木radical.

You would then organise all the letters with the leftside radical together, then further organise them on the 2nd top radical, and lastly on the bottom radical.

Here is where it gets fuky. there is basic words in the chinese vocab that are considered the “fundamentals”, those are categorized by number of strokes, and are basically our ABCs, except they actually mean something.

So in a typical dictionary, it would be the basic, then ALWAYS followed by left hand radicals, which then sub divide into the right hand component. Then once all left hand radicals are logged, you move onto ones with no left hand radical but only top radicals, so on and so forth. And unlike english, while you could gain some meaning from reconizing radicals, it is nearly impossible to know the pronounciation and actual meaning of the word, compared to english, where you can read it out.

Isn’t this the purpose of the Dewey Decimal system?

my man Dewey came up with a whole system for this

It’s called the Dewey Decimal system.

It organizes books by Subject with numbers.

It’s pretty cool.

Go Dewey

I studied Japanese in Japan for a few months, so I don’t really understand the details. But the way things like dictionaries were organized was by stroke order and radicals of the Kanji. Like I said, I don’t really understand it but the idea is that you order entries based on the way the symbols are written. In book stores, for example, authors are just ordered by their romanized names because that’s obviously a lot easier.

I also studied Akkadian for a few semesters, which is written in cuneirofrm. Most Akkadian dictionaries are simply transcribed and ordered alphabetically, with some extra entries for letters like š (after s). The reference works that are actually in cuneiform use a somewhat chaotic and arbitrary method that lists all symbols that start with one horizontal wedge, then those with two horizontal wedges, etc, then the same for slanted wedges, then angle wedges, then vertical wedges. I never used any cuneiform dictionary myself, so I’m not sure how convenient it is. Cuneiform is a fairly complicated writing system in general though. Also, keep in mind that this is how **modern** reference works are ordered and even that’s not universal. There are ancient lists a dictionaries written in cuneiform but I actually don’t know how those are ordered. Although I’m guessing there was a similar system because assigning a fixed, random order to all of the hundreds of symbols would just be impractical.

Traditionally it is done by the number of lines in the first sign, or by the first “sub-sign”.

In Chinese: according to the order of characters in the Thousand Character Classic.

In Japanese: According to the order of characters in the famous poem Iroha.

In modern day, although often considered bad practice (depending on the context), you’ll sometimes see things on computers ordered by Unicode code point. Unicode is the standard that assigns a number (“code point”) to every character of every language in the world (well, every language they’ve gotten to so far; they update the standard periodically, and for most people the most obvious effect of this is having new emojis to play with).

This is useful for having an unambiguous consistent ordering (well, *mostly* unambiguous. Languages are messy, but I won’t go into the exceptions). But isn’t really an intuitive ordering system. Even in the simple case of English without any accents, all capitalized words would come before lower case words (Aaron would come before Zimbabwe as expected but both would come before aardvark, but then words starting with accented characters like Ångström would come after even zebra) – and it gets much, much messier with languages like Chinese. So it’s not really useful to an average person trying to find if one particular word is in a list. Programmers just order things in this way just because it’s easy to program and extends naturally to mixed-language lists.

But speaking of Unicode, they do have standards on the “right” way of how you’re *supposed* to order things. The proper term for this is “collation,” and [here]( is their technical report on the topic.

You organize by subject, and then things like starting word, authors name, etc. Also Chinese does have an alphabet in addition to their pictographic characters.