This paper is the result of two analyses concurrently conducted. On the one hand, we have studied the penetration of English in modern Japanese. On the other hand, we have detected what remain of the proto-Indo-European words in the Vietnamese language. The two studies have contributed to realize two snapshots of the history of the linguistic relations and exchanges between Western Europe and Far Eastern Asia, made at periods separated by thousand years.
The practical results are:
- A thesaurus of thousand English words, which have been imported in Japanese and trans-coded in Katakana characters, the Japanese syllabary dedicated to foreign words used in Japan. It will definitely help native (or non native) English speakers who are learning Japanese.
- A thesaurus of hundreds of Proto-Indo-European radicals, call SKIE Radicals, in the Vietnamese language.
SK stands for Sanskrit and IE for Indo-European.
The main characteristics of Vietnamese radicals are the way they are used to form other words. As the thesaurus have already 250 radicals by the end of october 2010, thousand words could be derived from the entries of this thesaurus, which will help:
- Westerners who learn Vietnamese;
- The large Vietnamese Diaspora, and more specially the young generations, offering them easy mnemonics, even if they may practice Vietglish (Vietnamese-English) or Friet (French-Vietnamese);
- And Vietnamese learning a European language
The thesauri resulting from this paper are at the following URL:
1 Short introduction to the European Influences in the Japanese & Vietnamese Languages
Japanese and Vietnamese languages are very different. The only connections between them are the Chinese radicals derived from ideograms that both of them have adopted. Our study refers to periods of their respective history separated by thousand years. Yet, our two dissertations echo each other.
The first part of this paper shows how Japanese is borrowing words from American English. This import process is constantly accelerating during the last twenty-five years. Today, thousand of English words have been transcribed into Katana, a scripting system reserved to foreign words. Multiple domains are concerned, which are covering almost all aspects of modern life. It’s why we consider the Katakana system as a “de facto” codex, which is neither a Creole English nor a new Japlish slang. We keep working on the edition a Katana Thesaurus, which will greatly help English speakers learning Japanese.
The second study brings us thousand years ago, in the ancient history of Vietnam. This country is part of the Indochinese peninsula; its Austro-Asiatic language has direct links with the Mon-Khmer group of languages. Our research shows that proto-Indo-European radicals exist in modern Vietnamese. We’ll call them “Vietnamese SKIE roots”, SK standing for Sanskrit and IE for Indo-European. In the same way, a Thesaurus of Vietnamese SKIE Radicals is under construction.
2. Katakana as a "de facto" codex
2.1 Basic notions of the Japanese writing system
Japanese has four writing systems, which are simultaneously used in modern Japanese:
-The Kanji system
The name means: “the letters (‘ji’) of the Hans (‘Kan’)” – Han is the name of the Chinese majority ethnical group. This system, adopted in the 3rd century AC, borrowed ideograms from the Chinese, on the basis of "one ideogram, one concept" … and basically at least two ways to say it in Japanese. However, the use of Kanji could not cover all the needs of Japanese, which is a polysyllabic quasi-atonal language with inflections while Chinese is tonal, monosyllabic with no inflection.
-The Hiragana syllabary
The name means cursive (“Hira”) alphabet (“gana”, which is a variant of “kana”). It was invented in the 4th century because it was necessary for the Japanese to have a complementary system for the written restitution of their link words, their complex grammar and their syntactic inflections.
Hiragana is also used to help kids who are learning to read. They start by learning the syllabary before learning Kanji writing.
-The Katakana syllabary
In parallel, this system (“kata” means "broken" and “kana”, alphabet) was created also in the 4th century AC, especially for the foreign words used in Japanese.
-The Romaji alphabet
With the coming of the first Christian Missions in the 18th century, the latin ("roma") alphabet ("ji") became the main vector for mutual understanding between Japan and the Western world. Today, Romaji is used:
-For international acronyms or words such as Fax, Web, etc.
-For educational purposes, and more precisely for the teaching of Japanese to foreigners.
In one phrase, we can say that Kanji words are the bricks of Japanese, while Hiragana is its mortar!
2-2. The Katakana scripting system
In the past, there were very few imported words. The main reasons are the relative isolation of Japan, which is an archipelago.
(All examples are extracted from our thesaurus: The Katakana de Facto Codex)
Japan is not a natural habitat for crocodile… the word for crocodile could only be imported. Curiously, the word for “ants” is also a non-Japanese word.
Like many other countries, words used for fashion clothing are French.
With the increasing relations with European powers, more words are borrowed from Portuguese, French, Spanish, etc. Since the 19th century more and more English words are used by Japanese. We will discuss the adoption process of these words, which became a most important phenomenon, accelerated by the development of the relations between Japan and Anglo-Saxon countries: USA, Canada, Australia and the UK.
The other use of Katakana is to transcript onomatopoeias. Mangas scripts, i.e. the Japanese comics, show a concentration of Katakana to reproduce sounds, shouts and any kind of noises for their readers. The style of this literature also concentrates the use of fashionable English words or expressions that are common in the modern way of living in over-developed countries, such as:
Another purpose of Katakana is to emphasize key words either for marketing reasons or in some types of communication such as ads, tracts, etc.
2.3 Japanese transcription
English speakers generally recognize neither the written, nor the spoken original words although the transcription rules are quite easy.
The above examples might be surprising, unless the reader applies some very simple transliteration rules. Non pronounceable phonemes are transformed the following way:
-Final “er”/“or” à a
-As “l” doesn't exist in Japanese, it is replaced by the Japanese “r”, which sounds between the Spanish “r” and “l”.
In this example, the Japanese version is inverting the English word, to be strictly conforming to the Japanese syntax order.
-Two successive consonants à add u between the two consonants
-As “v” does not exist in Japanese, it is replaced by “b”
-Final consonant à add “u”
English words are not used exclusively in technical domains. Actually, in Finance, Manufacturing, Electronics, Science or Accounting Japanese are used to create neologisms from existing radicals both in Kanji or Hiragana. Nonetheless the tendency is to adopt purely and simply the existing American words in Information Technology, Internet, etc.
This process is not limited to specific domains and we have marked the extensions of the use of Katakana in our thesaurus, by categories:
Gen: General items
2.5 Analysis of the adoption process
The process of importing American words is accelerating this last 25 years. Their number is constantly increasing reaching many thousands. However, some of them could be written in Kanji characters, examples:
-Base-ball that became the national sport gave place to a Japanese equivalent, 野球 (“yakyu”)
The expression "Lean Management", originally a Management Methodology from the MIT, did not have the same fate:
However, two Japanese words used in the methodology have different treatment:
-Kaizen is written in Katakana, although it is Japanese: カイゼン
-Kanban remains in Hiragana: かんばん
Note that the last two words remain in Japanese when the methodology re-crossed the Pacific to be applied both in North America and Europe.
The case of the Japanese words for “fish” is very interesting. Although two words exist:
-Sakana (living fish) : 魚
-Sashimi (raw fish in slice) : 刺身
The Japanese adopted:
-Furaidu fishu: フーライデュフシュ (Fried Fish)
… together withマクドナルド (Makudonarudo), when MacDonald’s fast foods settled in Japan !
The tendency is to abandon words borrowed from other languages for the American equivalent:
In this example, the word was imported from French before the adoption of its US version.
Words written in Katakana often keep a mark of the time they were imported:
Before WWII, a shirt could be only white…
Today, it’s a gasoline station!
2.6 The place of Katana in Japanese texts
Massive use of words from another language is not exceptional in history. In particular, English has borrowed an important part of its present thesaurus from French and Old French. But the process took hundred years as Thora Van Male describes it in her book “Les Liasions Généreuses”. In the case of Japanese and English, the systematic importation started only thirty years ago.
Now, more than 95% of the words written in Katakana are borrowed from American English.
It is very easy to measure or estimate the rate of the use of English terms in a written text, only by examining the space occupied by Katana scripting. The ratio depends on the domain, varying from 0% – for example in Haikus (short Japanese poems) – to largely above 50% – in the technical leaflets of any electronic devices, which are no more manufactured in Japan.
This is an example of what could be a modern short conversation where exactly 84% of the text is coded in Katakana, all of it deriving from American English:
Ken: Hero Toni!
Tony: Hai Ken!
Ken: Yu OK?
Tony: Hai, Nihon no fasuto fudo ya naitokurabu ya resutoran de wa kuru desu
(Yes! Japanese fast foods, night-clubs and restaurants are cool!)
- Ken is probably Japanese (Kenji) but he prefers to write his first name in Katakana.
- Tony should say, in good Japanese: “ohio” (Good Morning!), but he prefers to use the American “hi”.
- Ken’s phrase is what a reader could find in a manga story
- Toni makes a long sentence using Hiragana characters for the link words and the syntax and the Kanji pictograms of Japan. The rest of the sentence consists exclusively in imported words.
However, this is neither Japglish, nor a Creole language.
Ratios for the other coding system: Hiragana – 10% – are tagged in yellow, Kanji – 4% – in green and Romaji – 2% – in blue.
2.7 The Katakana Thesaurus
The practical result of this study is the on-line Katakana de Facto Codex (English è Japanese). It’s a “living” corpus as new words are enriching it everyday. It’s also a useful tool for English Speakers learning Japanese.
This research has been very useful for the subject that will follow, as we had the possibility to analyze linguistic phenomena that happen in quasi real time.
3 SKIE Radicals in Vietnamese
Our main objective is to analyze traces of Sanskrit and Indo-European words remaining in the Vietnamese language, whatever their origin. They could be either the results of imports or some inheritance of Proto-Indo-European origins. Anyway, they concerns processes that took place thousand years ago. We could compare this research to the finding of traces of ADN in archeological human remains…
3.1 Basic notions of Vietnamese
Vietnamese is a tonal – with 6 tones: 5 inflexions and the neutral tone – monosyllabic language. Actually, for historical reasons, the imperial authorities had forced their original language to reduce their words to mono-syllables, in order to “sinicize” it (i.e. to make it resemble Chinese, as much as possible).
Like Japanese, they borrowed the Chinese characters, called the “Nho” or “Nom” writing. Unlike Japanese, they needed not invent any complementary syllabary system for the link words and syntactic variations
But as early as 1625, long time before the French colonial era, Alexander de Rhodes, a Jesuit Father born in Avignon (once part of the Papal States), introduced a variance of Latin alphabet based on the Portuguese pronunciation. The new system was welcomed by the emperor and it was definitely adopted a century and a half later. The Nom system was slowly but irreversibly abandoned, becoming a dead writing codex.
Vietnamese alphabet is composed of:
-17 written consonants: b, c, d, đ, g, h, k, l, m, n, p, q, r, s, t, v, x
-and 12 written vowels: a â ă e ê i o ô ơ u ư y
-[ch], [kh], [th], [nh], [ph] are considered as consonants (total: 22)
-As there are 6 tones – a, á, à, ả, ã, ạ – the number of oral vowels is 72.
Moreover, if we consider all the possible diphthongs and triphthongs, the resulting mono-syllabic phonemes could be countered by hundreds.
3.2 The Vietnamese Heritages
Vietnam has a triple heritage:
-The Chinese heritage
Vietnam has not adopted only Chinese ideograms, but also words and radicals, which are used to build new words.
-The heritage of its ethnic minorities
“Viet” means “far” and “Nam” means “South”. Its history was mainly the down-south expansion of the Viet, an ethnic group originally from Southern China, fleeing the Han and conquering lands from the Cham and hundred of “montagnards” tribes.
-The Mon-Khmer Heritage, in particular during the historical contacts with the Cham
This last is the most probable source for Vietnamese Proto-Indo-European radicals.
There are three main accents in Vietnamese (accents, not dialects!): the Northern “official” accent, the Central accent (considered as the “classical” accent) and the Southern accent, which transforms the “z” the “gi” and the “v” to “y”. The pronunciation adopted in this paper is the Southern accent.
3.3 Adoption of foreign words
In spite of the French colonial episode, the Vietnamese language did not borrow a lot of terms from French. They've always preferred to build neologisms from Vietnamese radicals.
Today, the tendency is to adopt American words and acronyms, especially in technical domains. However, there is nothing like what we have described for the Japanese Katana scripture.
-a bike is "xe đạp" where “xe” means a vehicle and “đạp”, pedal.
When it is possible, the few imported words are transformed in monosyllabic words.
Example: stamp is "tem" (one syllable) although the original word is "timbre" in French (two syllables).
Most of transcriptions give a pseudo poly-monosyllabic expression. Example: TV becomes "ti-vi". However this translation is not accepted in literary text, because the official term is "truyềnhình", where the radical “truyền” means transfer and “hình"stands for image. So, the correct Vietnamese acronym for TV is TH.
When reducing a foreign word to one syllable, Vietnamese choose the syllable marked by the tonic accent, or the first syllable.
Example: kilogram is reduced to “kí”.
The major inconvenient of this process of reduction to monosyllabic terms is its mathematical limitation: the combinations of 22 consonants and 12 vowels in one-syllable word could not be infinite, even with the combination of 6 tonal inflexions. Moreover, the habit to refer to Nom characters - Chinese ideograms) - has fixed the limit of radicals to only some thousands…
Today, imported terms are transcribed as “pseudo-poly-mono-syllables”, like “Mu-Vi” (movie), which is progressively replacing “phim”, the transcription for “film” (from French). But the correct term is still “cuộnhát bóng“ – etymologically: a “piece” (“cuộn”) of “theatre” (“hát”) of “shadows” (“bóng”).
We do not take the tonal inflection into consideration as the original proto-Indo-European languages were mostly not tonal.
Proto-Indo-European radicals exist in Vietnamese, though the current situation is a complete denial of any links to the “indo” heritage. Some words are so obvious, directly understandable by present European speakers.
We have to dig more or less deeper to find the others.
There are also syntactic systems which are typically Indo-European.
3.5 Our methodology
Our methodology consists in:
-Making an inventory and compare words of modern Vietnamese to words of different European languages in order to trace all approaching words.
-Analyzing different syntactic system in order to find constructions, which are typically Indo-European.
-Referring to other studies (see bibliography) concerning other languages with connections to Vietnamese and having close words.
We take into account what we have noticed in the adoption of the French or American words by modern Vietnamese.
3.6 Vietnamese SKIE Roots
3.6.1 Obvious Indo-Europeans words
The following few examples are among a long list of most obvious Indo-European words, which have remained non-altered:
(extracted from our thesaurus of Proto-Indo-European Radicals in Vietnamese )
= [ba] :IN [abba] :HE [pa]:SK [pater]:LT,
(Many languages use the same word for father or daddy… Another word is used: [cha]:VN)
(Vietnamese have many hundreds of varieties of “bánh” cooked in different ways, sweet or salted with or without red pepper. They also have adopted the French “baguette” called “bánh |mì|”.)
= cake, bread, ||=>[pan]:LT
(Although it sounds like [pain]:FR, this word is not an import from the French; all kinds of traditional dishes based on paste are called “bánh” followed by its type: “bánh |bò|”, “ bánh |cam|”, etc.)
(This radical is used for real panels as well as for virtual panel like “bảng nhạc” – a music sheet – or “bảng |thống kê|” – panel of statistics.)
||=>[banc]:FR, = [panel, table, board]:EN
= [black board]:EN
= baby brother or sister
(The word ‘’bé’’, traditionally used in Vietnam, is not at all an import from the French [bèbè])
= [|to| be] :EN, ||=> [subir] :FR
= [to be sick]: EN
= [to be fined]: EN
(Used only in the passive form, with the meaning of “undergoing”: to be sick, to be ).
= [bovis] :LT, [bo]:ON
(A cow, the link word |con| always precedes the noun.)
= [ja] :GE
(The traditional “dạ” – pronounce “ya” – is a mark of high politeness. It was used to answer to the emperor; by the way, it was forbidden to say “no” to him…)
= [mamma]:LT, [ma]:SK, [mater]:LT, [mum]:EN,[ima, אמא] :HE, [ma] :IN
(Many languages use analogue words for “mum”.)
= [mère] :FR, [mother] :EN
(The word “mẹ” is formal while “má” is used in family.)
3.6.2 Direct links to IE words
“To read”, in Vietnamese is “đọc”. However, this word is really linked with the notions of knowledge and doctrine. So we relate it to “Δοξολογία”
Many other words have direct links with Ancient Greek and Latin.
3.6.3 Obvious IE syntactic systems
We have studied five syntactic systems and all of them contain direct links to Proto-Indo-European words.
126.96.36.199 The pronominal system
Vietnamese generally don’t use pronouns but a form of polite “you” by calling the person with his or her title. It is like the Spanish “usted” or the Italian “Lei”. However, in popular conversations and in some formal official high ranking exchanges, the following pronouns are used:
I = Tôi (correct form)
= Tao (colloquial, popular)
= Mình (most formal and official)
You (singular) = Mày (popular)
= Ta (most formal and official, but “ta” means also he, she)
Compare with the Latin “mi” and “tu”, there is an inversion in the popular language and a parallelism in the formal language.
He, she, it = Nó (popular)
He, she = Ta (formal)
We = Mình
You (plural) = “Qui vi”/’’Chung Ta’’
They = Họ
“Mình” and “Họ” are closed to “mi” (plural of mi) and He or Hou…
188.8.131.52 The ordinal numeration system
Traces of Indo-European numeration remain in the following terms:
sáu (pronounce : chao)
six, seks, chaha (Hindi)
184.108.40.206 Suffices, postfixes
||=> [scire] :LT, <sci(ence)>:EN
= [doctor]:EN (“bác” = medicine)
=[singer]:EN (“ca” = song)
(This is a most important radical used to mark the knowledge – science – of a professional. It applies to different experts.)
||=> [scire] :LT, <sci(ence)>
(A variant of the radical “sĩ”).
220.127.116.11 Calling the parts of the body
Many parts of the body have a direct link with Indo-European terms, among them:
collum (Latin), cou (French)
Linked to mun (mouth in Swedish)
Linked to mun (mouth in Swedish)
genou (French), genou
Remark: while “lưỡi” means “tongue”, “tiếng” means “”language”.
18.104.22.168 Animal names
Among animal names that have a direct connection with Indo-European terms, let’s mention:
hen, cock, rooster
animal (ancient Greek radical zoo)
- for the “cuckoo”, the common root is evidently an onomatopoeia (see § 3.6.7).
- the crocodile is called as a “fish” (“cá”, see §22.214.171.124)… It is probably an indication that the only crocodiles where salties or estuarine crocodiles – people of the Delta of Mekong called themselves “the sons of crocodiles”… but there are no crocodiles in this area; moreover, some old people of the delta say that before WWII some crocodiles were captured in fishermen’s nets.
Among the names of colors that have a direct connection with Indo-European terms, let’s mention:
vàn (pronounce: yang)
yellow, giallo (Italian)
126.96.36.199 Post position
This is an example from our thesaurus:
||=> [raus] :GE, =[out] :EN
= [to go out]:EN
= [to run out]
In Vietnamese “ra” follows the verb, exactly like “out” in English)
188.8.131.52 About religion
amen (only for Buddhists)
god (of the Vietnamese Pantheon)
Remark: the IE-related words only apply to Vietnamese traditional religions. Christians use other terms, examples: “a-men” for “amen”, “nhà thờ” – etymologically “house of worship” for church.
3.6.4 To proto-IE Roots through the Thai Language
184.108.40.206 Thai and Vietnamese
Thai and Vietnamese use similar phonemes… but there is no possible mutual understanding between their respective speakers. However some terms
220.127.116.11 Thai relations with proto-Indo-European Languages
Arne Østmoe’s paper, “A Germanic-Tai Linguistic Puzzle”, published by the Department of East Asian Languages and Civilizations University of Pennsylvania, the site of Sino-Platonic studies – http://www.sino-platonic.org – offers new grounds of investigation.
18.104.22.168 More SKIE roots
We’ll have much more proto-IE radicals in Vietnamese, such as:
“gạo” (Vietnamese) à ”rice”(English) à “kum” (“grain”, old Germanic) à “khau” (Thai)
3.6.5 Deeper in proto-IE sources
22.214.171.124 Chinese and the proto-IE
In the same way, the following paper, published in the same collection:
- "Indo-European Vocabulary in Old Chinese, A New Thesis on the Emergence of Chinese Language and Civilization in the Late Neolithic Age" by Tsung-tung Chang
shows that even Chinese has words related to proto-IE…
126.96.36.199 Consequences for Vietnamese radicals of Chinese origin
Some of them are also related to proto-IE terms!
“cá” (Vietnamese) è “ka” (Chinese) à “aqia” (“live in water” proto-IE)
3.6.6 Possible cultural relations
Some cultural facts lead us to Indo-European countries. For example: “áo”. This word means “tunic”. The traditional tunic is white
There is a reminiscence of the fate of the Latin term “alba” with became the French "aube" because it is always a white gown…
3.6.7 Of onomatopoeias
Consider the word “gnông”, which means “goose”.
Undoubtedly, both terms have an onomatopoeic origin
Should we discard this example?
Our opinion is that we should not. Let’s have a look in our Katakana de-facto Codex:
Japanese has imported a foreign word based on the same radical “ga” as in Vietnamese, already mentioned in §188.8.131.52!
3.7 A new vision of Vietnamese
The Vietnamese syntax is the following, without any derogation:
Phrase = [link or qualifier] [noun] [adjective] [auxiliary] [verb] [postposition] [adverb] [complement]
3.7.1 The classical view
The generally admitted classification of Vietnamese is that it is a language derived from Chinese… some even say a Chinese dialect! This is not true!We have found enough radicals to construct texts, which could be quasi proto-Indo-European!
3.7.2 Example of an “IE” text
This is the example of what could be the introduction of a kids’ story:
Đọc chuyện thú
Mình tính một đôi bò sẽ kêu con trâu đi ra rừng, mà con gà-loi xanh mỏ vàng sẽ ca. Vì vậy, con sấu mở mắt ra.
“đọc” is a radical
Equivalent to the radical “zoo”
Ancient Greek ,Or “duo” (Latin)
(of) cows /bulls
“con” is a qualifier
“gà” is the radical used for all kind of “hen”, “cock”, “rooster”
“cyan” in French and in English
This text is 75% Indo-European!
3.8 The thesaurus of Vietnamese Proto-Indo-European Radicals
The status of our Vietnamese thesaurus of proto-Indo-European radicals is the following:
October 2010: 239 entries
Current Vietnamese dictionaries have about one thousand radicals (all the words are combinations of these radicals). The first 250 entries of our thesaurus are among the most common! The thesaurus will be enriched with further studies.
May 2012: I have now more than 500 definitely indo-european radicals... The point for me is to have enough time to feed the thesaurus!
3.8.1 Presentation & structure
The on-line thesaurus is presented in alphabetic order, based on English.
A short presentation is linked to each term with:
- Use cases
- Cultural issues
- Historical backgrounds
- Other points
3.8.2 Two annexes:
The thesaurus has two annexes:
- one for French words adopted by Vietnamese before 1960
- one for imported English words
3.8.3 Content & versions
Versioning policy is
Prototype à a versions 0.0x à b versions 0.xy à versions x.y
As a conclusion, it would be very tempting to say that our results could be viewed as another set of complementary pieces for the Mater Lingua puzzle. However, we must point out that we have considered only two groups of men among thousands. Although they practiced linguistic exchanges from one extremity to the other of the giant Eurasian continent, there is no hint that their languages derived or not from only one matrix.
Moreover, the recent discoveries on Neanderthal's ADN and the high probabilities of their mixing with Sapiens Sapiens bring doubts on the uniqueness of a human Mater Lingua.
For Japanese and Vietnamese, their writing systems, i.e. the Japanese Kanji and the now-abandoned Vietnamese Nom are definitely grounds for common parentage although both were imported Chinese ideograms. The principle of "one-sign, one-concept, many terms" pledges for a possible single "Pater Script" that would be of course prior to the pictograms and the Mesopotamian cuneiform codex. Would it be hidden on prehistoric drawings of some caverns?
Anyway, both of them, "Mater Lingua" & "Pater Script" gave birth to thousand of languages, as in Genesis: God created us Man and Woman…
- "The phonology of Japanese: a panchronic account" by Laurence Labrune, Université de Bordeaux, France
- "The Changing Role of Katakana in the Japanese Writing System: Processing and Pedagogical Dimensions for Native Speakers and Foreign Learners" by Yuko Igarashi, University of Victoria, Queensland, Australia
Published by the Department of East Asian Languages and Civilizations University of Pennsylvania, the site of Sino-Platonic studies (http://www.sino-platonic.org):
- "Indo-European Vocabulary in Old Chinese, A New Thesis on the Emergence of Chinese Language and Civilization in the Late Neolithic Age" by Tsung-tung Chang
- “A Germanic-Tai Linguistic Puzzle”, by Arne Østmoe
- "How the Earth’s Geology Determined Human History" by Donald F. Beaumont, Senior University, Georgetown, Texas