thai-language.comInternet resource
for the Thai language
Lookup:
» more options here
Browse

F.A.Q. Check out the list of frequently asked questions for a quick answer to your inquiry

e-mail the author
guestbook
site settings
site news
bulk lookup
Bangkok
Thanks for your

recent donations!

Narisa N. $+++!
John A. $+++!
Paul S. $100!
Mike A. $100!
Eric B. $100!
John Karl L. $100!
Don S. $100!
John S. $100!
Peter B. $100!
Ingo B $50
Peter d C $50
Hans G $50
Alan M. $50
Rod S. $50
Wolfgang W. $50
Bill O. $70
Ravinder S. $20
Chris S. $15
Jose D-C $20
Steven P. $20
Daniel W. $75
Rudolf M. $30
David R. $50
Judith W. $50
Roger C. $50
Steve D. $50
Sean F. $50
Paul G. B. $50
xsinventory $20
Nigel A. $15
Michael B. $20
Otto S. $20
Damien G. $12
Simon G. $5
Lindsay D. $25
David S. $25
Laurent L. $40
Peter van G. $10
Graham S. $10
Peter N. $30
James A. $10
Dmitry I. $10
Edward R. $50
Roderick S. $30
Mason S. $5
Henning E. $20
John F. $20
Daniel F. $10
Armand H. $20
Daniel S. $20
James McD. $20
Shane McC. $10
Roberto P. $50
Derrell P. $20
Trevor O. $30
Patrick H. $25
Rick @SS $15
Gene H. $10
Aye A. M. $33
S. Cummings $25
Will F. $20
Get e-mail

Sign-up to join our mail­ing list. You'll receive e­mail notification when this site is updated. Your privacy is guaran­teed; this list is not sold, shared, or used for any other purpose. Click here for more infor­mation.

To unsubscribe, click here.

Asian Script Converter

Vowel & consonant graphemes (letters), syllables, and orthography

Moderator: acloudmovingby

Asian Script Converter

Postby vinodhrajan » Sun May 22, 2011 7:39 pm

Hi Guys,

I have developed a Asian Script Converter which converts between all major South Asian Script which include all the Mainland Indian Scripts, Sinhala and the East Asian scripts Thai, Burmese & Khmer (as an offshoot Urdu too ).

The converter can be accessed at : http://www.virtualvinodh.com/aksharamukha

The Character Matrix for all these scripts can be found here: http://www.virtualvinodh.com/character-matrix

Specifically, Thai related notes & options have been provided at : http://www.virtualvinodh.com/thai

Please try the converter. I would like to have your feedback & comments on the Converter, especially with respect to Thai (and other East Asian scripts).

Thanks all.

V
vinodhrajan
 
Posts: 6
Joined: Sun May 22, 2011 7:30 pm

Re: Asian Script Converter

Postby Richard Wordingham » Sun May 22, 2011 11:43 pm

You've done enough research to know that Thai is hard.

One case you haven't (fully) covered is clusters plus preposed vowels. For example, HK draupadI should yield เทฺราปที , not ทฺเราปที / ทเราปะที. There's a relevant article in Thai at http://www.huso.buu.ac.th/thai/web/pers ... 2chap5.htm - I haven't studied it myself. One problem is that initial and intervocal clusters behave differently, so that for example you correctly convert HK buddho to พุทฺโธ. I've seen some slightly surprising placements of the preposed vowels in Pali in Thai script - something like intervocalic -svo- becoming -โสฺว- if I remember correctly.

Visarga isn't lost - it survives as sara a ().

You'll have fun when you add the Tham script (Thai name tua mueang) - sometimes you have to use a consonant sign instead of choeng plus consonant.
Richard Wordingham
 
Posts: 1294
Joined: Mon Feb 14, 2005 12:00 am
Location: Stevenage, England

Re: Asian Script Converter

Postby vinodhrajan » Mon May 23, 2011 2:44 am

Hi Richard,

Thanks a lot for your response.

http://tipitaka.org/thai/

Here: Buddho is written as พุโทฺธ instead of พุทฺโธ.

Is it a standard rule for initial & intervocalic cluster to behave differently ?

Or can we standardize the behavior of (โทฺธ) for clusters at all positions like that of http://tipitaka.org/thai/.

Are there any other references to Pali transliterated into Thai ? If we can compare them, we can come up with a rule of a thumb for Transliteration of clusters into Thai.

V
vinodhrajan
 
Posts: 6
Joined: Sun May 22, 2011 7:30 pm

Re: Asian Script Converter

Postby vinodhrajan » Mon May 23, 2011 3:37 am

I have found quite a few thai print editions here:

http://hall.worldtipitaka.org/node/240199

All of which spell Buddho as "พุทฺโธ". (Tena Samayena Buddho [...] in the first line)

http://www.flickr.com/photos/dhammasoci ... 959624647/
http://www.flickr.com/photos/dhammasoci ... 959624647/
http://www.flickr.com/photos/dhammasoci ... 959624647/
http://www.flickr.com/photos/dhammasoci ... 959624647/
http://www.flickr.com/photos/dhammasoci ... 959624647/

[as you suggested] I suppose its suffice just to change the initial cluster formation rule for vowels <e, o, ai, au>.

V
vinodhrajan
 
Posts: 6
Joined: Sun May 22, 2011 7:30 pm

Re: Asian Script Converter

Postby Richard Wordingham » Mon May 23, 2011 10:59 pm

For the initial clusters, the vowel comes before the two consonants. This seems to be the rule even if the cluster results from elision. (The Thai script seems to have no avagraha.)

For medial clusters, the rules are more complicated. I haven't been able to find a statement of the rules, so I'm having to look at examples. Some of the rules are fairly clear:

1) Nasal plus oral stop is split by a vowel.
2) Nasal geminates are split
3) มฺห is split. On the other hand, มฺย might not be split - but I only have one examples so far.
4) Homorganic stop clusters are split by the vowel.
5) Oral stop plus semivowel (e.g. ตฺร, พฺย) is not spit by a vowel (muta cum liquida).

An interesting example of the interaction of these rules is ภุญฺเชฺย.

ยฺย is a nasty case - sometimes the vowel comes before, sometimes afterwards.

I intend to do a corpus search on the texts from http://www.learntripitaka.com/ . Unfortunately, it's not free from typographical errors, and they used a non-standard coding to get round rendering problems that are now largely history.
Richard Wordingham
 
Posts: 1294
Joined: Mon Feb 14, 2005 12:00 am
Location: Stevenage, England

Re: Asian Script Converter

Postby Rick Bradford » Tue May 24, 2011 10:41 am

Advertised in this week's Matichon is a book called อักษรไทย มาจากไหน? by สุจิตต์ วงษ์เทศ.
User avatar
Rick Bradford
 
Posts: 1164
Joined: Tue Sep 30, 2008 12:00 am
Location: Bangkok

Re: Asian Script Converter

Postby vinodhrajan » Tue May 24, 2011 5:19 pm

Richard Wordingham wrote:For the initial clusters, the vowel comes before the two consonants. This seems to be the rule even if the cluster results from elision. (The Thai script seems to have no avagraha.)

For medial clusters, the rules are more complicated. I haven't been able to find a statement of the rules, so I'm having to look at examples. Some of the rules are fairly clear:

1) Nasal plus oral stop is split by a vowel.
2) Nasal geminates are split
3) มฺห is split. On the other hand, มฺย might not be split - but I only have one examples so far.
4) Homorganic stop clusters are split by the vowel.
5) Oral stop plus semivowel (e.g. ตฺร, พฺย) is not spit by a vowel (muta cum liquida).

An interesting example of the interaction of these rules is ภุญฺเชฺย.

ยฺย is a nasty case - sometimes the vowel comes before, sometimes afterwards.

I intend to do a corpus search on the texts from http://www.learntripitaka.com/ . Unfortunately, it's not free from typographical errors, and they used a non-standard coding to get round rendering problems that are now largely history.


To summarize:

Only the following is not split by the vowel:

1) Word Initial clusters [whatever be the case] - เทฺว

2) Clusters with Semi-Vowels [ra/ya] - นิเทฺร , วินฺไทฺย . Does it include other semi-vowels [la & va] ?

3) yya can be standardized to be not split based on above rule.

In all other cases, the Cluster is split.

If these are the final ones.. I will make the necessary changes to the converter to display the Thai Clusters correctly.

Thanks again for your suggestions.

V
vinodhrajan
 
Posts: 6
Joined: Sun May 22, 2011 7:30 pm

Re: Asian Script Converter

Postby Richard Wordingham » Tue May 24, 2011 6:36 pm

(This should have bee posted this morning (GMT) - it doesn't answer this afternoon's questions. I'm still researching the rules.)

It's looking more and more complicated. It seems that practice is not uniform. In one sections, one mostly gets ตุมฺเห and in another one mostly gets ตุเมฺห.
Richard Wordingham
 
Posts: 1294
Joined: Mon Feb 14, 2005 12:00 am
Location: Stevenage, England

Re: Asian Script Converter

Postby vinodhrajan » Sun May 29, 2011 11:32 am

I found a Multi-Script version of the Tripitaka here. Seems to be very authentic. They have even faithfully reproduced the Sinhala Pali Conjuncts in the Tripitaka . Except for Thai, they use a non-Unicode font for other scripts

http://budsir.mahidol.ac.th/

I did a search myself for some clusters.

Seems all of the conventions elucidated before is followed. Some additional conventions, Conjuncts of -h are not split as mhe, nhe, etc. hme is also not split.

V
vinodhrajan
 
Posts: 6
Joined: Sun May 22, 2011 7:30 pm

Re: Asian Script Converter

Postby Richard Wordingham » Thu Jun 02, 2011 12:45 am

The rules seem to be, in order of priority:

1) Word initial clusters are not split.
2) Geminates are split.
3) Nasal or Indic semivowel (ย ร ล ว) or + is not split.
4) Consonant + v/y is not split.
5) / + consonant is not split.
6) Oral stop + / is not split.
7) Other combinations are split.

There are a lot of exceptions to Rules 3 to 6, but they seem to be random. Rule 2 isn't infallible either.

I had worried about the apparent sequence ชฺเย that I had seen, but it turned out to be due to misspellings of -ปจฺจเยน as -ปจฺเยน (four times!) and -ปจจฺเยน (once). Likewise, the apparent sequence นฺโห turned out to be a misspelling of เสฺนโห as เสนฺโห!
Richard Wordingham
 
Posts: 1294
Joined: Mon Feb 14, 2005 12:00 am
Location: Stevenage, England

Next

Return to Reading, Writing, Spelling, and Tone Rules

Who is online

Users browsing this forum: No registered users and 21 guests

Copyright © 2024 thai-language.com. Portions copyright © by original authors, rights reserved, used by permission; Portions 17 USC §107.