AsmBB

Power
Login Register
Threads Categories Chat

EncodingTable koi8-r.tbl

ganuonglachanh (ツ)

Hi johnfound

The default Utf8ToAnsi function use EncodingTable koi8-r.tbl, how can I make another EncodingTable to replace some other chars like ế => e (many more)

Thank you!

johnfound (ツ)
ganuonglachanh

Hi johnfound

The default Utf8ToAnsi function use EncodingTable koi8-r.tbl, how can I make another EncodingTable to replace some other chars like ế => e (many more)

Thank you!

Well, the only implemented code tables for now are WIN1251, CP866, KOI8R and KOI8U;

But if you are asking about the slug/tag generation, you actually don't need this. I am using Utf8ToAnsi procedure here, because in the Russian KOI8 table the Cyrillic letters have the same codes as the UTF8 Latin letters with similar sound.

After the conversion, the string remains valid UTF8 encoded, but all the Cyrillic letters are replaced with the respective Latin letters that can be read the proper way in Russian, Bulgarian, Serbian, etc.

In other words, the use of Utf8ToAnsi is simply a hack. In order to fix the special Latin characters you will need different code at all.

ganuonglachanh (ツ)
johnfound
ganuonglachanh

Hi johnfound

The default Utf8ToAnsi function use EncodingTable koi8-r.tbl, how can I make another EncodingTable to replace some other chars like ế => e (many more)

Thank you!

Well, the only implemented code tables for now are WIN1251, CP866, KOI8R and KOI8U;

But if you are asking about the slug/tag generation, you actually don't need this. I am using Utf8ToAnsi procedure here, because in the Russian KOI8 table the Cyrillic letters have the same codes as the UTF8 Latin letters with similar sound.

After the conversion, the string remains valid UTF8 encoded, but all the Cyrillic letters are replaced with the respective Latin letters that can be read the proper way in Russian, Bulgarian, Serbian, etc.

In other words, the use of Utf8ToAnsi is simply a hack. In order to fix the special Latin characters you will need different code at all.

Yes I asking about slug/tag generation, because I used to use this js function to handle slugify url in VietNamese:

slug = slug.replace(/á|à|ả|ạ|ã|ă|ắ|ằ|ẳ|ẵ|ặ|â|ấ|ầ|ẩ|ẫ|ậ/gi, 'a'); slug = slug.replace(/é|è|ẻ|ẽ|ẹ|ê|ế|ề|ể|ễ|ệ/gi, 'e'); slug = slug.replace(/i|í|ì|ỉ|ĩ|ị/gi, 'i'); slug = slug.replace(/ó|ò|ỏ|õ|ọ|ô|ố|ồ|ổ|ỗ|ộ|ơ|ớ|ờ|ở|ỡ|ợ/gi, 'o'); slug = slug.replace(/ú|ù|ủ|ũ|ụ|ư|ứ|ừ|ử|ữ|ự/gi, 'u'); slug = slug.replace(/ý|ỳ|ỷ|ỹ|ỵ/gi, 'y'); slug = slug.replace(/đ/gi, 'd');

My knowledge about UTF-8 encode is limited, still can't find a solution :-(

johnfound (ツ)
ganuonglachanh

Yes I asking about slug/tag generation, because I used to use this js function to handle slugify url in VietNamese:

slug = slug.replace(/á|à|ả|ạ|ã|ă|ắ|ằ|ẳ|ẵ|ặ|â|ấ|ầ|ẩ|ẫ|ậ/gi, 'a'); slug = slug.replace(/é|è|ẻ|ẽ|ẹ|ê|ế|ề|ể|ễ|ệ/gi, 'e'); slug = slug.replace(/i|í|ì|ỉ|ĩ|ị/gi, 'i'); slug = slug.replace(/ó|ò|ỏ|õ|ọ|ô|ố|ồ|ổ|ỗ|ộ|ơ|ớ|ờ|ở|ỡ|ợ/gi, 'o'); slug = slug.replace(/ú|ù|ủ|ũ|ụ|ư|ứ|ừ|ử|ữ|ự/gi, 'u'); slug = slug.replace(/ý|ỳ|ỷ|ỹ|ỵ/gi, 'y'); slug = slug.replace(/đ/gi, 'd');

My knowledge about UTF-8 encode is limited, still can't find a solution :-(

I will see what I can do about it. In my opinion, we need some general solution able to process such symbols in all languages the same way...

©2016..2020 John Found; Licensed under EUPL; AsmBB v2.8 (check-in: 6d0d9d4bca1af5dd); SQLite v3.31.1 (check-in: 3bfa9cc97da10598); Powered by Assembly language; Created with Fresh IDE;