Hi johnfound
The default Utf8ToAnsi function use EncodingTable koi8-r.tbl, how can I make another EncodingTable to replace some other chars like ế => e (many more)
Thank you!
Hi johnfound
The default Utf8ToAnsi function use EncodingTable koi8-r.tbl, how can I make another EncodingTable to replace some other chars like ế => e (many more)
Thank you!
ganuonglachanh Hi johnfound
The default Utf8ToAnsi function use EncodingTable koi8-r.tbl, how can I make another EncodingTable to replace some other chars like ế => e (many more)
Thank you!
Well, the only implemented code tables for now are WIN1251, CP866, KOI8R and KOI8U;
But if you are asking about the slug/tag generation, you actually don't need this. I am using Utf8ToAnsi procedure here, because in the Russian KOI8 table the Cyrillic letters have the same codes as the UTF8 Latin letters with similar sound.
After the conversion, the string remains valid UTF8 encoded, but all the Cyrillic letters are replaced with the respective Latin letters that can be read the proper way in Russian, Bulgarian, Serbian, etc.
In other words, the use of Utf8ToAnsi is simply a hack. In order to fix the special Latin characters you will need different code at all.
johnfound ganuonglachanh Hi johnfound
The default Utf8ToAnsi function use EncodingTable koi8-r.tbl, how can I make another EncodingTable to replace some other chars like ế => e (many more)
Thank you!
Well, the only implemented code tables for now are WIN1251, CP866, KOI8R and KOI8U;
But if you are asking about the slug/tag generation, you actually don't need this. I am using Utf8ToAnsi procedure here, because in the Russian KOI8 table the Cyrillic letters have the same codes as the UTF8 Latin letters with similar sound.
After the conversion, the string remains valid UTF8 encoded, but all the Cyrillic letters are replaced with the respective Latin letters that can be read the proper way in Russian, Bulgarian, Serbian, etc.
In other words, the use of Utf8ToAnsi is simply a hack. In order to fix the special Latin characters you will need different code at all.
Yes I asking about slug/tag generation, because I used to use this js function to handle slugify url in VietNamese:
slug = slug.replace(/á|à|ả|ạ|ã|ă|ắ|ằ|ẳ|ẵ|ặ|â|ấ|ầ|ẩ|ẫ|ậ/gi, 'a');
slug = slug.replace(/é|è|ẻ|ẽ|ẹ|ê|ế|ề|ể|ễ|ệ/gi, 'e');
slug = slug.replace(/i|í|ì|ỉ|ĩ|ị/gi, 'i');
slug = slug.replace(/ó|ò|ỏ|õ|ọ|ô|ố|ồ|ổ|ỗ|ộ|ơ|ớ|ờ|ở|ỡ|ợ/gi, 'o');
slug = slug.replace(/ú|ù|ủ|ũ|ụ|ư|ứ|ừ|ử|ữ|ự/gi, 'u');
slug = slug.replace(/ý|ỳ|ỷ|ỹ|ỵ/gi, 'y');
slug = slug.replace(/đ/gi, 'd');
My knowledge about UTF-8 encode is limited, still can't find a solution
ganuonglachanh Yes I asking about slug/tag generation, because I used to use this js function to handle slugify url in VietNamese:
slug = slug.replace(/á|à|ả|ạ|ã|ă|ắ|ằ|ẳ|ẵ|ặ|â|ấ|ầ|ẩ|ẫ|ậ/gi, 'a'); slug = slug.replace(/é|è|ẻ|ẽ|ẹ|ê|ế|ề|ể|ễ|ệ/gi, 'e'); slug = slug.replace(/i|í|ì|ỉ|ĩ|ị/gi, 'i'); slug = slug.replace(/ó|ò|ỏ|õ|ọ|ô|ố|ồ|ổ|ỗ|ộ|ơ|ớ|ờ|ở|ỡ|ợ/gi, 'o'); slug = slug.replace(/ú|ù|ủ|ũ|ụ|ư|ứ|ừ|ử|ữ|ự/gi, 'u'); slug = slug.replace(/ý|ỳ|ỷ|ỹ|ỵ/gi, 'y'); slug = slug.replace(/đ/gi, 'd');
My knowledge about UTF-8 encode is limited, still can't find a solution
I will see what I can do about it. In my opinion, we need some general solution able to process such symbols in all languages the same way...