This is the mod we've been all waiting for: A clean method to render Unicode glyphs, based on the GNU Unifont.
I think GNU Unifont with its 16×16 pixel glyphs is just perfect for Minetest, there aren't many other fonts with such a huge coverage of the Unicode code space. And Pixel style is good for many retro-style games.
There are nice features like kerning and color selection.
However, while Unifont may cover the entire Unicode BMP and more, that doesn't mean this mod can render them all properly. Some writing systems are very complex and require additional adjustments. So any users of this mod need to be aware of these limitations:
Bidirectional Algorithm is not correctly implemented
Many complex writing systems have additional writing rules that would need more code to make them work. For example, rendering Arabic not only requires the bidi algorithm but also you need to draw the glyphs differently depending on where they are in the word. (Note: This is a limitation of Unifont as well.)
Complex combining marks are not supported well. The code admits they are hacky
But still, this mod is a massive improvement of what we had before. I will include a modified version of this mod in Repixture.
Bidirectional Algorithm is not correctly implemented
Correct, I stopped doing it when I got no user interest. It is tedious to do so. Do you think it should be optional?
Many complex writing systems have additional writing rules that would need more code to make them work.
Correct. I do not know any of those, e.g. arabic. Do you? I would appreciate help regarding that.
Complex combining marks are not supported well. The code admits they are hacky
I am pretty sure that the support is enough to support most latin languages, but it is admittedly hacky.
I thought about making it optional, but then NFC and NFD encoded glyphs do no longer render similarly.
Rendering them as U+FFFD REPLACEMENT CHARACTER might result in text like “Ame�lie”, unfortunately.
I will include a modified version of this mod in Repixture.
I have said it before, but just so that I have said it on CDB: Please change the mod name of the modified version to rp_unicode_text or something, as it is intentionally incompatible with unicode_text (IIRC you removed everything hacky that you disliked) and no mod that depends on unicode_text can work correctly with your fork.
Correct, I stopped doing it when I got no user interest. It is tedious to do so. Do you think it should be optional?
You know my philosophy. Either do it right, or remove the feature. That’s better than to pretend the software has a feature when in reality it’s very broken. An alternative would be to clearly mark this feature as experimental, and it must be enabled first.
Correct. I do not know any of those, e.g. arabic. Do you? I would appreciate help regarding that.
No. I know at least enough to know that naive approaches to render Arabic will not work. I also know using a text shaping library like Harfbuzz would solve this way more elegantly instead of re-inventing the wheel in a mod. One idea would be adding Harfbuzz (or similar) in Luanti, that would be nice.
Rendering them as U+FFFD REPLACEMENT CHARACTER might result in text like “Ame�lie”, unfortunately.
Yeah, this is what I did in my fork but of course it would be better if combining marks just worked. At least the replacement character tells the user this character is unsupported. Thankfully, Unicode includes a lot of codepoints where diacriticals are "baked in", so to say.
One possibility would be partial support for combining marks, so those cases that are implemented will render, the unimplemented cases will lead to replacement chars (until implemented). Admittedly, in my fork I have thrown combining marks away entirely. The Unicode rules are quite complex and overwhelming.
Please change the mod name of the modified version to rp_unicode_text
Alright, alright. I’ll rename the mod in Repixture. Admittedly, I only had lame excuses. I never disagreed with your reasoning btw and I’m perfectly aware this was bad practice, I just made up ad-hoc excuses lol.
Yeah, this is what I did in my fork but of course it would be better if combining marks just worked.
I have already asked on IRC in which cases combining marks do not work – please provide some examples.
With those examples I could work on the rendering and hopefully improve it by a bit.
(Real words and phrases please, so no Zalgo text.)
At least the replacement character tells the user this character is unsupported.
A simple way to have “unsupported characters” is to not provide them in a font. Why are you not doing that instead?
If it is not enough to do that, please tell me what changes I have to make so that this would become usable for you.
Thankfully, Unicode includes a lot of codepoints where diacriticals are "baked in", so to say.
All of those glyphs with “baked in” diacritics that I have tested render fine with my version also in the decomposed variant.
If I am not mistaken, a user could input the string “Amélie” encoded as U+0041 U+006d U+00e9 U+006c U+0069 U+0065 (composed) or as U+0041 U+006d U+0065 U+0301 U+006c U+0069 U+0065 (decomposed). In your version, the latter looks broken. In my version, both look the same (maybe not pixel-exact, but basically the same). I fail to see how rendering one of them as “Ame�lie” is any kind of improvement.
Alright, alright. I’ll rename the mod in Repixture. Admittedly, I only had lame excuses. I never disagreed with your reasoning btw and I’m perfectly aware this was bad practice, I just made up ad-hoc excuses lol.
This is the mod we've been all waiting for: A clean method to render Unicode glyphs, based on the GNU Unifont.
I think GNU Unifont with its 16×16 pixel glyphs is just perfect for Minetest, there aren't many other fonts with such a huge coverage of the Unicode code space. And Pixel style is good for many retro-style games.
There are nice features like kerning and color selection.
However, while Unifont may cover the entire Unicode BMP and more, that doesn't mean this mod can render them all properly. Some writing systems are very complex and require additional adjustments. So any users of this mod need to be aware of these limitations:
But still, this mod is a massive improvement of what we had before. I will include a modified version of this mod in Repixture.
Correct, I stopped doing it when I got no user interest. It is tedious to do so. Do you think it should be optional?
Correct. I do not know any of those, e.g. arabic. Do you? I would appreciate help regarding that.
I am pretty sure that the support is enough to support most latin languages, but it is admittedly hacky.
I thought about making it optional, but then NFC and NFD encoded glyphs do no longer render similarly.
Rendering them as U+FFFD REPLACEMENT CHARACTER might result in text like “Ame�lie”, unfortunately.
I have said it before, but just so that I have said it on CDB: Please change the mod name of the modified version to
rp_unicode_text
or something, as it is intentionally incompatible withunicode_text
(IIRC you removed everything hacky that you disliked) and no mod that depends onunicode_text
can work correctly with your fork.You know my philosophy. Either do it right, or remove the feature. That’s better than to pretend the software has a feature when in reality it’s very broken. An alternative would be to clearly mark this feature as experimental, and it must be enabled first.
No. I know at least enough to know that naive approaches to render Arabic will not work. I also know using a text shaping library like Harfbuzz would solve this way more elegantly instead of re-inventing the wheel in a mod. One idea would be adding Harfbuzz (or similar) in Luanti, that would be nice.
Yeah, this is what I did in my fork but of course it would be better if combining marks just worked. At least the replacement character tells the user this character is unsupported. Thankfully, Unicode includes a lot of codepoints where diacriticals are "baked in", so to say.
One possibility would be partial support for combining marks, so those cases that are implemented will render, the unimplemented cases will lead to replacement chars (until implemented). Admittedly, in my fork I have thrown combining marks away entirely. The Unicode rules are quite complex and overwhelming.
Alright, alright. I’ll rename the mod in Repixture. Admittedly, I only had lame excuses. I never disagreed with your reasoning btw and I’m perfectly aware this was bad practice, I just made up ad-hoc excuses lol.
I have already asked on IRC in which cases combining marks do not work – please provide some examples.
With those examples I could work on the rendering and hopefully improve it by a bit.
(Real words and phrases please, so no Zalgo text.)
A simple way to have “unsupported characters” is to not provide them in a font. Why are you not doing that instead?
If it is not enough to do that, please tell me what changes I have to make so that this would become usable for you.
All of those glyphs with “baked in” diacritics that I have tested render fine with my version also in the decomposed variant.
If I am not mistaken, a user could input the string “Amélie” encoded as
U+0041 U+006d U+00e9 U+006c U+0069 U+0065
(composed) or asU+0041 U+006d U+0065 U+0301 U+006c U+0069 U+0065
(decomposed). In your version, the latter looks broken. In my version, both look the same (maybe not pixel-exact, but basically the same). I fail to see how rendering one of them as “Ame�lie” is any kind of improvement.Thanks!