Treatment of Unicode canoncal decomposition among operating systems

Efstratios Rappos

arxiv: 1711.10481 · v1 · pith:YR73J4KTnew · submitted 2017-11-28 · 💻 cs.OH

Treatment of Unicode canoncal decomposition among operating systems

Efstratios Rappos This is my paper

classification 💻 cs.OH

keywords charactersunicodeoperatingsystemsmultiplepopularrepresentationstreated

0 comments

read the original abstract

This article shows how the text characters that have multiple representations under the Unicode standard are treated by popular operating systems. Whilst most characters have a unique representation in Unicode, some characters such as the accented European letters, can have multiple representations due to a feature of Unicode called normalization. These characters are treated differently by popular operating systems, leading to additional challenges during interoperability of computer programs.

This paper has not been read by Pith yet.

Treatment of Unicode canoncal decomposition among operating systems

discussion (0)