“How I Learned To ❤️ Unicode and Stop Worrying”

  by Giuseppe D’Angelo

Abstract

First released in 1991, the Unicode Standard is today the de-facto standard for text interexchange. Or is it?

How many times you’ve experienced bröken encodings when visiting a webpage? Isn’t it annoying that one just want to type some 𝓫𝓮𝓪𝓾𝓽𝓲𝓯𝓾𝓵 code, yet ends up with ṣ̋oṃ̋̋ẹ h̰o̽̽rr̰̽ḭ̰̽b̽l̽̽ḛ̰ g͓͝͝͝͝a͔͔͔͔r̖̖͋b͒͒͒a̔̔̔̔͟͟g͈͈͈͈͈̀̀̀̀̀e͇͇ͩͩͩͩͩ on the screen? 𝑾𝒉𝒚 𝒄𝒂𝒏’𝒕 𝒘𝒆 𝒋𝒖𝒔𝒕 𝒉𝒂𝒗𝒆 𝒔𝒊𝒎𝒑𝒍𝒆 𝒕𝒉𝒊𝒏𝒈𝒔?

The reason is: string handling is complicated.

But don’t worry! The amazing minds at the Unicode consortium have all it figured out. We just have to 🔊 to their expert advice.

In this talk we will present the foundations of Unicode string handling when using Qt.
We will start from a quick explaination what the Unicode standard is all about, what it contains, and why its usage is necessary when building user interfaces. This will lead into discussing the low-level classes in standard C++ and Qt to store Unicode data: QChar, QString, and so on. We will see how they work, what’s their feature set, and especially how they map to Unicode concepts.

We will then move on to some higher level facilities provided by Qt, such as classes and functions for collation, iteration over grapheme clusters, and locale-aware formatting. All of these features are necessary to build proper user interfaces.

At the end, the audience will have gained some more understanding about about the foundations of modern string handling, and how Qt immensily helps the developer to do the right thing.

About the speaker

Giuseppe D’Angelo KDABGiuseppe is an Approver of the Qt Project and a Senior Software Engineer at KDAB. He is a long time contributor to Qt, having used Qt and C++ since 2000. His contributions in Qt range from containers and regular expressions to GUI, Widgets and OpenGL. A free software passionate and UNIX specialist, before joining KDAB, Giuseppe organized conferences on open source around Italy. He holds a BSc in Computer Science.