- Working with mixed-byte encoding requires complex programming method
- Every time a new language added, new code pages need to be created
- The needs of mixing and sharing information in a variety of languages across different system
- Unicode is especially fit in with Internet due to the worldwide nature of the Internet which demands solutions that work in any language. The World Wide Web Consortium (W3C) now expects all new RFCs to use Unicode for text.
- Many other products and standards including XML, HTML, Java, Perl, Microsoft C#, Microsoft Jscript, and Microsoft Visual Basic (VB.NET), now require use Unicode.
The Background of Unicode
Unicode was originated from the collaboration between Xerox and Apple, and rapidly joined by some other leading Information Technology (IT) companies such as IBM and Microsoft. Now Unicode is accepted by all major computer companies as the de facto character encoding standard, while ISO 10646 is the corresponding worldwide de jure standard approved by all ISO member countries. The two standards include identical character repertoires and binary representations. For more information on Unicode, please visit Unicode Consortium’s website www.unicode.org
Migration to Unicode
Creating a new program based on Unicode is fairly easy. Converting an existing program that uses code-page encoding to one that uses Unicode or generic declarations is also straightforward. Here are the steps to follow:
1. Modify your code to use generic data types.
2. Modify your code to use generic function prototypes.
3. Surround any character or string literal with the TEXT macro.
4. Create generic versions of your data structures.
5. Change your build process.
6. Adjust pointer arithmetic.
7. Check for any code that assumes a character is always 1 byte long.
8. Add code to support special Unicode characters.
9. Debug your port by enabling your compiler’s type-checking.