A Quick Way to Distinguish Simplified Chinese Characters or Counterfeit Traditional Chinese Characters in a Big5 File (II)

In my last blog, I have introduced an easy-to-operate way to distinguish simplified Chinese characters or counterfeit traditional Chinese characters in a big5 file. In this blog, I will tell you some disadvantages of it.

1) Fail to identify the improper form of such characters with variant forms

As we all know, there are many CHT characters that have several variant forms, such as 布/佈(棉布;佈告), 周/週(周圍;週一), 席/蓆(主席;草蓆), 占/佔(占卜;佔領), 注/註(注水;註解), 于/於(單于;於是), 几/幾(茶几兒;幾個)etc. That is to say, both the first form (布, 周, 席….) and the second form (佈, 週, 蓆…) are available in the traditional Chinese language, even though the first form (布, 周, 席….) also exists in the simplified Chinese language. In view of that, if “布, 周, 席….” is included in the CHT translation, but it is improper in the given circumstance, you will still fail to pick it out by using the solution described in my last blog.

Now let’s take “几” for example.

If the text reads as follows:

每粒水果口味的喉糖皆可提供全部每日所需的維他命 C 攝取量,包括以下几種口味︰

After going through the procedure described, you can not find the improper character “几”, because such form exits in the traditional Chinese language.

Read Also: Simplified Chinese Characters or Counterfeit Traditional Chinese Characters

Here are more examples:

If your translation contains the following sentences:

本操作說明書載有關于本焗爐的安裝、安全使用及保養的重要資料。

打開焗爐門,焗爐內底部的數據牌上,注有焗爐的電壓及接駁負載。

維他命 D3 促進鈣的吸收。* 鈣幫助保持骨骼健康。* 維他命 K 有助於骨蛋白質的形成。* 抵抗骨質疏松症的產生。

Again, you will fail to find the improper characters “于”, “注” and “松”in your translation because of the same reason.

2) Fail to identify miswritten CHT characters

If the miswritten characters themselves are CHT characters, you will fail to identify them depending on the described solution.

For example:

這家古物店賣的不少都是膺品。

這份報告你得看過清楚,不要馬虎了事。

請問你要喝甚麼?蒸溜水還是礦泉水?

這次旅行一定要找個響導,否則只怕會費時失事。

Each of the above sentence contains one miswritten character (膺, 過, 溜 and 響) but it can not be identified by this machine-based detecting method, because they are CHT characters. This method can just identify non-CHT characters.

Therefore, the method is not omnipotent. You cannot completely depend on it for error-free (free of improper characters) translation, otherwise it may spoil your translation quality or even your reputation. On the other hand, as a visual, effective debugging method, the solution does can help you to ensure that no CHS characters are included in your translation as far as possible. Anyway you can use it but never resort to it as a sole solution. Always bear its advantages and disadvantages in mind, and you can make proper use of the solution.

Take a look at how we helped our client by localizing their project for Chinese language. Click here to read the complete case study