Commonly used
The character codes for are utf-8,GBK,GB2312. For storing text information in a computer ASCII code , Each character corresponds to a unique ASCII code . Each Chinese character should be unique ASCII code . In this way, the country has formulated the character coding standard :GBK,GB2312 etc .GB It means national standard ,GBK and GB2312 main
Coding for Chinese characters . and utf-8 It's universal .GBK and GB2312 Small storage volume of text (utf-8). If your web page uses GB2312 Code , But none of them GB2312 The encoded computer visited , All the Chinese characters in it are in disorder . and utf-8 It's all computers .
UTF-8(8-bit Unicode Transformation
Format) It's about Unicode Variable length character encoding for , Also known as the world code .UTF-8 use 1 reach 6 Byte code Unicode character . Allowed to contain BOM, But it doesn't usually contain BOM. English is written in one byte , Three or four bytes in Chinese .
GBK,GB2312 The proportion of Chinese characters 2 Bytes , That is, regardless of , English characters are represented by double bytes , To distinguish Chinese , Set its highest position to 1.
UTF-8 and GBK Conversion between :
String a = " Hello , China ";
String b = new String(a.getBytes("GBK") , "utf-8");
Technology
Daily Recommendation