2017. 7. 16.

[Tip] ISO-8859-1 and unicode

ISO-8859-1은 HTML 4.01의 기본 인코딩 문자이다. 이는 ASCII 문자열을 포함하고 있다. 마지막 숫자의 의미는 커버할 수 있는 문자열을 의미한다. 자세한 사항은 아래와 같다.


ISO-8859-1 is the default character in HTML 4.0.1. It contains numbers, upper and lowercase English letters, and some special characters like ASCII printable characters. The last number means languages. Details are like this : 

Character setDescriptionCovers
ISO-8859-1Latin alphabet part 1North America, Western Europe, Latin America, the Caribbean, Canada, Africa
ISO-8859-2Latin alphabet part 2Eastern Europe
ISO-8859-3Latin alphabet part 3SE Europe, Esperanto, miscellaneous others
ISO-8859-4Latin alphabet part 4Scandinavia/Baltics (and others not in ISO-8859-1)
ISO-8859-5Latin/Cyrillic part 5The languages that are using a Cyrillic alphabet such as Bulgarian, Belarusian, Russian and Macedonian
ISO-8859-6Latin/Arabic part 6The languages that are using the Arabic alphabet
ISO-8859-7Latin/Greek part 7The modern Greek language as well as mathematical symbols derived from the Greek
ISO-8859-8Latin/Hebrew part 8The languages that are using the Hebrew alphabet
ISO-8859-9Latin 5 part 9The Turkish language. Same as ISO-8859-1 except Turkish characters replace Icelandic ones
ISO-8859-10Latin 6 Lappish, Nordic, EskimoThe Nordic languages
ISO-8859-15Latin 9 (aka Latin 0)Similar to ISO-8859-1 but replaces some less common symbols with the euro sign and some other missing characters
ISO-2022-JPLatin/Japanese part 1The Japanese language
ISO-2022-JP-2Latin/Japanese part 2The Japanese language
ISO-2022-KRLatin/Korean part 1The Korean language


서유럽 문자인 ISO-8859-1이 HTML 표준 인코딩셋이기 때문에, http 통신을 할 때 request에 들어있는 한글 파라미터는 당연히 깨진다. 따라서 UTF-8 유니코드 인코딩셋으로 변환해줘야 한다.  이때 유니코드는 위의 ASCII의 문제점이 있기 때문에 나온 방식이다. 전 세계의 모든 문자열을 인코딩하기 위해 만들어졌다.

So, a korean parameter cannot be transfer well from request object while interchaging HTTP. For this reason, you have to transfer it to unicode encoding. Unicode is a computing industry standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems.

댓글 없음:

댓글 쓰기