Newsgroups : Borland : borland.public.delphi.internet.winsock : 2006 Oct : Re: problem with UTF-8 encoding
| Subject: | Re: problem with UTF-8 encoding |
| Posted by: | "Marek Weyda" (mar..@slim.cz) |
| Date: | Wed, 4 Oct 2006 09:53:26 |
Thank you for your answer, now I have BASE64 decoder and it works
Marek W.
"Remy Lebeau (TeamB)" <no.spam@no.spam.com> píše v diskusním příspěvku
news:45229c09$1@newsgroups.borland.com...
>
> "Marek Weyda" <marek@slim.cz> wrote in message
> news:45224f72$1@newsgroups.borland.com...
>
>> When some e-mail is UTF-8 encoded MailMessage.Subject returns
>> something like: '=?utf-8?B?w6HDqcOtw7PDusWvw ... '
>
> As it should be.
>
>> Where's the problem ?
>
> There is no problem. That is perfectly normal behavior. E-mail is
> ASCII-based. To send Unicode, it has to be encoded into an
> ASCII-compatible
> format, such as UTF-8. It is the receiver's responsibilty to decode it in
> order to access the original Unicode data.
>
>> It means that Indy doesn't know how to decode UTF-8
>> encoded e-mail subject or something else ?
>
> That is correct. UTF-8 decoding is not supported at this time. Mainly
> because the VCL itself is Ansi-based. Indy still uses AnsiString instead
> of
> WideString. Even if Indy were to decode the UTF-8 data internally, the
> higher charcters would be lost when the decoded WideString is converted
> back
> to AnsiString by the VCL runtime.
>
> If you need access to the Unicode data, then you will have to decode it
> manually. The line above consists of three parts, formatted as follows:
>
> =?charset?encoding?data?=
>
> When generating the email, the original data is transformed using the
> charset, and then that encoded data is transformed using the encoding,
> resulting in the final data. In this case, the 'B' stands for base64. So
> the Unicode data was transformed into ASCII using UTF-8, and then the
> UTF-8
> data was encoded using base64, and then the base64 data was placed into
> the
> email. To decode, simply reverse the process. Decode the base64 into a
> UTF-8 string, and then decode the UTF-8 data into a Unicode string.
>
>
> Gambit
none