Newsgroups : Borland : borland.public.delphi.internet.winsock : 2006 Oct : Re: problem with UTF-8 encoding

www.cryer.info
Managed Newsgroup Archive

Re: problem with UTF-8 encoding

Subject:Re: problem with UTF-8 encoding
Posted by:"Marek Weyda" (mar..@slim.cz)
Date:Wed, 4 Oct 2006 09:53:26

Thank you for your answer, now I have BASE64 decoder and it works

Marek W.

"Remy Lebeau (TeamB)" <no.spam@no.spam.com> píše v diskusním příspěvku
news:45229c09$1@newsgroups.borland.com...
>
> "Marek Weyda" <marek@slim.cz> wrote in message
> news:45224f72$1@newsgroups.borland.com...
>
>> When some e-mail is UTF-8 encoded MailMessage.Subject returns
>> something like: '=?utf-8?B?w6HDqcOtw7PDusWvw ... '
>
> As it should be.
>
>> Where's the problem ?
>
> There is no problem.  That is perfectly normal behavior.  E-mail is
> ASCII-based.  To send Unicode, it has to be encoded into an
> ASCII-compatible
> format, such as UTF-8.  It is the receiver's responsibilty to decode it in
> order to access the original Unicode data.
>
>> It means that Indy doesn't know how to decode UTF-8
>> encoded e-mail subject or something else ?
>
> That is correct.  UTF-8 decoding is not supported at this time.  Mainly
> because the VCL itself is Ansi-based.  Indy still uses AnsiString instead
> of
> WideString.  Even if Indy were to decode the UTF-8 data internally, the
> higher charcters would be lost when the decoded WideString is converted
> back
> to AnsiString by the VCL runtime.
>
> If you need access to the Unicode data, then you will have to decode it
> manually.  The line above consists of three parts, formatted as follows:
>
>    =?charset?encoding?data?=
>
> When generating the email, the original data is transformed using the
> charset, and then that encoded data is transformed using the encoding,
> resulting in the final data.  In this case, the 'B' stands for base64.  So
> the Unicode data was transformed into ASCII using UTF-8, and then the
> UTF-8
> data was encoded using base64, and then the base64 data was placed into
> the
> email.  To decode, simply reverse the process.  Decode the base64 into a
> UTF-8 string, and then decode the UTF-8 data into a Unicode string.
>
>
> Gambit

Replies:

none

In response to:

www.cryer.info
Managed Newsgroup Archive