Newsgroups : Microsoft : microsoft.public.inetsdk.programming.wininet : 2006 Mar : Bug in InternetCanonicalizeUrlW in converting Unicode characters ?

www.cryer.info
Managed Newsgroup Archive

Bug in InternetCanonicalizeUrlW in converting Unicode characters ?

Subject:Bug in InternetCanonicalizeUrlW in converting Unicode characters ?
Posted by:"Ravik" (rav..@discussions.microsoft.com)
Date:Wed, 8 Mar 2006 16:59:29

There seems to be a problem in this function, I'd expect it to either escape
or convert to utf-8 some of the unicode characters. For example, in the
following sample


// test app to call the widechar version of url functions, showing bug in
Japanese locale
#include <windows.h>
#include <wininet.h>
#include <stdio.h>

void main()
{
    wchar_t buf[1024];
    DWORD dsize = _countof(buf);
    wchar_t wszName[] = {L'C', L':', L'\', 0xff11, 0};
    BOOL b = ::InternetCanonicalizeUrlW(wszName, buf, &dsize, 0);
    if (b) {
        URL_COMPONENTSW uc;
        memset(&uc, 0, sizeof(uc));  // default all fields to null
        wchar_t wszUrlPath[200];
        uc.lpszUrlPath = wszUrlPath;
        uc.dwUrlPathLength = _countof(wszUrlPath);
        BOOL b2 = ::InternetCrackUrlW(buf, 0, 0, &uc);
        if (b2) {
            printf("UrlPath len = %d: ", uc.dwUrlPathLength);
            for (int i = 0; i < uc.dwUrlPathLength; i++) {
                printf(" %x", wszUrlPath[i]);
            }
            printf("
");
        }
    }
}

If you run it on US locale, you get this output:

UrlPath len = 4:  43 3a 5c ff11

And on Japanese locale, this output:

UrlPath len = 4:  43 3a 5c 45


Thanks for any information.


--
Ravi K.

Replies:

www.cryer.info
Managed Newsgroup Archive