Watch, Follow, &
Connect with Us

Please visit our new home
community.embarcadero.com.


Welcome, Guest
Guest Settings
Help

Thread: Convert UTF8 to WIndows 1251



Permlink Replies: 1 - Last Post: Jul 6, 2015 9:38 AM Last Post By: Remy Lebeau (Te... Threads: [ Previous | Next ]
Alexander Son

Posts: 3
Registered: 12/4/13
Convert UTF8 to WIndows 1251
Click to report abuse...   Click to reply to this thread Reply
  Posted: Jul 5, 2015 11:49 PM
Hi,

I try to convert text in utf8 to windows1251.

This is source text Ñàíêò-Ïåòåðáóðã
This is targer Санкт-Петербург

I tested a lot of functions: Utf8ToAnsi, UTF8ToString, Utf8Encode and other but didn't get true result.

Need help.
Remy Lebeau (Te...


Posts: 9,447
Registered: 12/23/01
Re: Convert UTF8 to WIndows 1251
Click to report abuse...   Click to reply to this thread Reply
  Posted: Jul 6, 2015 9:38 AM   in response to: Alexander Son in response to: Alexander Son
Alexander wrote:

I try to convert text in utf8 to windows1251.

Which version of Delphi are you using? What code have you tried so far?

If you are using Delphi 2009+, the easiest solution is to use the UT8String
and AnsiString(N) types and let the RTL handle the conversion for you:

type
  Win1251String = type AnsiString(1251);
var
  Utf8: UTF8String;
  Win1251: Win1251String;
begin
  Utf8 := ...;
  Win1251 := Win1251String(Utf8);
  // use Win1251 as needed...
end;


Alternatively, you can use the RTL's UnicodeFromLocaleChars() and LocaleCharsFromUnicode()
functions:

var
  Utf8: UTF8String;
  Utf16: UnicodeString;
  Win1251: AnsiString;
begin
  Utf8 := ...;
 
  SetLength(Utf16, UnicodeFromLocaleChars(65001, 0, PAnsiChar(UTf8), Length(Utf8), 
nil, 0));
  UnicodeFromLocaleChars(65001, 0, PAnsiChar(Utf8), Length(Utf8), PWideChar(Utf16), 
Length(Utf16));
 
  SetLength(Win1251, LocaleCharsFromUnicode(1251, 0, PWideChar(Utf16), Length(Utf16), 
nil, 0, nil, nil));
  LocaleCharsFromUnicode(1251, 0, PWideChar(Utf16), Length(Utf16), PAnsiChar(Win1251), 
Length(Win1251), nil, nil);
 
  // use Win1251 as needed...
end;


Or you can use the Win32 MultiByteToWideChar() and WideCharToMultiByte()
functions directly:

var
  Utf8: UTF8String;
  Utf16: UnicodeString; // or WideString in pre-2009 versions
  Win1251: AnsiString;
begin
  Utf8 := ...;
 
  SetLength(Utf16, MultiByteToWideChar(65001, 0, PAnsiChar(UTf8), Length(Utf8), 
nil, 0));
  MultiByteToWideChar(65001, 0, PAnsiChar(Utf8), Length(Utf8), PWideChar(Utf16), 
Length(Utf16));
 
  SetLength(Win1251, WideCharToMultiByte(1251, 0, PWideChar(Utf16), Length(Utf16), 
nil, 0, nil, nil));
  WideCharToMultiByte(1251, 0, PWideChar(Utf16), Length(Utf16), PAnsiChar(Win1251), 
Length(Win1251), nil, nil);
 
  // use Win1251 as needed...
end;


This is source text Ñàíêò-Ïåòåðáóðã This is targer Санкт-Петербург

'Ñàíêò-Ïåòåðáóðã' is not the UTF-8 encoded form of 'Санкт-Петербург'. 'Санкт-Петербург'
is.

I tested a lot of functions: Utf8ToAnsi

That does not allow you to specify the target charset.

UTF8ToString

That is for decoding a UTF-8 string to UTF-16.

Utf8Encode

That is for encoding a UTF-16 string to UTF-8.

--
Remy Lebeau (TeamB)

Legend
Helpful Answer (5 pts)
Correct Answer (10 pts)

Server Response from: ETNAJIVE02