Watch, Follow, &
Connect with Us

For forums, blogs and more please visit our
Developer Tools Community.


Welcome, Guest
Guest Settings
Help

Thread: TListView Extended ASCII Characters not showing up


This question is not answered. Helpful answers available: 2. Correct answers available: 1.


Permlink Replies: 7 - Last Post: Dec 10, 2015 11:48 AM Last Post By: Kris G
Kris G

Posts: 5
Registered: 12/15/13
TListView Extended ASCII Characters not showing up  
Click to report abuse...   Click to reply to this thread Reply
  Posted: Dec 9, 2015 2:29 PM
Hi all,

I'm trying to fix a bug where a TListView subitem needs to be able to properly display extended ASCII characters. It's not at the moment, and I'm stumped as to why. I know we have other applications that do, so I'm wondering if there's something in the project settings that might need to be changed in order for it to display. I'm using Xe6

Originally I thought it was an issue with the conversions, but I tried the following and it still didn't display correctly.

  TListItem * item = FindListItem(message_id_num);
  if (item == NULL)
  {
    item = m_lst_messages->Items->Add();
    for (int i=1; i<m_lst_messages->Columns->Count; i++)
    {
      item->SubItems->Add("FÑÙ");
    }
  }


Instead, it displays FÃ'Ù.

There are other forms in this project that correctly display it. It's just this one.
Remy Lebeau (Te...


Posts: 9,447
Registered: 12/23/01
Re: TListView Extended ASCII Characters not showing up  
Click to report abuse...   Click to reply to this thread Reply
  Posted: Dec 9, 2015 3:21 PM   in response to: Kris G in response to: Kris G
Kris wrote:

I'm trying to fix a bug where a TListView subitem needs to be able
to properly display extended ASCII characters.

You are passing an Ansi string literal where a System::String is expected.
To avoid any data loss, pass a Unicode string literal instead, eg:

item->SubItems->Add(L"FÑÙ"); // <-- notice the 'L' prefix


Or:

item->SubItems->Add(_D("FÑÙ"));


_D() is a C++Builder specific macro that tells the compiler to interpret
the literal using the same encoding that System::Char and System::String
use (in this case, UTF-16). It is a similar concept to C's _T() macro for
_TCHAR data, and Win32's TEXT() macro for TCHAR data.

It's not at the moment, and I'm stumped as to why.

Your source file is UTF-8 encoded (right-click on the code editor, File Format,
UTF8). The UTF-8 encoded form of "FÑÙ" (0x46 0xC3 0x91 0xC3 0x99) is being
stored in your EXE file. At runtime, you are passing a UTF-8 encoded char*
to Add(). Since Add() expects a UnicodeString as input, the RTL has to perform
a data conversion. But the RTL does not know that your char* data is UTF-8
encoded. Passing a char* to a UnicodeString converts the data using the
OS default Ansi codepage (or, more accurately, the codepage specified by
the global System::DefaultSystemCodePage variable), which is not UTF-8.
Thus, the conversion produces the wrong result.

Knowing that your source file is UTF-8 encoded, an alternative to the above
code would be to provide an extra runtime cast to ensure the correct conversion
from UTF-8 (instead of Ansi) to UTF-16:

item->SubItems->Add(UTF8String("FÑÙ")); // <-- notice no 'L' prefix


The RTL knows how to properly convert a UTF-8 encoded UTF8String to a UTF-16
encoded UnicodeString.

Originally I thought it was an issue with the conversions, but I
tried the following and it still didn't display correctly.
<snip>
Instead, it displays FÃ'Ù.

"FÃ'Ù" is the UTF-8 encoded form of "FÑÙ" being interpretted as Windows-1252
instead of as UTF-8.

--
Remy Lebeau (TeamB)
Remy Lebeau (Te...


Posts: 9,447
Registered: 12/23/01
Re: TListView Extended ASCII Characters not showing up  
Click to report abuse...   Click to reply to this thread Reply
  Posted: Dec 9, 2015 3:23 PM   in response to: Remy Lebeau (Te... in response to: Remy Lebeau (Te...
Remy wrote:

You are passing an Ansi string literal where a System::String is
expected. To avoid any data loss, pass a Unicode string literal
instead

Remember, everything in the RTL/VCL from CB2009 onwards is now Unicode-based,
not Ansi-based anymore. Don't Ansi-fy your code.

--
Remy Lebeau (TeamB)
Kris G

Posts: 5
Registered: 12/15/13
Re: TListView Extended ASCII Characters not showing up  
Click to report abuse...   Click to reply to this thread Reply
  Posted: Dec 9, 2015 3:31 PM   in response to: Remy Lebeau (Te... in response to: Remy Lebeau (Te...
The code originally had the L"...", and that wasn't working. I tried it too originally.

I tried the suggestion of

  TListItem * item = FindListItem(message_id_num);
  if (item == NULL)
  {
    item = m_lst_messages->Items->Add();
    for (int i=1; i<m_lst_messages->Columns->Count; i++)
    {
      item->SubItems->Add(UTF8String("FÑÙ"));
    }
  }


I got something even weirder displaying. I can't reproduce it because I can't copy it, but it's pretty much an F and then what looks like upsidedown question marks and small 1/2's.
Remy Lebeau (Te...


Posts: 9,447
Registered: 12/23/01
Re: TListView Extended ASCII Characters not showing up  
Click to report abuse...   Click to reply to this thread Reply
  Posted: Dec 9, 2015 4:02 PM   in response to: Kris G in response to: Kris G
Kris wrote:

The code originally had the L"...", and that wasn't working.

I assure you, it does. VCL is Windows-specific. On Windows, the L"..."
synax produces a UTF-16 encoded wchar_t[] array for the literal data, and
UnicodeString has a constructor that accepts a UTF-16 encoded wchar_t* as
input (a wchar[] array can decay into a wchar_t* pointer). The data gets
copied as-is at runtime without any conversion. Works fine in every version
of C++Builder, from 2009 onwards using UnicodeString, and 2007 and earlier
using WideString.

I tried the suggestion of
<snip>
I got something even weirder displaying.

Then your string literal data is not actually UTF-8 encoded to begin with.

I can't reproduce it because I can't copy it

How about a screenshot?

but it's pretty an F and then what looks like upsidedown question
marks and small 1/2's.

In UTF-8:

- "FÑÙ" (U+0046 U+00D1 U+00D9) is encoded as 0x46 0xC3 0x91 0xC3 0x99.

- "¿" (U+00BF) is encoded as 0xC2 0xBF.

- "½" (U+00BD) is encoded as 0xC2 0xBD.

I don't see any way you could get from "FÑÙ" to something like "F¿½" or "F½¿"
from those byte combinations. You would likely have to go through several
layers of incorrect codepage conversions to mess up the bytes that badly.
Even if "FÑÙ" were stored in your EXE in its Ansi form (0x46 0xD1 0xD9,
assuming Windows-1252), that would not produce "F¿½" or "F¿½" when interpretted
as-is as UTF-8 (0x46 0xD1 0xD9 is not a valid UTF-8 byte sequence).

--
Remy Lebeau (TeamB)
Kris G

Posts: 5
Registered: 12/15/13
Re: TListView Extended ASCII Characters not showing up  
Click to report abuse...   Click to reply to this thread Reply
  Posted: Dec 10, 2015 6:56 AM   in response to: Remy Lebeau (Te... in response to: Remy Lebeau (Te...
Heh...trust me, I know this is weird and that's why I'm absolutely stumped. Everything you're saying makes sense, so something else must be going on and I'm not sure where to look.

https://img42.com/2KrLM. This is a small screenshot of what the output is. Again, it's only in this TListView. It displays fine in the other forms that can be accessed from there.

Any other ideas of what could possibly be causing this? If it's incorrect to begin with, how would I go about looking into that?
Remy Lebeau (Te...


Posts: 9,447
Registered: 12/23/01
Re: TListView Extended ASCII Characters not showing up  
Click to report abuse...   Click to reply to this thread Reply
  Posted: Dec 10, 2015 10:50 AM   in response to: Kris G in response to: Kris G
Kris wrote:

https://img42.com/2KrLM. This is a small screenshot of what the
output is.

The photo is locked and cannot be accessed. Please post the picture to the
Attachments forum on this server instead.

Any other ideas of what could possibly be causing this? If it's
incorrect to begin with, how would I go about looking into that?

I have already told you how to fix the code to handle the string data properly.
If the ListView is still not displaying it correctly then clearly the data
is being corrupted. You will have to step into the VCL source code with
the debugger (enable Debug DCUs in the project options) to see what is actually
happening with the Add() call at runtime.

--
Remy Lebeau (TeamB)
Kris G

Posts: 5
Registered: 12/15/13
Re: TListView Extended ASCII Characters not showing up  
Click to report abuse...   Click to reply to this thread Reply
  Posted: Dec 10, 2015 11:48 AM   in response to: Remy Lebeau (Te... in response to: Remy Lebeau (Te...
The picture has been posted in attachments with the same name as this thread + Attachment:

https://forums.embarcadero.com/thread.jspa?threadID=122027&tstart=0
Legend
Helpful Answer (5 pts)
Correct Answer (10 pts)

Server Response from: ETNAJIVE02