Watch, Follow, &
Connect with Us

For forums, blogs and more please visit our
Developer Tools Community.


Welcome, Guest
Guest Settings
Help

Thread: Reading a text file that contains NULL characters into Delphi


This question is answered. Helpful answers available: 2. Correct answers available: 1.


Permlink Replies: 3 - Last Post: Jun 13, 2017 7:15 AM Last Post By: Bill Coe
Bill Coe

Posts: 4
Registered: 5/10/98
Reading a text file that contains NULL characters into Delphi  
Click to report abuse...   Click to reply to this thread Reply
  Posted: Jun 12, 2017 8:59 AM
I have a series of small (~2k) text files (CP 1252) from an uncooperative third party that I need to read and process into a database. The file has NUL characters embedded in it. I have tried to read it using the normal Delphi (XE7) mechanisms (LoadFromFile, LoadFromStream, TFile reads) but they all stop reading at the first NUL character. I have also tried many of the methods written up on Stack Overflow but none will remove them. I can use notepad++ to read the file, convert it to UTF-8, and save it successfully to remove them but I have many of these to process and it is very time-consuming. Is there a way to successfully process these files in Delphi?
Lajos Juhasz

Posts: 801
Registered: 3/14/14
Re: Reading a text file that contains NULL characters into Delphi  
Click to report abuse...   Click to reply to this thread Reply
  Posted: Jun 12, 2017 9:58 AM   in response to: Bill Coe in response to: Bill Coe
Bill Coe wrote:

I have a series of small (~2k) text files (CP 1252) from an
uncooperative third party that I need to read and process into a
database. The file has NUL characters embedded in it. I have tried to
read it using the normal Delphi (XE7) mechanisms (LoadFromFile,
LoadFromStream, TFile reads) but they all stop reading at the first
NUL character. I have also tried many of the methods written up on
Stack Overflow but none will remove them. I can use notepad++ to read
the file, convert it to UTF-8, and save it suc cessfully to remove
them but I have many of these to process and it is very
time-consuming. Is there a way to successfully process these files in
Delphi?

You can read the files using a TFileStream, remove the #0 bytes and
then use the data. To remove the 0 bytes you can use a for statement
for example:

{code}
procedure TForm2.LoadFile(PFileName: string);
var lSrcBytes: TBytes;
lResBytes: TBytes;
lPos: integer;
var x: TFileStream;
a: TEncoding;
i: integer;
lSize: integer;
begin
x:=TFileStream.Create(PFileName, fmOpenRead);
try
lSize:=x.Size;
SetLength(lSrcBytes, lSize);
x.Read(lSrcBytes, lSize);

SetLength(lResBytes, lSize);
lPos:=0;

for i:=0 to lSize-1 do
begin
if lSrcBytes[i]<>0 then
begin
lResBytes[lPos]:=lSrcBytes[i];
inc(lpos);
end;
end;
SetLength(lResBytes, lPos);

a:=TEncoding.GetEncoding(1252);
try
Memo1.lines.text:=a.GetString(lResBytes);
finally
a.Free;
end;
finally
x.Free;
end;
end;

Remy Lebeau (Te...


Posts: 9,447
Registered: 12/23/01
Re: Reading a text file that contains NULL characters into Delphi  
Click to report abuse...   Click to reply to this thread Reply
  Posted: Jun 12, 2017 11:49 AM   in response to: Bill Coe in response to: Bill Coe
Bill Coe wrote:

I have a series of small (~2k) text files (CP 1252) from an
uncooperative third party that I need to read and process into a
database. The file has NUL characters embedded in it.

CP1252 does not have any null bytes in it, except for the actual NUL
character, which should never exist in a purely text file. Maybe the
files are not actually text files to begin with? Are they maybe binary
files that happen to contain null bytes in delimiters between textual
fields?

I can use notepad++ to read the file, convert it to UTF-8, and save it
successfully to remove them

By chance, are the files actually pure text file that are encoded in
UTF-16 instead of CP1252? If so, TStrings.LoadFromFile() would handle
that just fine if the files contain a UTF-16 BOM, or if you set the
AEncoding parameter of LoadFrom...() to TEncoding.Unicode.

--
Remy Lebeau (TeamB)
Bill Coe

Posts: 4
Registered: 5/10/98
Re: Reading a text file that contains NULL characters into Delphi  
Click to report abuse...   Click to reply to this thread Reply
  Posted: Jun 13, 2017 7:15 AM   in response to: Bill Coe in response to: Bill Coe
Bill Coe wrote:
I have a series of small (~2k) text files (CP 1252) from an uncooperative third party that I need to read and process into a database. The file has NUL characters embedded in it. I have tried to read it using the normal Delphi (XE7) mechanisms (LoadFromFile, LoadFromStream, TFile reads) but they all stop reading at the first NUL character. I have also tried many of the methods written up on Stack Overflow but none will remove them. I can use notepad++ to read the file, convert it to UTF-8, and save it successfully to remove them but I have many of these to process and it is very time-consuming. Is there a way to successfully process these files in Delphi?

Thank you Mr. Juhasz. Your solution worked. I still have a lot of testing to do but it has worked in every case I have tried so far.

As far as whether it is a true CP1252 file or not, who knows? It is text, not binary. It has NUL characters in it, however they got there. They show up nicely in a hex editor. The file comes from a device over which I have no control.

I really appreciate the fast response from this board. You guys quickly solved a problem I have been struggling with for weeks. And I learned something new.
Legend
Helpful Answer (5 pts)
Correct Answer (10 pts)

Server Response from: ETNAJIVE02