Watch, Follow, &
Connect with Us

For forums, blogs and more please visit our
Developer Tools Community.


Welcome, Guest
Guest Settings
Help

Thread: TXMLDocument: splitting long attributes on write


This question is answered. Helpful answers available: 1. Correct answers available: 1.


Permlink Replies: 4 - Last Post: Sep 25, 2015 4:48 AM Last Post By: Thomas Grubb
Thomas Grubb

Posts: 61
Registered: 2/27/01
TXMLDocument: splitting long attributes on write  
Click to report abuse...   Click to reply to this thread Reply
  Posted: Sep 24, 2015 5:13 PM
Hi,
I am using the TXMLDocument to create a XML document. One of the IXMLNode attributes is a very long string. However, this causes problems as the attribute can become too long to be read by other XML parsers. I need to be able to insert whitespace and NOT have it encoded.

For example:
<path d="Very long string but has spaces"
id="path3478" />

I would like to write out:

<path d="Very long string
but has spaces"
id="path3478" />

I can split the string easily (inserting LineFeeds) but the IXMLNode.Attributes[Name] := SplitLongString; encodes the linefeeds. Note that I have no control over the choice of CDATA or anything like that. The SVG spec requires the path data to be in the "d" attribute.

Any ideas?
Thanks,
Tom
Remy Lebeau (Te...


Posts: 9,447
Registered: 12/23/01
Re: TXMLDocument: splitting long attributes on write  
Click to report abuse...   Click to reply to this thread Reply
  Posted: Sep 24, 2015 6:59 PM   in response to: Thomas Grubb in response to: Thomas Grubb
Thomas wrote:

I need to be able to insert whitespace and NOT have it encoded.

I would like to write out:

<path d="Very long string
but has spaces"
id="path3478" />

I can split the string easily (inserting LineFeeds) but the
IXMLNode.Attributes[Name] := SplitLongString; encodes the
linefeeds.

The XML spec supports unencoded linebreaks inside of attribute values (see
sections 2.3 and 3.3.3). Maybe the DOM vendor you are using as TXMLDocument's
underlying XML engine doesn't implement that portion of the spec? IXMLNode
itself is not the one doing the encoding, the DOM vendor handles it.

Note that I have no control over the choice of CDATA or anything
like that. The SVG spec requires the path data to be in the "d" attribute.

True, but the XML spec (again, section 3.3.3) explains that an attribute
value can be parsed as if it were CDATA if it has not been declared in a
<!ATTLIST> declaration.

--
Remy Lebeau (TeamB)
Thomas Grubb

Posts: 61
Registered: 2/27/01
Re: TXMLDocument: splitting long attributes on write  
Click to report abuse...   Click to reply to this thread Reply
  Posted: Sep 24, 2015 7:44 PM   in response to: Remy Lebeau (Te... in response to: Remy Lebeau (Te...
Hi Remy,

The XML spec supports unencoded linebreaks inside of attribute values (see
sections 2.3 and 3.3.3). Maybe the DOM vendor you are using as TXMLDocument's
underlying XML engine doesn't implement that portion of the spec? IXMLNode
itself is not the one doing the encoding, the DOM vendor handles it.

I am using the MSXML engine. If it supports setting an attribute with linefeeds (without encoding), I don't know how to make it do it. I do
XMLNode.Attributes[Name] := 'string'+CarriageReturn+'string'; // also tried Linefeed
and it encodes the linefeed.

Thanks,
Tom


Note that I have no control over the choice of CDATA or anything
like that. The SVG spec requires the path data to be in the "d" attribute.

True, but the XML spec (again, section 3.3.3) explains that an attribute
value can be parsed as if it were CDATA if it has not been declared in a
<!ATTLIST> declaration.
Remy Lebeau (Te...


Posts: 9,447
Registered: 12/23/01
Re: TXMLDocument: splitting long attributes on write
Helpful
Click to report abuse...   Click to reply to this thread Reply
  Posted: Sep 25, 2015 1:18 AM   in response to: Thomas Grubb in response to: Thomas Grubb
Thomas wrote:

I am using the MSXML engine. If it supports setting an attribute with
linefeeds (without encoding), I don't know how to make it do it.

It apparently does not. Although there is a way to set MSXML-specific properties
for an MSXML document in TXMLDocument, none of the properties exposed have
any effect on how it encodes line breaks in attributes when creating XML
(there is a property to control whether it normalizes attribute values when
parsing XML, though).

I do

XMLNode.Attributes[Name] := 'string'+CarriageReturn+'string'; // also
tried Linefeed

and it encodes the linefeed.

I just found the following reference. This seems to be the same behavior
that MSXML exhibits, and a good explanation of why:

XSL Transformations (XSLT)
7.1.3 Creating Attributes with xsl:attribute

http://www.w3.org/TR/xslt#creating-attributes

When an xsl:attribute contains a text node with a newline, then the XML output
must contain a character reference. For example,

<xsl:attribute name="a">x
y</xsl:attribute>

will result in the output

a="x y"

(or with any equivalent character reference). The XML output cannot be

a="x
y"

This is because **XML 1.0 requires newline characters in attribute values
to be normalized into spaces** but requires character references to newline
characters not to be normalized. The attribute values in the data model represent
the attribute value after normalization. **If a newline occurring in an attribute
value in the tree were output as a newline character rather than as character
reference, then the attribute value in the tree created by reparsing the
XML would contain a space not a newline, which would mean that the tree had
not been output correctly**.

Because newlines are normalized into spaces during parsing, an attribute
value can have a newline when output to XML, but that newline would not be
preserved when the XML is parsed later. Now, in your case, you WANT that
normalization to happen. But it seems MSXML will not allow you to output
XML with an unencoded newline in an attribute.

So, you may have to resort to using a different DOM vendor, or even a different
XML library, that allows you to turn off attribute value processing. Or,
just build up your own XML strings manually instead of using TXMLDocument
or an XML library at all.

--
Remy Lebeau (TeamB)
Thomas Grubb

Posts: 61
Registered: 2/27/01
Re: TXMLDocument: splitting long attributes on write  
Click to report abuse...   Click to reply to this thread Reply
  Posted: Sep 25, 2015 4:48 AM   in response to: Remy Lebeau (Te... in response to: Remy Lebeau (Te...
So, you may have to resort to using a different DOM vendor, or even a different
XML library, that allows you to turn off attribute value processing. Or,
just build up your own XML strings manually instead of using TXMLDocument
or an XML library at all.
Hi Remy,
I was afraid of that. :-( Thank you for your research and your answer.
Regards,
Tom
Legend
Helpful Answer (5 pts)
Correct Answer (10 pts)

Server Response from: ETNAJIVE02