Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
 

Go Back   XSL - XML - RSS Forums > XML General > .NET and XML

Tags:



Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 10-20-2008, 06:58 PM
SammyBar
 
Posts: n/a

Default How to obtain a utf-8 string from an XmlReader?



Hi all,

I'm trying to convert the xml obtained from a XmlReader object into a UTF-8
array. My general idea is to read the XmlReader and write into a
MemoryStream. Then convert the MemoryStream bytes into utf-8.

MemoryStream ms = new MemoryStream();
XmlTextWriter xmlWriter = new XmlTextWriter(ms, new UTF8Encoding(false));

writer.Formatting = Formatting.Indented;
writer.Namespaces = false;
writer.Indentation = 4;

while(xmlReader.Read())
{
xmlWriter.Write(?);
}

xmlWriter.Flush();
xmlWriter.Close();

string xml_as_utf8 = Encoding.UTF8.GetString(ms.ToArray());

But I fill the XmlReader and XmlWriter are not made for this purpose.
xmlReader.Read() parses the xml stream, and xmlWriter is done to create xml
element by element.
Which is the correct strategy here?

Thanks in advance
Sammy



Reply With Quote
Sponsored Links
  #2 (permalink)  
Old 10-20-2008, 06:59 PM
Martin Honnen
 
Posts: n/a

Default Re: How to obtain a utf-8 string from an XmlReader?

SammyBar wrote:
> Hi all,
>
> I'm trying to convert the xml obtained from a XmlReader object into a UTF-8
> array. My general idea is to read the XmlReader and write into a
> MemoryStream. Then convert the MemoryStream bytes into utf-8.
>
> MemoryStream ms = new MemoryStream();
> XmlTextWriter xmlWriter = new XmlTextWriter(ms, new UTF8Encoding(false));
>
> writer.Formatting = Formatting.Indented;
> writer.Namespaces = false;
> writer.Indentation = 4;
>
> while(xmlReader.Read())
> {
> xmlWriter.Write(?);
> }
>
> xmlWriter.Flush();
> xmlWriter.Close();
>
> string xml_as_utf8 = Encoding.UTF8.GetString(ms.ToArray());
>
> But I fill the XmlReader and XmlWriter are not made for this purpose.
> xmlReader.Read() parses the xml stream, and xmlWriter is done to create xml
> element by element.
> Which is the correct strategy here?


What exactly is it that you want to achieve?
Strings in the .NET framework are always UTF-16 encoded.

--

Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/
Reply With Quote
  #3 (permalink)  
Old 10-20-2008, 06:59 PM
SammyBar
 
Posts: n/a

Default Re: How to obtain a utf-8 string from an XmlReader?


"Martin Honnen" <mahotrash@yahoo.de> escribió en el mensaje
news:eVFmX2sMJHA.1204@TK2MSFTNGP05.phx.gbl...
> SammyBar wrote:
>> Hi all,
>>
>> I'm trying to convert the xml obtained from a XmlReader object into a
>> UTF-8 array. My general idea is to read the XmlReader and write into a
>> MemoryStream. Then convert the MemoryStream bytes into utf-8.
>>
>> MemoryStream ms = new MemoryStream();
>> XmlTextWriter xmlWriter = new XmlTextWriter(ms, new
>> UTF8Encoding(false));
>>
>> writer.Formatting = Formatting.Indented;
>> writer.Namespaces = false;
>> writer.Indentation = 4;
>>
>> while(xmlReader.Read())
>> {
>> xmlWriter.Write(?);
>> }
>>
>> xmlWriter.Flush();
>> xmlWriter.Close();
>>
>> string xml_as_utf8 = Encoding.UTF8.GetString(ms.ToArray());
>>
>> But I fill the XmlReader and XmlWriter are not made for this purpose.
>> xmlReader.Read() parses the xml stream, and xmlWriter is done to create
>> xml element by element.
>> Which is the correct strategy here?

>
> What exactly is it that you want to achieve?
> Strings in the .NET framework are always UTF-16 encoded.


Thanks for your response, Martin.

I need to save the xml utf-8 encoded at the client. But my client IDE
(Centura 1.5) does not have any helper function to convert utf-16 to utf-8.
So my plan is to make the conversion at the server (Sql Server 2005) by
using clr to write a user defined function. The Sql Server passes the Xml
via an object type SqlXml. This object has a CreateReader() method that
returns an XmlReader. So I need to convert this input to utf-8 encoded text,
then return this to the Sql Server, then it is returned to Centura client
via ODBC in order to write the text in a file at the client. Thus the client
will not deal with the encoding. This is an old client-server application
where the only communication between the client and the server is ODBC
without any xml support from the client side. In other words, I'm trying to
simulate the client to http connect to the sql server to retrieve a
dynamically created xml file. I have no Http client at the Centura client.




Reply With Quote
  #4 (permalink)  
Old 10-20-2008, 08:08 PM
SammyBar
 
Posts: n/a

Default Re: How to obtain a utf-8 string from an XmlReader?

I found the following solution: deserialize the text reader into a
XmlDocument object, and then deserialize the it into the MemoryStream.


"SammyBar" <sammybar@gmail.com> escribió en el mensaje
news:Ogux9tsMJHA.1304@TK2MSFTNGP02.phx.gbl...
> Hi all,
>
> I'm trying to convert the xml obtained from a XmlReader object into a
> UTF-8 array. My general idea is to read the XmlReader and write into a
> MemoryStream. Then convert the MemoryStream bytes into utf-8.
>
> MemoryStream ms = new MemoryStream();
> XmlTextWriter xmlWriter = new XmlTextWriter(ms, new UTF8Encoding(false));
>
> writer.Formatting = Formatting.Indented;
> writer.Namespaces = false;
> writer.Indentation = 4;
>
> while(xmlReader.Read())
> {
> xmlWriter.Write(?);
> }
>
> xmlWriter.Flush();
> xmlWriter.Close();
>
> string xml_as_utf8 = Encoding.UTF8.GetString(ms.ToArray());

This is an error: The memory stream is already utf-8 encoded. Then this
previous line re-encodes from utf-8 to the .NET standard encoding so we loss
the encoding when converting.
The solution I found here looks like a hack: to deceive the .NET by reading
the utf-8 saved stream as default encoding:
string xml_data = Encoding.Default.GetString(ms.ToArray());

The code fragments looks like this: (note my input is SqlXml sqlXml
variable)
// load

XmlReader xmlReader = sqlXml.CreateReader();

XmlDocument xDoc = new XmlDocument();

xDoc.Load(xmlReader);

xmlReader.Close();


// save

MemoryStream ms = new MemoryStream();

XmlTextWriter writer = new XmlTextWriter(ms, new UTF8Encoding(false));

writer.Formatting = Formatting.Indented;

writer.Indentation = 4;

xDoc.Save(writer);

writer.Close();

ms.Close();

// convert to utf-8

//string xml_data = Encoding.UTF8.GetString(ms.ToArray());

string xml_data = Encoding.Default.GetString(ms.ToArray());




Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



Contact Us -|- XSL - XML - RSS Forums -|- Archive -|- Top -|-Rules/Disclaimer-|-Help/Support -|-Advertise
© Camley Interactive (camley.info) 2008 - all logos and images are copywrite their respective owners.
Proud member of the Camley Interactive Network
Powered by vBulletin® Version 3.6.8
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.1.0 ©2007, Crawlability, Inc.
All times are GMT. The time now is 12:51 PM.
Style Developed by Epic Designz