Calculating data size without storing in database

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Calculating data size without storing in database

Anthony Taboada-2
Hi,

I am in the process of integrating the Microsoft Azure Storage REST API with eXist-db.  When uploading files (both XML and binary) to Azure, the authentication requires Content-Length information in the string-to-sign.

I want to upload data that is not stored in my database, so I am currently saving a temporary copy of the data, retrieving the file size using xmldb:size(), and then deleting the temporary file.

Is there another method of obtaining data size without having to store it in the database first?  

Thanks,

Anthony Taboada
Software Developer & Digital Operations
NueMeta LLC
Digital Media & Technology
Phone: +1-305-924-1774
Skype: anthony.taboada


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Exist-open mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/exist-open
Reply | Threaded
Open this post in threaded view
|

Re: Calculating data size without storing in database

Dannes Wessels-3
Hi,

On 20 Feb 2017, at 19:52 , Anthony Taboada <[hidden email]> wrote:

 xmldb:size()

Note that this gives for XML documents only an approximate size of the document! Actual size depends on serialisation parameters….

regards

Dannes

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Exist-open mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/exist-open
Reply | Threaded
Open this post in threaded view
|

Re: Calculating data size without storing in database

Anthony Taboada
Hi,

Let me clarify what I am trying to accomplish.

I have an XML file stored in my database, for example:

<result>
<test-a>A</test-a>
<test-b>B</test-b>
<test-c>C</test-c>
</result>

I want to upload this to Microsoft Azure storage using the REST API.  I am using a "PUT" request using the EXPath HTTP module.
In order to authenticate this request to Azure, an HMAC-SHA-256 signature must be calculated from a secret key and a string-to-sign that requires Content-Length information be included.
Using the xmldb:size function on this example XML returns a file size of 4096.
The problem is that I am not uploading the XML from the database directly.  I am serializing it before I upload it:

let $parameters := 
       <output:serialization-parameters xmlns:output="http://www.w3.org/2010/xslt-xquery-serialization">
           <output:omit-xml-declaration value="no"/>
           <output:method value="xml"/>
           <output:encoding value="UTF-8"/>
           <output:indent value="yes"/>
       </output:serialization-parameters>

let $unserialized-xml := doc($xml-file-path)
let $serialized-xml := serialize($unserialized-xml, $parameters)

Serializing the XML example will return an xmldb:size value of 126, resulting in a bad request message from Azure.
I needed to find a quick solution, so I am currently storing a temporary copy of the serialized file, retrieving the size, and then deleting the file:

    let $store-temp-xml := xmldb:store('/db/tmp', 'temp.txt', $serialized-xml)
    let $xml-file-size := xmldb:size('/db/tmp', 'temp.txt')
    let $delete-temp-xml := xmldb:remove('/db/tmp', 'temp.txt')

I wanted to know if there is a more elegant solution to this problem that does not require storing and deleting the data.

Thanks,
Anthony

On Feb 22, 2017, at 1:16 PM, Dannes Wessels <[hidden email]> wrote:

Hi,

On 20 Feb 2017, at 19:52 , Anthony Taboada <[hidden email]> wrote:

 xmldb:size()

Note that this gives for XML documents only an approximate size of the document! Actual size depends on serialisation parameters….

regards

Dannes
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot_______________________________________________
Exist-open mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/exist-open


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Exist-open mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/exist-open
Reply | Threaded
Open this post in threaded view
|

Re: Calculating data size without storing in database

Alister Pillow-3
Hi Anthony,

The same result is achieved without storing it by using string-length($serialized-xml) - but it’s still 126.
If you turn off indent, it’s 110.

Regards, Alister

On 24 Feb 2017, at 3:07 am, Anthony Taboada <[hidden email]> wrote:

Hi,

Let me clarify what I am trying to accomplish.

I have an XML file stored in my database, for example:

<result>
<test-a>A</test-a>
<test-b>B</test-b>
<test-c>C</test-c>
</result>

I want to upload this to Microsoft Azure storage using the REST API.  I am using a "PUT" request using the EXPath HTTP module.
In order to authenticate this request to Azure, an HMAC-SHA-256 signature must be calculated from a secret key and a string-to-sign that requires Content-Length information be included.
Using the xmldb:size function on this example XML returns a file size of 4096.
The problem is that I am not uploading the XML from the database directly.  I am serializing it before I upload it:

let $parameters := 
       <output:serialization-parameters xmlns:output="http://www.w3.org/2010/xslt-xquery-serialization">
           <output:omit-xml-declaration value="no"/>
           <output:method value="xml"/>
           <output:encoding value="UTF-8"/>
           <output:indent value="yes"/>
       </output:serialization-parameters>

let $unserialized-xml := doc($xml-file-path)
let $serialized-xml := serialize($unserialized-xml, $parameters)

Serializing the XML example will return an xmldb:size value of 126, resulting in a bad request message from Azure.
I needed to find a quick solution, so I am currently storing a temporary copy of the serialized file, retrieving the size, and then deleting the file:

    let $store-temp-xml := xmldb:store('/db/tmp', 'temp.txt', $serialized-xml)
    let $xml-file-size := xmldb:size('/db/tmp', 'temp.txt')
    let $delete-temp-xml := xmldb:remove('/db/tmp', 'temp.txt')

I wanted to know if there is a more elegant solution to this problem that does not require storing and deleting the data.

Thanks,
Anthony

On Feb 22, 2017, at 1:16 PM, Dannes Wessels <[hidden email]> wrote:

Hi,

On 20 Feb 2017, at 19:52 , Anthony Taboada <[hidden email]> wrote:

 xmldb:size()

Note that this gives for XML documents only an approximate size of the document! Actual size depends on serialisation parameters….

regards

Dannes
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot_______________________________________________
Exist-open mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/exist-open

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot_______________________________________________
Exist-open mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/exist-open


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Exist-open mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/exist-open
Reply | Threaded
Open this post in threaded view
|

Re: Calculating data size without storing in database

Alister Pillow-3
In reply to this post by Anthony Taboada
Please note that size you’re calculating is not the Content-Length - that can only be provided by the http client. When POSTing to AWS S3 I had similar issues - IIRC, setting the version on the request was critical:
<http:request method=“post” http-version=“1.0” >….

On 24 Feb 2017, at 3:07 am, Anthony Taboada <[hidden email]> wrote:

Hi,

Let me clarify what I am trying to accomplish.

I have an XML file stored in my database, for example:

<result>
<test-a>A</test-a>
<test-b>B</test-b>
<test-c>C</test-c>
</result>

I want to upload this to Microsoft Azure storage using the REST API.  I am using a "PUT" request using the EXPath HTTP module.
In order to authenticate this request to Azure, an HMAC-SHA-256 signature must be calculated from a secret key and a string-to-sign that requires Content-Length information be included.
Using the xmldb:size function on this example XML returns a file size of 4096.
The problem is that I am not uploading the XML from the database directly.  I am serializing it before I upload it:

let $parameters := 
       <output:serialization-parameters xmlns:output="http://www.w3.org/2010/xslt-xquery-serialization">
           <output:omit-xml-declaration value="no"/>
           <output:method value="xml"/>
           <output:encoding value="UTF-8"/>
           <output:indent value="yes"/>
       </output:serialization-parameters>

let $unserialized-xml := doc($xml-file-path)
let $serialized-xml := serialize($unserialized-xml, $parameters)

Serializing the XML example will return an xmldb:size value of 126, resulting in a bad request message from Azure.
I needed to find a quick solution, so I am currently storing a temporary copy of the serialized file, retrieving the size, and then deleting the file:

    let $store-temp-xml := xmldb:store('/db/tmp', 'temp.txt', $serialized-xml)
    let $xml-file-size := xmldb:size('/db/tmp', 'temp.txt')
    let $delete-temp-xml := xmldb:remove('/db/tmp', 'temp.txt')

I wanted to know if there is a more elegant solution to this problem that does not require storing and deleting the data.

Thanks,
Anthony

On Feb 22, 2017, at 1:16 PM, Dannes Wessels <[hidden email]> wrote:

Hi,

On 20 Feb 2017, at 19:52 , Anthony Taboada <[hidden email]> wrote:

 xmldb:size()

Note that this gives for XML documents only an approximate size of the document! Actual size depends on serialisation parameters….

regards

Dannes
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot_______________________________________________
Exist-open mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/exist-open

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot_______________________________________________
Exist-open mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/exist-open


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Exist-open mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/exist-open
Reply | Threaded
Open this post in threaded view
|

Re: Calculating data size without storing in database

Michael Westbay-2
In reply to this post by Alister Pillow-3
The same result is achieved without storing it by using string-length($serialized-xml) - but it’s still 126.
If you turn off indent, it’s 110.

The only potential problem with this is of you are dealing with non-English characters. string-length returns the number of characters, not bytes.

--

Michael Westbay

Sent from Nexus


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Exist-open mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/exist-open
Reply | Threaded
Open this post in threaded view
|

Re: Calculating data size without storing in database

wshager
Indeed, the standard allows for character length of 1 for a Unicode character. In many occasions xpath chooses to deviate from most "real" programming languages, so we don't have to learn one... 

Anyway, don't use string-length for byte size.

Op 24 feb. 2017 07:40 schreef "Michael Westbay" <[hidden email]>:
The same result is achieved without storing it by using string-length($serialized-xml) - but it’s still 126.
If you turn off indent, it’s 110.

The only potential problem with this is of you are dealing with non-English characters. string-length returns the number of characters, not bytes.

--

Michael Westbay

Sent from Nexus


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Exist-open mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/exist-open


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Exist-open mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/exist-open