Quantcast

regex behavior of fn:contains()

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

regex behavior of fn:contains()

Peter Stadler
Dear all,

I’m facing some strange behavior with regards to fn:contains() when looking for question marks „?“. When looking for $someCollection//tei:persName[fn:contains(., '?‘)] I get a different amount of hits depending on whether there’s a range index defined on the element tei:persName. With the index, fn:contains() behaves in a regex style, i.e. I get gazillion hits with a simple fn:contains(., '?‘) but only the correct hits when escaping the question mark like fn:contains(., ‚\?‘).

Furthermore, without the index on tei:persName,  fn:contains()  misses hits when a range index is defined on child nodes, e.g.
<persName xmlns="http://www.tei-c.org/ns/1.0" type=„reg"><surname>Hering</surname>, <forename>S(amuel?)</forename></persName>
is not found when there is a range index defined on tei:forename.

Index entries look like (trimmed down version, I skipped the lucent parts and some more range index definitions)
<collection xmlns="http://exist-db.org/collection-config/1.0">
    <index xmlns:tei="http://www.tei-c.org/ns/1.0">
        <range>
            <create qname="tei:surname" type="xs:string"/>
            <create qname="tei:forename" type="xs:string"/>
            <create qname="tei:persName" type="xs:string"/>
        </range>
    </index>
</collection>

I’m aware that there are several issues around fn:matches and fn:contains etc. (see e.g. https://github.com/eXist-db/exist/issues/59) but this behavior seems very inconsistent and more like a bug than a feature to me.

All the best
and many thanks for all the hard work
Peter

Running on
Git commit : develop-c50dbba
Operating System : Mac OS X 10.11.6 x86_64
Java 1.8.0_112



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Exist-open mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/exist-open

signature.asc (465 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: regex behavior of fn:contains()

Peter Stadler
Dear all,

thanks for the new release!
Just updated but my issue with the regex behavior of fn:contains() still persists, so I raised an issue at https://github.com/eXist-db/exist/issues/1379

Many thanks again!
Best
Peter

> Am 15.03.2017 um 09:39 schrieb Peter Stadler <[hidden email]>:
>
> Dear all,
>
> I’m facing some strange behavior with regards to fn:contains() when looking for question marks „?“. When looking for $someCollection//tei:persName[fn:contains(., '?‘)] I get a different amount of hits depending on whether there’s a range index defined on the element tei:persName. With the index, fn:contains() behaves in a regex style, i.e. I get gazillion hits with a simple fn:contains(., '?‘) but only the correct hits when escaping the question mark like fn:contains(., ‚\?‘).
>
> Furthermore, without the index on tei:persName,  fn:contains()  misses hits when a range index is defined on child nodes, e.g.
> <persName xmlns="http://www.tei-c.org/ns/1.0" type=„reg"><surname>Hering</surname>, <forename>S(amuel?)</forename></persName>
> is not found when there is a range index defined on tei:forename.
>
> Index entries look like (trimmed down version, I skipped the lucent parts and some more range index definitions)
> <collection xmlns="http://exist-db.org/collection-config/1.0">
>    <index xmlns:tei="http://www.tei-c.org/ns/1.0">
>        <range>
>            <create qname="tei:surname" type="xs:string"/>
>            <create qname="tei:forename" type="xs:string"/>
>            <create qname="tei:persName" type="xs:string"/>
>        </range>
>    </index>
> </collection>
>
> I’m aware that there are several issues around fn:matches and fn:contains etc. (see e.g. https://github.com/eXist-db/exist/issues/59) but this behavior seems very inconsistent and more like a bug than a feature to me.
>
> All the best
> and many thanks for all the hard work
> Peter
>
> Running on
> Git commit : develop-c50dbba
> Operating System : Mac OS X 10.11.6 x86_64
> Java 1.8.0_112
>
>
--
Peter Stadler
Carl-Maria-von-Weber-Gesamtausgabe
Arbeitsstelle Detmold
Hornsche Str. 39
D-32756 Detmold
Tel. +49 5231 975-676
Fax: +49 5231 975-668
stadler at weber-gesamtausgabe.de
www.weber-gesamtausgabe.de


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Exist-open mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/exist-open

signature.asc (465 bytes) Download Attachment
Loading...