most longest prefix value exist in the column value but it doesn't intersect with other columns values in the query that we need part of query retrieval #300

fouadazem · 2022-07-10T01:17:09Z

lets suppose we have
we have part of our query select R from Table where Longest(A=112233) and (B>5 and D<10)
A B D R
11223 1 3 15
1122 6 9 12

since the longest appears where is B and D doesn't fit base on our query.

so the case above we expect to receive the second line even we don't have the longest match for the second line.
but as result of the problem no records received part of the query output.

npgall · 2022-07-10T07:52:43Z

Should that be an or() query in the middle instead?

fouadazem · 2022-07-10T08:35:18Z

It must be and() between the longest prefix match and the logical queries . So we expect the longest to applied after the logical queries applied. Does that make sense ? I see that the cq engine doesn't support that for now ? Can we add such functionality? Regards Fouad

…

On Sun, Jul 10, 2022, 10:52 AM Niall Gallagher ***@***.***> wrote: Should that be an or() query in the middle instead? — Reply to this email directly, view it on GitHub <#300 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/APQOBSETROZ2SWBQI7ECSD3VTJ6NLANCNFSM53EGJKJA> . You are receiving this because you authored the thread.Message ID: ***@***.***>

npgall · 2022-07-10T11:31:12Z

I'm not sure that I understand what you need.

Is it that you effectively have 2 queries? By default you want results to be returned for the first query only. However if there are no results for the first query then you want instead the results for the second query to be returned?

If thats what you need could you just do 2 separate queries where your application has that logic?

If you're looking for something else, could you provide a test case which shows what you're looking for (e.g. assertions on the results)? Or else maybe you explain how it would be achieved in SQL?

fouadazem · 2022-07-10T13:44:05Z

Please see below: Let's assume we have the below data table called mytable A. B. C. D 32221. 4. 7. A1 3222 9 16. C1 322. 17. 21. C7 32. 9. 16. C9 Now my query expected to be Select D from mytable where longestPrefix(A=322211) and ( C greater than or equal than 10 and B less than 10 ) Now from the cqengine current implementation current no result set will be retrieved empty result set . Why? This because the internal cq engine code consider the longest prefix match as comparative query and trying to return line #1 . In the hand for B and C column base on the cq engine those are logical queries and base on the 10 for the between logic expect to be return line #2 and line # 4 from my table We as customer which use the cqengine expect for line # 2 from my table cause it's answering our requirements to have the longest prefix match with the other conditions (for B and C columns). I have unit test class that I build and I will send it later on. Regards Fouad Azem

…

On Sun, Jul 10, 2022, 2:31 PM Niall Gallagher ***@***.***> wrote: I'm not sure that I understand what you need. Is it that you effectively have 2 queries? By default you want results to be returned for the first query only. However if there are no results for the first query then you want instead the results for the second query to be returned? If thats what you need could you just do 2 separate queries where your application has that logic? If you're looking for something else, could you provide a test case which shows what you're looking for (e.g. assertions on the results)? Or else maybe you explain how it would be achieved in SQL? — Reply to this email directly, view it on GitHub <#300 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/APQOBSAMCCYDWORSNQ6ZOU3VTKYAVANCNFSM53EGJKJA> . You are receiving this because you authored the thread.Message ID: ***@***.***>

fouadazem · 2022-07-11T00:02:23Z

Hi, please see the test class i prepared for the execute the test class LongestPrefixMatchExampleCodeTest test method : longestPrefix_match_and_validFrom_ValidTo_Expected_NotEmptyResultSet currently the test method passes the test since its empty true but that base on our expectation is not expected . as i write on the method comments since it contain the longest as well as the valid from and valid to condition and in this case i do expect LongestPrefixMatchExampleCode from longestPrefixMatchExampleCodes at index 1 which is the combination of the and query (valid from _ valid to ) and (longest prefix match) Does the above make sense based on our expectations ? Does this make sense to you to raise a bug on the product ? as Am working on the fix. Any suggestions on how to fix that to answer our requirements ? Regards Fouad Azem ‫בתאריך יום א׳, 10 ביולי 2022 ב-16:43 מאת ‪fouad azem‬‏ <‪ ***@***.***‬‏>:‬

…

Please see below: Let's assume we have the below data table called mytable A. B. C. D 32221. 4. 7. A1 3222 9 16. C1 322. 17. 21. C7 32. 9. 16. C9 Now my query expected to be Select D from mytable where longestPrefix(A=322211) and ( C greater than or equal than 10 and B less than 10 ) Now from the cqengine current implementation current no result set will be retrieved empty result set . Why? This because the internal cq engine code consider the longest prefix match as comparative query and trying to return line #1 . In the hand for B and C column base on the cq engine those are logical queries and base on the 10 for the between logic expect to be return line #2 and line # 4 from my table We as customer which use the cqengine expect for line # 2 from my table cause it's answering our requirements to have the longest prefix match with the other conditions (for B and C columns). I have unit test class that I build and I will send it later on. Regards Fouad Azem On Sun, Jul 10, 2022, 2:31 PM Niall Gallagher ***@***.***> wrote: > I'm not sure that I understand what you need. > > Is it that you effectively have 2 queries? By default you want results to > be returned for the first query only. However if there are no results for > the first query then you want instead the results for the second query to > be returned? > > If thats what you need could you just do 2 separate queries where your > application has that logic? > > If you're looking for something else, could you provide a test case which > shows what you're looking for (e.g. assertions on the results)? Or else > maybe you explain how it would be achieved in SQL? > > — > Reply to this email directly, view it on GitHub > <#300 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/APQOBSAMCCYDWORSNQ6ZOU3VTKYAVANCNFSM53EGJKJA> > . > You are receiving this because you authored the thread.Message ID: > ***@***.***> >

npgall · 2022-07-11T11:23:11Z

I think I understand what you're looking for now - that was a good example, thank you.

You are right that the longest prefix query will only match the longest one.

I think there are a few ways to solve your problem:

Option 1
Execute a query that only refers to your constraints on the B and C columns/attributes, and then order the results in descending order of the length of the match for column/attribute A.

In this case you can do the ordering/sorting of the results inside or outside of CQEngine.

To do it inside CQEngine, you might need to implement a special/custom attribute MATCH_LENGTH (or similar) which will return the length of the match. And, since the attribute will need to know the string that it is comparing the stored value with, you can pass that string to it as a query option. You can invoke it by supplying query option orderBy(descending(MATCH_LENGTH)) (and supply the input string as another query option as well).

This will leverage any indexes you have on B and C, but the sorting/ordering will not leverage indexes at all. (So this option is not a panacea, hence why I've mentioned other options below. See my discussion of tradeoffs below too.)

I do think CQEngine could better support you if you want to go with this option. If you get that working and you can contribute the code for the attribute you will write, I (or you via pull requests 😉) could use it as a basis for better supporting this use case in CQEngine.

Option 2
Don't use CQEngine for this, use the ConcurrentRadixTree directly instead.

This will allow you to use the radix tree as an index on A to retrieve the matches for your input string. You can then filter those matches to ensure they match B and C.

This will leverage an index on A, but not on B or C.

Option 3
Improve support in CQEngine to better leverage the ConcurrentRadixTree for queries like this on A, when an "index" ordering strategy is selected. This would basically merge support into CQEngine for Option 2.

Tradeoffs
Unfortunately what you really want to do is not supported well with CQEngine currently. The main challenge is that this use case is unlike most others, because an index on A which could support your query, would actually match every object in the collection. That's because it seems to be a kind of "soft" constraint that is used for ordering more than filtering. So therefore it would not integrate well with the core of the query engine which is based on set theory. However it could be integrated better with CQEngine's index ordering strategy.

Which approach will work best for you? It depends on the X% of the collection that is matched by your queries on B and C. If X is small, something like Option 1 should work well. If X is large, something like Option 2 or 3 would be better.

Hope that helps. And I hope I've understood your use case properly and that the above makes sense!

fouadazem · 2022-07-18T00:06:55Z

Hi Niall @npgall ,

Thanks for explanation and details and your quick response , appreciate that!

we are using the CQEngine version 3.5.0 .

first , would inform that we choose the option #3
Improve support in CQEngine to better leverage the ConcurrentRadixTree for queries like this on A, when an "index" ordering strategy is selected. This would basically merge support into CQEngine for Option 2.

how to get base specific field name values from the ResultSet ? i do expect to give field name and get me the values of this field name from the result set that m planning to do on CollectionQueryEngine : retrieveWithIndexOrdering ?

Best Regards Fouad

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

most longest prefix value exist in the column value but it doesn't intersect with other columns values in the query that we need part of query retrieval #300

most longest prefix value exist in the column value but it doesn't intersect with other columns values in the query that we need part of query retrieval #300

fouadazem commented Jul 10, 2022

npgall commented Jul 10, 2022

fouadazem commented Jul 10, 2022 via email

npgall commented Jul 10, 2022

fouadazem commented Jul 10, 2022 via email

fouadazem commented Jul 11, 2022 via email

npgall commented Jul 11, 2022 •

edited

Loading

fouadazem commented Jul 18, 2022 •

edited

Loading

most longest prefix value exist in the column value but it doesn't intersect with other columns values in the query that we need part of query retrieval #300

most longest prefix value exist in the column value but it doesn't intersect with other columns values in the query that we need part of query retrieval #300

Comments

fouadazem commented Jul 10, 2022

npgall commented Jul 10, 2022

fouadazem commented Jul 10, 2022 via email

npgall commented Jul 10, 2022

fouadazem commented Jul 10, 2022 via email

fouadazem commented Jul 11, 2022 via email

npgall commented Jul 11, 2022 • edited Loading

fouadazem commented Jul 18, 2022 • edited Loading

npgall commented Jul 11, 2022 •

edited

Loading

fouadazem commented Jul 18, 2022 •

edited

Loading