3.0.29: Incorrect data on select with IN condition on a map property

Running OrientDB version 3.0.29, a select statement with an IN statement against a property of type EMBEDDEDMAP such as “select from testmap where groups in [‘c’]” does not return the expected result. To reproduce the issue:

(1) Create a class “testmap” and add properties “name” (string), “id” (long) and “groups” (embeddedmap).

(2) Insert data into the class:
insert into testmap (id, names, groups) values (1, ‘abc’, {“a”:“b”,“c”:“d”})
insert into testmap (id, names, groups) values (2, ‘def’, {“c”:“d”,“e”:“f”})

(3) Run the query: “select from testmap where groups.keys() in [‘a’,‘e’]”. Expected result is to see both data sets in the class, actual result is no qualifying data at all.

(4) Update the query: “select from testmap where groups containsKey ‘a’ or groups containsKey ‘e’”. This returns the expected result. However, in SQL, the IN statement is intended to avoid just such lengthy OR statements.

(5) Add another entry to the class:
insert into testmap (id, names, groups) values (3, ‘ghi’, {“g”:“h”})

(6) Run the query: “select from testmap where groups.keys() in [‘g’]”. This actually works and returns the last entered data set.

All this worked fine in our previous OrientDB version 2.2.16. There, the statement “select from testmap where groups.keys() in [‘a’,‘e’]” did return the expected result.

Looks like an error. Unless there was a change in the implementation on how this is supposed to work.

Thanks

Thomas

One additional note: executing a SQL statement with all key values in the IN statement (e.g. select from testmap where groups.key() in [‘a’,‘c’,‘e’,‘g’]) does return one - seemingly random - data set from the class.

Hi @tok

The IN operator was quite ambiguous in previous releases, eg. is [a, b] IN [a, b, c] valid? Or is [a, b] IN [[a, b], c] valid? Or both?

Even worse, in some cases CONTAINS and IN were completely interchangeable, that is just wrong.

In v 3.0 we added a few more specific operators, ie:

  • CONTAINSANY check if two collections have at least one element in common
  • CONTAINSALL check if all the elements in the second collection are also contained in the first one

In addition you have

  • CONTAINS that, in case of two collections, checks if one is contained (itself, not its items!) in the other one, ie. [[a, b], c] CONTAINS [a, b]
  • IN the same as CONTAINS, just in the inverted order

I hope it clarifies a bit the situation

Thanks

Luigi

Hi Luigi,

Thanks for the quick update.

In general, I considered the SQL definition of the IN statement well defined. But then, OrientDB is a NoSQL database and hence some aspects (may) require different interpretation. That is acceptable. I would, however, have liked to see a release note about this change as it impacts queries, none of the searches had shown any reference to that change.

All resolved now, thanks.

Thomas