Problems with very large result sets with OrientJS on ODB 2.2.x


#1

We’re still trying to set up an isolated demo to test and characterize this, but I wanted to first check if there are any known limits on the size of a query result (in terms of records returned) with OrientJS on ODB 2.2.29? Internally, is the result set a “stream” or must the entire set exist in memory on the server and/or the client?


#2

Hi Eric,

In v 2.2 the result set is not streamed, it is kept in RAM until it’s sent to the client.
One of the big improvements in v 3.0 is streamed result sets

Thanks

Luigi


#3

Ah so. That would explain what we’ve been seeing. Planning to migrate to 3.0 very soon.


#4

A follow-up question on 3.0 with the latest 3.0 OrientJS: When using the “mode” with the all() function (which collects the stream into a single array), I’m assuming that the server is still streaming the result set? So the “large memory” requirements for a large result set are thus client-side only?


#5

Hi @eric24

that is correct. The server still sends results in pages. But with all() the client side collect all the results set in an array.


#6

Yes, I just noticed the other change of pagination by default. I assume the recommended approach is to use RID-based pagination? If so, it might make sense to show, or at least mention, this in the OrientJS example.


#7

What do you mean rid-based pagination?

What i mean is that for example if you do

select form V

You will get paginated records. by default the page size si 100, but you can increase it if you want to reduce latency, it’s the client that ask for the next page


#8

Understood that paged results are now the default (and I agree with this decision).

By RID-based pagination, I mean this: http://orientdb.com/docs/3.0.x/sql/Pagination.html (RID-LIMIT method)


#9

Oh ok if you want to handle page manually yes :slight_smile:


#10

Oh, I think I understand what’s happening now. Let me see if this is correct: If I just do a normal query, the page size will be 100, and each “set” of 100 records will trigger a ‘data’ event, with as many ‘data’ events as needed to complete the full result set. And if I use the await/all() method, this pagination will still occur between the server and the client, but the final array will contain the full reset set. So the pagination that’s happening is essentially transparent to the client. Right?

And if I want to do manual pagination, I would need to add a LIMIT to the SQL query, right?


#11

Yes that is correct the pagination is transparent for the client