Server-side Cursors

I’ve seen mentions of “server-side cursors” in github and in the documentation Roadmap, but no details. Is there any information on this?

My objective is better pagination. This is such a common scenario that I think it deserves some direct attention (and the current RID-based or SKIP/LIMIT solutions are very poor for all but the simplest queries and smallest result sets).

Since I’m not sure of the implementation details of the “server-side cursors” that are mentioned, I don’t want to guess. But my thinking is that a “cursor” would be an ideal way to handle pagination. In my definition, a cursor would allow you to run a query and essentially save the state of that query at any point (maybe based on LIMIT), where part of the result set includes an opaque blob that can be used to continue to query later. Conceptually, the server should be able to pick up where it left off, just as if it hadn’t been interrupted. Of course the simplest version of this would be a forward-only cursor, which actually covers the majority of pagination use-cases, but potentially a bidirectional cursor could also be implemented in this way.

Hi @eric24

in OrientDB 3.0.x the pagination of the result set has been implemented and also supported with OrientJS 3.0.x.

@wolf4ood - I’m aware of the result set paging in the OrientJS 3.0 driver, but I don’t think it’s the same thing. What I specifically need is a way to start a query, return some number of records, then “pause” the query (receiving some kind of “token” or “bookmark”), which I can then pass back to the driver at some later time to allow the query to continue where it left off and return another set of records, etc.

As I understand the paging that’s implemented in 3.0.x, it is primarily concerned with reducing server memory load for large result sets, but does not provide a way to pause and restart a query. If that’s not right, or if the mechanism that’s currently in place could potentially be used for the “pause and continue” functionality I’m looking for, please let me know.

Hi @eric24

actually the OrientDB result set protocol is implemented in that way. When a query it’s spawned to the server it returns the first page and a query id (cursor) that can be used for fetching the next page.

In OrientJS it’s implemented via NodeJS streams. I’m not sure it’s implemented right now, but if not we could support stream.pause/resume, which can be what you are looking for.

WDYT?

@eric24

actually the pause resume is supported by default the main problem today that still OrientJS will fetch the next pages even if the stream is paused. That means that the records will be cached inside the buffer of the stream, until it’s resumed, which is not ideal for memory footprint.

So now it misses this step, of not fetching the next page when the stream is paused

@eric24

i’ve raised an issue about this

@wolf4ood - It sounds very promising, but I want to dig a bit deeper into this functionality.

My ultimate goal is to be able to support “incremental loading” of record sets being sent to a web client. In a sense, this is “forward-only” pagination (bidirectional pagination would be even better, but forward-only satisfies most use-cases–where the issue isn’t so much about memory use on the client, but about minimizing the time and bandwidth used by a query, for example a query is run to populate a table with users, there are actually 10,000 users, but the table only shows 100 rows at a time, so there’s no need to actually transfer more than 100 records until the user scrolls down; of course there are various optimizations that might benefit from transferring maybe 200 records, but that’s an implementation detail).

In any case, what that means is that for the stream method you describe to work, it must be possible to pause the stream and restart it at some time in the future, possibly from a different server. As such, the “token” must contain enough information that it can be passed to the web client and later passed back in a future transaction, and then onto the OBD server, so it can continue the query from where it left off.

From what you describe, I’m not sure this is possible. It seems like it may be possible to allow the stream to be paused and restarted, but only within the context of the connection/session that started it.

My thinking is that as ODB is running a query, it’s keeping certain meta information in memory about the query. Given that, it should be possible for that meta information to be returned to the client (as an opaque object) and then later sent back to ODB, letting it “continue” the query at the point the meta information was returned to the client, even if a long period of time elapsed between these two events.

@wolf4ood - I just happened to notice OSQLSynchQuery (https://orientdb.com/docs/last/Pagination.html#automatic-management). It looks like it does what I was describing (assuming the query object can be serialized, and kept between queries with potentially long delays between them). Is this supported by OrientJS?

Hi @eric24

OSQLSynchQuery is supported in OrientJS, but not the automatic pagination.

But that is the old OrientDB query protocol, which is deprecated in OrientDB 3.0, since it dose not support streaming with server side cursor as 3.0.x protocol does.

@wolf4ood - Too bad, it seemed promising. So what do you think is most viable solution for what I’m trying to accomplish?