Strategy of the ODB 3.0 Query Optimizer

I’m curious about what the ODB 3.0 query optimizer would do in a case like this:

  • The target class has multiple non-unique indexes defined that cover various combinations of fields: index1 includes fields a, b, c; index2 includes fields d, e, f, and index3 includes fields g, h, i
  • The query includes fields in the MATCH or WHERE clause that include these fields: WHERE (a=1 AND b=2) AND (d=11 AND e=22 AND F=33) AND (g=111 AND H=222 and I>333) (parenthesis shown only for clarity)
  1. Does the optimizer:
  • a) Choose one of the three indices, use it to filter down the list of possible records, and then scan that list of records, comparing the remaining values (i.e. ignoring the other indices)?
  • b) Run queries on each of the three indices, then “map/reduce” the RIDs in each of the three result sets to get the final set of records?
  • c) Something entirely different?
  1. What if additional fields were included in the MATCH or WHERE clause that were not covered by any index?

  2. I’m assuming that if index2 was a unique index (the only index that is complete, with all-equals operators), the optimizer would choose it and go find the record, then check to make sure it also satisfied the other fields in the query, but otherwise ignore the other indices?

The purpose of my question is to better understand how to optimize query performance by building a useful index strategy. If ODB 3.0 will ever only use one particular index in a given query, that would lead me in a different direction than if it will make use of multiple indices “at once”.

Hi @eric24

When conditions are in AND, like in your case, then only one condition is used for index search and all the other conditions are evaluated later (same if you add further conditions). If the conditions are in OR, then multiple indexes can be used at once.

The query optimizer takes its decisions based on statistics. During query execuiton (and not during query planning), every time an index is used for a condition (ie. for a set of fields), the query executor records number of returned entries/records and other stats; the execution planner uses these stats to choose the best index.
This means that when the stats are “cold” (ie. at startup), the query execution planner has to do some guessing, based on class size (smaller classes first), index type (unique indexes first) and so on.

Thanks

Luigi