Q: I know that "many" indexes per table caused a performance drawback (slow append and so on) in the "old" days, but how is it in Nexus? Is there any "max-count" of indexes per table that I should try not to get above?

First some concepts:

  • record - a secquence of bytes, logically separated into different fields
  • record engine - a piece of code that's responsible for storing records,handing out refnr's, later being able to return that record given the refnr and able to "delete" (mark for reuse) a record given the refnr
  • refnr - a 64bit value representing a specific record that was stored using a record engine
  • key - a sequence of bytes composite key - a key which is composed of subkeys each derived (usually from one field of the) record.
  • key engine - a piece of code that is able to a) create a key given a record b) compare 2 keys, the following conditions must always be true: (A>C) = (A> B) and (B>C); (A<C) = (A< B) and (B<C); (A=C) = (A= B) and (B=C);
  • index - an ordered list of keys + refnr, the order of the keys is determined by the key comparison function
  • index engine - a piece of code that is able to maintain an index, the following functionality is required: a) given a key and a refnr, insert that information into the index b) given key + refnr delete that key from the index c) given a key find and return the key+refnr that fulfils a specific condition relative to the given key (smaller, smaller-or-equal, first equal, last equal, larger-or-equal, larger) d) given a key+refnr return the next/prior key+refnr in sort order.
  • b*tree - an algorithm designed to implement an index using page based approach.
  • key path - for a b*tree this is an array with 1 entry per level of the tree containing page number and offset in page

B*Tree Index(which is what the default index implementation of NexusDB does):

There are 2 kinds of pages, internal and leaf nodes. actual keys are only stored in the leaf nodes, the internal nodes store values derived from that actual keys to lead the search for a specific key starting from the root node. The depth of the tree (number of pages from the root node to the leaf node containing the key) is always constant for all keys and increases as the tree grows. each page contains a sorted list of keys (or key derived values for internal nodes). Searching for specific key has to start at the root page, doing a standard binary search on that page (requiring the square root of the number of keys in the page of key comparisons), follow the reference to the next page on level deeper into the tree, searching again.. and so on till the key is found (or not found) in a leaf node.

The maximum number of keys per page depends on the size of the key (plus some constant overhead) and the size of a page (minus some constant overhead). The overheads are different for internal and leaf nodes. But for larger (>16 bytes) key sizes this isn't very relevant.

Each page is filled between 50-100%, the avg. fill factor in a regularly modified b*tree is 66.6%. This means the AvgNumberOfKeysPerPage is

MaxKeysPerPage * 0.666.

The avg. number of leaf nodes is Ceiling(TotalNumberOfKeys /AvgNumberOfKeysPerPage )

The avg. depth of the tree is Ceiling(Xth root of LeafNodeCount) where x = AvgNumberOfKeysPerPage.

The total number of key comparisons is about AvgTreeDepth * Sqrt(AvgNumberOfKeysPerPage ).

Given all this information you can put a diagram together that shows how the number of required key comparisons needed to find a key in the index changes with the number of keys.

Inserting a record requires inserting one key per index (which requires a find operation to find the insertion point).

Deleting a record required deleting one key per index (which requires a find operation to find the key and generate a key path on the way from the root to the leaf).

Updating a record requires one delete and one insert per index which had it's key changed (2 find operations)

In addition to the pure time it takes to compare the keys when walking the tree it's especially important to consider the IO costs:

Inserting/Deleting a record (without any indices):

read/write - 2 pages (table header page + data page)

The header contains the total count of records in the table and needs to be updated.

If this was the last used slot in the data page (for a delete) or if there were no data pages with empty slots available (for an insert) between 1 and 2 additional pages need to be read/written to either add the page back into the main list of reusable pages or get a new page from there or add an additional page to the table

Updating a record (without indices) read/write - 1 page

Inserting/Deleting a key into an index:

read - 1 page per level of the tree + indices header page (containing the root node page number)

write - 1 leaf node page + indices header page (containing the total number of keys for the index)

If there are multiple indices the "indices header page" will be read/written only once as all indices use the same page, all modifications take place in an implicit transaction and the buffer manager will cache the page.

In addition, if the leaf node that the find operation determines as the insertion point/deletion point is either full (insert) or would be less then 50% after the operation (deletion) it's required to either balance the page with its siblings (adding 1 or 2 more page reads and 1 page write) or, if the siblings are also full/half-empty execute a page split (adding a new page, splitting the keys into 2 pages) or page merge (removing one page and merging the keys with the sibling). which costs a few more page reads/writes. In case of a split or merge the parent page needs to be updated as well and the operation will cascade upward through the tree as long as the parent page is full/half-empty and can not be balanced with siblings. If it reaches the root node of the tree a new root node is created (and the depth of the tree increases by one level, the new root node will have 2 keys) or the current root node is removed (and the depth of the tree

decreases by 1 level, this happens when the key count of the root node reaches 1).

Last thing we have to look at is page size. For each page read/write there is a (more or less static) time (drive seek time) + a variable time depending on the page size. Larger pages result in a higher fan-out factor (avg. keys per internal node) of the tree, reducing the depth of the tree, reducing the number of pages read/written. At the same time, larger pages take longer to read/write and contain more information that is possible not relevant. The "negative" impact of larger pages is the highest if you do single record operations (insert/update/delete) and gets smaller if you use larger transactions (because multiple operations that write to the same page will only result in a single read / write).

In summary... the answer to your questions is "that depends".

Home | Site Contents | Documentation | FAQ, Tips & Tricks | NexusDB FAQ | Thorsten's Goodies