GiST internals
GiST internal interface
GiST interface to database is conformed to PostgreSQL index interface agreements. Source files are located under src/backend/access/gist and header files - under src/include/access in PostgreSQL source tree. Basic algorithms implemented in GiST are described here. Application level GiST interface is described in introduction and GiST Programming Tutorial (in Russian).
- gistbuild
- Creates index.
Parameters: table and index descriptors - gistinsert
- Insert new value to the index.
Parameters: index descriptor, value, pointer to table record - gistbulkdelete
- Dead tuples cleanup. Tree-walking and calling callback function to recognize does tuple is dead or live.
Parameters: index descriptor, callback function, callback state - gistvacuumcleanup
- Removes empty pages and restore broken (after crash) tuples. Also rebuilds internal keys if needed (index is locked when rebuilding).
Parameters: index descriptor. - gistbeginscan, gistendscan, gistrescan
- These are base methods for search in index - create scan, finish scan, repeat scan.
Parameters:
- gistbeginscan - index descriptor, search keys (may be omitted)
- gistendscan, gistrescan - scan
- gistmarkpos
- Save scan state.
Parameters: scan - gistrestrpos
- Restore scan state.
Parameters: scan - gistgettuple, gistgetmulti
- Returns one or several pointers to records in table, according search key, specified in gistbeginscan.
Parameters: scan - gist_redo
- Restore one WAL record.
Parameters: LSN (LogSequenceNumber), WAL record - gist_desc
- Describes WAL record (debug only)
- gist_xlog_startup
- Prepare to recover using WAL
- gist_xlog_cleanup
- Finalize recovering.
NOTES:
- Functions gistmarkpos, gistrestrpos and gistrescan are required by PostgreSQL optimizer and executor and may be not needed for other database engine.
- GiST has no direct access to tables, only via pointers ( the number of page and the number of record on the page) to records. So, callback function in gistbulkdelete uses pointer to record and this function should recognize if pointer is still valid (points to the record) or not.
GiST requirements to memory manager
Memory manager should provide:
- malloc/realloc/free functions
- create,switch, clean and destroy memory context, used by alloc. functions. At the same time there are could be several memory contexts.
GiST requirements to lock manager
- share/exclusive locks at the page level. Deadlock detector NOT needed.
- share/exclusive locks at the index level. Deadlock detector NOT needed.
GiST requirements to page manager
- Ability to easy find page by its number/index
- When creating new pages it's very desirable to use pages marked as free by gistvacuumcleanup
GiST requirements to WAL
- LSN (Log Sequence Number) - monotonically increased (on the whole period of database life) sequence number (PostgreSQL uses 64-bit integer), that identifies log record.
- Function which writes WAL record and returns its LSN
NOTE: It's possible to implement GiST concurrency without WAL. In that case we need a sequencer which generates monotonically increased integers.
What we need to know
- Agreements on passing keys (by value, by pointers, length, memory usage,…)
- It'd be nice to know specification (names, descriptions) of WAL, lock and page manager interfaces.
- Specifications of functions (if any ) for converting value → tuple, tuple → page.