Документ взят из кэша поисковой машины. Адрес оригинального документа : http://www.sai.msu.su/~megera/postgres/gist/gevel/README.gevel
Дата изменения: Fri Apr 4 00:04:31 2008
Дата индексирования: Sun Apr 13 07:45:12 2008
Кодировка:
Gevel contrib module provides several functions useful for analyzing
GIN and GiST indexes.

Online version of this document is available
[23]http://www.sai.msu.su/~megera/wiki/Gevel

Caution: This module was designed for developers of GiST/GIN based
indices !

Authors

* Oleg Bartunov , Moscow, Moscow University, Russia
* Teodor Sigaev , Moscow, Moscow University,Russia

License

Stable version, included into PostgreSQL distribution, released under
BSD license. Development version, available from this site, released
under the GNU General Public License, version 2 (June 1991)

Downloads

Old version of gevel available from
[24]http://www.sai.msu.su/~megera/postgres/gist/gevel/

Development version is available from CVS:
cvs -d :pserver:anoncvs@sigaev.ru:/root login
(just enter an empty password)
cvs -d :pserver:anoncvs@sigaev.ru:/root co gevel

Installation

% cd PGSQLSRC/contrib
% tar xzvf gevel.tar.gz
% make
% make install
% make installcheck
% psql regression < gevel.sql

Examples

All examples below require installing contrib/rtree_gist module and
already installed gevel module.
cd contrib/rtree_gist
make
make install
make installcheck
psql regression < gevel.sql

Now you have regression database with two tables and indices:
regression=# \pset tuples_only 1
Showing only tuples.
regression=# \d
List of relations
Schema | Name | Type | Owner
--------+---------+-------+----------
public | boxtmp | table | postgres
public | polytmp | table | postgres
(2 rows)
regression=# \di
List of relations
Schema | Name | Type | Owner | Table
--------+------+-------+----------+---------
public | bix | index | postgres | boxtmp
public | pix | index | postgres | polytmp
(2 rows)


* gist_stat(INDEXNAME) - show some statistics about GiST tree

regression=# select gist_stat('pix');
Number of levels: 2
Number of pages: 30
Number of leaf pages: 29
Number of tuples: 3129
Number of leaf tuples: 3100
Total size of tuples: 137676 bytes
Total size of leaf tuples: 136400 bytes
Total size of index: 245760 bytes

* gist_tree(INDEXNAME,MAXLEVEL) - show GiST tree up to MAXLEVEL

regression=# select gist_tree('pix',0);
0(l:0) blk: 0 numTuple: 29 free: 6888b(15.63%)

Designation (from left to right):

* 0 - page number
* (l:0) - tree level
* blk: 0 - block number
* numTuple: 29 - the number of tuples
* free: 6888b - free space in bytes
* (15.63%) - occupied space in percents

gist_tree(INDEXNAME) - show full GiST tree
regression=# select gist_tree('pix');
0(l:0) blk: 0 numTuple: 29 free: 6888b(15.63%)
1(l:1) blk: 13 numTuple: 180 free: 244b(97.01%)
2(l:1) blk: 11 numTuple: 175 free: 464b(94.32%)
3(l:1) blk: 2 numTuple: 101 free: 3720b(54.43%)
4(l:1) blk: 17 numTuple: 111 free: 3280b(59.82%)
5(l:1) blk: 18 numTuple: 101 free: 3720b(54.43%)
6(l:1) blk: 10 numTuple: 98 free: 3852b(52.82%)
7(l:1) blk: 19 numTuple: 111 free: 3280b(59.82%)
8(l:1) blk: 9 numTuple: 97 free: 3896b(52.28%)
9(l:1) blk: 20 numTuple: 104 free: 3588b(56.05%)
10(l:1) blk: 14 numTuple: 96 free: 3940b(51.74%)
11(l:1) blk: 21 numTuple: 106 free: 3500b(57.13%)
12(l:1) blk: 7 numTuple: 103 free: 3632b(55.51%)
13(l:1) blk: 1 numTuple: 101 free: 3720b(54.43%)
14(l:1) blk: 16 numTuple: 97 free: 3896b(52.28%)
15(l:1) blk: 24 numTuple: 103 free: 3632b(55.51%)
16(l:1) blk: 4 numTuple: 98 free: 3852b(52.82%)
17(l:1) blk: 25 numTuple: 98 free: 3852b(52.82%)
18(l:1) blk: 3 numTuple: 97 free: 3896b(52.28%)
19(l:1) blk: 26 numTuple: 96 free: 3940b(51.74%)
20(l:1) blk: 6 numTuple: 103 free: 3632b(55.51%)
21(l:1) blk: 8 numTuple: 162 free: 1036b(87.31%)
22(l:1) blk: 23 numTuple: 94 free: 4028b(50.66%)
23(l:1) blk: 12 numTuple: 82 free: 4556b(44.19%)
24(l:1) blk: 27 numTuple: 105 free: 3544b(56.59%)
25(l:1) blk: 5 numTuple: 90 free: 4204b(48.51%)
26(l:1) blk: 28 numTuple: 100 free: 3764b(53.90%)
27(l:1) blk: 22 numTuple: 101 free: 3720b(54.43%)
28(l:1) blk: 15 numTuple: 95 free: 3984b(51.20%)
29(l:1) blk: 29 numTuple: 95 free: 3984b(51.20%)

* gist_print(INDEXNAME) - prints objects stored in GiST tree, works
only if objects in index have textual representation (type_out
functions should be implemented for given object type). It's known
to work with R-tree GiST based index (contrib/rtree_gist). Note, in
example below, objects are of type box.

# \di
List of relations
Schema | Name | Type | Owner | Table
--------+------+-------+----------+---------
public | bix | index | postgres | boxtmp
public | pix | index | postgres | polytmp
(2 rows)
# \d pix
Index "public.pix"
Column | Type
--------+------
p | box
gist, for table "public.polytmp"

# select * from gist_print('pix') as t(level int, a box) where level =1;

for 8.1 and later this query should looks like
# select * from gist_print('pix') as t(level int, valid bool, a box) where level =1;

level | valid | a
-------+-------+-----------------------------
1 | t | (37357,50073),(34242,357)
1 | t | (43499,49770),(40358,43)
1 | t | (31193,24679),(25047,12410)
1 | t | (31018,12142),(25083,6)
1 | t | (49944,25174),(43471,12802)
1 | t | (12577,49757),(6302,37534)
1 | t | (12528,37333),(6171,24861)
1 | t | (50027,49751),(46817,25462)
1 | t | (46870,49912),(43664,25722)
1 | t | (24855,25574),(12447,19263)
1 | t | (25054,19126),(12403,12796)
1 | t | (32737,49923),(31178,1038)
1 | t | (3184,24465),(15,81)
1 | t | (24951,49983),(12740,44000)
1 | t | (24919,43956),(12617,37901)
1 | t | (40387,49852),(37338,25217)
1 | t | (40325,24963),(37375,491)
1 | t | (24919,12698),(12654,6518)
1 | t | (25002,6338),(12350,51)
1 | t | (49985,12554),(43447,222)
1 | t | (25003,37769),(12552,25573)
1 | t | (34270,49382),(32763,594)
1 | t | (6205,50012),(3,37527)
1 | t | (6163,37358),(120,25034)
1 | t | (12343,24542),(9295,294)
1 | t | (9308,24151),(6234,620)
1 | t | (6230,24629),(3169,108)
1 | t | (31179,50040),(28113,25556)
1 | t | (28048,49694),(25000,25000)
(29 rows)


* Using Gevel module it's possible to visualize index tree, see for
example [25]Rtree Index.
* gin_stat(INDEXNAME) prints statistics about each index entry in GIN
index (PostgreSQL 8.2+)

For each index entry 'key' it returns set of records (key, nrows),
where 'nrows' is the estimated number of rows from tables, returned by
select. For example, in text search 'nrows' is the number of documents,
which contains word 'key'. This statistics may be used ONLY as
estimation of true counts, since it doesn't takes into account the
visibility information (from heap) and uses approximate counts in case
of long posting lists.

Vacuum table after table modification to obtain more accurate
statistics.

Top-5 most frequent words (from 463873) :
=# SELECT * FROM gin_stat('gin_idx') as t(word text, ndoc int) order by ndoc desc limit 5;
word | ndoc
--------+--------
page | 340858
figur | 240366
use | 148022
model | 134442
result | 129010
(5 rows)

Time: 520.714 ms

Compare exact stats, obtained using ts_stat() function, with estimated
counts:
=# select a.word, b.ndoc as exact, a.estimation as estimation,
round ( (a.estimation-b.ndoc)*100.0/a.estimation,2)||'%' as error from
(SELECT * FROM gin_stat('gin_x_idx') as t(word text, estimation int)
order by estimation desc limit 5 ) as a, stat b where a.word = b.word;
word | exact | estimation | error
--------+--------+------------+-------
page | 340430 | 340858 | 0.13%
figur | 240104 | 240366 | 0.11%
use | 147132 | 148022 | 0.60%
model | 133444 | 134442 | 0.74%
result | 128977 | 129010 | 0.03%
(5 rows)

Time: 550.562 ms

Example for integer arrays:
# SELECT * FROM gin_stat('gin_idx') as t(value int, nrow int) where nrow > 250;
value | nrow
-------+------
31 | 254
47 | 251
52 | 257
59 | 259
(4 rows)