|
Документ взят из кэша поисковой машины. Адрес
оригинального документа
: http://www.sai.msu.su/~megera/wiki/GinTestUpdate
Дата изменения: Unknown Дата индексирования: Sun Apr 10 17:49:13 2016 Кодировка: UTF-8 Поисковые слова: asteroid |
For integer arrays GIN is 10x slower than GiST on update. It's about 10x faster than GiST on select. Updating of 100 rows in 100,000 takes < 0.5s for GIN index, which is quite reasonable speed.
Linux 2.6.19.1, Slackware, x86, Intel(R) Pentium(R) 4 CPU 2.40GHz, 2Gb RAM.
PostgreSQL 8.2.4
Test table tt(id integer, a int[]) with GIN index created
test=# \d tt
Table "public.tt"
Column | Type | Modifiers
--------+-----------+-----------
id | integer |
a | integer[] |
Indexes:
"gin_idx" gin (a)
For creating GiST index
create index gist_idx on tt using gist(a gist__intbig_ops);
Example of populating a table - 10000 rows, with integer arrays cardinality 500 and maximal array length of 100.
sh ./test-id.sh -l 100 -n 10000 -v 500 | psql test -c "copy tt from stdin with delimiter as '|'"
Example of update SQL - update first 300 rows
update tt set a=array( select(1000*random())::integer+generate_series( 0,200+id) ) where id < 300;
Update array column by random array of length 200, vary #rows ( All timings in ms !)
-l 100 -n 10000 -v 500
Create index:
-l 100 -n 100000 -v 500
Create index:
-l 100 -n 10000 -v 5000
Create index:
-l 100 -n 100000 -v 5000
Create index (real):
Simulate typical news archive - 100,000 documents with length < 500 words and dictionary size 100,000 words.
-l 500 -n 100000 -v 100000
Create index:
Search performance:
explain analyze select * from tt where a @> '{2}';
GiST:
Bitmap Heap Scan on tt (cost=29.27..405.87 rows=100 width=979) (actual time=68.714..6311.757 rows=678 loops=1)
Filter: (a @> '{2}'::integer[])
-> Bitmap Index Scan on gist_idx (cost=0.00..29.24 rows=100 width=0) (actual time=43.430..43.430 rows=10774 loops=1)
Index Cond: (a @> '{2}'::integer[])
Total runtime: 6312.807 ms
GIN:
Bitmap Heap Scan on tt (cost=150.50..527.11 rows=100 width=982) (actual time=0.610..11.277 rows=678 loops=1)
Recheck Cond: (a @> '{2}'::integer[])
-> Bitmap Index Scan on gin_idx (cost=0.00..150.48 rows=100 width=0) (actual time=0.433..0.433 rows=678 loops=1)
Index Cond: (a @> '{2}'::integer[])
Total runtime: 11.892 ms
GiST ia a way slower than GIN index, because GiST found 10774 documents where only 678 documents are right answer and remaining are false drops !
Скрипт test-id.sh for generating test load. It's not optimized and could be horribly slow !
#!/bin/sh
# default values
# array length
MAXLEN=4
# number of rows
MAXNUM=10
# cardinality
MAXVAL=100
while getopts l:n:v: opt
do
case "$opt" in
l) MAXLEN="$OPTARG";;
n) MAXNUM="$OPTARG";;
v) MAXVAL="$OPTARG";;
esac
done
for ((i=0;i<$MAXNUM;i++)) do
len=$(( $RANDOM % $MAXLEN))+1
for ((j=0;j<$len;j++)) do
if [ $j == '0' ]; then
val=$(( $RANDOM % $MAXVAL))
else
val=$val,$(( $RANDOM % $MAXVAL))
fi
done
echo $i\|{$val}
done