Zen: SkyPixelization

Indexing the sky with PostgreSQL

The Challenge

Scalable indexing of very big sky catalogues in open-source relational database - PostgreSQL. We advocate the adoption of PostgreSQL by astronomical community.

The method

There is a common opinion that good pixelization method should satisfy 3 conditions ( see Desiderata):

Hierarchical structure
Equal areas
Iso-Latitude distribution

Do we really need them for database indexing ? Not at all. We need simple numbering scheme, so that for any given point we could easily calculate the number of "pixel", this is what HealPix provides and lacks HTM. Also, we need simple pixel shape for easy geometrical calculation, like HTM's triangles, while HealPix pixel has curved shape which complicates geometrical calculation.

Our method is a hybrid of HTM and HealPix and in a nutshell is *"The quadrilaterlized spherical cube"*, which was used in COBE ( see details in SphereCube). Special numbering scheme keeps spatial closiness, so we could use standard Btree index provided by PostgreSQL for searching pixels.
Q3C segmentation
Quad-tree pixelization of the circle of cone search ((42,0), radius 1arc second) on cube face. Notice, regular shapes of pixels and circle projected to ellipse.

The main feature of astronomical databases, the cone search was implemented by following way:

The cone search bound the small circle on the sphere
This small cirle is projected as ellipse on the cube face
The quad-tree structure on the cube face allow to OPTIMALLY decompose the ellipse on the two sets: the set of quad-tree squares fully covered by the ellipse, and the set of quad-tree squares partly covered by the ellipse (see the picture above)
Due to the presence of the quadtree structure these two sets of squares transforms to the set of the ranges on pixel number allowing the direct using of the BTREE index.

Also, we could use CLUSTER feature to store database rows on disk according index, which greatly increases performance of fetching tuples.

Finally we want to underline two important facts about our scheme:

Due to the optimal segmentation of the circle of the query we fetch from the disk only a few percents of points not lying within the circle of the query. So our scheme is nearly I/O optimal.
Due to to not recursive(as in HTM) and very simple computations (in contrast to HEALPIX with it non-geodesic pixels) our scheme is not CPU intensive and we feel that it is significantly less CPU intensive then HEALPIX and HTM.
Due to the simplicity of computations and high performance, the scheme can be used for catalogues with the very high coordinate precision (currently several milliarcseconds), and can be easily extended even to higher precision.

Preliminary tests

Test run consists of desorted (to exclude caching) cone queries over USNO-A1 catalogue (>500,000,000 rows), which cover all sky for Q3C (our method), two-column (ra,dec) btree index and Rtree (box). We run queries for set of radii (0.1, 0.2, 0.4, 0.8, 1.6, 3.2) degrees, total 1800 queries for each run.

Performance comparison

It's always difficult to provide fair comparison because of influence of many factors, such as system caching, system load, etc. So, we present some timings and mostly concentrate on blocks IO statistics, provided by PostgreSQL statistic collector

We use spare hardware kindly provided by Scientific Network project, which is DUAL Xeon 2.4 GHz, 1Gb RAM, 160Gb IDE disk ( IBM / Hitachi HDS722516VLAT80, 8.5 ms average seek time, 4.7 ms latency average time, see details ), Linux 2.6.7 optimized for Web applications. We configured separate pg superuser and installed 8.01 version in different location (there is primary "production" 7.4.6 PostgreSQL server). Because of tight conditions we couldn't fully optimized our backend, below some parameters from postgresql.conf:

shared_buffers = 24576
work_mem = 51200 #1024
checkpoint_segments = 12

Database structure

wsdb=# \d usno
     Table "public.usno"
 Column |  Type  | Modifiers 
--------+--------+-----------
 ra     | real   | 
 dec    | real   | 
 bmag   | real   | 
 rmag   | real   | 
 ipix   | bigint | 
 errbox | box    | 
Indexes:
    "box_ind" rtree (errbox)
    "ipix_ind" btree (ipix) CLUSTER
    "radec_idx1" btree (ra,dec)

Database size:

usno table - 48Gb
box_ind (Rtree) - 40Gb
radec_idx1 (Btree) - 11Gb
ipix_ind (Btree) - 11Gb

We were able to load database, create indices, run vacuum in about 20 hours, which is rather good timing for our resources.

We plot time (ms) vs the number of fetched tuples in logarithmic axes.

Input/Output comparison

Database performance for very large data sets (much more than RAM available) is often characterized in terms of input/output operations (tuples, index block reads and usage of cache in database buffers).

The figures below show the differences in disk I/O for different sky indexing schemes.

The number of index blocks read during the queries versus number of fetched tuples

The number blocks read from the heap during the queries versus number of returned tuples

The main conclusion from the first image is that Q3C is significantly less I/O intensive in index usage. During Rtree queries the number of index blocks read is several times greater than during Q3C queries. The RADEC index usage is even more I/O intensive. The second image demonstrate again the advantage of the Q3C scheme. The actual data read from the database in Q3C scheme is less than the data read in Rtree and RADEC scheme. This is mainly caused by the fact that both Rtree and RADEC scheme are bounding box schemes (the data from the box encompassing the circle of search is extracted from the database and only on the last stage checked whether it lies in the circle). But in Q3C scheme, as we showed before, the circle is segmented, so we extract from the database only a few percents more data than occur in the circle.