Документ взят из кэша поисковой машины. Адрес оригинального документа : http://www.sai.msu.su/~megera/wiki/sncat_design
Дата изменения: Unknown
Дата индексирования: Sun Apr 10 17:10:12 2016
Кодировка:

Поисковые слова: п п п п п п п п п п п р п р п р п р п
Zen: sncat design

sncat design

SAI SN catalogue design

Convertations Issues

  • How to distinguish unknown optdisc and default (ccd) ? See 1885A example.
  • type of sn ??? Ideally, we want to encode information about common type, subtype, pecularity and uncertainty of each part.
  • date ??? We have incomplete date (only month is specified), which is no way to use in db, but we want to preserve somehow this info and use date data type.
  • cross-identification with leda catalogue. This is not difficult using q3c package. See ledasn table in leda database for results. See comment page for details.

Comments

Well, I finally decided to abandon galaxy table from our catalogue and to use hyperleda as a reference data for galaxies. This will greatly simplify maintenance of our catalog and make it more transparent. But, I see a problem with support of several reference tables for galaxies. For example, hyperleda is ok for close galaxies, but modern sn surveys look much further and we certainly need other catalogs with faint galaxies. I need to figure out how to modify SQL scheme. Ideally, we want be able to:

  • add (online) new reference catalogs
  • specify priority of catalogs in case of overlapping data (for example, we could have our local hand-made table for galaxies and assign it high priority)

Questions

1. How to implement history of changes on catalog ?
Ideally, we need separate history table which stores information about who made the change, when it was occured, what was changed and old value.
CREATE TABLE history (
  id bigint PRIMARY KEY,
  who text,
  when timestamp without time zone,
  what text,
  old  ????
);

For now, I use triggers to store modification datetime in mod_date column.

Maintain compiled catalogs

The problem:
It's necessary to maintain catalog with data collected from different (primary) catalogs. We want to find out the best database schema which supports synchronization of secondary catalog with primary one.

  • t0.sql, diagram - sql schema and diagram for example database (galaxy has 3 properties - ra,dec,z)

1. Fully normalized schema

It's possible to store every property of a galaxy as a separate row, so we could maintain separate table with references to the source catalog and record id in source catalog.

CREATE TABLE cats (
	id integer PRIMARY KEY,
	name text
);

-- a list of properties of galaxy
CREATE TABLE properties (
	id integer PRIMARY KEY, 
	name text
);

CREATE TABLE galaxy (
	id integer PRIMARY KEY,
	name text
);

CREATE TABLE galaxy_property (
	gid integer REFERENCES galaxy(id),
	pid integer REFERENCES properties(id),
	value	float,
	UNIQUE(gid,pid)
);

CREATE TABLE map (
	gid integer REFERENCES galaxy(id),	-- galaxy id
	pid integer REFERENCES properties(id),  -- property id
	cid integer REFERENCES cats(id),	-- catalog id
	cidid integer,  			-- id in catalog cid
	UNIQUE(gid,pid,cid)
);

http://mira.sai.msu.su/~megera/sn/sncat/design/test/t0.png

t=# select * from galaxy;
 id | name 
----+------
  1 | N224
t=# select * from galaxy_property where gid=1;
 gid | pid | value 
-----+-----+-------
   1 |   1 |  0.74
   1 |   2 |  41.3
   1 |   3 |     0
t=# select * from properties;
 id | name 
----+------
  1 | ra
  2 | dec
  3 | z
t=# select * from cats;      
 id | name  
----+-------
  1 | LEDA
  2 | NED
  3 | SDSS
  4 | OTHER
(4 rows)

Benefits:

  • Very flexible
    • catalogs could be added online,
    • only known properties could be stored for a galaxy

Drawbacks:

  • Each galaxy requires many rows in galaxy_property and map tables ( as much as the number of properties)
  • select queries are very cumbersome
  • There is a problem with storing values of different types in galaxy_property (we use float in this example)

2. Partially normalized schema

Instead of maintaining separate row for each attribute, we use custom composite type, which encapsulates knowledge about value and source catalog where this value comes from. See table galaxy_properties instead of galaxy_property'.

CREATE TYPE val_cid as ( value float, cid integer);
CREATE TABLE galaxy_properties (
        gid integer REFERENCES galaxy(id),
        ra  val_cid,
        dec val_cid,
        z   val_cid
);

http://mira.sai.msu.su/~megera/sn/sncat/design/test/t1.png
Notice, composite type designated as BLOB on picture!

t=# select * from galaxy_properties;
 gid |    ra    |   dec    |   z   
-----+----------+----------+-------
   1 | (0.74,1) | (41.3,1) | (0,3)
(1 row)

For composite type we should use special syntax to access specific component.

t=# select * from galaxy_properties where (z).cid = 3;
 gid |    ra    |   dec    |   z   
-----+----------+----------+-------
   1 | (0.74,1) | (41.3,1) | (0,3)
(1 row)

Benefits:

  • It's flexible and catalogs could be added online
  • One row per galaxy
  • No problem with data type, we could create many composite types

Drawbacks:

  • Require creation of custom type
  • select queries are very cumbersome

Example query - identify source (source catalog, record id in source catalog) of redshift of galaxy with gid = 1.

t=# select g.id, g.name as galaxy, c.name as catalog,m.cidid 
from galaxy_properties gp, cats c, map m,galaxy g,properties p
where  (gp.z).cid = c.id and gp.gid = 1 and m.gid = gp.gid and m.cid = c.id 
and g.id = gp.gid and m.pid=p.id and p.name='z';
 id | galaxy | catalog | cidid 
----+--------+---------+-------
  1 | N224   | SDSS    |   178
(1 row)

3. Basic schema

What if we combine tables galaxy and galaxy_properties ? Then, if assume, that (gid,pid) uniquely connected with (cid,cidid), we could a bit easier identified the source of a specific column.

CREATE TABLE galaxies (
        gid integer PRIMARY KEY, 
        name text,
        ra  float,
        dec float,
        z   float
);

Uniqueness means, that for given property of given galaxy we have onle one record in source catalog !

http://mira.sai.msu.su/~megera/sn/sncat/design/test/t2.png

t=# select g.gid, g.name as galaxy, c.name as catalog,m.cidid
from galaxies g, cats c, map m, properties p 
where g.gid = 1 and m.gid = g.gid and m.cid = c.id  and m.pid=p.id and p.name='z';
 gid | galaxy | catalog | cidid 
-----+--------+---------+-------
   1 | N224   | SDSS    |   178
(1 row)

Example of update query:
Update coordinates (ra,dec) in table galaxies using information from Leda catalogue (cid=1).

t=# update galaxies set ra=foo.ra,dec=foo.dec from 
(select l.al2000 as ra, l.de2000 as dec, g.id  from galaxies g,  
map m, leda.m000 l 
where g.id = m.gid and m.pid=2 and m.cid=1 and m.cidid = l.pgc 
) as foo where foo.id = galaxies.id;