Документ взят из кэша поисковой машины. Адрес оригинального документа : http://www.stsci.edu/institute/itsd/science/IDIA-TWO.pdf
Дата изменения: Unknown
Дата индексирования: Tue Feb 5 02:56:02 2013
Кодировка:
Поисковые слова: enceladus

our very own Bob Hanisch gave a presentation on the VAO. This was a great presentation on the VAO and the status of things. Please bear with me below: it's really mostly notes and main points. If you have questions about this very synthetic summary below, please ask me, or please refer to Bob directly for VAO specific questions. I mostly captured "highlights" of his talk. Data discovery, access, and management with the Virtual Observatory. Different sizes, different dimensionality, different input. How do you deal with such heterogeneous environment ? metadata can help. Need to homogenize. Different tools. 50 major data centers and observatories with substantial holdings. In VO 10,000 resouces (catalogs, surveys, archives). Data centers host approximately 100 TB each, currently with 1+ PB. in total. There has been change in culture, looking at archives. Now archives are used for research (2/3 of papers published from HST data are by scientists who were not PI or collaborator on original HST proposal that data was taken for). Reference to diagrams from astro2010 and paper by White et al. VAO was not reviewed or prioritized in astro2010 as it was already being implemented. Data preservation and curation very much needed and new software needed to be created (to handle large data sets). NSF should be in board. Similar recommendations were already made by CODMAC in 1982. Data mgmt.: need database technology capabale of

managing 10^9 need metadata need to budget curation by VO.

-10^12 rows; need network bandwidth; mgmt.; support long term projects. So we for comprehensive archiving, long term Need to engage theoretical community.

VO is mostly for data discovery, access and integration facility. Intl. collaboration on metadata standards, data models and protocols. Various standards from the data to software to transferring data and so for. Intl. VO was established in 2001. US VO efforts: NVO: 2001-2008. VAO from 2010 to 2015 (VAO is operational component; there was a gap unfortunately after end of NVO). VAO was allocated $5.5M/yr over five years. VAO is subject to an annual performance review. Over sixty people are involved but all at low level of effort. VAO has board of directors. Bob Hanisch is the VAO Director, Bruce Berriman is P.M., De Young Project Scientist and Alex Szalay is Technology Advisor. New web presence http://www.usvao.org Scope and functions: at STScI, User Support by Maria Santisteban-Nieto, EPO by Brandon Lawton. Challenges: restarting a distributed team; working in an atmosphere of intense fiscal oversight; changing the mindset from R&D to facility operations; right-sizing processes: structure vs. straitjacket; managing expectations, timing releases of new capabilities; user community take up, building trust. Seven major science inititatives: new portal; scalable; SEDs.....other.....in portal trying to create a context

sensitive interpreter so that you type one word in one box and tool knows what you are typing in (object M31, galaxy, cords...etc). (it will be a javascript). Some notes below on these new initiatives within the VAO (please be aware, notes are quite sketchy here): SED Tool: between specview and Sherpa, linking these tools together and then accessing NED collection. It can fit a model etc. Cross-matching: going beyond previous tool in NVO. It will be a large scale positional catalog cross comparison web service. Time-series integration tools, looking at Harvard tool and IPAC tool. Merge them ? VAO-IRAF integration: it's about number of people registered to IRAF. It's about buy-in. Largely a strategic approach. >700 IRAF tasks will become VO-aware. Science-studies: a study period during year one, time domain astronomy (transients). Data linking and semantic astronomy (project ongoing at Smithsonian); desktop tool integration (phase 2) how can we operate with different desktop tools in addition to tools ?; data mining and stats analysis (what are best tools to use in VO environment;; make sure users understand how to use them). The research record and data: aim to systematically preserve data in published data. Not only paper but also include data points that were used to make a plot/table/graph. Establish regulations and methods to include data in papers. This can ensure that no clerical errors are done, and also help reproducibility and curation

and preservation. VAO collaborating with AAS, with NSF OCI-funded project, the Data Conservancy (DataNet program). NSF now has policy that project data management justification is needed when applying for NSF grant. CyberSKA like tool can be good to share comments and input on papers. This aim of the VAO is not to police publications. Science collaborations: help make sure that decisions are made with scientists doing big projects providing input: CANDELS. Also SMC by Madore PI. Summary: unprecented data volumes, complex data, data mgtm practices need to be implemented in facility; distributed data and distributed services. Q/As: VAO doing very little storage themselves: VAO do not accept data over the wall. Data remain in other institutions. VAO is service integrator. Steps when a user submits a request for data: "User's query" ---> "VAO provides pointer" ---> "Provider to consumer directly". VAO does not get the data only provides the link. Can you do an expert mode query ? probably variety of approaches desirable. For now it's one box. About IRAF: There will be an IRAF VAO distribution likely in the Fall.