Scot's Page for Public SDSS Data Access

The earlier DR1 version of this page.

The earlier DR2 version of this page.

The earlier DR3 version of this page.

DR4 was released mid 2005.

What is DR4? From the DR4 Web Site:

The DR4 imaging data cover 6670 square degrees, and include information on roughly 180 million objects. The DR4 spectroscopic data include data from 1052 main survey plates of 640 spectra each, and cover 4681 square degrees. In addition, DR4 for the first time contains 276 "extra" and "special" plates:

There is a separate page describing the special plates in DR4.

The DR4 footprint is defined by all non-repeating survey-quality imaging runs within the a priori defined elliptical survey area in the Nothern Galactic Cap, and three stripes in the Southern Galactic Cap obtained prior to 1 July 2004, and the spectroscopy associated with that area as well as the extra and special plates obtained before that date. In fact, 34 square degrees of imaging data in the Nothern Galactic Cap lie outside this ellipse. While the DR4 scans do not repeat a given area of sky, they do overlap to some extent, and the data in the overlaps are included in earlier releases as well. The sky coverage of the imaging and spectroscopic data that make up DR4 are given on the coverage page. The natural unit of imaging data is a run; the DR4 contains data from (about) 200 runs in the best database, and (about) 202 runs in the target database.

A total of 183 square degrees of sky are different runs between target and best, the majority along the Equatorial Stripe in the Fall sky.

Except for the sky coverage, the pipelines and databases are identical in DR4, DR3 and DR2. Thus, DR4 is (very nearly) a proper superset of DR3, which is a superset of DR2. The DR2 included reprocessing of all data included in DR1, and those data in EDR that pass our data-quality criteria for the official survey. For details about what changed from DR2 to DR3, please refer to About DR3 on the DR3 web site.

Astrometric calibrations are good to 2% rms and coordinates are accurate to 100 mili-arcsec rms per coordinate. As since DR2, compared to DR1, the spectra now feature better spectrophotometric calibrations, but are no longer Galactic reddening-corrected. Model magnitudes have also been improved so that they now serve as a better proxy for PSF magnitudes for point sources and Petrosian magnitudes for extended sources. The DR4 data pipelines are at least functionally identical to the DR3 pipelines.

The SDSS spectra cover a wavelength range of 3850 - 9200 Angstroms in two channels with a wavelength scale of 1.14 Angstroms/pixel for a resolution of ~1800.

The official project DR4 WWW page is found at http://www.sdss.org/dr4 while the project's WWW site is located at http://www.sdss.org. The official DR4 publication (Adelman-McCarthy, et al., 2005) can be found in astro-ph. The DR3 publication (Abazajian, et al., 2005) is available at AJ, 126, 1755. The official DR2 publication (Abazajian at al., 2004) can be found in preprint form at: astrp-ph/0403325. A lot of useful information is also in the DR1 and EDR publication at: AJ, 126, 2081 and AJ, Vol. 123, Issue 1, p. 485., respectively.

The data access tool you need

The Data Archive Server

There are several ways to access DR4 data. If you know what spectra/images you want already, you can use the SDSS DAS (Data Archive Server). The standard things to get there are the reduced images, or fpC (corrected frames) files, the postage-stamp object images, or fpAtlas files, and the reduced spectra, or spSpec files. (If you want to do photometry on the fpC files yourself, you can find the details here.) All imaging files are indexed by some set of 6 parameters:

Everything on the DAS is organized at least by field - the returned FITs files, then, having several entries, one for each object (indexed by id) in that field. The SDSS imaging filters are discussed on the Imager information page on the DR4 WWW site.

Similarly SDSS spectra are indexed by 4 parameters:

For DR4, spRerun is always 23, since only one reduction is used.

You can also get individual object images (or see if a given RA/Dec has been released in DR4) and finder charts based on object coordinates. (One of the DR1 finder chart tools has gone away; this is the only available tool, now.)

The Image and Spectroscopic Query Server

The DAS is only good if you know what images or spectra you want, but since DR4 contains so many objects and spectra, you will probably need to sort things to get only the relevant objects for your project. There are several ways to do this (the most versatile is described later), but one is through the Imaging and Spectroscopic Query Servers, the IQS, and SQS. These tools let you enter position, SDSS magnitude, and QA-flag constraints and can return a variety of photometric and spectroscopic outputs. If you choose the "minimal" set of parameters to be returned, you will get the required magic parameters mentioned above to retrieve your objects' images or spectra from the DAS.

A note here on SDSS "sky versions" is probably warranted. The photometric catalogs always have at least two versions: one is the "target" version which is the observation and the reduction that were done to produce the spectroscopic tiling information used to assign objects fibers for observation. This sky version is useful if you want to analyze why an object was targeted and/or you are investigating completeness of a given sample. Otherwise, you will most often want the "best" version which may be the same or latter observation from the "target" observation, but will be reduced with the latest, best version of the reduction pipeline.

The Tutorials Section of the SDSS DR4 WWW site has some nice examples to help you work through common tasks with the IQS, SQS, and DAS.

The Catalog Archive Server

There is another database of SDSS DR4 data called the skyserver. (Actually, as opposed to DR1, the DR4 version of the IQS and SQS referenced above are now part of the skyserver instead of separate products and databases.) It contains many other ways of accessing the data, but basically allows you to enter SQL queries to find the information you need from the database. Typically, these queries return two types of parameters (but of course you are not limited to these): reduced photometric or spectroscopic quantities, or coordinates and the indexing parameters needed to retrieve images and spectra from the DAS. These tools are located in the CAS (catalog archive server) section of the skyserver WWW site.

You can enter your SQL queries directly from a WWW page, via a downloaded Java applet called sdssQA, via a custom emacs interface, or via a custom python interface. For beginners, I recommend either the WWW page or sdssQA. The sdssQA product contains lots of sample queries which can be used to help learn the SQL syntax and the SDSS schema via any of these interfaces.

The online skyserver schema browser will allow you to figure out what quantities you need to query on or return to get the information you need. There are also two crossID tools including one for imaging which is now improved over the DR2 version in that is no longer limits you to choosing from a set of possible return parameters for all catalog objects with a set of user-entered coordinates. You can now also cross-reference via run, rerun, etc. and write arbitrary SQL code for your return information. The spectroscopic tool allows you to enter free-form SQL code and return information for all objects indexed by user-entered plate, MJD, and fiber.

If you do not know SQL, but need more sophisticated data-sorting tools than the IQS and SQS provide, you will simply need to learn SQL. For most tasks, however, it is fairly straightforward to form a satisfactory SQL query. The examples that come with sdssQA, even if you don't use sdssQA to input them, are a great place to start. There is also a good set of sample SQL queries on the Introduction to SQL page.


Last update: 24Oct05.

Questions/Comments

At least hits since 20Jun02.