Mapping Versions to Data Releases¶
Paths to data products often contain a version number either in the directory path or filename itself. These
version numbers may be tagged pipeline reduction versions, or software post-processing versions. Eventually these
files are frozen and released in a public Data Release (DR) or in a survey-specific internal product launch, e.g.
IPLs or MaNGA’s MPLs. It may be necessary to map these specific version numbers to release numbers so that tools
like sdss_access can understand where to look for files given a specific overall data release. sdss_brain
provides a method for doing so.
Version Metadata¶
The metadata describing the various software version names and tags associated for each SDSS data release is
handled by the SDSS datamodel product. sdss_brain contains convenience methods for accessing version
metadata either from a local copy of the datamodel product, or remotely via the SDSS valis API. To
retrieve all the version metadata, use the get_versions function.
>>> from sdss_brain.datamodel import get_versions
>>> vers = get_versions()
{'IPL1': {'apred_vers': '1.0', 'v_astra': '0.2.6', 'run2d': 'v6_0_9', 'run1d': 'v6_0_9'},
'DR18': {'run2d': 'v6_0_4', 'run1d': 'v6_0_4', 'v_speccomp': 'v1.4.3', 'v_targ': '1.0.1'},
'DR17': {'run2d': 'v5_13_2', 'apred_vers': 'dr17', 'apstar_vers': 'stars', 'aspcap_vers': 'synspec_rev1',
'results_vers': 'synspec_rev1', 'run1d': 'v5_13_2', 'drpver': 'v3_1_1', 'dapver': '3.1.0'},
...}
Accessing a Mapping¶
To access a mapping you can use the get_mapped_version helper function.
It accepts as input the version reference name, and a release to lookup. If no version name is specified,
it returns the entire mapping for the given release.
>>> # access the entire version mapping for DR16
>>> from sdss_brain.datamodel import get_mapped_version
>>> get_mapped_version(release='DR16')
{'run2d': 'v5_13_0', 'apred_vers': 'r12', 'apstar_vers': 'stars', 'aspcap_vers': 'l33',
'results_vers': 'l33', 'run1d': 'v5_13_0', 'drpver': 'v2_4_3', 'dapver': '2.2.1'}
Specifiy a version reference name to extract only the individual item from the dictionary.
>>> # access only the manga DRP version mapping for DR16
>>> from sdss_brain.datamodel import get_mapped_version
>>> get_mapped_version('drpver', release='DR16')
'v2_4_3'
Using a Mapping¶
As mentioned, the utility of this is to extract version numbers needed by sdss_access given a single release.
Consider a defined class for MaNGA datacubes:
from sdss_brain.core import Brain
from sdssdb.sqlalchemy.mangadb import database
from sdss_brain.datamodel import get_mapped_version
class Cube(Brain):
_db = database
path_name = 'mangacube'
def _parse_input(self, value):
plateifu_pattern = re.compile(r'([0-9]{4,5})-([0-9]{4,9})')
plateifu_match = re.match(plateifu_pattern, value)
data = {'filename': None, 'objectid': None}
# match on plate-ifu or else assume a filename
if plateifu_match is not None:
data['objectid'] = value
# extract and set additional parameters
self.plateifu = plateifu_match.group(0)
self.plate, self.ifu = plateifu_match.groups(0)
else:
data['filename']
return data
def _set_access_path_params(self):
drpver = get_mapped_version("drpver", release=self.release)
self.path_params = {'plate': self.plate, 'ifu': self.ifu, 'drpver': drpver}
Inside the _set_access_path_params method we use get_mapped_version to access the
DRP version for the given release of the Cube. Now as we load different cubes of different
releases, the correct versions and paths are updated.
>>> # load a cube for DR16
>>> cube = Cube('8485-1901', release='DR16')
>>> cube.get_full_path()
'/Users/Brian/Work/sdss/sas/dr16/manga/spectro/redux/v2_4_3/8485/stack/manga-8485-1901-LOGCUBE.fits.gz'
>>> # load a cube for DR13
>>> cube = Cube('8485-1901', release='DR13')
>>> cube.get_full_path()
'/Users/Brian/Work/sdss/sas/dr13/manga/spectro/redux/v1_5_4/8485/stack/manga-8485-1901-LOGCUBE.fits.gz'
Version Name Differences¶
Sometimes the version name specified in an sdss_access path template and in a datamodel can be different.
This can happen due to the longetivity of SDSS, a lack of standards around naming conventions, and multiple
people contributing to the same project/code. For example, a version name could be referenced as
drpver in the datamodel, but as drp_ver or ver_drp or drpvers in various sdss_access
path templates describing data products. A real example is the version of the APOGEE pipeline is
often referenced as apred in older path templates, but apred_vers in the datamodel and in
newer path templates.
To accommodate these differences, aliases can be defined using the version_aliases parameter in the
sdss_brain.yml configuration file, which is a dictionary of parameters read in by sdss_brain. Each
key in version_aliases is a mapping between a version alias and the true datamodel version name. Using
the above example, the entry in the config file would like that
version_aliases:
drp_ver: drpver
ver_drp: drpver
drpvers: drpver
apred: apred_vers