spacekit.extractor.radio
Querying and downloading .fits files from a MAST s3 bucket on AWS. Unlike spacekit.extractor.scrape, which can access data in private s3 buckets, this module is specifically for collecting data from the publicly available MAST website and/or MAST data hosted on s3. Instead of scraping a closed collection, you’re receiving data from an open channel - like a radio.
- class spacekit.extractor.radio.Radio(config='disable', name='Radio', **log_kws)[source]
Bases:
object
Class for querying and downloading .fits files from a MAST s3 bucket on AWS. TODO: overhaul for multi-mission (HST, JWST) TODO: generalize mast_download() for other missions and options (put mission specific methods into subclasses) TODO: change config attr to cloud
Instantiates a spacekit.extractor.Radio object.
- Parameters:
config (str, optional) – enable or disable aws cloud access (disable uses MAST only), by default “disable”
- cone_search(radius, collection, exptime, subgroup)[source]
Sets parameters for a cone search as object attributes: radius, collection, exptime, subgroup.
- get_object_uris()[source]
Run observation query via cone search and return list of product uris.
- Returns:
class object with attributes updated
- Return type:
self
- prop_search(proposal_id, filters, obsid, subgroup)[source]
Sets parameters for prop search as object attributes: proposal ID, filters, obsid and subgroup.
- Parameters:
proposal_id (string) – match proposal id, e.g. ‘13926’
filters (string) – match filters ‘F657N’
obsid (string) – match obsid or regex pattern ‘ICK90[5678]*’
subgroup (list) – data file types [‘FLC’, ‘SPT’]
- Returns:
class object with attributes updated
- Return type:
self
- s3_download()[source]
Download datasets in list of uris from AWS s3 bucket (public access via STScI)
- Returns:
class object with attributes updated
- Return type:
self
- search_by_radec(data, propid='proposal_id', ra='ra_targ', dec='dec_targ', datacol='target_classification')[source]
Scrapes MAST for remaining target classifications that could not be identified using target name. This method instead uses a broader set of query parameters: the
ra_targ
anddec_targ
coordinates along with the dataset’s proposal ID. If multiple datasets are found to match, the first of these containing a target_classification value will be used.- Returns:
secondary set of remaining key-value pairs (target names and scraped categories)
- Return type:
- search_by_targname(targets, datacol='target_classification')[source]
Scrapes the “target_classification” for each observation (dataframe rows) from MAST using
astroquery
and the target name. For observations where the target classification is not found (or is blank), thescrape_other_targets
method will be called using a broader set of search parameters (ra_targ
anddec_targ
).- Returns:
target name and category key-value pairs
- Return type:
dictionary
- class spacekit.extractor.radio.HstSvmRadio(df, trg_col='targname', ra_col='ra_targ', dec_col='dec_targ', **log_kws)[source]
Class for scraping metadata from MAST (Mikulsky Archive for Space Telescopes) via
astroquery
. Current functionality for this class is limited to extracting thetarget_classification
values of HAP targets from the archive. An example of a target classification is “GALAXY” - an alphanumeric categorization of an image product/.fits file. Note - the files themselves are not downloaded, just this specific metadata listed in the online archive database. For downloading MAST science files, use thespacekit.extractor.radio
module. The search parameter values needed for locating a HAP product on MAST can be extracted from the fits science extension headers using theastropy
library. See thespacekit.preprocessor.scrub
api for an example (or the astropy documentation).Instantiates a spacekit.extractor.radio.HstSvmRadio object.
- Parameters:
df (dataframe) – dataset containing the requisite search parameter values (kwargs for this class)
trg_col (str, optional) – name of the column containing the image target names, by default “targname”
ra_col (str, optional) – name of the column containing the target’s right ascension values, by default “ra_targ”
dec_col (str, optional) – name of the column containing the target’s right ascension values, by default “dec_targ”