spacekit.extractor.radio

Querying and downloading .fits files from a MAST s3 bucket on AWS. Unlike spacekit.extractor.scrape, which can access data in private s3 buckets, this module is specifically for collecting data from the publicly available MAST website and/or MAST data hosted on s3. Instead of scraping a closed collection, you’re receiving data from an open channel - like a radio.

class spacekit.extractor.radio.Radio(config='disable', name='Radio', **log_kws)[source]

Bases: object

Class for querying and downloading .fits files from a MAST s3 bucket on AWS. TODO: overhaul for multi-mission (HST, JWST) TODO: generalize mast_download() for other missions and options (put mission specific methods into subclasses) TODO: change config attr to cloud

Instantiates a spacekit.extractor.Radio object.

Parameters:

config (str, optional) – enable or disable aws cloud access (disable uses MAST only), by default “disable”

Sets parameters for a cone search as object attributes: radius, collection, exptime, subgroup.

Parameters:
  • radius (string) – radius for the cone search e.g. 0s

  • collection (string) – observatory collection name e.g. “K2”

  • exptime (float) – exposure time e.g. 1800.0

  • subgroup (list) – # data file type e.g. [“LLC”]

Returns:

class object with attributes updated

Return type:

self

configure_aws()[source]

Sets cloud (AWS) configuration On or Off.

get_object_uris()[source]

Run observation query via cone search and return list of product uris.

Returns:

class object with attributes updated

Return type:

self

mast_download()[source]

Download datasets from MAST

Sets parameters for prop search as object attributes: proposal ID, filters, obsid and subgroup.

Parameters:
  • proposal_id (string) – match proposal id, e.g. ‘13926’

  • filters (string) – match filters ‘F657N’

  • obsid (string) – match obsid or regex pattern ‘ICK90[5678]*’

  • subgroup (list) – data file types [‘FLC’, ‘SPT’]

Returns:

class object with attributes updated

Return type:

self

s3_download()[source]

Download datasets in list of uris from AWS s3 bucket (public access via STScI)

Returns:

class object with attributes updated

Return type:

self

search_by_radec(data, propid='proposal_id', ra='ra_targ', dec='dec_targ', datacol='target_classification')[source]

Scrapes MAST for remaining target classifications that could not be identified using target name. This method instead uses a broader set of query parameters: the ra_targ and dec_targ coordinates along with the dataset’s proposal ID. If multiple datasets are found to match, the first of these containing a target_classification value will be used.

Returns:

secondary set of remaining key-value pairs (target names and scraped categories)

Return type:

dict

search_by_targname(targets, datacol='target_classification')[source]

Scrapes the “target_classification” for each observation (dataframe rows) from MAST using astroquery and the target name. For observations where the target classification is not found (or is blank), the scrape_other_targets method will be called using a broader set of search parameters (ra_targ and dec_targ).

Returns:

target name and category key-value pairs

Return type:

dictionary

search_targets_by_obs_id(obs_id, prop_id)[source]
set_product_params(obs, obsid=None)[source]

keyword arguments can be any valid MAST data product params, e.g. obs_collection, t_exptime, target_classification

set_query_params(**kwargs)[source]

keyword arguments can be any valid MAST search params, e.g. proposal_id, filters, obsid target, radius, s_ra, s_dec

class spacekit.extractor.radio.HstSvmRadio(df, trg_col='targname', ra_col='ra_targ', dec_col='dec_targ', **log_kws)[source]

Class for scraping metadata from MAST (Mikulsky Archive for Space Telescopes) via astroquery. Current functionality for this class is limited to extracting the target_classification values of HAP targets from the archive. An example of a target classification is “GALAXY” - an alphanumeric categorization of an image product/.fits file. Note - the files themselves are not downloaded, just this specific metadata listed in the online archive database. For downloading MAST science files, use the spacekit.extractor.radio module. The search parameter values needed for locating a HAP product on MAST can be extracted from the fits science extension headers using the astropy library. See the spacekit.preprocessor.scrub api for an example (or the astropy documentation).

Instantiates a spacekit.extractor.radio.HstSvmRadio object.

Parameters:
  • df (dataframe) – dataset containing the requisite search parameter values (kwargs for this class)

  • trg_col (str, optional) – name of the column containing the image target names, by default “targname”

  • ra_col (str, optional) – name of the column containing the target’s right ascension values, by default “ra_targ”

  • dec_col (str, optional) – name of the column containing the target’s right ascension values, by default “dec_targ”

combine_categories()[source]

Combines the two dictionaries (target_categories and other_cat) and inserts back into the original dataframe as a new column named category.

Returns:

copy of original dataset with new “category” column data appended

Return type:

dataframe

scrape_mast()[source]

Main calling function to scrape MAST

Returns:

updated dataset with target classification categorical data added for each observation.

Return type:

dataframe