spacekit.skopes.hst.svm.prep

Spacekit HST Single Visit Mosaic Data/Image Preprocessing

Step 1: SCRAPE JSON FILES and make dataframe Step 2: Scrape Fits Headers and SCRUB DATAFRAME Step 3: DRAW Mosaic images

Examples: df = run_preprocessing(“home/singlevisits”)

df = run_preprocessing(“home/syntheticdata”, fname=”synth2”, crpt=1, draw=0)

spacekit.skopes.hst.svm.prep.run_preprocessing(input_path, h5=None, fname='svm_data', output_path=None, json_pattern='*_total*_svm_*.json', visit=None, crpt=0, draw=1, subset_name=None)[source]

Scrapes SVM data from raw files, preprocesses dataframe for MLP classifier and generates png images for image CNN. #TODO: if no JSON files found, look for results_*.csv file instead and preprocess via alternative method

Parameters:
  • input_path (str) – path to SVM dataset directory

  • h5 (str, optional) – load from existing hdf5 file, by default None

  • fname (str, optional) – base filename to give the output files, by default “svm_data”

  • output_path (str, optional) – where to save output files. Defaults to current working directory, by default None

  • json_pattern (str, optional) – glob-based search pattern, by default “_total*_svm_.json”

  • visit (str, optional) – single visit name (e.g. “id8f34”) matching subdirectory of input_path; will search and preprocess this visit only (rather than all visits contained in the input_path), by default None

  • crpt (int, optional) – set to 1 if using synthetic corruption data, by default 0

  • draw (int, optional) – generate png images from dataset, by default 1

Returns:

preprocessed Pandas dataframe

Return type:

dataframe