lipd package¶
Module contents¶
-
lipd.
addEnsemble
(D, dsn, ensemble)¶ Create ensemble entry and then add it to the specified LiPD dataset.
Parameters: - D (dict) – LiPD data
- dsn (str) – Dataset name
- ensemble (list) – Nested numpy array of ensemble column data.
Return dict D: LiPD data
-
lipd.
collapseTs
(ts=None)¶ Collapse a time series back into LiPD record form.
Example1. D = lipd.readLipd()2. ts = lipd.extractTs(D)3. New_D = lipd.collapseTs(ts)Parameters: ts (list) – Time series Return dict: Metadata
-
lipd.
doi
()¶ Update publication information using data DOIs. Updates LiPD files on disk, not in memory.
Example1: lipd.readLipd()2: lipd.doi()Return none:
-
lipd.
ensToDf
(ensemble)¶ Create an ensemble data frame from some given nested numpy arrays
Parameters: ensemble (list) – Ensemble data Return obj df: Pandas dataframe
-
lipd.
excel
()¶ Convert Excel files to LiPD files. LiPD data is returned directly from this function.
Example1: lipd.readExcel()2: D = lipd.excel()Return dict _d: Metadata
-
lipd.
extractTs
(d, chron=False)¶ Create a time series using LiPD data (uses paleoData by default)
Example : paleoData1. D = lipd.readLipd()2. ts = lipd.extractTs(D)Example : chronData1. D = lipd.readLipd()2. ts = lipd.extractTs(D, chron=True)Parameters: - d (dict) – Metadata
- chron (bool) – Create a chronData time series
Return list l: Time series
-
lipd.
filterTs
(ts, expression)¶ Create a new time series that only contains entries that match the given expression.
Example:D = lipd.loadLipd()ts = lipd.extractTs(D)new_ts = filterTs(ts, “archiveType == marine sediment”)new_ts = filterTs(ts, “paleoData_variableName == sst”)Parameters: - expression (str) – Expression
- ts (list) – Time series
Return list new_ts: Filtered time series that matches the expression
-
lipd.
getCsv
(L=None)¶ Get CSV from LiPD metadata
Examplec = lipd.getCsv(D[“Africa-ColdAirCave.Sundqvist.2013”])Parameters: L (dict) – One LiPD record Return dict d: CSV data
-
lipd.
getLipdNames
(D=None)¶ Get a list of all LiPD names in the library
Examplenames = lipd.getLipdNames(D)Return list f_list: File list
-
lipd.
getMetadata
(L)¶ Get metadata from a LiPD data in memory
Examplem = lipd.getMetadata(D[“Africa-ColdAirCave.Sundqvist.2013”])Parameters: L (dict) – One LiPD record Return dict d: LiPD record (metadata only)
-
lipd.
noaa
(d=None)¶ Convert between NOAA and LiPD files
Example: LiPD to NOAA converter1: D = lipd.readLipd()2: lipd.noaa(D)Example: NOAA to LiPD converter1: readNoaa()2: lipd.noaa()Return none:
-
lipd.
queryTs
(ts, expression)¶ Find the indices of the time series entries that match the given expression.
Example:D = lipd.loadLipd()ts = lipd.extractTs(D)matches = queryTs(ts, “archiveType == marine sediment”)matches = queryTs(ts, “geo_meanElev <= 2000”)Parameters: - expression (str) – Expression
- ts (list) – Time series
Return list _idx: Indices of entries that match the criteria
-
lipd.
readAll
(usr_path='')¶ Read all approved file types at once. Enter a file path, directory path, or leave args blank to trigger gui.
Parameters: usr_path (str) – Path to file / directory (optional) Return str cwd: Current working directory
-
lipd.
readExcel
(usr_path='')¶ Read Excel file(s) Enter a file path, directory path, or leave args blank to trigger gui.
Parameters: usr_path (str) – Path to file / directory (optional) Return str cwd: Current working directory
-
lipd.
readLipd
(usr_path='')¶ Read LiPD file(s). Enter a file path, directory path, or leave args blank to trigger gui.
Parameters: usr_path (str) – Path to file / directory (optional) Return dict _d: Metadata
-
lipd.
readNoaa
(usr_path='')¶ Read NOAA file(s) Enter a file path, directory path, or leave args blank to trigger gui.
Parameters: usr_path (str) – Path to file / directory (optional) Return str cwd: Current working directory
-
lipd.
run
()¶ Initialize and start objects. This is called automatically when importing the package.
Return none:
-
lipd.
showDfs
(d)¶ Display the available data frame names in a given data frame collection
Parameters: d (dict) – Dataframe collection Return none:
-
lipd.
showLipds
(D=None)¶ Display the dataset names of a given LiPD data
Examplelipd.showLipds(D)Pararm dict D: LiPD data Return none:
-
lipd.
showMetadata
(dat)¶ Display the metadata specified LiPD in pretty print
ExampleshowMetadata(D[“Africa-ColdAirCave.Sundqvist.2013”])Parameters: dat (dict) – Metadata Return none:
-
lipd.
tsToDf
(tso)¶ Create Pandas DataFrame from TimeSeries object. Use: Must first extractTs to get a time series. Then pick one item from time series and pass it through
Parameters: tso (dict) – Time series entry Return dict dfs: Pandas dataframes
-
lipd.
viewTs
(ts)¶ View the contents of one time series entry in a nicely formatted way
Example1. D = lipd.readLipd()2. ts = lipd.extractTs(D)3. viewTs(ts[0])Parameters: ts (dict) – One time series entry Return none:
-
lipd.
writeLipd
(dat, usr_path='', filename='')¶ Write LiPD data to file(s)
Parameters: - dat (dict) – Metadata
- usr_path (str) – Destination (optional)
- filename (str) – LiPD filename, for writing one specific file (optional)
Return none:
Submodules¶
alternates¶
List of alternate and synonym keys
bag¶
-
lipd.bag.
create_bag
(dir_bag)¶ Create a Bag out of given files. :param str dir_bag: Directory that contains csv, jsonld, and changelog files. :return obj: Bag
-
lipd.bag.
finish_bag
(dir_bag)¶ Closing steps for creating a bag :param obj dir_bag: :return None:
-
lipd.bag.
open_bag
(dir_bag)¶ Open Bag at the given path :param str dir_bag: Path to Bag :return obj: Bag
-
lipd.bag.
resolved_flag
(bag)¶ Check DOI flag in bag.info to see if doi_resolver has been previously run :param obj bag: Bag :return bool: Flag
-
lipd.bag.
validate_md5
(bag)¶ Check if Bag is valid :param obj bag: Bag :return None:
blanks¶
List of empty and ignored keys
csvs¶
-
lipd.csvs.
get_csv_from_metadata
(name, metadata)¶ Two goals. Get all csv from metadata, and return new metadata with generated filenames to match files. :param str name: LiPD dataset name :param dict metadata: Metadata :return dict: Csv Data
-
lipd.csvs.
merge_csv_metadata
(d)¶ Using the given metadata dictionary, retrieve CSV data from CSV files, and insert the CSV values into their respective metadata columns. Checks for both paleoData and chronData tables. :param dict d: Metadata :return dict: Modified metadata dictionary
-
lipd.csvs.
read_csv_from_file
(filename)¶ Opens the target CSV file and creates a dictionary with one list for each CSV column. :param str filename: :return list of lists: column values
-
lipd.csvs.
write_csv_to_file
(d)¶ Writes columns of data to a target CSV file. :param dict d: A dictionary containing one list for every data column. Keys: int, Values: list :return None:
dataframes¶
-
lipd.dataframes.
create_dataframe
(ensemble)¶ Create a data frame from given nested lists of ensemble data :param list ensemble: Ensemble data :return obj: Dataframe
-
lipd.dataframes.
get_filtered_dfs
(lib, expr)¶ Main: Get all data frames that match the given expression :return dict: Filenames and data frames (filtered)
-
lipd.dataframes.
lipd_to_df
(metadata, csvs)¶ Create an organized collection of data frames from LiPD data :param dict metadata: LiPD data :param dict csvs: Csv data :return dict: One data frame per table, organized in a dictionary by name
-
lipd.dataframes.
ts_to_df
(metadata)¶ Create a data frame from one TimeSeries object :param dict metadata: Time Series dictionary :return dict: One data frame per table, organized in a dictionary by name
directory¶
-
lipd.directory.
browse_dialog_dir
()¶ Open up a GUI browse dialog window and let to user pick a target directory. :return str: Target directory path
-
lipd.directory.
browse_dialog_file
()¶ Open up a GUI browse dialog window and let to user select one or more files :return str _path: Target directory path :return list _files: List of selected files
-
lipd.directory.
check_file_age
(filename, days)¶ Check if the target file has an older creation date than X amount of time. i.e. One day: 60*60*24 :param str filename: Target filename :param int days: Limit in number of days :return bool: True - older than X time, False - not older than X time
-
lipd.directory.
collect_metadata_file
(full_path)¶ Create the file metadata and add it to the appropriate section by file-type :param str full_path: :param dict existing_files: :return dict existing files:
-
lipd.directory.
collect_metadata_files
(cwd, new_files, existing_files)¶ Collect all files from a given path. Separate by file type, and return one list for each type If ‘files’ contains specific :param str cwd: Directory w/ target files :param list new_files: Specific new files to load :param dict existing_files: Files currently loaded, separated by type :return list: All files separated by type
-
lipd.directory.
create_tmp_dir
()¶ Creates tmp directory in OS temp space. :return str: Path to tmp directory
-
lipd.directory.
dir_cleanup
(dir_bag, dir_data)¶ Moves JSON and csv files to bag root, then deletes all the metadata bag files. We’ll be creating a new bag with the data files, so we don’t need the other text files and such. :param str dir_bag: Path to root of Bag :param str dir_data: Path to Bag /data subdirectory :return None:
-
lipd.directory.
filename_from_path
(path)¶ Extract the file name from a given file path. :param str path: File path :return str: File name with extension
-
lipd.directory.
find_files
()¶ Search for the directory containing jsonld and csv files. chdir and then quit. :return none:
-
lipd.directory.
get_filenames_generated
(d, name='', csvs='')¶ Get the filenames that the LiPD utilities has generated (per naming standard), as opposed to the filenames that originated in the LiPD file (that possibly don’t follow the naming standard) :param dict d: Data :param str name: LiPD dataset name to prefix :param list csvs: Filenames list to merge with :return list: Filenames
-
lipd.directory.
get_filenames_in_lipd
(path, name='')¶ List all the files contained in the LiPD archive. Bagit, JSON, and CSV :param str path: Directory to be listed :param str name: LiPD dataset name, if you want to prefix it to show file hierarchy :return list: Filenames found
-
lipd.directory.
get_src_or_dst
(mode, path_type)¶ User sets the path to a LiPD source location :param str mode: “read” or “write” mode :param str path_type: “directory” or “file” :return str path: dir path to files :return list files: files chosen
-
lipd.directory.
get_src_or_dst_path
(prompt, count)¶ Let the user choose a path, and store the value. :return str _path: Target directory :return str count: Counter for attempted prompts
-
lipd.directory.
get_src_or_dst_prompt
(mode)¶ String together the proper prompt based on the mode :param str mode: “read” or “write” :return str prompt: The prompt needed
-
lipd.directory.
list_files
(x, path='')¶ Lists file(s) in given path of the X type. :param str x: File extension that we are interested in. :param str path: Path, if user would like to check a specific directory outside of the CWD :return list of str: File name(s) to be worked on
-
lipd.directory.
rm_file_if_exists
(path, filename)¶ Remove a file if it exists. Useful for when we want to write a file, but it already exists in that locaiton. :param str filename: Filename :param str path: Directory :return none:
-
lipd.directory.
rm_files_in_dir
(path)¶ Removes all files within a directory, but does not delete the directory :param str path: Target directory :return none:
doi_main¶
-
lipd.doi_main.
doi_main
(files)¶ Main function that controls the script. Take in directory containing the .lpd file(s). Loop for each file. :return None:
-
lipd.doi_main.
process_lpd
(name, dir_tmp)¶ Opens up json file, invokes doi_resolver, closes file, updates changelog, cleans directory, and makes new bag. :param str name: Name of current .lpd file :param str dir_tmp: Path to tmp directory :return none:
-
lipd.doi_main.
prompt_force
()¶ Ask the user if they want to force update files that were previously resolved :return bool: response
doi_resolver¶
-
class
lipd.doi_resolver.
DOIResolver
(dir_root, name, root_dict)¶ Bases:
object
Use DOI id(s) to pull updated publication info from doi.org and overwrite file data.
Input: Original publication dictionary Output: Updated publication dictionary (success), original publication dictionary (fail)
-
static
compare_replace
(pub_dict, fetch_dict)¶ Take in our Original Pub, and Fetched Pub. For each Fetched entry that has data, overwrite the Original entry :param pub_dict: (dict) Original pub dictionary :param fetch_dict: (dict) Fetched pub dictionary from doi.org :return: (dict) Updated pub dictionary, with fetched data taking precedence
Compiles authors “Last, First” into a single list :param list authors: Raw author data retrieved from doi.org :return list: Author objects
-
static
compile_date
(date_parts)¶ Compiles date only using the year :param list date_parts: List of date parts retrieved from doi.org :return str: Date string or NaN
-
compile_fetch
(raw, doi_id)¶ Loop over Raw and add selected items to Fetch with proper formatting :param dict raw: JSON data from doi.org :param str doi_id: :return dict:
-
find_doi
(curr_dict)¶ Recursively search the file for the DOI id. More taxing, but more flexible when dictionary structuring isn’t absolute :param dict curr_dict: Current dictionary being searched :return dict bool: Recursive - Current dictionary, False flag that DOI was not found :return str bool: Final - DOI id, True flag that DOI was found
-
get_data
(doi_id, idx)¶ Resolve DOI and compile all attributes into one dictionary :param str doi_id: :param int idx: Publication index :return dict: Updated publication dictionary
-
illegal_doi
(doi_string)¶ DOI string did not match the regex. Determine what the data is. :param doi_string: (str) Malformed DOI string :return: None
-
main
()¶ Main function that gets file(s), creates outputs, and runs all operations. :return dict: Updated or original data for jsonld file
-
noaa_citation
(doi_string)¶ Special instructions for moving noaa data to the correct fields :param doi_string: (str) NOAA url :return: None
-
remove_empties
(pub)¶
-
static
ensembles¶
-
lipd.ensembles.
create_ensemble
(ensemble)¶ Add ensemble data to a LiPD object :param list ensemble: Ensemble data nested lists :return dict: Structured Ensemble data
-
lipd.ensembles.
insert_ensemble
(d, ens)¶ Insert the ensemble table dictionary into the LiPD metadata :param dict d: LiPD metadata :param dict ens: Ensemble data to insert :return dict:
excel¶
-
lipd.excel.
cells_dn_meta
(workbook, sheet, row, col, final_dict)¶ Traverse all cells in a column moving downward. Primarily created for the metadata sheet, but may use elsewhere. Check the cell title, and switch it to. :param obj workbook: :param str sheet: :param int row: :param int col: :param dict final_dict: :return: none
-
lipd.excel.
cells_rt_meta
(workbook, sheet, row, col)¶ Traverse all cells in a row. If you find new data in a cell, add it to the list. :param obj workbook: :param str sheet: :param int row: :param int col: :return list: Cell data for a specific row
-
lipd.excel.
cells_rt_meta_pub
(workbook, sheet, row, col, pub_qty)¶ Publication section is special. It’s possible there’s more than one publication. :param obj workbook: :param str sheet: :param int row: :param int col: :param int pub_qty: Number of distinct publication sections in this file :return list: Cell data for a specific row
Split the string of author names into the BibJSON format. :param str cell: Data from author cell :return: (list of dicts) Author names
-
lipd.excel.
compile_fund
(workbook, sheet, row, col)¶ Compile funding entries. Iter both rows at the same time. Keep adding entries until both cells are empty. :param obj workbook: :param str sheet: :param int row: :param int col: :return list of dict: l
-
lipd.excel.
compile_geo
(d)¶ Compile top-level Geography dictionary. :param d: :return:
-
lipd.excel.
compile_geometry
(lat, lon, elev)¶ Take in lists of lat and lon coordinates, and determine what geometry to create :param list lat: Latitude values :param list lon: Longitude values :param float elev: Elevation value :return dict:
-
lipd.excel.
compile_temp
(d, key, value)¶ Compiles temporary dictionaries for metadata. Adds a new entry to an existing dictionary. :param dict d: :param str key: :param any value: :return dict:
-
lipd.excel.
count_chron_variables
(temp_sheet)¶ Count the number of chron variables :param obj temp_sheet: :return int: variable count
-
lipd.excel.
excel_main
(file)¶ Parse data from Excel spreadsheets into LiPD files. :return list: Filenames of LiPD files created
-
lipd.excel.
extract_short
(string_in)¶ Extract the short name from a string that also has units. :param str string_in: :return str:
-
lipd.excel.
extract_units
(string_in)¶ Extract units from parenthesis in a string. i.e. “elevation (meters)” :param str string_in: :return str:
-
lipd.excel.
geometry_linestring
(lat, lon, elev)¶ GeoJSON Linestring. Latitude and Longitude have 2 values each. :param list lat: Latitude values :param list lon: Longitude values :return dict:
-
lipd.excel.
geometry_point
(lat, lon, elev)¶ GeoJSON point. Latitude and Longitude only have one value each :param list lat: Latitude values :param list lon: Longitude values :param float elev: Elevation value :return dict:
-
lipd.excel.
geometry_range
(crd_range, elev, crd_type)¶ Range of coordinates. (e.g. 2 latitude coordinates, and 0 longitude coordinates) :param crd_range: Latitude or Longitude values :param elev: Elevation value :param crd_type: Coordinate type, lat or lon :return dict:
-
lipd.excel.
get_chron_data
(temp_sheet, row, total_vars)¶ Capture all data in for a specific chron data row (for csv output) :param obj temp_sheet: :param int row: :param int total_vars: :return list: data_row
-
lipd.excel.
get_chron_var
(temp_sheet, start_row)¶ Capture all the vars in the chron sheet (for json-ld output) :param obj temp_sheet: :param int start_row: :return: (list of dict) column data
-
lipd.excel.
instance_str
(cell)¶ Match data type and return string :param any cell: :return str:
-
lipd.excel.
logger_excel
= <logging.Logger object>¶ VERSION: LiPD v1.2
-
lipd.excel.
name_to_jsonld
(title_in)¶ Convert formal titles to camelcase json_ld text that matches our context file Keep a growing list of all titles that are being used in the json_ld context :param str title_in: :return str:
-
lipd.excel.
traverse_to_chron_data
(temp_sheet)¶ Traverse down to the first row that has chron data :param obj temp_sheet: :return int: traverse_row
-
lipd.excel.
traverse_to_chron_var
(temp_sheet)¶ Traverse down to the row that has the first variable :param obj temp_sheet: :return int:
inferred_data¶
-
lipd.inferred_data.
get_inferred_data_table
(pc, table)¶ Table level: Dive down, calculate data, then return the new table with the inferred data. :param str pc: Paleo or Chron table type :param dict table: Table data :return dict table: Table with new data
io¶
-
lipd.io.
lipd_read
(path)¶ Loads a LiPD file from local path. Unzip, read, and process data Steps: create tmp, unzip lipd, read files into memory, manipulate data, move to original dir, delete tmp. :param str path: Source path :return none:
-
lipd.io.
lipd_write
(_json, path, name)¶ Saves current state of LiPD object data. Outputs to a LiPD file. Steps: create tmp, create bag dir, get dsn, splice csv from json, write csv, clean json, write json, create bagit,
zip up bag folder, place lipd in target dst, move to original dir, delete tmpParameters: - _json (dict) – Metadata
- path (str) – Destination path
- name (str) – Filename w/o extension
Return none:
jsons¶
-
lipd.jsons.
get_csv_from_json
(d)¶ Get CSV values when mixed into json data. Pull out the CSV data and put it into a dictionary. :param dict d: JSON with CSV values :return dict: CSV values. (i.e. { CSVFilename1: { Column1: [Values], Column2: [Values] }, CSVFilename2: … }
-
lipd.jsons.
idx_name_to_num
(d)¶ Switch from index-by-name to index-by-number. :param dict d: Metadata :return dict: Modified metadata
-
lipd.jsons.
idx_num_to_name
(d)¶ Switch from index-by-number to index-by-name. :param dict d: Metadata :return dict: Modified Metadata
-
lipd.jsons.
read_json_from_file
(filename)¶ Import the JSON data from target file. :param str filename: Target File :return dict: JSON data
-
lipd.jsons.
read_jsonld
()¶ Find jsonld file in the cwd (or within a 2 levels below cwd), and load it in. :return dict: Jsonld data
-
lipd.jsons.
remove_csv_from_json
(d)¶ Remove all CSV data ‘values’ entries from paleoData table in the JSON structure. :param dict d: JSON data - old structure :return dict: Metadata dictionary without CSV values
-
lipd.jsons.
write_json_to_file
(json_data, filename='metadata')¶ Write all JSON in python dictionary to a new json file. :param dict json_data: JSON data :param str filename: Target filename (defaults to ‘metadata.jsonld’) :return None:
loggers¶
-
lipd.loggers.
create_benchmark
(name, log_file, level=20)¶ Creates a logger for function benchmark times :param str name: Name of the logger :param str log_file: Filename :return obj: Logger
-
lipd.loggers.
create_logger
(name)¶ Creates a logger with the below attributes. :param str name: Name of the logger :return obj: Logger
-
lipd.loggers.
log_benchmark
(fn, start, end)¶ Log a given function and how long the function takes in seconds :param str fn: Function name :param float start: Function start time :param float end: Function end time :return none:
-
lipd.loggers.
update_changelog
()¶ Create or update the changelog txt file. Prompt for update description. :return None:
lpd_noaa¶
-
class
lipd.lpd_noaa.
LPD_NOAA
(dir_root, name, lipd_dict)¶ Bases:
object
Creates a NOAA object that contains all the functions needed to write out a LiPD file as a NOAA text file. Supports LiPD Version: v1.2 NOAA txt template: v3.0
Return none: Writes NOAA text to file in local storage -
get_master
()¶ Get the master json that has been modified :return dict: self.lipd_data
-
get_wdc_paleo_url
()¶ When a NOAA file is created, it creates a URL link to where the dataset will be hosted in NOAA’s archive Retrieve and add this link to the original LiPD file, so we can trace the dataset to NOAA. :return str:
-
main
()¶ Load in the template file, and run through the parser :return none:
-
misc¶
-
lipd.misc.
cast_float
(x)¶ Attempt to cleanup string or convert to number value. :param any x: :return float:
-
lipd.misc.
cast_int
(x)¶ Cast unknown type into integer :param any x: :return int:
-
lipd.misc.
cast_values_csvs
(d, idx, x)¶ Attempt to cast string to float. If error, keep as a string. :param dict d: Data :param int idx: Index number :param str x: Data :return any:
-
lipd.misc.
check_dsn
(name, _json)¶ Get a dataSetName. If one is not provided, then insert the filename as the dataSetName. :param str name: Filename w/o extension :param dict _json: Metadata :return dict _json: Metadata
-
lipd.misc.
clean_doi
(doi_string)¶ Use regex to extract all DOI ids from string (i.e. 10.1029/2005pa001215) :param str doi_string: Raw DOI string value from input file. Often not properly formatted. :return list: DOI ids. May contain 0, 1, or multiple ids.
-
lipd.misc.
fix_coordinate_decimal
(d)¶ Coordinate decimal degrees calculated by an excel formula are often too long as a repeating decimal. Round them down to 5 decimals :param dict d: Metadata :return dict d: Metadata
-
lipd.misc.
generate_timestamp
(fmt=None)¶ Generate a timestamp to mark when this file was last modified. :param str fmt: Special format instructions :return str: YYYY-MM-DD format, or specified format
-
lipd.misc.
generate_tsid
(size=8)¶ Generate a TSid string. Use the “PYT” prefix for traceability, and 8 trailing generated characters ex: PYT9AG234GS :return:
-
lipd.misc.
get_appended_name
(name, columns)¶ Append numbers to a name until it no longer conflicts with the other names in a column. Necessary to avoid overwriting columns and losing data. Loop a preset amount of times to avoid an infinite loop. There shouldn’t ever be more than two or three identical variable names in a table. :param str name: Variable name in question :param dict columns: Columns listed by variable name :return str: Appended variable name
Take author or investigator data, and convert it to a concatenated string of names. Author data structure has a few variations, so account for all. :param any x: Author data :return str: Author string
-
lipd.misc.
get_dsn
(d)¶ Get the dataset name from a record :param dict d: Metadata :return str: Dataset name
-
lipd.misc.
get_ensemble_counts
(d)¶ Determine if this is a 1 or 2 column ensemble. Then determine how many columns and rows it has. :param d: :return:
-
lipd.misc.
get_missing_value_key
(d)¶ Get the Missing Value entry from a table of data. If none is found, try the columns. If still none found, prompt user. :param dict d: Table of data :return str: Missing Value
-
lipd.misc.
get_table_key
(key, d, fallback='')¶ Try to get a table name from a data table :param str key: Key to try first :param dict d: Data table :param str fallback: (optional) If we don’t find a table name, use this as a generic name fallback. :return str: Data table name
-
lipd.misc.
get_variable_name_col
(d)¶ Get the variable name from a table or column :param dict d: Metadata :return str:
-
lipd.misc.
is_ensemble
(d)¶ Check if a table of data is an ensemble table. Is the first values index a list? ensemble. Int/float? not ensemble. :param dict d: Table data :return bool: Ensemble or not ensemble
-
lipd.misc.
load_fn_matches_ext
(file_path, file_type)¶ Check that the file extension matches the target extension given. :param str file_path: Path to be checked :param str file_type: Target extension :return bool:
-
lipd.misc.
match_arr_lengths
(l)¶ Check that all the array lengths match so that a DataFrame can be created successfully. :param list l: Nested arrays :return bool: Valid or invalid
-
lipd.misc.
match_operators
(inp, relate, cut)¶ Compare two items. Match a string operator to an operator function :param str inp: Comparison item :param str relate: Comparison operator :param any cut: Comparison item :return bool: Comparison truth
-
lipd.misc.
mv_files
(src, dst)¶ Move all files from one directory to another :param str src: Source directory :param str dst: Destination directory :return none:
-
lipd.misc.
normalize_name
(s)¶ Remove foreign accents and characters to normalize the string. Prevents encoding errors. :param str s: :return str:
-
lipd.misc.
path_type
(path, target)¶ Determine if given path is file, directory, or other. Compare with target to see if it’s the type we wanted. :param str path: Path :param str target: Target type wanted :return bool:
-
lipd.misc.
prompt_protocol
()¶ Prompt user if they would like to save pickle file as a dictionary or an object. :return str: Answer
-
lipd.misc.
put_tsids
(x)¶ Recursively add in TSids into any columns that do not have them. Look for “columns” keys, and then start looping and adding generated TSids to each column :param any x: Recursive, so could be any data type. :return any x: Recursive, so could be any data type.
-
lipd.misc.
rm_empty_doi
(d)¶ If an “identifier” dictionary has no doi ID, then it has no use. Delete it. :param dict d: JSON Metadata :return dict: JSON Metadata
-
lipd.misc.
rm_empty_fields
(x)¶ Go through N number of nested data types and remove all empty entries. Recursion :param any x: Dictionary, List, or String of data :return any: Returns a same data type as original, but without empties.
-
lipd.misc.
rm_files
(path, extension)¶ Remove all files in the given directory with the given extension :param str path: Directory :param str extension: File type to remove :return none:
-
lipd.misc.
rm_keys_from_dict
(d, keys)¶ Given a dictionary and a key list, remove any data in the dictionary with the given keys. :param dict d: Data :param list keys: List of key data to remove :return dict d: Data (with keys + data removed)
-
lipd.misc.
rm_missing_values_table
(d)¶ Loop for each table column and remove the missingValue key & data :param dict d: Table data :return dict d: Table data
-
lipd.misc.
rm_values_fields
(x)¶ (recursive) Remove all “values” fields from the metadata :param any x: Any data type :return dict: metadata without “values”
-
lipd.misc.
split_path_and_file
(s)¶ Given a full path to a file, split and return a path and filename :param str s: Full path :return str str: Path, filename
-
lipd.misc.
unwrap_arrays
(l)¶ Unwrap nested lists to be one “flat” list of lists. Mainly for prepping ensemble data for DataFrame() creation :param list l: Nested lists :return list: Flattened lists
noaa¶
-
lipd.noaa.
lpd_to_noaa
(obj)¶ Convert a LiPD format to NOAA format :param obj obj: LiPD object :return obj: LiPD object (modified)
-
lipd.noaa.
noaa_prompt
()¶ Convert between NOAA and LiPD file formats. :return:
-
lipd.noaa.
noaa_to_lpd
(files)¶ Convert NOAA format to LiPD format :param dict files: Files metadata :return None:
noaa_lpd¶
regexes¶
timeseries¶
-
lipd.timeseries.
collapse
(l)¶ LiPD Version 1.3 Main function to initiate time series to LiPD conversion :param list l: Time series :return dict _master: LiPD data, sorted by dataset name
-
lipd.timeseries.
extract
(d, chron)¶ LiPD Version 1.3 Main function to initiate LiPD to TSOs conversion. :param dict d: Metadata for one LiPD file :param bool chron: Paleo mode (default) or Chron mode :return list _ts: Time series
-
lipd.timeseries.
get_matches
(expr_lst, ts)¶ Get a list of TimeSeries objects that match the given expression. :param list expr_lst: Expression :param list ts: TimeSeries :return list new_ts: Matched time series objects :return list idxs: Indices of matched objects
-
lipd.timeseries.
mode_ts
(ec, ts=None, b=None)¶ Get string for the mode :param bool b: Chron boolean (for extract) :param str ec: extract or collapse :param list ts: Time series (for collapse) :return str phrase: Phrase
-
lipd.timeseries.
translate_expression
(expression)¶ Check if the expression is valid, then check turn it into an expression that can be used for filtering. :return list of lists: One or more matches. Each list has 3 strings.
validator_api¶
-
lipd.validator_api.
create_detailed_results
(data)¶
-
lipd.validator_api.
display_results
(data, detailed=False)¶ Display the results from the validator in a brief or detailed way. :param dict data: Results, sorted by dataset name :param bool detailed: Detailed results on or off :return none:
-
lipd.validator_api.
get_validator_format
(data_json, data_csv, filenames)¶ Format the LIPD data in the layout that the Lipd.net validator accepts. Example of one _file metadata. _file_list will contain 1 or more _file’s _file = {
“type”: “bagit/json/csv”, “filenameFull”: /path/to/filename.txt, “filenameShort”: filename.txt, “data”: “”, “pretty”: “”}
Parameters: - data_json (dict) – Metadata
- data_csv (dict) – CSV data
- filenames (list) – All files found in LiPD archive
Return list: Validator-formatted data
-
lipd.validator_api.
get_validator_results
(data)¶ Send LiPD data to the Lipd.net validator and get the results back. :param data: :return:
versions¶
-
lipd.versions.
get_lipd_version
(d)¶ Check what version of LiPD this file is using. If none is found, assume it’s using version 1.0 :param dict d: Metadata :return float:
-
lipd.versions.
update_lipd_v1_1
(d)¶ Update LiPD v1.0 to v1.1 - chronData entry is a list that allows multiple tables - paleoData entry is a list that allows multiple tables - chronData now allows measurement, model, summary, modelTable, ensemble, calibratedAges tables - Added ‘lipdVersion’ key
Parameters: d (dict) – Metadata v1.0 Return dict d: Metadata v1.1
-
lipd.versions.
update_lipd_v1_2
(d)¶ Update LiPD v1.1 to v1.2 - Added NOAA compatible keys : maxYear, minYear, originalDataURL, WDCPaleoURL, etc - ‘calibratedAges’ key is now ‘distribution’ - paleoData structure mirrors chronData. Allows measurement, model, summary, modelTable, ensemble,
distribution tablesParameters: d (dict) – Metadata v1.1 Return dict d: Metadata v1.2
-
lipd.versions.
update_lipd_v1_3
(d)¶ Update LiPD v1.2 to v1.3 - Added ‘createdBy’ key - Top-level folder inside LiPD archives are named “bag”. (No longer <datasetname>) - .jsonld file is now generically named ‘metadata.jsonld’ (No longer <datasetname>.lpd ) - All “paleo” and “chron” prefixes are removed from “paleoMeasurementTable”, “paleoModel”, etc. - Merge isotopeInterpretation and climateInterpretation into “interpretation” block - ensemble table entry is a list that allows multiple tables - summary table entry is a list that allows multiple tables :param dict d: Metadata v1.2 :return dict d: Metadata v1.3
-
lipd.versions.
update_lipd_v1_3_names
(d)¶ Update the key names and merge interpretation data :param dict d: Metadata :return dict d: Metadata
-
lipd.versions.
update_lipd_v1_3_structure
(d)¶ Update the structure for summary and ensemble tables :param dict d: Metadata :return dict d: Metadata
-
lipd.versions.
update_lipd_version
(d)¶ Metadata is indexed by number at this step.
Use the current version number to determine where to start updating from. Use “chain versioning” to make it modular. If a file is a few versions behind, convert to EACH version until reaching current. If a file is one version behind, it will only convert once to the newest. :param dict d: Metadata :return dict d: Metadata
zips¶
-
lipd.zips.
unzipper
(filename, dir_tmp)¶ Unzip .lpd file contents to tmp directory. :param str filename: filename.lpd :param str dir_tmp: Tmp folder to extract contents to :return None:
-
lipd.zips.
zipper
(root_dir='', name='', path_name_ext='')¶ Zips up directory back to the original location :param str root_dir: Root directory of the archive :param str name: <datasetname>.lpd :param str path_name_ext: /path/to/filename.lpd