lipd package¶

Module contents¶

lipd.addEnsemble(D, dsn, ensemble)¶

Create ensemble entry and then add it to the specified LiPD dataset.

Parameters:	D (dict) – LiPD data dsn (str) – Dataset name ensemble (list) – Nested numpy array of ensemble column data.
Return dict D:	LiPD data

lipd.collapseTs(ts=None)¶

Collapse a time series back into LiPD record form.

Example
D = lipd.readLipd()
ts = lipd.extractTs(D)
New_D = lipd.collapseTs(ts)

Parameters:	ts (list) – Time series
Return dict:	Metadata

lipd.doi()¶

Update publication information using data DOIs. Updates LiPD files on disk, not in memory.

Example
1: lipd.readLipd()
2: lipd.doi()

Return none:

lipd.ensToDf(ensemble)¶

Create an ensemble data frame from some given nested numpy arrays

Parameters:	ensemble (list) – Ensemble data
Return obj df:	Pandas dataframe

lipd.excel()¶

Convert Excel files to LiPD files. LiPD data is returned directly from this function.

Example
1: lipd.readExcel()
2: D = lipd.excel()

Return dict _d:	Metadata

lipd.extractTs(d, chron=False)¶

Create a time series using LiPD data (uses paleoData by default)

Example : paleoData
1. D = lipd.readLipd()
2. ts = lipd.extractTs(D)

Example : chronData
1. D = lipd.readLipd()
2. ts = lipd.extractTs(D, chron=True)

Parameters:	d (dict) – Metadata chron (bool) – Create a chronData time series
Return list l:	Time series

lipd.filterTs(ts, expression)¶

Create a new time series that only contains entries that match the given expression.

Example:
D = lipd.loadLipd()
ts = lipd.extractTs(D)
new_ts = filterTs(ts, “archiveType == marine sediment”)
new_ts = filterTs(ts, “paleoData_variableName == sst”)

Return list new_ts:
Parameters:	expression (str) – Expression ts (list) – Time series
	Filtered time series that matches the expression

lipd.getCsv(L=None)¶

Get CSV from LiPD metadata

Example

c = lipd.getCsv(D[“Africa-ColdAirCave.Sundqvist.2013”])

Parameters:	L (dict) – One LiPD record
Return dict d:	CSV data

lipd.getLipdNames(D=None)¶

Get a list of all LiPD names in the library

Example

names = lipd.getLipdNames(D)

Return list f_list:
	File list

lipd.getMetadata(L)¶

Get metadata from a LiPD data in memory

Example

m = lipd.getMetadata(D[“Africa-ColdAirCave.Sundqvist.2013”])

Parameters:	L (dict) – One LiPD record
Return dict d:	LiPD record (metadata only)

lipd.noaa(d=None)¶

Convert between NOAA and LiPD files

Example: LiPD to NOAA converter
1: D = lipd.readLipd()
2: lipd.noaa(D)

Example: NOAA to LiPD converter
1: readNoaa()
2: lipd.noaa()

Return none:

lipd.queryTs(ts, expression)¶

Find the indices of the time series entries that match the given expression.

Example:
D = lipd.loadLipd()
ts = lipd.extractTs(D)
matches = queryTs(ts, “archiveType == marine sediment”)
matches = queryTs(ts, “geo_meanElev <= 2000”)

Return list _idx:
Parameters:	expression (str) – Expression ts (list) – Time series
	Indices of entries that match the criteria

lipd.readAll(usr_path='')¶

Read all approved file types at once. Enter a file path, directory path, or leave args blank to trigger gui.

Parameters:	usr_path (str) – Path to file / directory (optional)
Return str cwd:	Current working directory

lipd.readExcel(usr_path='')¶

Read Excel file(s) Enter a file path, directory path, or leave args blank to trigger gui.

Parameters:	usr_path (str) – Path to file / directory (optional)
Return str cwd:	Current working directory

lipd.readLipd(usr_path='')¶

Read LiPD file(s). Enter a file path, directory path, or leave args blank to trigger gui.

Parameters:	usr_path (str) – Path to file / directory (optional)
Return dict _d:	Metadata

lipd.readNoaa(usr_path='')¶

Read NOAA file(s) Enter a file path, directory path, or leave args blank to trigger gui.

Parameters:	usr_path (str) – Path to file / directory (optional)
Return str cwd:	Current working directory

lipd.run()¶

Initialize and start objects. This is called automatically when importing the package.

Return none:

lipd.showDfs(d)¶

Display the available data frame names in a given data frame collection

Parameters:	d (dict) – Dataframe collection
Return none:

lipd.showLipds(D=None)¶

Display the dataset names of a given LiPD data

Example

lipd.showLipds(D)

Pararm dict D:	LiPD data
Return none:

lipd.showMetadata(dat)¶

Display the metadata specified LiPD in pretty print

Example

showMetadata(D[“Africa-ColdAirCave.Sundqvist.2013”])

Parameters:	dat (dict) – Metadata
Return none:

lipd.tsToDf(tso)¶

Create Pandas DataFrame from TimeSeries object. Use: Must first extractTs to get a time series. Then pick one item from time series and pass it through

Return dict dfs:
Parameters:	tso (dict) – Time series entry
	Pandas dataframes

lipd.viewTs(ts)¶

View the contents of one time series entry in a nicely formatted way

Example
D = lipd.readLipd()
ts = lipd.extractTs(D)
viewTs(ts[0])

Parameters:	ts (dict) – One time series entry
Return none:

lipd.writeLipd(dat, usr_path='', filename='')¶

Write LiPD data to file(s)

Parameters:	dat (dict) – Metadata usr_path (str) – Destination (optional) filename (str) – LiPD filename, for writing one specific file (optional)
Return none:

Submodules¶

alternates¶

List of alternate and synonym keys

bag¶

lipd.bag.create_bag(dir_bag)¶: Create a Bag out of given files. :param str dir_bag: Directory that contains csv, jsonld, and changelog files. :return obj: Bag

lipd.bag.finish_bag(dir_bag)¶: Closing steps for creating a bag :param obj dir_bag: :return None:

lipd.bag.open_bag(dir_bag)¶: Open Bag at the given path :param str dir_bag: Path to Bag :return obj: Bag

lipd.bag.resolved_flag(bag)¶: Check DOI flag in bag.info to see if doi_resolver has been previously run :param obj bag: Bag :return bool: Flag

lipd.bag.validate_md5(bag)¶: Check if Bag is valid :param obj bag: Bag :return None:

blanks¶

List of empty and ignored keys

csvs¶

lipd.csvs.get_csv_from_metadata(name, metadata)¶: Two goals. Get all csv from metadata, and return new metadata with generated filenames to match files. :param str name: LiPD dataset name :param dict metadata: Metadata :return dict: Csv Data

lipd.csvs.merge_csv_metadata(d)¶: Using the given metadata dictionary, retrieve CSV data from CSV files, and insert the CSV values into their respective metadata columns. Checks for both paleoData and chronData tables. :param dict d: Metadata :return dict: Modified metadata dictionary

lipd.csvs.read_csv_from_file(filename)¶: Opens the target CSV file and creates a dictionary with one list for each CSV column. :param str filename: :return list of lists: column values

lipd.csvs.write_csv_to_file(d)¶: Writes columns of data to a target CSV file. :param dict d: A dictionary containing one list for every data column. Keys: int, Values: list :return None:

dataframes¶

lipd.dataframes.create_dataframe(ensemble)¶: Create a data frame from given nested lists of ensemble data :param list ensemble: Ensemble data :return obj: Dataframe

lipd.dataframes.get_filtered_dfs(lib, expr)¶: Main: Get all data frames that match the given expression :return dict: Filenames and data frames (filtered)

lipd.dataframes.lipd_to_df(metadata, csvs)¶: Create an organized collection of data frames from LiPD data :param dict metadata: LiPD data :param dict csvs: Csv data :return dict: One data frame per table, organized in a dictionary by name

lipd.dataframes.ts_to_df(metadata)¶: Create a data frame from one TimeSeries object :param dict metadata: Time Series dictionary :return dict: One data frame per table, organized in a dictionary by name

directory¶

lipd.directory.browse_dialog_dir()¶: Open up a GUI browse dialog window and let to user pick a target directory. :return str: Target directory path

lipd.directory.browse_dialog_file()¶: Open up a GUI browse dialog window and let to user select one or more files :return str _path: Target directory path :return list _files: List of selected files

lipd.directory.check_file_age(filename, days)¶: Check if the target file has an older creation date than X amount of time. i.e. One day: 60*60*24 :param str filename: Target filename :param int days: Limit in number of days :return bool: True - older than X time, False - not older than X time

lipd.directory.collect_metadata_file(full_path)¶: Create the file metadata and add it to the appropriate section by file-type :param str full_path: :param dict existing_files: :return dict existing files:

lipd.directory.collect_metadata_files(cwd, new_files, existing_files)¶: Collect all files from a given path. Separate by file type, and return one list for each type If ‘files’ contains specific :param str cwd: Directory w/ target files :param list new_files: Specific new files to load :param dict existing_files: Files currently loaded, separated by type :return list: All files separated by type

lipd.directory.create_tmp_dir()¶: Creates tmp directory in OS temp space. :return str: Path to tmp directory

lipd.directory.dir_cleanup(dir_bag, dir_data)¶: Moves JSON and csv files to bag root, then deletes all the metadata bag files. We’ll be creating a new bag with the data files, so we don’t need the other text files and such. :param str dir_bag: Path to root of Bag :param str dir_data: Path to Bag /data subdirectory :return None:

lipd.directory.filename_from_path(path)¶: Extract the file name from a given file path. :param str path: File path :return str: File name with extension

lipd.directory.find_files()¶: Search for the directory containing jsonld and csv files. chdir and then quit. :return none:

lipd.directory.get_filenames_generated(d, name='', csvs='')¶: Get the filenames that the LiPD utilities has generated (per naming standard), as opposed to the filenames that originated in the LiPD file (that possibly don’t follow the naming standard) :param dict d: Data :param str name: LiPD dataset name to prefix :param list csvs: Filenames list to merge with :return list: Filenames

lipd.directory.get_filenames_in_lipd(path, name='')¶: List all the files contained in the LiPD archive. Bagit, JSON, and CSV :param str path: Directory to be listed :param str name: LiPD dataset name, if you want to prefix it to show file hierarchy :return list: Filenames found

lipd.directory.get_src_or_dst(mode, path_type)¶: User sets the path to a LiPD source location :param str mode: “read” or “write” mode :param str path_type: “directory” or “file” :return str path: dir path to files :return list files: files chosen

lipd.directory.get_src_or_dst_path(prompt, count)¶: Let the user choose a path, and store the value. :return str _path: Target directory :return str count: Counter for attempted prompts

lipd.directory.get_src_or_dst_prompt(mode)¶: String together the proper prompt based on the mode :param str mode: “read” or “write” :return str prompt: The prompt needed

lipd.directory.list_files(x, path='')¶: Lists file(s) in given path of the X type. :param str x: File extension that we are interested in. :param str path: Path, if user would like to check a specific directory outside of the CWD :return list of str: File name(s) to be worked on

lipd.directory.rm_file_if_exists(path, filename)¶: Remove a file if it exists. Useful for when we want to write a file, but it already exists in that locaiton. :param str filename: Filename :param str path: Directory :return none:

lipd.directory.rm_files_in_dir(path)¶: Removes all files within a directory, but does not delete the directory :param str path: Target directory :return none:

doi_main¶

lipd.doi_main.doi_main(files)¶: Main function that controls the script. Take in directory containing the .lpd file(s). Loop for each file. :return None:

lipd.doi_main.process_lpd(name, dir_tmp)¶: Opens up json file, invokes doi_resolver, closes file, updates changelog, cleans directory, and makes new bag. :param str name: Name of current .lpd file :param str dir_tmp: Path to tmp directory :return none:

lipd.doi_main.prompt_force()¶: Ask the user if they want to force update files that were previously resolved :return bool: response

doi_resolver¶

class lipd.doi_resolver.DOIResolver(dir_root, name, root_dict)¶

Bases: object

Use DOI id(s) to pull updated publication info from doi.org and overwrite file data.

Input: Original publication dictionary Output: Updated publication dictionary (success), original publication dictionary (fail)

static compare_replace(pub_dict, fetch_dict)¶: Take in our Original Pub, and Fetched Pub. For each Fetched entry that has data, overwrite the Original entry :param pub_dict: (dict) Original pub dictionary :param fetch_dict: (dict) Fetched pub dictionary from doi.org :return: (dict) Updated pub dictionary, with fetched data taking precedence

static compile_authors(authors)¶: Compiles authors “Last, First” into a single list :param list authors: Raw author data retrieved from doi.org :return list: Author objects

static compile_date(date_parts)¶: Compiles date only using the year :param list date_parts: List of date parts retrieved from doi.org :return str: Date string or NaN

compile_fetch(raw, doi_id)¶: Loop over Raw and add selected items to Fetch with proper formatting :param dict raw: JSON data from doi.org :param str doi_id: :return dict:

find_doi(curr_dict)¶: Recursively search the file for the DOI id. More taxing, but more flexible when dictionary structuring isn’t absolute :param dict curr_dict: Current dictionary being searched :return dict bool: Recursive - Current dictionary, False flag that DOI was not found :return str bool: Final - DOI id, True flag that DOI was found

get_data(doi_id, idx)¶: Resolve DOI and compile all attributes into one dictionary :param str doi_id: :param int idx: Publication index :return dict: Updated publication dictionary

illegal_doi(doi_string)¶: DOI string did not match the regex. Determine what the data is. :param doi_string: (str) Malformed DOI string :return: None

main()¶: Main function that gets file(s), creates outputs, and runs all operations. :return dict: Updated or original data for jsonld file

noaa_citation(doi_string)¶: Special instructions for moving noaa data to the correct fields :param doi_string: (str) NOAA url :return: None

remove_empties(pub)¶

ensembles¶

lipd.ensembles.create_ensemble(ensemble)¶: Add ensemble data to a LiPD object :param list ensemble: Ensemble data nested lists :return dict: Structured Ensemble data

lipd.ensembles.insert_ensemble(d, ens)¶: Insert the ensemble table dictionary into the LiPD metadata :param dict d: LiPD metadata :param dict ens: Ensemble data to insert :return dict:

excel¶

lipd.excel.cells_dn_meta(workbook, sheet, row, col, final_dict)¶: Traverse all cells in a column moving downward. Primarily created for the metadata sheet, but may use elsewhere. Check the cell title, and switch it to. :param obj workbook: :param str sheet: :param int row: :param int col: :param dict final_dict: :return: none

lipd.excel.cells_rt_meta(workbook, sheet, row, col)¶: Traverse all cells in a row. If you find new data in a cell, add it to the list. :param obj workbook: :param str sheet: :param int row: :param int col: :return list: Cell data for a specific row

lipd.excel.cells_rt_meta_pub(workbook, sheet, row, col, pub_qty)¶: Publication section is special. It’s possible there’s more than one publication. :param obj workbook: :param str sheet: :param int row: :param int col: :param int pub_qty: Number of distinct publication sections in this file :return list: Cell data for a specific row

lipd.excel.compile_authors(cell)¶: Split the string of author names into the BibJSON format. :param str cell: Data from author cell :return: (list of dicts) Author names

lipd.excel.compile_fund(workbook, sheet, row, col)¶: Compile funding entries. Iter both rows at the same time. Keep adding entries until both cells are empty. :param obj workbook: :param str sheet: :param int row: :param int col: :return list of dict: l

lipd.excel.compile_geo(d)¶: Compile top-level Geography dictionary. :param d: :return:

lipd.excel.compile_geometry(lat, lon, elev)¶: Take in lists of lat and lon coordinates, and determine what geometry to create :param list lat: Latitude values :param list lon: Longitude values :param float elev: Elevation value :return dict:

lipd.excel.compile_temp(d, key, value)¶: Compiles temporary dictionaries for metadata. Adds a new entry to an existing dictionary. :param dict d: :param str key: :param any value: :return dict:

lipd.excel.count_chron_variables(temp_sheet)¶: Count the number of chron variables :param obj temp_sheet: :return int: variable count

lipd.excel.excel_main(file)¶: Parse data from Excel spreadsheets into LiPD files. :return list: Filenames of LiPD files created

lipd.excel.extract_short(string_in)¶: Extract the short name from a string that also has units. :param str string_in: :return str:

lipd.excel.extract_units(string_in)¶: Extract units from parenthesis in a string. i.e. “elevation (meters)” :param str string_in: :return str:

lipd.excel.geometry_linestring(lat, lon, elev)¶: GeoJSON Linestring. Latitude and Longitude have 2 values each. :param list lat: Latitude values :param list lon: Longitude values :return dict:

lipd.excel.geometry_point(lat, lon, elev)¶: GeoJSON point. Latitude and Longitude only have one value each :param list lat: Latitude values :param list lon: Longitude values :param float elev: Elevation value :return dict:

lipd.excel.geometry_range(crd_range, elev, crd_type)¶: Range of coordinates. (e.g. 2 latitude coordinates, and 0 longitude coordinates) :param crd_range: Latitude or Longitude values :param elev: Elevation value :param crd_type: Coordinate type, lat or lon :return dict:

lipd.excel.get_chron_data(temp_sheet, row, total_vars)¶: Capture all data in for a specific chron data row (for csv output) :param obj temp_sheet: :param int row: :param int total_vars: :return list: data_row

lipd.excel.get_chron_var(temp_sheet, start_row)¶: Capture all the vars in the chron sheet (for json-ld output) :param obj temp_sheet: :param int start_row: :return: (list of dict) column data

lipd.excel.instance_str(cell)¶: Match data type and return string :param any cell: :return str:

lipd.excel.logger_excel = <logging.Logger object>¶: VERSION: LiPD v1.2

lipd.excel.name_to_jsonld(title_in)¶: Convert formal titles to camelcase json_ld text that matches our context file Keep a growing list of all titles that are being used in the json_ld context :param str title_in: :return str:

lipd.excel.traverse_to_chron_data(temp_sheet)¶: Traverse down to the first row that has chron data :param obj temp_sheet: :return int: traverse_row

lipd.excel.traverse_to_chron_var(temp_sheet)¶: Traverse down to the row that has the first variable :param obj temp_sheet: :return int:

inferred_data¶

lipd.inferred_data.get_inferred_data_table(pc, table)¶: Table level: Dive down, calculate data, then return the new table with the inferred data. :param str pc: Paleo or Chron table type :param dict table: Table data :return dict table: Table with new data

io¶

lipd.io.lipd_read(path)¶: Loads a LiPD file from local path. Unzip, read, and process data Steps: create tmp, unzip lipd, read files into memory, manipulate data, move to original dir, delete tmp. :param str path: Source path :return none:

lipd.io.lipd_write(_json, path, name)¶

Saves current state of LiPD object data. Outputs to a LiPD file. Steps: create tmp, create bag dir, get dsn, splice csv from json, write csv, clean json, write json, create bagit,

zip up bag folder, place lipd in target dst, move to original dir, delete tmp

Parameters:	_json (dict) – Metadata path (str) – Destination path name (str) – Filename w/o extension
Return none:

jsons¶

lipd.jsons.get_csv_from_json(d)¶: Get CSV values when mixed into json data. Pull out the CSV data and put it into a dictionary. :param dict d: JSON with CSV values :return dict: CSV values. (i.e. { CSVFilename1: { Column1: [Values], Column2: [Values] }, CSVFilename2: … }

lipd.jsons.idx_name_to_num(d)¶: Switch from index-by-name to index-by-number. :param dict d: Metadata :return dict: Modified metadata

lipd.jsons.idx_num_to_name(d)¶: Switch from index-by-number to index-by-name. :param dict d: Metadata :return dict: Modified Metadata

lipd.jsons.read_json_from_file(filename)¶: Import the JSON data from target file. :param str filename: Target File :return dict: JSON data

lipd.jsons.read_jsonld()¶: Find jsonld file in the cwd (or within a 2 levels below cwd), and load it in. :return dict: Jsonld data

lipd.jsons.remove_csv_from_json(d)¶: Remove all CSV data ‘values’ entries from paleoData table in the JSON structure. :param dict d: JSON data - old structure :return dict: Metadata dictionary without CSV values

lipd.jsons.write_json_to_file(json_data, filename='metadata')¶: Write all JSON in python dictionary to a new json file. :param dict json_data: JSON data :param str filename: Target filename (defaults to ‘metadata.jsonld’) :return None:

loggers¶

lipd.loggers.create_benchmark(name, log_file, level=20)¶: Creates a logger for function benchmark times :param str name: Name of the logger :param str log_file: Filename :return obj: Logger

lipd.loggers.create_logger(name)¶: Creates a logger with the below attributes. :param str name: Name of the logger :return obj: Logger

lipd.loggers.log_benchmark(fn, start, end)¶: Log a given function and how long the function takes in seconds :param str fn: Function name :param float start: Function start time :param float end: Function end time :return none:

lipd.loggers.update_changelog()¶: Create or update the changelog txt file. Prompt for update description. :return None:

lpd_noaa¶

class lipd.lpd_noaa.LPD_NOAA(dir_root, name, lipd_dict)¶

Bases: object

Creates a NOAA object that contains all the functions needed to write out a LiPD file as a NOAA text file. Supports LiPD Version: v1.2 NOAA txt template: v3.0

Return none:	Writes NOAA text to file in local storage

get_master()¶: Get the master json that has been modified :return dict: self.lipd_data

get_wdc_paleo_url()¶: When a NOAA file is created, it creates a URL link to where the dataset will be hosted in NOAA’s archive Retrieve and add this link to the original LiPD file, so we can trace the dataset to NOAA. :return str:

main()¶: Load in the template file, and run through the parser :return none:

misc¶

lipd.misc.cast_float(x)¶: Attempt to cleanup string or convert to number value. :param any x: :return float:

lipd.misc.cast_int(x)¶: Cast unknown type into integer :param any x: :return int:

lipd.misc.cast_values_csvs(d, idx, x)¶: Attempt to cast string to float. If error, keep as a string. :param dict d: Data :param int idx: Index number :param str x: Data :return any:

lipd.misc.check_dsn(name, _json)¶: Get a dataSetName. If one is not provided, then insert the filename as the dataSetName. :param str name: Filename w/o extension :param dict _json: Metadata :return dict _json: Metadata

lipd.misc.clean_doi(doi_string)¶: Use regex to extract all DOI ids from string (i.e. 10.1029/2005pa001215) :param str doi_string: Raw DOI string value from input file. Often not properly formatted. :return list: DOI ids. May contain 0, 1, or multiple ids.

lipd.misc.fix_coordinate_decimal(d)¶: Coordinate decimal degrees calculated by an excel formula are often too long as a repeating decimal. Round them down to 5 decimals :param dict d: Metadata :return dict d: Metadata

lipd.misc.generate_timestamp(fmt=None)¶: Generate a timestamp to mark when this file was last modified. :param str fmt: Special format instructions :return str: YYYY-MM-DD format, or specified format

lipd.misc.generate_tsid(size=8)¶: Generate a TSid string. Use the “PYT” prefix for traceability, and 8 trailing generated characters ex: PYT9AG234GS :return:

lipd.misc.get_appended_name(name, columns)¶: Append numbers to a name until it no longer conflicts with the other names in a column. Necessary to avoid overwriting columns and losing data. Loop a preset amount of times to avoid an infinite loop. There shouldn’t ever be more than two or three identical variable names in a table. :param str name: Variable name in question :param dict columns: Columns listed by variable name :return str: Appended variable name

lipd.misc.get_authors_as_str(x)¶: Take author or investigator data, and convert it to a concatenated string of names. Author data structure has a few variations, so account for all. :param any x: Author data :return str: Author string

lipd.misc.get_dsn(d)¶: Get the dataset name from a record :param dict d: Metadata :return str: Dataset name

lipd.misc.get_ensemble_counts(d)¶: Determine if this is a 1 or 2 column ensemble. Then determine how many columns and rows it has. :param d: :return:

lipd.misc.get_missing_value_key(d)¶: Get the Missing Value entry from a table of data. If none is found, try the columns. If still none found, prompt user. :param dict d: Table of data :return str: Missing Value

lipd.misc.get_table_key(key, d, fallback='')¶: Try to get a table name from a data table :param str key: Key to try first :param dict d: Data table :param str fallback: (optional) If we don’t find a table name, use this as a generic name fallback. :return str: Data table name

lipd.misc.get_variable_name_col(d)¶: Get the variable name from a table or column :param dict d: Metadata :return str:

lipd.misc.is_ensemble(d)¶: Check if a table of data is an ensemble table. Is the first values index a list? ensemble. Int/float? not ensemble. :param dict d: Table data :return bool: Ensemble or not ensemble

lipd.misc.load_fn_matches_ext(file_path, file_type)¶: Check that the file extension matches the target extension given. :param str file_path: Path to be checked :param str file_type: Target extension :return bool:

lipd.misc.match_arr_lengths(l)¶: Check that all the array lengths match so that a DataFrame can be created successfully. :param list l: Nested arrays :return bool: Valid or invalid

lipd.misc.match_operators(inp, relate, cut)¶: Compare two items. Match a string operator to an operator function :param str inp: Comparison item :param str relate: Comparison operator :param any cut: Comparison item :return bool: Comparison truth

lipd.misc.mv_files(src, dst)¶: Move all files from one directory to another :param str src: Source directory :param str dst: Destination directory :return none:

lipd.misc.normalize_name(s)¶: Remove foreign accents and characters to normalize the string. Prevents encoding errors. :param str s: :return str:

lipd.misc.path_type(path, target)¶: Determine if given path is file, directory, or other. Compare with target to see if it’s the type we wanted. :param str path: Path :param str target: Target type wanted :return bool:

lipd.misc.prompt_protocol()¶: Prompt user if they would like to save pickle file as a dictionary or an object. :return str: Answer

lipd.misc.put_tsids(x)¶: Recursively add in TSids into any columns that do not have them. Look for “columns” keys, and then start looping and adding generated TSids to each column :param any x: Recursive, so could be any data type. :return any x: Recursive, so could be any data type.

lipd.misc.rm_empty_doi(d)¶: If an “identifier” dictionary has no doi ID, then it has no use. Delete it. :param dict d: JSON Metadata :return dict: JSON Metadata

lipd.misc.rm_empty_fields(x)¶: Go through N number of nested data types and remove all empty entries. Recursion :param any x: Dictionary, List, or String of data :return any: Returns a same data type as original, but without empties.

lipd.misc.rm_files(path, extension)¶: Remove all files in the given directory with the given extension :param str path: Directory :param str extension: File type to remove :return none:

lipd.misc.rm_keys_from_dict(d, keys)¶: Given a dictionary and a key list, remove any data in the dictionary with the given keys. :param dict d: Data :param list keys: List of key data to remove :return dict d: Data (with keys + data removed)

lipd.misc.rm_missing_values_table(d)¶: Loop for each table column and remove the missingValue key & data :param dict d: Table data :return dict d: Table data

lipd.misc.rm_values_fields(x)¶: (recursive) Remove all “values” fields from the metadata :param any x: Any data type :return dict: metadata without “values”

lipd.misc.split_path_and_file(s)¶: Given a full path to a file, split and return a path and filename :param str s: Full path :return str str: Path, filename

lipd.misc.unwrap_arrays(l)¶: Unwrap nested lists to be one “flat” list of lists. Mainly for prepping ensemble data for DataFrame() creation :param list l: Nested lists :return list: Flattened lists

noaa¶

lipd.noaa.lpd_to_noaa(obj)¶: Convert a LiPD format to NOAA format :param obj obj: LiPD object :return obj: LiPD object (modified)

lipd.noaa.noaa_prompt()¶: Convert between NOAA and LiPD file formats. :return:

lipd.noaa.noaa_to_lpd(files)¶: Convert NOAA format to LiPD format :param dict files: Files metadata :return None:

noaa_lpd¶

class lipd.noaa_lpd.NOAA_LPD(dir_root, dir_tmp, name)¶

Bases: object

main()¶: Convert a NOAA text file into a lipds file. CSV files will be created if chronology or data sections are available. :return dict: Metadata Dictionary

regexes¶

timeseries¶

lipd.timeseries.collapse(l)¶: LiPD Version 1.3 Main function to initiate time series to LiPD conversion :param list l: Time series :return dict _master: LiPD data, sorted by dataset name

lipd.timeseries.extract(d, chron)¶: LiPD Version 1.3 Main function to initiate LiPD to TSOs conversion. :param dict d: Metadata for one LiPD file :param bool chron: Paleo mode (default) or Chron mode :return list _ts: Time series

lipd.timeseries.get_matches(expr_lst, ts)¶: Get a list of TimeSeries objects that match the given expression. :param list expr_lst: Expression :param list ts: TimeSeries :return list new_ts: Matched time series objects :return list idxs: Indices of matched objects

lipd.timeseries.mode_ts(ec, ts=None, b=None)¶: Get string for the mode :param bool b: Chron boolean (for extract) :param str ec: extract or collapse :param list ts: Time series (for collapse) :return str phrase: Phrase

lipd.timeseries.translate_expression(expression)¶: Check if the expression is valid, then check turn it into an expression that can be used for filtering. :return list of lists: One or more matches. Each list has 3 strings.

validator_api¶

lipd.validator_api.create_detailed_results(data)¶

lipd.validator_api.display_results(data, detailed=False)¶: Display the results from the validator in a brief or detailed way. :param dict data: Results, sorted by dataset name :param bool detailed: Detailed results on or off :return none:

lipd.validator_api.get_validator_format(data_json, data_csv, filenames)¶

Format the LIPD data in the layout that the Lipd.net validator accepts. Example of one _file metadata. _file_list will contain 1 or more _file’s _file = {

“type”: “bagit/json/csv”, “filenameFull”: /path/to/filename.txt, “filenameShort”: filename.txt, “data”: “”, “pretty”: “”

}

Parameters:	data_json (dict) – Metadata data_csv (dict) – CSV data filenames (list) – All files found in LiPD archive
Return list:	Validator-formatted data

lipd.validator_api.get_validator_results(data)¶: Send LiPD data to the Lipd.net validator and get the results back. :param data: :return:

versions¶

lipd.versions.get_lipd_version(d)¶: Check what version of LiPD this file is using. If none is found, assume it’s using version 1.0 :param dict d: Metadata :return float:

lipd.versions.update_lipd_v1_1(d)¶

Update LiPD v1.0 to v1.1 - chronData entry is a list that allows multiple tables - paleoData entry is a list that allows multiple tables - chronData now allows measurement, model, summary, modelTable, ensemble, calibratedAges tables - Added ‘lipdVersion’ key

Parameters:	d (dict) – Metadata v1.0
Return dict d:	Metadata v1.1

lipd.versions.update_lipd_v1_2(d)¶

Update LiPD v1.1 to v1.2 - Added NOAA compatible keys : maxYear, minYear, originalDataURL, WDCPaleoURL, etc - ‘calibratedAges’ key is now ‘distribution’ - paleoData structure mirrors chronData. Allows measurement, model, summary, modelTable, ensemble,

distribution tables

Parameters:	d (dict) – Metadata v1.1
Return dict d:	Metadata v1.2

lipd.versions.update_lipd_v1_3(d)¶: Update LiPD v1.2 to v1.3 - Added ‘createdBy’ key - Top-level folder inside LiPD archives are named “bag”. (No longer <datasetname>) - .jsonld file is now generically named ‘metadata.jsonld’ (No longer <datasetname>.lpd ) - All “paleo” and “chron” prefixes are removed from “paleoMeasurementTable”, “paleoModel”, etc. - Merge isotopeInterpretation and climateInterpretation into “interpretation” block - ensemble table entry is a list that allows multiple tables - summary table entry is a list that allows multiple tables :param dict d: Metadata v1.2 :return dict d: Metadata v1.3

lipd.versions.update_lipd_v1_3_names(d)¶: Update the key names and merge interpretation data :param dict d: Metadata :return dict d: Metadata

lipd.versions.update_lipd_v1_3_structure(d)¶: Update the structure for summary and ensemble tables :param dict d: Metadata :return dict d: Metadata

lipd.versions.update_lipd_version(d)¶

Metadata is indexed by number at this step.

Use the current version number to determine where to start updating from. Use “chain versioning” to make it modular. If a file is a few versions behind, convert to EACH version until reaching current. If a file is one version behind, it will only convert once to the newest. :param dict d: Metadata :return dict d: Metadata

zips¶

lipd.zips.unzipper(filename, dir_tmp)¶: Unzip .lpd file contents to tmp directory. :param str filename: filename.lpd :param str dir_tmp: Tmp folder to extract contents to :return None:

lipd.zips.zipper(root_dir='', name='', path_name_ext='')¶: Zips up directory back to the original location :param str root_dir: Root directory of the archive :param str name: <datasetname>.lpd :param str path_name_ext: /path/to/filename.lpd