Summary.Rmd
LiPD datasets can contain a lot of data and metadata, and it’s often useful to get a quick summary of one or more datasets that you’ve loaded into R. Let’s load in a file from lipdverse.org and see how this works.
Let’s first grab at a single LiPD the Moose Lake dataset from Clegg et al. (2010) that’s part of the PAGES2k Temperature compilation.
library(lipdR)
L <- readLipd("https://lipdverse.org/data/MD6jkgwSxsq0oilgYUjM/1_0_0//Arc-MooseLake.Clegg.2010-ensemble.lpd")
First, let’s get a quick look using the print()
function, or by just returning the object:
L
#> ################################################################################################
#> Arc-MooseLake.Clegg.2010
#> MD6jkgwSxsq0oilgYUjM
#> v.1.0.0
#> ################################################################################################
#>
#> ### Archive Type ###
#> lake sediment
#>
#> ### Geographic Metadata ###
#> Moose Lake (61.35N, -143.6E), 437 masl
#>
#> ### Publications (printing first 2 citations of 2 total) ###
#>
#> Clegg, Benjamin F.;Clarke, Gina H.;Chipman, Melissa L.;Chou, Michael;Walker, Ian R.;Tinner, Willy;Hu, Feng Sheng (2010). Six millennia of summer temperature variation based on midge analysis of lake sediments from Alaska. Quaternary Science Reviews. 10.1016/j.quascirev.2010.08.001.
#>
#> Clegg, B.F. World Data Center for Paleoclimatology.
#>
#> ### Paleo Data ###
#>
#> Summary data for object 1, Measurement Table 1 of 1:
#> Measurement table contains 63 observations of 2 variables
#>
#>
#> ### Chron Data ###
#>
#> Summary data for object 1, Measurement Table 1 of 1:
#> Measurement table contains 11 observations of 6 variables
#>
#> 1 model(s) found
#>
#> Model 1 contains: ensembleTable method
#>
#> Model algorithm: Bacon2.2
#> Model contains 1 ensemble table(s)
#> dimensions of ensemble table 1 of 1 : 85 x 6297
This is handy, as we can quickly see…
To get more detail about the dataset, let’s use
summary()
summary(L)
#> ################################################################################################
#> Arc-MooseLake.Clegg.2010
#> MD6jkgwSxsq0oilgYUjM
#> v.1.0.0
#> ################################################################################################
#>
#> ### Archive Type ###
#> lake sediment
#>
#> ### Geographic Metadata ###
#> Moose Lake (61.35N, -143.6E), 437 masl
#>
#> ### Publications (printing first 2 citations of 2 total) ###
#>
#> Clegg, Benjamin F.;Clarke, Gina H.;Chipman, Melissa L.;Chou, Michael;Walker, Ian R.;Tinner, Willy;Hu, Feng Sheng (2010). Six millennia of summer temperature variation based on midge analysis of lake sediments from Alaska. Quaternary Science Reviews. 10.1016/j.quascirev.2010.08.001.
#>
#> Clegg, B.F. World Data Center for Paleoclimatology.
#>
#> ### Paleo Data ###
#>
#> Summary data for object 1, Measurement Table 1 of 1:
#> Measurement table contains 63 observations of 2 variables
#>
#> # A tibble: 5 × 3
#> ` ` temperature year
#> <chr> <chr> <chr>
#> 1 units degC AD
#> 2 description NA Year AD
#> 3 min 12.82 -718.15
#> 4 median 13.7 1297.9
#> 5 max 14.29 1963
#>
#> ### Chron Data ###
#>
#> Summary data for object 1, Measurement Table 1 of 1:
#> Measurement table contains 11 observations of 6 variables
#>
#> # A tibble: 4 × 7
#> ` ` labID age ageUncertainty depth age14C age14CUncertainty
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 units unitless cal yr cal yr cm 14C yr 14C yr
#> 2 min 131673a -50 1 0 320 35
#> 3 median 80799a 7 3 51.5 1730 40
#> 4 max NaN 64 5 168.5 5250 50
#> 1 model(s) found
#>
#> Model 1 contains: ensembleTable method
#>
#> Model algorithm: Bacon2.2
#> Model contains 1 ensemble table(s)
#> dimensions of ensemble table 1 of 1 : 85 x 6297
In addition to the information printed by print()
,
summary()
includes tables with basic metadata about each of
the columns in the measurement tables. In the Moose Lake dataset’s
paleoData measurement table, we see two variables,
temperature and year, and the table
shows the units, description, and some stats about the contents.
Similarly, we get information about the radiocarbon dates included in
the chronData measurement table.
LiPD datasets are often most powerful when many datasets are analyzed together, and it’s common to load multiple datasets into a “multi_lipd” object. It’s nice to get useful summaries for these data too. Let’s explore this functionality using a diverse collection of LiPD files pulled at random from around the LiPDverse.
D <- readLipd("http://lipdverse.org/testData/testData.zip")
Once again, we can print()
for an overview:
D
#> Multi LiPD contains 47 LiPD files.
#>
#> ### Archive Types ###
#> 7 MarineSediment records
#> 4 Lake records
#> 4 GlacierIce records
#> 2 glacier ice records
#> 3 tree records
#> 1 lake sediment records
#> 13 LakeSediment records
#> 5 coral records
#> 2 Coral records
#> 2 Speleothem records
#> 4 Peat records
#>
#> ### Geographic Bounds ###
#> All sites located between -75.0025N to 80.7N and -162.13E to 146.8749E
and see the number of datasets included, their archiveTypes and geographic boundaries.
We can also have a look at any individual dataset in the collection by subsetting the multi-lipd:
D$`Arc-Agassiz.Vinther.2008`
#> ################################################################################################
#> Arc-Agassiz.Vinther.2008
#> TlyrPuoYdCjvokqD1Jzn
#> v.1.0.0
#> ################################################################################################
#>
#> ### Archive Type ###
#> glacier ice
#>
#> ### Geographic Metadata ###
#> Agassiz (80.7N, -73.1E), 1700 masl
#>
#> ### Publications (printing first 2 citations of 2 total) ###
#>
#> Vinther, B. M.;Clausen, H. B.;Fisher, D. A.;Koerner, R. M.;Johnsen, S. J.;Andersen, K. K.;Dahl-Jensen, D.;Rasmussen, S. O.;Steffensen, J. P.;Svensson, A. M. (2008). Synchronizing ice cores from the Renland and Agassiz ice caps to the Greenland ice core chronology. Journal of Geophysical Research. 10.1029/2007JD009143.
#>
#> Vinther, B.M. World Data Center for Paleoclimatology.
#>
#> ### Paleo Data ###
#>
#> Summary data for object 1, Measurement Table 1 of 1:
#> Measurement table contains 1973 observations of 2 variables
#>
#>
#> ### Chron Data ###
#> Chron Data does not include measurement table
#>
#> 1 model(s) found
#>
#> Model 1 contains: ensembleTable method
#>
#> Model algorithm: BAM
#> Model contains 1 ensemble table(s)
#> dimensions of ensemble table 1 of 1 : 1973 x 1000
For more information on the multi-lipd, we can use
summary()
summary(D)
#> Multi LiPD contains 47 LiPD files.
#>
#> ### Archive Types ###
#> 7 MarineSediment records
#> 4 Lake records
#> 4 GlacierIce records
#> 2 glacier ice records
#> 3 tree records
#> 1 lake sediment records
#> 13 LakeSediment records
#> 5 coral records
#> 2 Coral records
#> 2 Speleothem records
#> 4 Peat records
#>
#> ### Geographic Bounds ###
#> All sites located between -75.0025N to 80.7N and -162.13E to 146.8749E
#>
#> ### Measurement tables and models ###
#> *Age values gathered from PaleoData Object 1, Measurement Table 1
#> # A tibble: 47 × 8
#> Dataset `Archive Type` NumPalTabs NumChrTabs NumEns AgeMin AgeMax `Paleo Vars`
#> <chr> <chr> <dbl> <dbl> <dbl> <chr> <chr> <chr>
#> 1 165_1002C.Herbert.2000 MarineSediment 2 1 0 690 BP 164970 BP depth, age, temperature, C37.concentration, temperature.1, label, Uk37, depth, age, planktic.d18O, label
#> 2 AbantGolu.Bottema.1993 Lake 1 1 0 200 yr BP 13200 yr BP ageMin, ageMax, age, ageMedian, depth, temperature, temperatureUncertainty, temperature.1, temperatureUncertainty.1, temperature.2, temperatureUncertainty.2, temperature.3, temperatureUncertainty.3, precipitation, precipitationUncertainty, precipitation.1, precipitationUncertainty.1, temperature.4, temperatureUncertainty.4, precipitation.2, precipitationUncertainty.2, temperature.5, …
#> 3 Agassiz.Lecavalier.2017 GlacierIce 2 0 0 16 Calibrated 11366 Calibrated age, temperature, uncertaintyHigh, uncertaintyLow, ReliabIeYN1, Commentregardingreliability1, melt, meltUncertaintyHigh, meltUncertaintyLow, age, d18O, temperature, uncertaintyHigh, uncertaintyHigh.1
#> 4 Ant-SiteDML05.Graf.2002 glacier ice 1 0 0 1996 AD 166 AD d18O, year
#> 5 Arc-Agassiz.Vinther.2008 glacier ice 1 0 1 1972 AD 0 AD d18O, year
#> 6 Asi-MONG033.Jacoby.2010 tree 1 0 0 1997 AD 1550 AD year, trsgi
#> 7 Aus-DuckholeLake.Saunders.2013 lake sediment 1 0 0 2001.32 AD 1140.35 AD year, age, depth, waterContent, N, C, BSi, R650_700, R570_630, R660_670, RABD660_670, C_N
#> 8 big_round.Thomas.2010 LakeSediment 1 1 0 -56 BP 10185.66 BP depth, age, MS
#> 9 BlueMoundsCreek.Davis.1977Legacy Lake 1 1 0 -50 yr BP 12160 yr BP ageMin, ageMax, age, ageMedian, depth, temperature, temperatureUncertainty, temperature.1, temperatureUncertainty.1, temperature.2, temperatureUncertainty.2, temperature.3, temperatureUncertainty.3, precipitation, precipitationUncertainty, precipitation.1, precipitationUncertainty.1, temperature.4, temperatureUncertainty.4, precipitation.2, precipitationUncertainty.2, temperature.5, …
#> 10 BO14HTI01 coral 3 0 0 2005.04 AD 1977.37 AD d18O, year, SrCa, year, SrCaUncertainty, year
#> 11 CahabaPond.Delcourt.1983PaleoDIVER LakeSediment 1 0 0 -29 yr BP 13949 yr BP age, ageOld, ageYoung, lakeLevelMax, lakeLevelMin, lakeLevel, lakeLevelRelative
#> 12 CO03COPM Coral 12 0 2 1998.37 AD 928.13 AD d18O, year, d18O, year, d18O, year, d18O, year, d18O, year, d18O, year, d18O, year, d18O, year, d18O, year, d18O, year, d18O, year, d18O, year, depth
#> 13 CO04LIRA Coral 3 1 2 1996.91 AD 1726.78 AD d18O, year, d18O, year, d18O, year
#> 14 CrowsnestLake.author.1111 Lake 1 1 0 60 yr BP 12840 yr BP ageMin, ageMax, age, ageMedian, depth, temperature, temperatureUncertainty, temperature.1, temperatureUncertainty.1, temperature.2, temperatureUncertainty.2, temperature.3, temperatureUncertainty.3, precipitation, precipitationUncertainty, precipitation.1, precipitationUncertainty.1, temperature.4, temperatureUncertainty.4, precipitation.2, precipitationUncertainty.2, temperature.5, …
#> 15 CuevaDiablo.Bernal.2011 Speleothem 1 1 0 672 AD -8920 AD d18O, year, age
#> 16 DE14DTO04 coral 1 0 0 2008.63 AD 1896.54 AD SrCa, year
#> 17 EDML.Stenni.2010 GlacierIce 1 0 0 1193 BP 12992 BP bagDepth, age, bagd18O, bagdD, bagDexcess, temperature, tempSource, tempNoElevCorrection, tempNoSourceCorrection
#> 18 Emerald.Shuman.2014 LakeSediment 1 1 0 1700 AD -9950 AD lakeLevel, year, age, lakeLevelHi, lakeLevelLo
#> 19 GeoB5901_2.Kim.2004 MarineSediment 1 1 0 1130 BP 9920 BP depth, ageDuplicate, ageOriginal, Uk37, temperature, ageMedianBacon, age
#> 20 Glattalp.Wirth.2013 LakeSediment 1 0 0 1960 AD -8099 AD floods, year, age
#> # ℹ 27 more rows
And also get tables that give us some basic information on each of the datasets included. There are a few options that customize the output of summary, here we will choose the preferred time units, and limit the number of datasets summarized to 20:
summary(D, print.length=20, time.units="AD")
#> Multi LiPD contains 47 LiPD files.
#>
#> ### Archive Types ###
#> 7 MarineSediment records
#> 4 Lake records
#> 4 GlacierIce records
#> 2 glacier ice records
#> 3 tree records
#> 1 lake sediment records
#> 13 LakeSediment records
#> 5 coral records
#> 2 Coral records
#> 2 Speleothem records
#> 4 Peat records
#>
#> ### Geographic Bounds ###
#> All sites located between -75.0025N to 80.7N and -162.13E to 146.8749E
#>
#> ### Measurement tables and models ###
#> *Age values gathered from PaleoData Object 1, Measurement Table 1
#> # A tibble: 47 × 8
#> Dataset `Archive Type` NumPalTabs NumChrTabs NumEns AgeMin AgeMax `Paleo Vars`
#> <chr> <chr> <dbl> <dbl> <dbl> <chr> <chr> <chr>
#> 1 165_1002C.Herbert.2000 MarineSediment 2 1 0 690 BP 164970 BP depth, age, temperature, C37.concentration, temperature.1, label, Uk37, depth, age, planktic.d18O, label
#> 2 AbantGolu.Bottema.1993 Lake 1 1 0 200 yr BP 13200 yr BP ageMin, ageMax, age, ageMedian, depth, temperature, temperatureUncertainty, temperature.1, temperatureUncertainty.1, temperature.2, temperatureUncertainty.2, temperature.3, temperatureUncertainty.3, precipitation, precipitationUncertainty, precipitation.1, precipitationUncertainty.1, temperature.4, temperatureUncertainty.4, precipitation.2, precipitationUncertainty.2, temperature.5, …
#> 3 Agassiz.Lecavalier.2017 GlacierIce 2 0 0 16 Calibrated 11366 Calibrated age, temperature, uncertaintyHigh, uncertaintyLow, ReliabIeYN1, Commentregardingreliability1, melt, meltUncertaintyHigh, meltUncertaintyLow, age, d18O, temperature, uncertaintyHigh, uncertaintyHigh.1
#> 4 Ant-SiteDML05.Graf.2002 glacier ice 1 0 0 1996 AD 166 AD d18O, year
#> 5 Arc-Agassiz.Vinther.2008 glacier ice 1 0 1 1972 AD 0 AD d18O, year
#> 6 Asi-MONG033.Jacoby.2010 tree 1 0 0 1997 AD 1550 AD year, trsgi
#> 7 Aus-DuckholeLake.Saunders.2013 lake sediment 1 0 0 2001.32 AD 1140.35 AD year, age, depth, waterContent, N, C, BSi, R650_700, R570_630, R660_670, RABD660_670, C_N
#> 8 big_round.Thomas.2010 LakeSediment 1 1 0 -56 BP 10185.66 BP depth, age, MS
#> 9 BlueMoundsCreek.Davis.1977Legacy Lake 1 1 0 -50 yr BP 12160 yr BP ageMin, ageMax, age, ageMedian, depth, temperature, temperatureUncertainty, temperature.1, temperatureUncertainty.1, temperature.2, temperatureUncertainty.2, temperature.3, temperatureUncertainty.3, precipitation, precipitationUncertainty, precipitation.1, precipitationUncertainty.1, temperature.4, temperatureUncertainty.4, precipitation.2, precipitationUncertainty.2, temperature.5, …
#> 10 BO14HTI01 coral 3 0 0 2005.04 AD 1977.37 AD d18O, year, SrCa, year, SrCaUncertainty, year
#> 11 CahabaPond.Delcourt.1983PaleoDIVER LakeSediment 1 0 0 -29 yr BP 13949 yr BP age, ageOld, ageYoung, lakeLevelMax, lakeLevelMin, lakeLevel, lakeLevelRelative
#> 12 CO03COPM Coral 12 0 2 1998.37 AD 928.13 AD d18O, year, d18O, year, d18O, year, d18O, year, d18O, year, d18O, year, d18O, year, d18O, year, d18O, year, d18O, year, d18O, year, d18O, year, depth
#> 13 CO04LIRA Coral 3 1 2 1996.91 AD 1726.78 AD d18O, year, d18O, year, d18O, year
#> 14 CrowsnestLake.author.1111 Lake 1 1 0 60 yr BP 12840 yr BP ageMin, ageMax, age, ageMedian, depth, temperature, temperatureUncertainty, temperature.1, temperatureUncertainty.1, temperature.2, temperatureUncertainty.2, temperature.3, temperatureUncertainty.3, precipitation, precipitationUncertainty, precipitation.1, precipitationUncertainty.1, temperature.4, temperatureUncertainty.4, precipitation.2, precipitationUncertainty.2, temperature.5, …
#> 15 CuevaDiablo.Bernal.2011 Speleothem 1 1 0 672 AD -8920 AD d18O, year, age
#> 16 DE14DTO04 coral 1 0 0 2008.63 AD 1896.54 AD SrCa, year
#> 17 EDML.Stenni.2010 GlacierIce 1 0 0 1193 BP 12992 BP bagDepth, age, bagd18O, bagdD, bagDexcess, temperature, tempSource, tempNoElevCorrection, tempNoSourceCorrection
#> 18 Emerald.Shuman.2014 LakeSediment 1 1 0 1700 AD -9950 AD lakeLevel, year, age, lakeLevelHi, lakeLevelLo
#> 19 GeoB5901_2.Kim.2004 MarineSediment 1 1 0 1130 BP 9920 BP depth, ageDuplicate, ageOriginal, Uk37, temperature, ageMedianBacon, age
#> 20 Glattalp.Wirth.2013 LakeSediment 1 0 0 1960 AD -8099 AD floods, year, age
#> # ℹ 27 more rows
These options limit the printed information on the table to 20 rows, and gives us time summary information in AD units where possible.
Sometimes it is useful to save the table, so it can be sorted and further investigated. Simply assign an output:”
MLSummary <- summary(D)
We can now explore this table as a standard data frame. Here we do so
using dplyr
tools to sort by datasets with age
ensembles
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
arrange(MLSummary,desc(NumEns))
#> # A tibble: 47 × 8
#> Dataset `Archive Type` NumPalTabs NumChrTabs NumEns AgeMin AgeMax `Paleo Vars`
#> <chr> <chr> <dbl> <dbl> <dbl> <chr> <chr> <chr>
#> 1 CO03COPM Coral 12 0 2 1998.37 AD 928.13 AD d18O, year, d18O, year, d18O, year, d18O, year, d18O, year, d18O, year, d18O, year, d18O, year, d18O, year, d18O, year, d18O, year, d18O, year, depth
#> 2 CO04LIRA Coral 3 1 2 1996.91 AD 1726.78 AD d18O, year, d18O, year, d18O, year
#> 3 Arc-Agassiz.Vinther.2008 glacier ice 1 0 1 1972 AD 0 AD d18O, year
#> 4 IC09VICC GlacierIce 1 0 1 1967 AD 1242 AD d18O, year
#> 5 MS14MOLS MarineSediment 1 1 1 1186.64 AD 35.35 AD d18O, year, depth, d18O.1, d18O.2, d18O.3, Mg.Ca, temperature, d18O.4, Mg.Ca.1, temperature.1
#> 6 165_1002C.Herbert.2000 MarineSediment 2 1 0 690 BP 164970 BP depth, age, temperature, C37.concentration, temperature.1, label, Uk37, depth, age, planktic.d18O, label
#> 7 AbantGolu.Bottema.1993 Lake 1 1 0 200 yr BP 13200 yr BP ageMin, ageMax, age, ageMedian, depth, temperature, temperatureUncertainty, temperature.1, temperatureUncertainty.1, temperature.2, temperatureUncertainty.2, temperature.3, temperatureUncertainty.3, precipitation, precipitationUncertainty, precipitation.1, precipitationUncertainty.1, temperature.4, temperatureUncertainty.4, precipitation.2, precipitationUncertainty.2, temperature.5, temperatur…
#> 8 Agassiz.Lecavalier.2017 GlacierIce 2 0 0 16 Calibrated 11366 Calibrated age, temperature, uncertaintyHigh, uncertaintyLow, ReliabIeYN1, Commentregardingreliability1, melt, meltUncertaintyHigh, meltUncertaintyLow, age, d18O, temperature, uncertaintyHigh, uncertaintyHigh.1
#> 9 Ant-SiteDML05.Graf.2002 glacier ice 1 0 0 1996 AD 166 AD d18O, year
#> 10 Asi-MONG033.Jacoby.2010 tree 1 0 0 1997 AD 1550 AD year, trsgi
#> # ℹ 37 more rows
With a well curated dataset (this example doesn’t quite fit those criteria), a few commands can get you pretty close to a publishable table.
NPM: we might have to kill this section because of the gt package, or just not show that.
Dataset | Archive Type | AgeMin | AgeMax |
---|---|---|---|
165_1002C.Herbert.2000 | MarineSediment | 690 BP | 164970 BP |
AbantGolu.Bottema.1993 | Lake | 200 yr BP | 13200 yr BP |
Agassiz.Lecavalier.2017 | GlacierIce | 16 Calibrated | 11366 Calibrated |
Ant-SiteDML05.Graf.2002 | glacier ice | 1996 AD | 166 AD |
Arc-Agassiz.Vinther.2008 | glacier ice | 1972 AD | 0 AD |
Asi-MONG033.Jacoby.2010 | tree | 1997 AD | 1550 AD |
Aus-DuckholeLake.Saunders.2013 | lake sediment | 2001.32 AD | 1140.35 AD |
big_round.Thomas.2010 | LakeSediment | -56 BP | 10185.66 BP |
BlueMoundsCreek.Davis.1977Legacy | Lake | -50 yr BP | 12160 yr BP |
BO14HTI01 | coral | 2005.04 AD | 1977.37 AD |
CahabaPond.Delcourt.1983PaleoDIVER | LakeSediment | -29 yr BP | 13949 yr BP |
CO03COPM | Coral | 1998.37 AD | 928.13 AD |
CO04LIRA | Coral | 1996.91 AD | 1726.78 AD |
CrowsnestLake.author.1111 | Lake | 60 yr BP | 12840 yr BP |
CuevaDiablo.Bernal.2011 | Speleothem | 672 AD | -8920 AD |
DE14DTO04 | coral | 2008.63 AD | 1896.54 AD |
EDML.Stenni.2010 | GlacierIce | 1193 BP | 12992 BP |
Emerald.Shuman.2014 | LakeSediment | 1700 AD | -9950 AD |
GeoB5901_2.Kim.2004 | MarineSediment | 1130 BP | 9920 BP |
Glattalp.Wirth.2013 | LakeSediment | 1960 AD | -8099 AD |
GRIP_accumulation.Vinther.2006 | GlacierIce | 1989 AD | -246813 AD |
Hoqcave.VanRampelbergh.2013 | Speleothem | -45.63 BP | 9249.4 BP |
IC09VICC | GlacierIce | 1967 AD | 1242 AD |
IndianPrairieFen.Whitlock.1995 | LakeSediment | 362.8 BP | 16023.9 BP |
Karujarv.EPD | LakeSediment | 484.76 BP | 8373.49 BP |
KI14PAR01 | coral | 1669 AD | 1469 AD |
LeVernydesBrulons.JouffroyBapicot.2010 | Peat | -90 yr BP | 13840 yr BP |
LS05ANJE | LakeSediment | 2002 unitless | -5606 unitless |
LS13LUBA | LakeSediment | 1553 AD | -6744 AD |
MD95_2039 | MarineSediment | 0 BP | 50568 BP |
MottaNaluns.vanderKnaap.1997 | Peat | 330 yr BP | 11350 yr BP |
MS14MOLS | MarineSediment | 1186.64 AD | 35.35 AD |
NAm-SnakeCreek.Kenigsberg.2013 | tree | 1999 AD | 1645 AD |
NikolayLake.Andreev.2004 | LakeSediment | 95 Calibrated | 10210 Calibrated |
Ocn-TurrumoteReefPuertoRico.Kilbourne.2008 | coral | 2004 AD | 1751 AD |
ParkPond2.Lynch.1998 | LakeSediment | 0 yr 14C BP | 12131 yr 14C BP |
POS362_2_33 | MarineSediment | 630 BP | 9560 BP |
RanViken.EPD | LakeSediment | 265.22 BP | 11599.91 BP |
Rugozero.Elina.1981 | Peat | 390 yr BP | 10910 yr BP |
SAm-CentralAndes6.Villalba.2014 | tree | 2006 AD | 1435 AD |
SellediCarnino.deBeaulieu.1977 | Peat | 310 yr BP | 16570 yr BP |
SS49.Perren.2012 | LakeSediment | 1780.25 AD | -7583.77 AD |
StellaLake.Reinemann.2009 | LakeSediment | 0 BP | 6678.09 BP |
SV_04 | MarineSediment | 726 BP | 10486 BP |
TR163_22.Lea.2006 | MarineSediment | 1137.5 BP | 135100 BP |
TriangleLakeBog.Jensen.2021 | Lake | 8880 yr BP | 17160 yr BP |
ZI04IFR01 | coral | 1995.62 AD | 1659.62 AD |
Typically, when we load multiple LiPD datasets, it’s convenient to convert them into timeseries objects, as it greatly simplifies analyzing the data. There are print and summary methods for these objects too. There are a couple different representations of these objects availabe in R - namely, as a list (lipd_ts) or as a nested tibble (lipd_ts_tibble). If you’re used to working with tibbles and tidyverse you’ll probably prefer the latter, so we’ll do that here:
tibTS <- as.lipdTsTibble(D)
Let’s quickly check the size of this tibble
dim(tibTS)
#> [1] 583 382
Wow - that’s pretty big. That means that there are 583 variables across all our datasets, and 382 data or metadata fields present in one or more of the datasets.
Again, let’s print out a quick overview:
tibTS
#> LiPD TS object containing 47 datasets
#> 583 variables and 382 data and metadata fields.
#>
#> TS object contains a mixture of age (BP) and year (AD) units.
#> 139 variables contain year (AD), while 486 contain age (BP).
#>
#> Can not provide interval of time overlap, units not standardized.
and we see the number of datasets, variables, and data/metadata fields. We also get the time units of all variables and the time interval common to all variables. These data have not been curated, and we see that no time units or time interval is common to all variables.
Beyond this quick overview, we can get a lot more information using
summary()
. This works the same for lipd_ts and
lipd_ts_tibble objects. Let’s limit the table to show details of the
first 10 rows only
summary(tibTS, print.length = 10)
#> LiPD TS object containing 47 datasets
#> 583 variables and 382 data and metadata fields.
#>
#> TS object contains a mixture of age (BP) and year (AD) units.
#> 139 variables contain year (AD), while 486 contain age (BP).
#>
#> Can not provide interval of time overlap, units not standardized.
#>
#> # A tibble: 69 × 2
#> paleoData_variableName varName_freq
#> <chr> <int>
#> 1 temperature 90
#> 2 temperatureUncertainty 64
#> 3 precipitation 44
#> 4 precipitationUncertainty 40
#> 5 d18O 35
#> 6 (Other) 21
#> 7 trsgi 7
#> 8 d13C 4
#> 9 SrCa 4
#> 10 uncertaintyHigh 4
#> # ℹ 59 more rows
In addition to the overview provided by print()
,
summary()
gives us a table that shows us which variables
are most common in the TS object. We can explore this more by specifying
which metadata fields we’d like to summarize using the
add.variable
parameter.
summary(tibTS,
print.length = 10,
add.variable = c("archiveType", "paleoData_meanValue12k","geo_continent", "geo_elevation", "agesPerKyr", "dataSetName"))
#> LiPD TS object containing 47 datasets
#> 583 variables and 382 data and metadata fields.
#>
#> TS object contains a mixture of age (BP) and year (AD) units.
#> 139 variables contain year (AD), while 486 contain age (BP).
#>
#> Can not provide interval of time overlap, units not standardized.
#>
#> paleoData_meanValue12k min: -352.1623 25%: 1.679 median: 7.9232 75%: 168.836 max: 349906
#>
#> geo_elevation min: -18.3 25%: -7 median: -3.6 75%: 1350 max: 3486
#>
#> agesPerKyr min: 0 25%: 0 median: 0 75%: 0.3933 max: 2.1459
#>
#> # A tibble: 69 × 8
#> paleoData_variableName varName_freq archiveType archiveType_freq geo_continent geo_continent_freq dataSetName dataSetName_freq
#> <chr> <int> <chr> <int> <chr> <int> <chr> <int>
#> 1 temperature 90 Lake 124 NA's 335 Hoqcave.VanRampelbergh.2013 78
#> 2 temperatureUncertainty 64 Peat 124 Europe 155 AbantGolu.Bottema.1993 31
#> 3 precipitation 44 Speleothem 81 Eastern North America 62 BlueMoundsCreek.Davis.1977Legacy 31
#> 4 precipitationUncertainty 40 LakeSediment 71 Western North America 31 CrowsnestLake.author.1111 31
#> 5 d18O 35 MarineSediment 57 NA NA LeVernydesBrulons.JouffroyBapicot.2010 31
#> 6 (Other) 21 Coral 31 NA NA MottaNaluns.vanderKnaap.1997 31
#> 7 trsgi 7 GlacierIce 30 NA NA Rugozero.Elina.1981 31
#> 8 d13C 4 coral 28 NA NA SellediCarnino.deBeaulieu.1977 31
#> 9 SrCa 4 tree 21 NA NA TriangleLakeBog.Jensen.2021 31
#> 10 uncertaintyHigh 4 lake sediment 12 NA NA CO03COPM 25
#> # ℹ 59 more rows
Now we get the same summarizing information for those additional six columns. The character or categorical data (archiveType, geo_contintent, and dataSetName) get added to the count table. The numeric data (paleoData_meanValue12k, geo_elevation, and agesPerKyr) are summarized in quantiles above.
This summary information is often a great first step towards informing how you will filter or subset your TS object for future analysis. For example, let’s grab only the variables that:
library(magrittr) #We'll use the pipe for clarity
subsetTs <- tibTS %>%
filter(between(geo_latitude,30,50),
between(geo_longitude,-125,-70),
maxYear > 11700,
minYear < 8000,
!is.na(interpretation1_variable))
Great - we found 29 timeseries that meet these criteria. Let’s explore their interpretations in a bit more detail by saving the table output.
sumTable <- summary(subsetTs,
add.variable = c("interpretation1_variable",
"interpretation1_variableDetail",
"interpretation1_seasonality"))
#> LiPD TS object containing 3 datasets
#> 29 variables and 382 data and metadata fields.
#>
#> All variables contain age (BP) data, 1 variables contain year (AD) data.
#>
#> All variables contain data in the interval 11900 (BP) to 250 (BP).
#>
#> # A tibble: 6 × 8
#> paleoData_variableName varName_freq interpretation1_variable interpretation1_variable_freq interpretation1_variableDetail interpretation1_variableDetail_freq interpretation1_seasonality interpretation1_seasonality_freq
#> <chr> <int> <chr> <int> <chr> <int> <chr> <int>
#> 1 temperature 8 T 16 air@surface 16 Annual 20
#> 2 temperatureUncertainty 8 P 10 precipitation@surface 10 July 8
#> 3 precipitation 5 P-E 3 LakeLevel@surface 3 1,2,3,4,5,6,7,8,9,10,11,12 1
#> 4 precipitationUncertainty 5 NA NA NA NA NA NA
#> 5 lakeLevel 2 NA NA NA NA NA NA
#> 6 lakeLevelRelative 1 NA NA NA NA NA NA