Giter VIP home page Giter VIP logo

tabula's Introduction

tesselle

R-CMD-check codecov CodeFactor Dependencies

r-universe CRAN Version CRAN checks CRAN Downloads

Project Status: Active – The project has reached a stable, usable state and is being actively developed.

DOI

Overview

The tesselle suite is a collection of packages for research and teaching in archaeology. These packages focus on quantitative analysis methods developed for archaeology. The tesselle packages are designed to work seamlessly together and to complement general-purpose and other specialized statistical packages. These packages can be used to explore and analyze common data types in archaeology: count data, compositional data and chronological data.

The tesselle package is designed to make it easy to install and load key packages from the tesselle suite in a single step.

To cite tesselle in publications use:

  Frerebeau N (2024). _tesselle: Easily Install and Load 'tesselle'
  Packages_. Université Bordeaux Montaigne, Pessac, France.
  doi:10.5281/zenodo.6500491 <https://doi.org/10.5281/zenodo.6500491>,
  R package version 1.5.0, <https://packages.tesselle.org/tesselle/>.

This package is a part of the tesselle project
<https://www.tesselle.org>.

Installation

You can install the released version of tesselle from CRAN with:

install.packages("tesselle")

And the development version from GitHub with:

# install.packages("remotes")
remotes::install_github("tesselle/tesselle")

Usage

library(tesselle) will load the core packages:

  • tabula: analysis and visualization of archaeological count data;
  • kairos: analysis of chronological patterns from archaeological count data;
  • nexus: analysis of compositional data;

And two companion packages:

library(tesselle)
#> --- Attaching packages -------------------------------------- tesselle 1.5.0 ---
#> * dimensio    0.6.0
#> * isopleuros  1.2.0
#> * kairos      2.1.0
#> * nexus       0.2.0
#> * tabula      3.0.1

Contributing

Please note that the tesselle project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

tabula's People

Contributors

benmarwick avatar nfrerebeau avatar soodoku avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

tabula's Issues

JOSS

Hey @nfrerebeau: I have created a few issues. when you get a chance take a look. The key thing that is missing in the paper =

"State of the field: Do the authors describe how this software compares to other commonly-used packages?" Specifically, can you speak to all the other packages tailored toward ecologists etc. like
https://github.com/vegandevs/vegan (I don't know as much about the field)

For the package, the key thing that is concerning = re-implementation of classic but simple algorithms, like shannon diversity, etc. it may be useful to import robust implementations? I don't know if they exist so suspect that they do.

The key thing missing from the package = whenever stat. tests are done, there is insufficient detail in the man pages about the assumptions and what precisely happened. There are some cites. But useful to write in a more expanded fashion what was done and return an object that has, for instance, all the estimates from each of the bootstrapped sample.

Please disagree with whatever doesn't make sense. I don't want you to do anything that is not adding value. My aim is just to help make the package better.

R version 3.6.3 (2020-02-29)

I commonly used Tabula on "R version 3.4.4" to get plot_ford(). Now when applying on "R version 3.6.3 (2020-02-29)" I get this:

compiegne%>%
as_count() %>%
plot_ford()

Result:
"Error in $<-.data.frame(*tmp*, "data", value = numeric(0)) :
replacement has 0 rows, data has 160"

Do I missing something?
Thanks for your effort!

FrequencyMatrix

> ?FrequencyMatrix

No documentation for 'FrequencyMatrix' in specified packages and libraries:
you could try '??FrequencyMatrix'

??FrequencyMatrix works but plausibly supporting `?FrequencyMatrix' useful?

Assemblage diversity size comparison

I've written an R function to calculate diversity of simulated assemblages and compare the diversity of actual assemblages to the distribution of diversities of simulations. Essentially, it implements the methods in Kintigh, 1984 and Kintigh, 1989 in R. It looks like it wouldn't take too much effort for me to rewrite the code to do the simulations from the matrix classes provided by tabula and use tabula's diversity functions for comparison. Would you have any interest in my adapting the code for this and submitting a PR?

Kintigh, K. (1984). Measuring Archaeological Diversity by Comparison with Simulated Assemblages. American Antiquity, 49(1), 44-54. doi:10.2307/280511

Kintigh, K. (1989). Sample size, significance, and measures of diversity. In R. D. Leonard, & G. T. Jones (Eds.), Quantifying diversity in archaeology (pp. 25-36). Cambridge University Press.

Move dating methods to a separate package

Functions for chronological modeling and dating of archaeological assemblages need their own package. This will make tabula easier to maintain.

  • Create a new package
  • Move code to the new package:
    • Mean ceramic date
    • Event/accumulation date
    • Frequency Increment Test
  • Remove code from tabula

Error(s) in re-building vignettes

See the problems shown on https://cran.r-project.org/web/checks/check_results_tabula.html.

These can be reproduced by checking with --as-cran using a very current r-devel (r77865 or later), which makes data.frame() and read.table() use a stringsAsFactors = FALSE default, which is planned to become the new default for the upcoming R 4.0.0.

See https://developer.r-project.org/Blog/public/2020/02/16/stringsasfactors/index.html for more information about this change.

Fix the package to work with both the old and new default In principle, this can easily be achieved by adding stringsAsFactors = TRUE to the relevant calls to data.frame() or read.table() [or other read.* function calling read.table()], but only do this if the sort order used in the string to factor conversion really does not matter (see the blog post about the locale dependence of the conversion). Otherwise, change to create the factors with explicitly given levels.

Correct before 2020-03-20 to safely retain the package on CRAN.

FrequencyMatrix

## Create a count matrix
A1 <- CountMatrix(data = sample(0:10, 100, TRUE),
                  nrow = 10, ncol = 10)

## Coerce counts to frequencies
B <- as_frequency(A1)
```
So far so good.

```
B[1] <- 2
```
and that goes through w/o error or warning. plausibly useful?

Cannot install pkg from CRAN or GitHub

Describe the bug
Not available from CRAN, cannot install from GitHub

To Reproduce

> install.packages("tabula")
Warning in install.packages :
  package ‘tabula’ is not available for this version of R

A version of this package for your version of R might be available elsewhere,
see the ideas at
https://cran.r-project.org/doc/manuals/r-patched/R-admin.html#Installing-packages
> sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7
> remotes::install_github("tesselle/tabula")
Using github PAT from envvar GITHUB_PAT
Downloading GitHub repo tesselle/tabula@HEAD
Error: Failed to install 'tabula' from GitHub:
  Missing commas separating Remotes: 'tesselle/arkhe
    tesselle/dimensio'
> 

Expected behavior
I expect the pkg will install.

Screenshots
image

** Session Info **

> sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] forcats_0.5.1   stringr_1.4.0   dplyr_1.0.4     purrr_0.3.4     readr_1.4.0     tidyr_1.1.2    
[7] tibble_3.0.6    ggplot2_3.3.3   tidyverse_1.3.0

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.6        cellranger_1.1.0  pillar_1.5.0      compiler_4.0.3    dbplyr_2.0.0     
 [6] tools_4.0.3       jsonlite_1.7.2    lubridate_1.7.9.2 lifecycle_1.0.0   gtable_0.3.0     
[11] pkgconfig_2.0.3   rlang_0.4.10      reprex_1.0.0      rstudioapi_0.13   DBI_1.1.1        
[16] cli_2.3.1         haven_2.3.1       withr_2.4.1       xml2_1.3.2        httr_1.4.2       
[21] fs_1.5.0          generics_0.1.0    vctrs_0.3.6       hms_1.0.0         grid_4.0.3       
[26] tidyselect_1.1.0  glue_1.4.2        R6_2.5.0          fansi_0.4.2       readxl_1.3.1     
[31] modelr_0.1.8      magrittr_2.0.1    backports_1.2.1   scales_1.1.1      ellipsis_0.3.1   
[36] rvest_0.3.6       assertthat_0.2.1  colorspace_2.0-0  utf8_1.1.4        stringi_1.5.3    
[41] munsell_0.5.0     broom_0.7.4       crayon_1.4.1  

CountMatrix

Describe the bug
In the man page for the function, you write: "Numeric values are rounded to zero decimal places and then coerced to integer as by as.integer."

To Reproduce
A minimal reproducible example (AKA a reprex). If you've never heard of a reprex before, start by reading https://www.tidyverse.org/help/#reprex.

CountMatrix(data = 1.1, nrow = 1)
Error: "CountMatrix" object initialization:
*  'data' must contain whole numbers.

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

Bug: Mishandling of missing values

Hi, thanks for a great package!

I encountered a bug with the way it handles missing values:

code to replicate error:

a <- c(1,1,NA,1)   
index_simpson(a)   
index_simpson(a, na.rm = T)   

Output (on both calls to index_simpson:

Error in if (!all(is_whole(x, ...))) { : 
  missing value where TRUE/FALSE needed

Installation problems

ERROR: dependency ‘car’ is not available for package ‘FactoMineR’

  • removing ‘/home/dk/R/x86_64-pc-linux-gnu-library/3.4/FactoMineR’
    ERROR: dependency ‘FactoMineR’ is not available for package ‘tabula’
  • removing ‘/home/dk/R/x86_64-pc-linux-gnu-library/3.4/tabula’

The downloaded source packages are in
‘/tmp/Rtmp5iMKEO/downloaded_packages’
Warning messages:
1: In install.packages(c("tabula", "khroma")) :
installation of package ‘openxlsx’ had non-zero exit status
2: In install.packages(c("tabula", "khroma")) :
installation of package ‘rio’ had non-zero exit status
3: In install.packages(c("tabula", "khroma")) :
installation of package ‘car’ had non-zero exit status
4: In install.packages(c("tabula", "khroma")) :
installation of package ‘FactoMineR’ had non-zero exit status
5: In install.packages(c("tabula", "khroma")) :
installation of package ‘tabula’ had non-zero exit status

CRAN Package Check NOTE

CRAN package check produces note, on the r-devel-linux-x86_64-fedora-clang, r-devel-linux-x86_64-fedora-gcc, r-patched-solaris-x86, r-release-osx-x86_64, r-oldrel-osx-x86_64 check flavors.

Enhance Description

CRAN team suggestion for the future: add some reference about the method in the Description field in the form Authors (year) doi:.....

FrequencyMatrix

It took me a bit to realize that FrequencyMatrix doesn't carry frequencies. It is actually CountMatrix that carries the frequencies. This is a small semantic point. Frequency = number of times X occurs. I know you define Frequency' as RelativeFrequency' but plausibly an argument that relabel Count as Frequency and Frequency as RelFrequency.

your call. not a blocker.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.