Giter VIP home page Giter VIP logo

checkmate's Introduction

checkmate

CRAN_Status_Badge R build status Coverage Status Download Stats

Fast and versatile argument checks for R.

Ever used an R function that produced a not-very-helpful error message, just to discover after minutes of debugging that you simply passed a wrong argument?

Blaming the laziness of the package author for not doing such standard checks (in a dynamically typed language such as R) is at least partially unfair, as R makes theses types of checks cumbersome and annoying. Well, that's how it was in the past.

Enter checkmate.

Virtually every standard type of user error when passing arguments into function can be caught with a simple, readable line which produces an informative error message in case. A substantial part of the package was written in C to minimize any worries about execution time overhead. Furthermore, the package provides over 30 expectations to extend the popular testthat package for unit tests.

Installation

For the stable release, just install the latest version from CRAN:

install.packages("checkmate")

For the development version, use devtools:

devtools::install_github("mllg/checkmate")

Resources

checkmate's People

Contributors

berndbischl avatar jhossepaul avatar justinmshea avatar maelle avatar mllg avatar petterhopp avatar reedcourty avatar rorynolan avatar rtaph avatar salim-b avatar sebffischer avatar tdeenes avatar wibeasley avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

checkmate's Issues

new release with fix, IMPORTANT

I would require a CRAN update SOON.

I had to fix a bug in assertList with len = ...

Which now triggers bugs in other packages.

assert bug (order seems to be importand)

> desc = "CV"
> assert(assertCharacter(desc), assertClass(desc, "ResampleDesc"))
> assert(assertClass(desc, "ResampleDesc"), assertCharacter(desc))
Fehler in assert(assertClass(desc, "ResampleDesc"), assertCharacter(desc)) : 
  Assertion on 'desc' failed: Must have class 'ResampleDesc'

check / assert

Isnt the semantic of check* that it always outputs TRUE / error.message [string] ?

At least in checkFile you call qassert, which throws an exception?

check_count not perfect

a) undocumented what a count means exactly

b) sometimes one wants 0 as lower, sometimes 1

c) should be possible to give upper value

check single integer

How do I check a single integer value? Currently I use assertInteger(x, any.missing=FALSE, len=1). Do I overlook a function, because I expected one with a shorter call similar to e.g. checkInt.

parameter null.ok

Hi,

quite often we check some argument only if it is not NULL, but NULL is allowed too. This leads to a bunch of check of the following type, which bloat the code and make it less readable:

if (!is.null(param)) {
  assertNumber(param, na.ok = FALSE, lower = 1)
}

I think the the introduction of another parameter null.ok with default value FALSE would be great!

  assertNumber(param, na.ok = FALSE, null.ok = TRUE, lower = 1)

What is your opinion?

checkNumber(TRUE) returns TRUE

Is this the intended behavior?
In addition,
qassert(TRUE,"i") and qassert(TRUE, "I") throw an error.
qassert(TRUE, "n") throws no error, but qassert(TRUE, "N") does.

enhancement to checkArray

Hi!

Thank you for this wonderful package, it helps me a lot. However, I miss one specific check which is quite useful if you work with matrices and arrays. Specifically, I have to check if the object is atomic, has a 'dim' attribute and the length of the 'dim' attribute is at least 2 (for example before calling rowMeans or functions from the matrixStats package). Would you consider adding two new arguments ('strict' and 'min.d') to checkArray (and testArray, assertArray)?

Note that a list can have a 'dim' attribute, in which case is.array() or checkArray() returns TRUE.

mylist <- structure(list(1:4, letters[1:3]), dim = c(2, 1))
str(mylist)
is.array(mylist)
checkArray(mylist)

Anyway, there seems to be a bug in checkAtomic:

mylist <- structure(list(1:4, letters[1:3]), dim = c(2, 1))
checkAtomic(mylist)
checkAtomic(matrix(1:10, 2, 5))

Here is an example how an enhanced version of checkArray would look like in (of course it should be done at the C-level):

checkAtomicArray <- function(x, mode = NULL, any.missing = TRUE, d = NULL,
                             strict = FALSE, min.d = NULL) {
    if (strict && is.list(x)) {
        return("Must be atomic, not list")
    }
    res <- checkArray(x, mode = mode, any.missing = any.missing, d = d)
    if (is.character(res)) return(res)
    res <- 
        if (is.null(d) && !is.null(min.d)) {
            if (length(dim(x)) >= min.d) {
                TRUE
            } else {
                sprintf("Must have at least %d dimensions, but has only %d", min.d, length(dim(x)))
            }
        } else {
            TRUE
        }
    if (is.character(res)) return(res)
    TRUE
} 
# check it
mylist <- structure(list(1:4, letters[1:3]), dim = c(2, 1))
checkAtomicArray(mylist, strict = TRUE)
checkAtomicArray(array(1:10), min.d = 2L)

Naming scheme of functions

I would prefer to be consistent with our style guide and not to use underscores for function names.

Your call.

error message if argument was missing in assertion

Here is one of the most common errors in a check

f = function(myarg) {
  assertCount(myarg)
}

g = function(myarg) {
  checkArg(myarg, "integer", len = 1L)
}

f(nope)
g(nope)

See here how the checkArg error is more informative.

Error in isTRUE(msg) : object 'nope' not found

Error in checkArg(myarg, "integer", len = 1L) : object 'nope' not found

The 2nd is much more informative as I can see where the check failed and - important - I get the name of the arg in the signature. I would say that saves doing a traceback in 95% of cases. The 1st is much more cryptic.

Is there a simple way to make this more readable?

checkFile: How to check for path, where a file is to be created?

Maybe we need checkPath for that?

Here is the use-case:

Users enters a path into function, some time later in code a file / dir is created here.
So we need write access to the path, but the target does not exist.
(and I think this is why checkFile currently fails for this as it assumes the file to exist)

Very common, actually come up in my first example for README.

provide headers to allow linking from C/C++ source

Really useful package, but AFAIK the header files need to be in inst/includes in the source so I can link to your package from my C/C++ in my R packages. This would be a helpful addition, and allow people to do good parameter checking in their compiled code without having to write wrapper functions for everything.

checkDataFrame in tutorial?

in the example for checkDataFrame we have

testDataFrame(iris, "data.frame")
testDataFrame(iris, "data.frame", min.rows = 1, col.names = "named")

but this should be actually

testDataFrame(iris)
testDataFrame(iris, min.rows = 1, col.names = "named")

as the first ones gives me "FALSE" back?

Quick checks for common situations

One of the most common situations is the checking of a vector and a scalar of a certain type. Most of the times with no allowance for NAs.

This is already handled very nicely by a couple of functions:

check_numeric
check_string
qcheck

I like the flexibility that qcheck allows for. And I would definitely keep it. I also think it is good that can "build up" the check string programmatically.
The big disadvantage is that

  • you get no command completion
  • typos in the format string are only detected at runtime

What I would like to have a little syntactic sugar function like check_string.
For numeric scalars such a helper seems to be missing?

The next thing is the naming scheme.

I am not sure, whether
check_string
check_count

is better to remember than the scheme already used in qcheck.....?
This is a very subtle, but important point, as it depends on what people remember best.

I CURRENTLY think that it might be better to have many helpers that use the same style as in qcheck.

Reasons:

  • consistence with qcheck
  • short
  • because the are short I can combine them nicely without producing long names for standard situations

Counterargument:
check_flag is very intuitive to read.

type check functions in BBmisc

BBmisc contains a couple of type check functions which COULD be moved to checkmate.

IMHO they probably belong here, they already have a large overlap.

I would mainly use them in the test* version, although assertions might be useful too.

Here is a first list

  • isScalar
  • isScalarNA
  • isProperlyNamed
  • isValidName
  • I currently need to check a numeric, in addiation to what checkmate can do I would also like to exclude NaNs and Infs. I thought about adding a helper to BBmisc.

NB: I am NOT suggesting mindlessly copying them without change and discussion.

NB2: Not so important for version 1.0 but nonetheless IMHO now a clear overlap between the 2 packages.

Error messages question

Is it by choice that no messages end with a . or a !

?

Do you sometimes concat them or insert them somewhere? Or just laziness?

checkMatrix: docs

#' @param row.names [\code{logical(1)}]\cr
#'  Check for row names. Default is \dQuote{any} (no check).
#'  See \code{\link{checkNamed}} for possible values.
#' @param col.names [\code{logical(1)}]\cr
#'  Check for column names. Default is \dQuote{any} (no check).
#'  See \code{\link{checkNamed}} for possible values.

row/cols.names is documented to be boolean, but default seems to be "any".

Correct? Adopt docs?

Dont habe time now, otherwise I would fix at once.

Maybe check other places for same issue.

Functionality for returning all errors together

Would you consider adding some functionality that would allow all errors and warnings to be returned simultaneously, rather than stopping the function any time any error is produced? The model I'm trying to follow is from a SAS Global Forum presentation.

The basic concept is that any time an error or warning is produced, you record a note but continue on with the parameter checking. Then, when the parameter checks are complete, return all of the errors and warnings. That way, if the user has provided two (or more) flawed inputs, he or she can receive a message about all of them and fix them all before trying to repeat the function.

I have drafted an small package that handles these, but wondered if it might be a better fit within a package that already specializes in parameter checking (rather than add yet another package to CRAN). I'll include a sample function in a subsequent comment that illustrates what I'm trying to do.

If you're interested, great! If not, I'll publish to CRAN separately.

assertClass should output wrong class

We currently see something like this:

Assertion on 'learner' failed: Must have class 'Learner'

It would be more informative for the user if the wrong class is shown as well.

lower in assertNumeric checks greater or equal

assertNumeric(0:3, lower = 0) does not throw an error, although the documentation says "greater than" the value of 'lower'. Is it possible to add an option to check values to be true greater or lower?

assertDataFrame has min.rows bug


d = iris[,-(1:5)]
# d = data frame with 0 columns and 150 rows
nrow(d)
#  150
assertDataFrame(d, min.rows=1)
Error in assertDataFrame(d, min.rows = 1) : 
  Assertion on 'd' failed: Must have at least 1 rows, but has 0 rows

Design Question: Scalars

It seems to be the design is somewhat inconsistent wrt to the scalars.

IMHO these should allow more flexibility.

  • At least bounds should be checkable for numbers / ints. This extremely common
  • We need a scalar integerish, which is not a count.
  • complex scalar is missing. This is mainly for completeness. I have never needed that until now.

The point is: If one has to very often resort to the check*Vector operations, to do these types of checks (ala bounds) one will simply remember those.

Note that I am NOT arguing for exposing everything of the vector interface. But the most common stuff should be exposed for the scalars. If this has reasonable defaults and the same names as for the vector, nobody gets hurt and its easy to remember.

It is actually probably LESS intuitive that some of those args are missing.

checkNamed should allow explicit names as option

In quite a few cases I know the names I would like to check for.
I should be allowed to pass them.

Note that one might want to check for them to be in the order that they are passed, or not in that order (so one check would use == , the other setequal).

IMHO you only need to check that EXACTLY these names occur, not subsets or whatever.

Use NULL instead of "missing"

Using "missing" for "not given" should be used as scarcely as possible.

I run into all kind of problems because of this. We should use NULL instead.

Argument checks must be made easy to allow this.

*_flag name

I am unsure whether "flag" is really a good name ....?

Mabye "bool" ?

checkNames

The function does not tell what it is for, add explaining sentence what the use case is

also document what x is? a char vec right?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.