Giter VIP home page Giter VIP logo

intrval's Introduction

intrval: Relational Operators for Intervals

CRAN version CRAN RStudio mirror downloads License: GPL v2 Github Stars

Evaluating if values of vectors are within different open/closed intervals (x %[]% c(a, b)), or if two closed intervals overlap (c(a1, b1) %[]o[]% c(a2, b2)). Operators for negation and directional relations also implemented.

Install

Install from CRAN:

install.packages("intrval")

Install development version from GitHub:

if (!requireNamespace("remotes")) install.packages("remotes")
remotes::install_github("psolymos/intrval")

User visible changes are listed in the NEWS file.

Use the issue tracker to report a problem.

Value-to-interval relations

Values of x are compared to interval endpoints a and b (a <= b). Endpoints can be defined as a vector with two values (c(a, b)): these values will be compared as a single interval with each value in x. If endpoints are stored in a matrix-like object or a list, comparisons are made element-wise.

x <- rep(4, 5)
a <- 1:5
b <- 3:7
cbind(x=x, a=a, b=b)
x %[]% cbind(a, b) # matrix
x %[]% data.frame(a=a, b=b) # data.frame
x %[]% list(a, b) # list

If lengths do not match, shorter objects are recycled. Return values are logicals. Note: interval endpoints are sorted internally thus ensuring the condition a <= b is not necessary.

These value-to-interval operators work for numeric (integer, real) and ordered vectors, and object types which are measured at least on ordinal scale (e.g. dates).

Closed and open intervals

The following special operators are used to indicate closed ([, ]) or open ((, )) interval endpoints:

Operator Expression Condition
%[]% x %[]% c(a, b) x >= a & x <= b
%[)% x %[)% c(a, b) x >= a & x < b
%(]% x %(]% c(a, b) x > a & x <= b
%()% x %()% c(a, b) x > a & x < b

Negation and directional relations

Equal Not equal Less than Greater than
%[]% %)(% %[<]% %[>]%
%[)% %)[% %[<)% %[>)%
%(]% %](% %(<]% %(>]%
%()% %][% %(<)% %(>)%

Dividing a range into 3 intervals

The functions %[c]%, %[c)%, %(c]%, and %(c)% return an integer vector taking values (the c within the brackets refer to 'cut'):

  • -1L when the value is less than or equal to a (a <= b), depending on the interval type,
  • 0L when the value is inside the interval, or
  • 1L when the value is greater than or equal to b (a <= b), depending on the interval type.
Expression Evaluates to -1 Evaluates to 0 Evaluates to 1
x %[c]% c(a, b) x < a x >= a & x <= b x > b
x %[c)% c(a, b) x < a x >= a & x < b x >= b
x %(c]% c(a, b) x <= a x > a & x <= b x > b
x %(c)% c(a, b) x <= a x > a & x < b x >= b

Interval-to-interval relations

The operators define the open/closed nature of the lower/upper limits of the intervals on the left and right hand side of the o in the middle.

Intervals Int. 2: [] Int. 2: [) Int. 2: (] Int. 2: ()
Int. 1: [] %[]o[]% %[]o[)% %[]o(]% %[]o()%
Int. 1: [) %[)o[]% %[)o[)% %[)o(]% %[)o()%
Int. 1: (] %(]o[]% %(]o[)% %(]o(]% %(]o()%
Int. 1: () %()o[]% %()o[)% %()o(]% %()o()%

The overlap of two closed intervals, [a1, b1] and [a2, b2], is evaluated by the %[o]% (alias for %[]o[]%) operator (a1 <= b1, a2 <= b2). Endpoints can be defined as a vector with two values (c(a1, b1))or can be stored in matrix-like objects or a lists in which case comparisons are made element-wise. If lengths do not match, shorter objects are recycled. These value-to-interval operators work for numeric (integer, real) and ordered vectors, and object types which are measured at least on ordinal scale (e.g. dates), see Examples. Note: interval endpoints are sorted internally thus ensuring the conditions a1 <= b1 and a2 <= b2 is not necessary.

c(2, 3) %[]o[]% c(0, 1)
list(0:4, 1:5) %[]o[]% c(2, 3)
cbind(0:4, 1:5) %[]o[]% c(2, 3)
data.frame(a=0:4, b=1:5) %[]o[]% c(2, 3)

If lengths do not match, shorter objects are recycled. These value-to-interval operators work for numeric (integer, real) and ordered vectors, and object types which are measured at least on ordinal scale (e.g. dates).

%)o(% is used for the negation of two closed interval overlap, directional evaluation is done via the operators %[<o]% and %[o>]%. The overlap of two open intervals is evaluated by the %(o)% (alias for %()o()%). %]o[% is used for the negation of two open interval overlap, directional evaluation is done via the operators %(<o)% and %(o>)%.

Equal Not equal Less than Greater than
%[o]% %)o(% %[<o]% %[o>]%
%(o)% %]o[% %(<o)% %(o>)%

Overlap operators with mixed endpoint do not have negation and directional counterparts.

Operators for discrete variables

The previous operators will return NA for unordered factors. Set overlap can be evaluated by the base %in% operator and its negation %ni% (as in not in, the opposite of in). %nin% and %notin% are aliases for better code readability (%in% can look very much like %ni%).

Examples

Bounding box

set.seed(1)
n <- 10^4
x <- runif(n, -2, 2)
y <- runif(n, -2, 2)
d <- sqrt(x^2 + y^2)
iv1 <- x %[]% c(-0.25, 0.25) & y %[]% c(-1.5, 1.5)
iv2 <- x %[]% c(-1.5, 1.5) & y %[]% c(-0.25, 0.25)
iv3 <- d %()% c(1, 1.5)
plot(x, y, pch = 19, cex = 0.25, col = iv1 + iv2 + 1,
    main = "Intersecting bounding boxes")
plot(x, y, pch = 19, cex = 0.25, col = iv3 + 1,
     main = "Deck the halls:\ndistance range from center")

Time series filtering

x <- seq(0, 4*24*60*60, 60*60)
dt <- as.POSIXct(x, origin="2000-01-01 00:00:00")
f <- as.POSIXlt(dt)$hour %[]% c(0, 11)
plot(sin(x) ~ dt, type="l", col="grey",
    main = "Filtering date/time objects")
points(sin(x) ~ dt, pch = 19, col = f + 1)

Quality control chart (QCC)

library(qcc)
data(pistonrings)
mu <- mean(pistonrings$diameter[pistonrings$trial])
SD <- sd(pistonrings$diameter[pistonrings$trial])
x <- pistonrings$diameter[!pistonrings$trial]
iv <- mu + 3 * c(-SD, SD)
plot(x, pch = 19, col = x %)(% iv +1, type = "b", ylim = mu + 5 * c(-SD, SD),
    main = "Shewhart quality control chart\ndiameter of piston rings")
abline(h = mu)
abline(h = iv, lty = 2)

Confidence intervals and hypothesis testing

## Annette Dobson (1990) "An Introduction to Generalized Linear Models".
## Page 9: Plant Weight Data.
ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
group <- gl(2, 10, 20, labels = c("Ctl","Trt"))
weight <- c(ctl, trt)

lm.D9 <- lm(weight ~ group)
## compare 95% confidence intervals with 0
(CI.D9 <- confint(lm.D9))
#                2.5 %    97.5 %
# (Intercept)  4.56934 5.4946602
# groupTrt    -1.02530 0.2833003
0 %[]% CI.D9
# (Intercept)    groupTrt
#       FALSE        TRUE

lm.D90 <- lm(weight ~ group - 1) # omitting intercept
## compare 95% confidence of the 2 groups to each other
(CI.D90 <- confint(lm.D90))
#            2.5 %  97.5 %
# groupCtl 4.56934 5.49466
# groupTrt 4.19834 5.12366
CI.D90[1,] %[o]% CI.D90[2,]
# 2.5 %
#  TRUE

Dates

DATE <- as.Date(c("2000-01-01","2000-02-01", "2000-03-31"))
DATE %[<]% as.Date(c("2000-01-15", "2000-03-15"))
# [1]  TRUE FALSE FALSE
DATE %[]% as.Date(c("2000-01-15", "2000-03-15"))
# [1] FALSE  TRUE FALSE
DATE %[>]% as.Date(c("2000-01-15", "2000-03-15"))
# [1] FALSE FALSE  TRUE

dt1 <- as.Date(c("2000-01-01", "2000-03-15"))
dt2 <- as.Date(c("2000-03-15", "2000-06-07"))
dt1 %[]o[]% dt2
# [1] TRUE
dt1 %[]o[)% dt2
# [1] TRUE
dt1 %[]o(]% dt2
# [1] FALSE
dt1 %[]o()% dt2
# [1] FALSE

Watch precedence!

(2 * 1:5) %[]% (c(2, 3) * 2)
# [1] FALSE  TRUE  TRUE FALSE FALSE
2 * 1:5 %[]% (c(2, 3) * 2)
# [1] 0 0 0 2 2
(2 * 1:5) %[]% c(2, 3) * 2
# [1] 2 0 0 0 0
2 * 1:5 %[]% c(2, 3) * 2
# [1] 0 4 4 0 0

Truncated distributions

Find the math here, as implemented in the package truncdist.

dtrunc <- function(x, ..., distr, lwr=-Inf, upr=Inf) {
    f <- get(paste0("d", distr), mode = "function")
    F <- get(paste0("p", distr), mode = "function")
    Fx_lwr <- F(lwr, ..., log=FALSE)
    Fx_upr <- F(upr, ..., log=FALSE)
    fx     <- f(x,   ..., log=FALSE)
    fx / (Fx_upr - Fx_lwr) * (x %[]% c(lwr, upr))
}
n <- 10^4
curve(dtrunc(x, distr="norm"), -2.5, 2.5, ylim=c(0, 2), ylab="f(x)")
curve(dtrunc(x, distr="norm", lwr=-0.5, upr=0.1), add=TRUE, col=4, n=n)
curve(dtrunc(x, distr="norm", lwr=-0.75, upr=0.25), add=TRUE, col=3, n=n)
curve(dtrunc(x, distr="norm", lwr=-1, upr=1), add=TRUE, col=2, n=n)

Shiny example 1: regular slider

library(shiny)
library(intrval)
library(qcc)

data(pistonrings)
mu <- mean(pistonrings$diameter[pistonrings$trial])
SD <- sd(pistonrings$diameter[pistonrings$trial])
x <- pistonrings$diameter[!pistonrings$trial]

## UI function
ui <- fluidPage(
  plotOutput("plot"),
  sliderInput("x", "x SD:",
    min=0, max=5, value=0, step=0.1,
    animate=animationOptions(100)
  )
)

# Server logic
server <- function(input, output) {
  output$plot <- renderPlot({
    Main <- paste("Shewhart quality control chart", 
        "diameter of piston rings", sprintf("+/- %.1f SD", input$x),
        sep="\n")
    iv <- mu + input$x * c(-SD, SD)
    plot(x, pch = 19, col = x %)(% iv +1, type = "b", 
        ylim = mu + 5 * c(-SD, SD), main = Main)
    abline(h = mu)
    abline(h = iv, lty = 2)
  })
}

## Run shiny app
if (interactive()) shinyApp(ui, server)

Shiny example 2: range slider

library(shiny)
library(intrval)

set.seed(1)
n <- 10^4
x <- round(runif(n, -2, 2), 2)
y <- round(runif(n, -2, 2), 2)
d <- round(sqrt(x^2 + y^2), 2)

## UI function
ui <- fluidPage(
  titlePanel("intrval example with shiny"),
  sidebarLayout(
    sidebarPanel(
      sliderInput("bb_x", "x value:",
        min=min(x), max=max(x), value=range(x), 
        step=round(diff(range(x))/20, 1), animate=TRUE
      ),
      sliderInput("bb_y", "y value:",
        min = min(y), max = max(y), value = range(y),
        step=round(diff(range(y))/20, 1), animate=TRUE
      ),
      sliderInput("bb_d", "radial distance:",
        min = 0, max = max(d), value = c(0, max(d)/2),
        step=round(max(d)/20, 1), animate=TRUE
      )
    ),
    mainPanel(
      plotOutput("plot")
    )
  )
)

# Server logic
server <- function(input, output) {
  output$plot <- renderPlot({
    iv1 <- x %[]% input$bb_x & y %[]% input$bb_y
    iv2 <- x %[]% input$bb_y & y %[]% input$bb_x
    iv3 <- d %()% input$bb_d
    op <- par(mfrow=c(1,2))
    plot(x, y, pch = 19, cex = 0.25, col = iv1 + iv2 + 3,
        main = "Intersecting bounding boxes")
    plot(x, y, pch = 19, cex = 0.25, col = iv3 + 1,
         main = "Deck the halls:\ndistance range from center")
    par(op)
  })
}

## Run shiny app
if (interactive()) shinyApp(ui, server)

intrval's People

Contributors

mgacc0 avatar psolymos avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

intrval's Issues

Roadmap for CRAN release

  • set of functions to finalize (naming etc)
  • support functions to finalize: use a single function for x %[]% c(a,b) types only, print & plot.
  • separate help files for %[]% and %[o]% operators
  • improve testing suite: unit tests + random interval testing
  • updated README file
  • tweet/blog when done

How to match elements of a vector to a data frame of intervals.

This looks like a really useful set of functions. I often need to match elements of a vector to a data frame of intervals. Currently I use the IRanges package, but I wonder if your package might be simpler to use. Can you tell me if there is a more efficient method with your package? This is what I've got so far:

# Find which interval that each element of the vector belongs in

library(tidyverse)
# here are my elements
elements <- c(0.1, 0.2, 0.5, 0.9, 1.1, 1.9, 2.1)

# here are my intervals
intervals <- 
  frame_data(  ~phase, ~start, ~end,
               "a",     0,      0.5,
               "b",     1,      1.9,
               "c",     2,      2.5
  )

# For each element, I want to know what interval does it belong in
library(intrval)
map(elements, ~.x %[]% data.frame(intervals[, c('start', 'end')])) %>% 
  map(., ~unlist(intervals[.x, 'phase'])) 

The output is like this, which I'm happy with:

[[1]]
phase 
  "a" 

[[2]]
phase 
  "a" 

[[3]]
phase 
  "a" 

[[4]]
character(0)

[[5]]
phase 
  "b" 

[[6]]
phase 
  "b" 

[[7]]
phase 
  "c" 

But I'm wondering if there's a simpler way to use your functions so I don't need the two map functions.

Thanks!

better operators for interval-to-interval relations

It would be really great to implement some new operators for interval-to-interval overlapping.

Instead of having the %[o]% operator for checking the overlapping between the closed intervals [a, b] and [x, y]
expressed as c(a, b) %[o]% c(x, y),
it would be great having the %[]o[]% operator
expressed as c(a, b) %[]o[]% c(x, y).

And then, for example:

c(1, 2) %[]o[]% c(2, 3)
# TRUE

c(2, 3) %[]o[]% c(1, 2)
# TRUE

c(1, 2) %()o()% c(2, 3)
# FALSE

c(2, 3) %()o()% c(1, 2)
# FALSE

Consequently, that would imply creating 32 operators (for all the combinations of open/closed endpoints):

interval-to-interval operator
%[]o[]%
%[]o[)%
%[]o(]%
%[]o()%
%[)o[]%
%[)o[)%
%[)o(]%
%[)o()%
... (continues)

specify and check how NAs are handled

Commit ee0c4c1 fixed an undesirable NA-handling behaviour.

  • The NA-handling bahaviour needs to be explicitly stated in the documentation.
  • Need to add unit-tests to ensure this stated behaviour.

A sketch of desirable behaviour:

  • When is.na(x), there is not much one can do, and the results should be NA.
  • When the interval specification contains an NA, e.g. c(1, NA) the na.rm=TRUE reduces the interval to a degenerate interval of a single value when closed [1,1], or nothing when open (1,1). Also there is the issue of soring: it cannot be assessed of the non NA value is the lower or the upper endpoint. This should return an NA. When both endpoints are NAs, return NA.
  • In other words: if any of x, a, b is NA return an NA.

reverse type if the interval limits are swapped

When an interval is expressed in reverse order,
it would be coherent to swap the order of parentheses and brackets:

expect_false(.intrval(4, c(5, 4), "[)"))
# Error: .intrval(4, c(5, 4), "[)") isn't false.

expect_false(.intrval(4, c(4, 5), "(]"))

expect_true(.intrval(5, c(5, 4), "[)"))
# Error: .intrval(5, c(5, 4), "[)") isn't true.

reverse_type <- function (type) {
  ifelse(type=="(]", "[)",
         ifelse(type=="[)", "(]", type))
}
reverse_type("(]")
# [1] "[)"
reverse_type("[)")
# [1] "(]"

Or maybe a warning should be shown when a swapped interval is found?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.