psolymos / intrval Goto Github PK

Relational Operators for Intervals

R 100.00%

interval-operators interval-endpoints closed-intervals open-intervals negation r cran

intrval's Introduction

intrval: Relational Operators for Intervals

Evaluating if values of vectors are within different open/closed intervals (x %[]% c(a, b)), or if two closed intervals overlap (c(a1, b1) %[]o[]% c(a2, b2)). Operators for negation and directional relations also implemented.

Install

Install from CRAN:

install.packages("intrval")

Install development version from GitHub:

if (!requireNamespace("remotes")) install.packages("remotes")
remotes::install_github("psolymos/intrval")

User visible changes are listed in the NEWS file.

Use the issue tracker to report a problem.

Value-to-interval relations

Values of x are compared to interval endpoints a and b (a <= b). Endpoints can be defined as a vector with two values (c(a, b)): these values will be compared as a single interval with each value in x. If endpoints are stored in a matrix-like object or a list, comparisons are made element-wise.

x <- rep(4, 5)
a <- 1:5
b <- 3:7
cbind(x=x, a=a, b=b)
x %[]% cbind(a, b) # matrix
x %[]% data.frame(a=a, b=b) # data.frame
x %[]% list(a, b) # list

If lengths do not match, shorter objects are recycled. Return values are logicals. Note: interval endpoints are sorted internally thus ensuring the condition a <= b is not necessary.

These value-to-interval operators work for numeric (integer, real) and ordered vectors, and object types which are measured at least on ordinal scale (e.g. dates).

Closed and open intervals

The following special operators are used to indicate closed ([, ]) or open ((, )) interval endpoints:

Operator	Expression	Condition
`%[]%`	`x %[]% c(a, b)`	`x >= a & x <= b`
`%[)%`	`x %[)% c(a, b)`	`x >= a & x < b`
`%(]%`	`x %(]% c(a, b)`	`x > a & x <= b`
`%()%`	`x %()% c(a, b)`	`x > a & x < b`

Negation and directional relations

Equal	Not equal	Less than	Greater than
`%[]%`	`%)(%`	`%[<]%`	`%[>]%`
`%[)%`	`%)[%`	`%[<)%`	`%[>)%`
`%(]%`	`%](%`	`%(<]%`	`%(>]%`
`%()%`	`%][%`	`%(<)%`	`%(>)%`

Dividing a range into 3 intervals

The functions %[c]%, %[c)%, %(c]%, and %(c)% return an integer vector taking values (the c within the brackets refer to 'cut'):

-1L when the value is less than or equal to a (a <= b), depending on the interval type,
0L when the value is inside the interval, or
1L when the value is greater than or equal to b (a <= b), depending on the interval type.

Expression	Evaluates to -1	Evaluates to 0	Evaluates to 1
`x %[c]% c(a, b)`	`x < a`	`x >= a & x <= b`	`x > b`
`x %[c)% c(a, b)`	`x < a`	`x >= a & x < b`	`x >= b`
`x %(c]% c(a, b)`	`x <= a`	`x > a & x <= b`	`x > b`
`x %(c)% c(a, b)`	`x <= a`	`x > a & x < b`	`x >= b`

Interval-to-interval relations

The operators define the open/closed nature of the lower/upper limits of the intervals on the left and right hand side of the o in the middle.

Intervals	Int. 2: `[]`	Int. 2: `[)`	Int. 2: `(]`	Int. 2: `()`
Int. 1: `[]`	`%[]o[]%`	`%[]o[)%`	`%[]o(]%`	`%[]o()%`
Int. 1: `[)`	`%[)o[]%`	`%[)o[)%`	`%[)o(]%`	`%[)o()%`
Int. 1: `(]`	`%(]o[]%`	`%(]o[)%`	`%(]o(]%`	`%(]o()%`
Int. 1: `()`	`%()o[]%`	`%()o[)%`	`%()o(]%`	`%()o()%`

The overlap of two closed intervals, [a1, b1] and [a2, b2], is evaluated by the %[o]% (alias for %[]o[]%) operator (a1 <= b1, a2 <= b2). Endpoints can be defined as a vector with two values (c(a1, b1))or can be stored in matrix-like objects or a lists in which case comparisons are made element-wise. If lengths do not match, shorter objects are recycled. These value-to-interval operators work for numeric (integer, real) and ordered vectors, and object types which are measured at least on ordinal scale (e.g. dates), see Examples. Note: interval endpoints are sorted internally thus ensuring the conditions a1 <= b1 and a2 <= b2 is not necessary.

c(2, 3) %[]o[]% c(0, 1)
list(0:4, 1:5) %[]o[]% c(2, 3)
cbind(0:4, 1:5) %[]o[]% c(2, 3)
data.frame(a=0:4, b=1:5) %[]o[]% c(2, 3)

If lengths do not match, shorter objects are recycled. These value-to-interval operators work for numeric (integer, real) and ordered vectors, and object types which are measured at least on ordinal scale (e.g. dates).

%)o(% is used for the negation of two closed interval overlap, directional evaluation is done via the operators %[<o]% and %[o>]%. The overlap of two open intervals is evaluated by the %(o)% (alias for %()o()%). %]o[% is used for the negation of two open interval overlap, directional evaluation is done via the operators %(<o)% and %(o>)%.

Equal	Not equal	Less than	Greater than
`%[o]%`	`%)o(%`	`%[<o]%`	`%[o>]%`
`%(o)%`	`%]o[%`	`%(<o)%`	`%(o>)%`

Overlap operators with mixed endpoint do not have negation and directional counterparts.

Operators for discrete variables

The previous operators will return NA for unordered factors. Set overlap can be evaluated by the base %in% operator and its negation %ni% (as in not in, the opposite of in). %nin% and %notin% are aliases for better code readability (%in% can look very much like %ni%).

Examples

Bounding box

set.seed(1)
n <- 10^4
x <- runif(n, -2, 2)
y <- runif(n, -2, 2)
d <- sqrt(x^2 + y^2)
iv1 <- x %[]% c(-0.25, 0.25) & y %[]% c(-1.5, 1.5)
iv2 <- x %[]% c(-1.5, 1.5) & y %[]% c(-0.25, 0.25)
iv3 <- d %()% c(1, 1.5)
plot(x, y, pch = 19, cex = 0.25, col = iv1 + iv2 + 1,
    main = "Intersecting bounding boxes")
plot(x, y, pch = 19, cex = 0.25, col = iv3 + 1,
     main = "Deck the halls:\ndistance range from center")

Time series filtering

x <- seq(0, 4*24*60*60, 60*60)
dt <- as.POSIXct(x, origin="2000-01-01 00:00:00")
f <- as.POSIXlt(dt)$hour %[]% c(0, 11)
plot(sin(x) ~ dt, type="l", col="grey",
    main = "Filtering date/time objects")
points(sin(x) ~ dt, pch = 19, col = f + 1)

Quality control chart (QCC)

library(qcc)
data(pistonrings)
mu <- mean(pistonrings$diameter[pistonrings$trial])
SD <- sd(pistonrings$diameter[pistonrings$trial])
x <- pistonrings$diameter[!pistonrings$trial]
iv <- mu + 3 * c(-SD, SD)
plot(x, pch = 19, col = x %)(% iv +1, type = "b", ylim = mu + 5 * c(-SD, SD),
    main = "Shewhart quality control chart\ndiameter of piston rings")
abline(h = mu)
abline(h = iv, lty = 2)

Confidence intervals and hypothesis testing

## Annette Dobson (1990) "An Introduction to Generalized Linear Models".
## Page 9: Plant Weight Data.
ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
group <- gl(2, 10, 20, labels = c("Ctl","Trt"))
weight <- c(ctl, trt)

lm.D9 <- lm(weight ~ group)
## compare 95% confidence intervals with 0
(CI.D9 <- confint(lm.D9))
#                2.5 %    97.5 %
# (Intercept)  4.56934 5.4946602
# groupTrt    -1.02530 0.2833003
0 %[]% CI.D9
# (Intercept)    groupTrt
#       FALSE        TRUE

lm.D90 <- lm(weight ~ group - 1) # omitting intercept
## compare 95% confidence of the 2 groups to each other
(CI.D90 <- confint(lm.D90))
#            2.5 %  97.5 %
# groupCtl 4.56934 5.49466
# groupTrt 4.19834 5.12366
CI.D90[1,] %[o]% CI.D90[2,]
# 2.5 %
#  TRUE

Dates

DATE <- as.Date(c("2000-01-01","2000-02-01", "2000-03-31"))
DATE %[<]% as.Date(c("2000-01-15", "2000-03-15"))
# [1]  TRUE FALSE FALSE
DATE %[]% as.Date(c("2000-01-15", "2000-03-15"))
# [1] FALSE  TRUE FALSE
DATE %[>]% as.Date(c("2000-01-15", "2000-03-15"))
# [1] FALSE FALSE  TRUE

dt1 <- as.Date(c("2000-01-01", "2000-03-15"))
dt2 <- as.Date(c("2000-03-15", "2000-06-07"))
dt1 %[]o[]% dt2
# [1] TRUE
dt1 %[]o[)% dt2
# [1] TRUE
dt1 %[]o(]% dt2
# [1] FALSE
dt1 %[]o()% dt2
# [1] FALSE

Watch precedence!

(2 * 1:5) %[]% (c(2, 3) * 2)
# [1] FALSE  TRUE  TRUE FALSE FALSE
2 * 1:5 %[]% (c(2, 3) * 2)
# [1] 0 0 0 2 2
(2 * 1:5) %[]% c(2, 3) * 2
# [1] 2 0 0 0 0
2 * 1:5 %[]% c(2, 3) * 2
# [1] 0 4 4 0 0

Truncated distributions

Find the math here, as implemented in the package truncdist.

dtrunc <- function(x, ..., distr, lwr=-Inf, upr=Inf) {
    f <- get(paste0("d", distr), mode = "function")
    F <- get(paste0("p", distr), mode = "function")
    Fx_lwr <- F(lwr, ..., log=FALSE)
    Fx_upr <- F(upr, ..., log=FALSE)
    fx     <- f(x,   ..., log=FALSE)
    fx / (Fx_upr - Fx_lwr) * (x %[]% c(lwr, upr))
}
n <- 10^4
curve(dtrunc(x, distr="norm"), -2.5, 2.5, ylim=c(0, 2), ylab="f(x)")
curve(dtrunc(x, distr="norm", lwr=-0.5, upr=0.1), add=TRUE, col=4, n=n)
curve(dtrunc(x, distr="norm", lwr=-0.75, upr=0.25), add=TRUE, col=3, n=n)
curve(dtrunc(x, distr="norm", lwr=-1, upr=1), add=TRUE, col=2, n=n)

Shiny example 1: regular slider

library(shiny)
library(intrval)
library(qcc)

data(pistonrings)
mu <- mean(pistonrings$diameter[pistonrings$trial])
SD <- sd(pistonrings$diameter[pistonrings$trial])
x <- pistonrings$diameter[!pistonrings$trial]

## UI function
ui <- fluidPage(
  plotOutput("plot"),
  sliderInput("x", "x SD:",
    min=0, max=5, value=0, step=0.1,
    animate=animationOptions(100)
  )
)

# Server logic
server <- function(input, output) {
  output$plot <- renderPlot({
    Main <- paste("Shewhart quality control chart", 
        "diameter of piston rings", sprintf("+/- %.1f SD", input$x),
        sep="\n")
    iv <- mu + input$x * c(-SD, SD)
    plot(x, pch = 19, col = x %)(% iv +1, type = "b", 
        ylim = mu + 5 * c(-SD, SD), main = Main)
    abline(h = mu)
    abline(h = iv, lty = 2)
  })
}

## Run shiny app
if (interactive()) shinyApp(ui, server)

Shiny example 2: range slider

library(shiny)
library(intrval)

set.seed(1)
n <- 10^4
x <- round(runif(n, -2, 2), 2)
y <- round(runif(n, -2, 2), 2)
d <- round(sqrt(x^2 + y^2), 2)

## UI function
ui <- fluidPage(
  titlePanel("intrval example with shiny"),
  sidebarLayout(
    sidebarPanel(
      sliderInput("bb_x", "x value:",
        min=min(x), max=max(x), value=range(x), 
        step=round(diff(range(x))/20, 1), animate=TRUE
      ),
      sliderInput("bb_y", "y value:",
        min = min(y), max = max(y), value = range(y),
        step=round(diff(range(y))/20, 1), animate=TRUE
      ),
      sliderInput("bb_d", "radial distance:",
        min = 0, max = max(d), value = c(0, max(d)/2),
        step=round(max(d)/20, 1), animate=TRUE
      )
    ),
    mainPanel(
      plotOutput("plot")
    )
  )
)

# Server logic
server <- function(input, output) {
  output$plot <- renderPlot({
    iv1 <- x %[]% input$bb_x & y %[]% input$bb_y
    iv2 <- x %[]% input$bb_y & y %[]% input$bb_x
    iv3 <- d %()% input$bb_d
    op <- par(mfrow=c(1,2))
    plot(x, y, pch = 19, cex = 0.25, col = iv1 + iv2 + 3,
        main = "Intersecting bounding boxes")
    plot(x, y, pch = 19, cex = 0.25, col = iv3 + 1,
         main = "Deck the halls:\ndistance range from center")
    par(op)
  })
}

## Run shiny app
if (interactive()) shinyApp(ui, server)

intrval's People

Contributors

Stargazers

Watchers

Forkers

nzcoops mgacc0 fossabot

intrval's Issues

Interval-to-interval relations: some of them are not implemented?

Some interval-to-interval relations

Equal	Not equal	Less than	Greater than
`%[0]%`	`%)0(%`	`%[<0]%`	`%[0>]%`

are mentioned on the README but they are not implemented in ../blob/master/R/specials.R ?

(Great package! And congratulations...
I'm sure I will use it a lot. 👍 )

Roadmap for CRAN release

set of functions to finalize (naming etc)
support functions to finalize: use a single function for x %[]% c(a,b) types only, print & plot.
separate help files for %[]% and %[o]% operators
improve testing suite: unit tests + random interval testing
updated README file
tweet/blog when done

How to match elements of a vector to a data frame of intervals.

This looks like a really useful set of functions. I often need to match elements of a vector to a data frame of intervals. Currently I use the IRanges package, but I wonder if your package might be simpler to use. Can you tell me if there is a more efficient method with your package? This is what I've got so far:

# Find which interval that each element of the vector belongs in

library(tidyverse)
# here are my elements
elements <- c(0.1, 0.2, 0.5, 0.9, 1.1, 1.9, 2.1)

# here are my intervals
intervals <- 
  frame_data(  ~phase, ~start, ~end,
               "a",     0,      0.5,
               "b",     1,      1.9,
               "c",     2,      2.5
  )

# For each element, I want to know what interval does it belong in
library(intrval)
map(elements, ~.x %[]% data.frame(intervals[, c('start', 'end')])) %>% 
  map(., ~unlist(intervals[.x, 'phase']))

The output is like this, which I'm happy with:

[[1]]
phase 
  "a" 

[[2]]
phase 
  "a" 

[[3]]
phase 
  "a" 

[[4]]
character(0)

[[5]]
phase 
  "b" 

[[6]]
phase 
  "b" 

[[7]]
phase 
  "c"

But I'm wondering if there's a simpler way to use your functions so I don't need the two map functions.

Thanks!

better operators for interval-to-interval relations

It would be really great to implement some new operators for interval-to-interval overlapping.

Instead of having the %[o]% operator for checking the overlapping between the closed intervals [a, b] and [x, y]
expressed as c(a, b) %[o]% c(x, y),
it would be great having the %[]o[]% operator
expressed as c(a, b) %[]o[]% c(x, y).

And then, for example:

c(1, 2) %[]o[]% c(2, 3)
# TRUE

c(2, 3) %[]o[]% c(1, 2)
# TRUE

c(1, 2) %()o()% c(2, 3)
# FALSE

c(2, 3) %()o()% c(1, 2)
# FALSE

Consequently, that would imply creating 32 operators (for all the combinations of open/closed endpoints):

interval-to-interval operator
`%[]o[]%`
`%[]o[)%`
`%[]o(]%`
`%[]o()%`
`%[)o[]%`
`%[)o[)%`
`%[)o(]%`
`%[)o()%`
... (continues)

enhancement: use a modern testing environment

To solve the CI error: just add "Suggests: testthat" to your DESCRIPTION.
http://r-pkgs.had.co.nz/tests.html

specify and check how NAs are handled

Commit ee0c4c1 fixed an undesirable NA-handling behaviour.

The NA-handling bahaviour needs to be explicitly stated in the documentation.
Need to add unit-tests to ensure this stated behaviour.

A sketch of desirable behaviour:

When is.na(x), there is not much one can do, and the results should be NA.
When the interval specification contains an NA, e.g. c(1, NA) the na.rm=TRUE reduces the interval to a degenerate interval of a single value when closed [1,1], or nothing when open (1,1). Also there is the issue of soring: it cannot be assessed of the non NA value is the lower or the upper endpoint. This should return an NA. When both endpoints are NAs, return NA.
In other words: if any of x, a, b is NA return an NA.

reverse type if the interval limits are swapped

When an interval is expressed in reverse order,
it would be coherent to swap the order of parentheses and brackets:

expect_false(.intrval(4, c(5, 4), "[)"))
# Error: .intrval(4, c(5, 4), "[)") isn't false.

expect_false(.intrval(4, c(4, 5), "(]"))

expect_true(.intrval(5, c(5, 4), "[)"))
# Error: .intrval(5, c(5, 4), "[)") isn't true.

reverse_type <- function (type) {
  ifelse(type=="(]", "[)",
         ifelse(type=="[)", "(]", type))
}
reverse_type("(]")
# [1] "[)"
reverse_type("[)")
# [1] "(]"

Or maybe a warning should be shown when a swapped interval is found?

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.