Giter VIP home page Giter VIP logo

bignum's Introduction

GitHub User's stars OSS contributions Google Scholar badge LinkedIn badge :name status badge

Skills ๐Ÿ› 

  • Languages: R, Python, SQL, C++, MATLAB
  • Big Data: Spark, PrestoDB, Azure, AWS
  • DevOps: Git, Docker, CI (GitHub Actions, Travis, Jenkins)

bignum's People

Contributors

davidchall avatar lionel- avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

bignum's Issues

Release bignum 0.3.0

Prepare for release:

  • Check current CRAN check results
  • Polish NEWS
  • devtools::build_readme()
  • urlchecker::url_check()
  • devtools::check(remote = TRUE, manual = TRUE)
  • devtools::check_win_devel()
  • rhub::check_for_cran()
  • rhub::check(platform = 'solaris-x86-patched')
  • revdepcheck::revdep_check(num_workers = 4)
  • Update cran-comments.md
  • Review pkgdown reference index for, e.g., missing topics

Submit to CRAN:

  • usethis::use_version('minor')
  • devtools::submit_cran()
  • Approve email

Wait for CRAN...

  • Accepted ๐ŸŽ‰
  • usethis::use_github_release()
  • usethis::use_dev_version()

Use a power of two for calling cpp11::check_user_interrupt()

I noticed several checks of the form x % 10000 == 0 in the code. This check is slower than e.g. x % 16384 == 0 or x % 8192 == 0 or even x & 0x1fff, because the latter only needs a logical AND, whereas the former uses a full-fledged integer division.

I suspect these tight loops will run a bit faster when employing the alternative variants. I haven't found a good source that demonstrates the actual impact, we're probably talking microseconds for a typical operations at best.

Support `round()` and `signif()`

  • Wait for response to r-lib/vctrs#1389.
  • Implement using format() method.
  • Handle digits argument (e.g. named vs unnamed).
  • Handle negative digits argument (round to powers of 10) - requires changes to C++ code.

Use vendored Boost headers

Remove BH dependency by following approach outlined in r-dbi/RSQLite#362.

This should not change functional behavior, but significantly reduce the download size needed to use the bignum package (since the BH package includes all Boost headers, which is more than we need for bignum).

Release bignum 0.1.0

First release:

Prepare for release:

  • devtools::build_readme()
  • urlchecker::url_check()
  • devtools::check(remote = TRUE, manual = TRUE)
  • devtools::check_win_devel()
  • rhub::check_for_cran()
  • rhub::check(platform = 'solaris-x86-patched')
  • rhub::check(platform = 'ubuntu-rchk')
  • rhub::check_with_sanitizers()
  • Review pkgdown reference index for, e.g., missing topics

Submit to CRAN:

  • usethis::use_version('minor')
  • devtools::submit_cran()
  • Approve email

Wait for CRAN...

  • Accepted ๐ŸŽ‰
  • usethis::use_github_release()
  • usethis::use_news_md()
  • usethis::use_dev_version()
  • Update install instructions in README

Constructors shouldn't accept ellipsis?

I might need to make biginteger() have same signature as as_biginteger() (i.e. takes single argument). Currently it accepts multiple arguments, which get cast to a character vector before constructing the biginteger. The problem is that this loses the special handling of lossy casts.

library(bignum)

as_biginteger(2.5)
#> Warning: Can't convert from <double> to <biginteger> due to loss of precision.
#> * Locations: 1
#> <biginteger[1]>
#> [1] <NA>

biginteger(2.5)
#> <biginteger[1]>
#> [1] <NA>

Created on 2021-05-09 by the reprex package (v2.0.0)

Release bignum 0.2.0

Prepare for release:

  • Check current CRAN check results
  • Polish NEWS
  • devtools::build_readme()
  • urlchecker::url_check()
  • devtools::check(remote = TRUE, manual = TRUE)
  • devtools::check_win_devel()
  • rhub::check_for_cran()
  • rhub::check(platform = 'ubuntu-rchk')
  • rhub::check_with_sanitizers()
  • rhub::check(platform = 'solaris-x86-patched')
  • Update cran-comments.md
  • Review pkgdown reference index for, e.g., missing topics

Submit to CRAN:

  • usethis::use_version('minor')
  • devtools::submit_cran()
  • Approve email

Wait for CRAN...

  • Accepted ๐ŸŽ‰
  • usethis::use_github_release()
  • usethis::use_dev_version()

Use ALTREP

I'm looking for a vector class that supports integers with arbitrary precision that works in a data frame/tibble. This package fulfills the need, but storage is inefficient (R strings), and all kinds of arithmetics require an expensive parse-deparse from strings to the binary format.

With ALTREP we could make use of the underlying library's storage format and convert to strings or numerics as necessary. NA values are a challenge but we could use a vctrs record with a raw vector on the R side or a bit array on the C++ side. Would you support this? Should we also support fixed-width integers (int8, int16, int32, int64, int128, int256)?

I'm also wondering if you'd be open to inlining/vendoring the required Boost headers, as done e.g. in r-dbi/RSQLite#362. For DBI and related packages, we'd rather avoid using {BH} because it installs so many files.

Allow leading zeros

It would be nice if biginteger() supported leading zeros. Currently (under version 0.3.0) it returns NA:

biginteger("08")

gives

<biginteger[1]>
[1] <NA>

Pretty printing (pillar)

As a follow-up to #6, it'd be nice to apply the same styling that pillar applies to atomic numeric vectors. E.g.

image

  • Align mantissa around decimal point
  • Color negative mantissa
  • Align exponent to the right
  • Color negative exponent
  • Show exponent sign only if there is a negative exponent
  • Subtle style for exponent "e"
  • Update width calculations to support all this

Edge cases disagree with base R functions

As raised in #44, boost::math functions throw an exception under certain edge cases (e.g., domain error, pole error). In this scenario, bignum catches the exception and inserts NA in the result vector:

bignum/src/operations.h

Lines 16 to 20 in aeb311c

try {
output.data[i] = UnaryOperation(x.data[i]);
} catch (...) {
output.is_na[i] = true; // # nocov
}

In contrast, the base R version of the math function might not return NA. For example, it might return NaN or infinity. We should align the bignum outputs with base R and add these edge cases to the unit tests. I think the correct approach is to pass an error handling policy as an argument to the boost::math function.

Here's a list of known edge cases that cause disagreement between bignum and R:

  • digamma(-1)

format not working well with data.frame when being printed

Bug description

format.bignum_bigfloat and format.bignum_biginteger do not work well with data.frame when being printed.

To Reproduce

dt <- data.frame(
  id = 1,
  x = bignum::bigfloat(1 / 3)
)

dt
#> Error: `...` is not empty.
#> 
#> We detected these problematic arguments:
#> * `na.encode`
#> * `justify`
#> 
#> These dots only exist to allow future extensions and should be empty.
#> Did you misspecify an argument?

Stack trace:

1: stop(fallback)
2: signal_abort(cnd)
3: action(message, .subclass = c(.subclass, "rlib_error_dots"), 
4: action_dots(action = action, message = "`...` is not empty.", 
5: ellipsis::check_dots_empty()
6: format.bignum_biginteger(x[[i]], ..., justify = justify)
7: format(x[[i]], ..., justify = justify)
8: format.data.frame(if (omit) x[seq_len(n0), , drop = FALSE] else x, 
9: as.matrix(format.data.frame(if (omit) x[seq_len(n0), , drop = FALSE] else x, 
10: print.data.frame(x)
11: (function (x, ...)

Additional context

โ”€ Session info โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
 setting  value                       
 version  R version 4.1.0 (2021-05-18)
 os       macOS Big Sur 11.4          
 system   x86_64, darwin17.0          
 ui       X11                         
 language (EN)                        
 collate  en_US.UTF-8                 
 ctype    en_US.UTF-8                 
 tz       Asia/Shanghai               
 date     2021-06-14                  

โ”€ Packages โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
 package     * version date       lib source        
 bignum        0.2.0   2021-06-13 [1] CRAN (R 4.1.0)
 cli           2.5.0   2021-04-26 [1] CRAN (R 4.1.0)
 crayon        1.4.1   2021-02-08 [1] CRAN (R 4.1.0)
 ellipsis      0.3.2   2021-04-29 [1] CRAN (R 4.1.0)
 jsonlite      1.7.2   2020-12-09 [1] CRAN (R 4.1.0)
 rlang         0.4.11  2021-04-30 [1] CRAN (R 4.1.0)
 rstudioapi    0.13    2020-11-12 [1] CRAN (R 4.1.0)
 sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 4.1.0)
 vctrs         0.3.8   2021-04-29 [1] CRAN (R 4.1.0)
 withr         2.4.2   2021-04-18 [1] CRAN (R 4.1.0)

[1] /Library/Frameworks/R.framework/Versions/4.1/Resources/library

Failure with dev waldo

waldo now (correctly) differentiates NA and NaN so two bignum tests fail:

Failure (test-vctrs-math.R:136:3): math returning float works
suppressWarnings(as.double(fun(biginteger(x), ...))) (`actual`) not equal to suppressWarnings(fun(x, ...)) (`expected`).

  `actual`: 0.4227843350984671 0.9227843350984671 NA  NA
`expected`: 0.4227843350984675 0.9227843350984675 NA NaN
Backtrace:
 1. bignum (local) check_math(x, digamma)
      at test-vctrs-math.R:136:2
 2. testthat::expect_equal(...)
      at test-vctrs-math.R:99:4

Failure (test-vctrs-math.R:136:3): math returning float works
suppressWarnings(as.double(fun(bigfloat(x), ...))) (`actual`) not equal to suppressWarnings(fun(x, ...)) (`expected`).

  `actual`: 0.4227843350984671 0.9227843350984671 NA  NA
`expected`: 0.4227843350984675 0.9227843350984675 NA NaN
Backtrace:
 1. bignum (local) check_math(x, digamma)
      at test-vctrs-math.R:136:2
 2. testthat::expect_equal(...)
      at test-vctrs-math.R:103:4

I didn't see an obvious way to fix this in just the tests, suggesting that this is revealing a real difference in behaviour that you might need to look into.

Release bignum 0.3.1

Prepare for release:

  • git pull
  • Check current CRAN check results
  • Polish NEWS
  • devtools::build_readme()
  • urlchecker::url_check()
  • devtools::check(remote = TRUE, manual = TRUE)
  • devtools::check_win_devel()
  • rhub::check_for_cran()
  • revdepcheck::revdep_check(num_workers = 4)
  • Update cran-comments.md
  • git push

Submit to CRAN:

  • usethis::use_version('patch')
  • devtools::submit_cran()
  • Approve email

Wait for CRAN...

  • Accepted ๐ŸŽ‰
  • git push
  • usethis::use_github_release()
  • usethis::use_dev_version()
  • git push

Factorial function

The base R factorial() is not a generic, so perhaps it can be called bigfactorial()?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.