Giter VIP home page Giter VIP logo

arrow-cast-guess-precision's Introduction

arrow-cast-guess-precision

Cast integer to timestamp with precision guessing options.

Just replace arrow::compute::cast with arrow_cast_guess_precision::cast and everything done.

use arrow::{
    array::{Int64Array, TimestampNanosecondArray},
    datatypes::{DataType, TimeUnit}
};

let data = vec![1701325744956, 1701325744956];
let array = Int64Array::from(data);
let array = arrow_cast_guess_precision::cast(
    &array,
    &DataType::Timestamp(TimeUnit::Nanosecond, None),
)
.unwrap();
let nanos = array
    .as_any()
    .downcast_ref::<TimestampNanosecondArray>()
    .unwrap();
assert_eq!(nanos.value(0), 1701325744956 * 1000 * 1000);

The difference to official arrow::compute::cast is that:

  • arrow v49 will cast integer directly to timestamp, but this crate(arrow-cast-guess-precision = "0.3.0") will try to guess from the value.
  • arrow v48 does not support casting from integers to timestamp (arrow-cast-guess-precision = "0.2.0").

The guessing method is:

use arrow::datatypes::TimeUnit;

const GUESSING_BOUND_YEARS: i64 = 10000;
const LOWER_BOUND_MILLIS: i64 = 86400 * 365 * GUESSING_BOUND_YEARS;
const LOWER_BOUND_MICROS: i64 = 1000 * 86400 * 365 * GUESSING_BOUND_YEARS;
const LOWER_BOUND_NANOS: i64 = 1000 * 1000 * 86400 * 365 * GUESSING_BOUND_YEARS;

#[inline]
const fn guess_precision(timestamp: i64) -> TimeUnit {
    let timestamp = timestamp.abs();
    if timestamp > LOWER_BOUND_NANOS {
        return TimeUnit::Nanosecond;
    }
    if timestamp > LOWER_BOUND_MICROS {
        return TimeUnit::Microsecond;
    }
    if timestamp > LOWER_BOUND_MILLIS {
        return TimeUnit::Millisecond;
    }
    TimeUnit::Second
}

Users could set ARROW_CAST_GUESSING_BOUND_YEARS environment at build-time to control the guessing bound. here is a sample list based on individual environment values:

value lower bound Upper Bound
100 1970-02-06t12:00:00 2069-12-07T00:00:00
200 1970-03-15t00:00:00 2169-11-13T00:00:00
500 1970-07-02t12:00:00 2469-09-01T00:00:00
1000 1971-01-01T00:00:00 2969-05-03T00:00:00
2000 1972-01-01t00:00:00 3968-09-03T00:00:00
5000 1974-12-31t00:00:00 6966-09-06T00:00:00
10000 1979-12-30t00:00:00 +11963-05-13T00:00:00

We use ARROW_CAST_GUESSING_BOUND_YEARS=1000 by default, just because 1000 milliseconds is 1 second so that the lower bound starts with 1971-01-01T00:00:00 which is one year after ZERO unix timestamp, and the upper bound is enough (even 100-years is enough though).

Like arrow::compute::cast, this crate also supports casting with specific options, checkout CastOptions.

License: MIT

arrow-cast-guess-precision's People

Contributors

zitsen avatar

Stargazers

Shabbir Hasan avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.