Giter VIP home page Giter VIP logo

dragonbox's People

Contributors

alexhuszagh avatar friendlyanon avatar jk-jeon avatar johnmcfarlane avatar jwakely avatar krnowak avatar leni536 avatar rodericktaylor avatar seanmiddleditch avatar striezel avatar tobybell avatar yotann avatar yumeyao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dragonbox's Issues

How to get a fixed string value format

For example,

constexpr int buffer_length = 1 + // for '\0'
                  jkj::dragonbox::max_output_string_length<jkj::dragonbox::ieee754_binary64>;
char buffer[buffer_length];
double d{980528.1478392933};
char* end_ptr = jkj::dragonbox::to_chars(price, buffer);
std::cout << buffer << std::endl;

I got "9.805281478392933E5", but I want "980528.1478392933";

x86(Win32) on MSVC 2019 reporting roundtrip errors

I used https://github.com/jk-jeon/dtoa-benchmark to build a win32 exe straight forward from the vs 2019 sln file, and find out dragonbox is corrupted on win32 at the moment. I also tried overwriting dragonbox.h in benchmark project with the latest file from this repository, still no help.

To reproduce:

  • clone dtoa-benchmark, and open it in vs 2019
  • fix win32 include path (a pull request is there in the benchmark repo)
  • comment out milo_only_grisu2.cpp REGISTER_TEST(...) line (it's very slow or maybe dead-loop?)
  • build & run

Uninitialized result structure can return incorrect value in may_have_trailing_zeros

It seems that the result structure is never initialized, which means that the may_have_trailing_zeros can return whatever was on the stack if none of on_trailing_zeros or no_trailing_zeros functions are called.

I have this issue in a test program I made to compare my C port of the reference code. It happens with the following values, which returns true (in fact it's not 'true' but '5') for may_have_trailing_zeros in my test code (which means that you might not get the same result with a simple call with those values). That value comes from the compute_nearest_normal function on line 1799 where ret_value is declared but not initialized. The field is never written to after that.

using namespace jkj::dragonbox;

uint32_t i = 1283457033;
float f = *( float* )( &i );
    
auto v = to_decimal( f, policy::decimal_to_binary_rounding::nearest_to_even, policy::binary_to_decimal_rounding::do_not_care, policy::trailing_zero::report, policy::sign::return_sign, policy::cache::full );

Typos in the paper

It seems that there should be no "= \varnothing" at the beginning of the second column on page 10.
typo1
And no "-s" in "ingredients", likewise no "-o" in "computationo".
typo2

As we are already talking about the article, how is this fact at the very end of the first column on page 9 being used later in the proof?
image

The output is of the shortest length is a question mark.

DragonDecimal.zip

I generated decimal normal number by concatenate to integers together to this effect:

dbl = operand_fractpart / std::pow(10, n); // operand_fractpart is an integer
dbl += operand_intpart; // operand_intpart is also an integer

Then passes a double to DragonDecimal.cpp as attached. I passed 10000 doubles and toStringRaw() output is also attached.
fp_by_dragonbox_decimal_raw8.txt

My code makes sure operand_fractpart is only 8 digits. However we can see -7.018380128700001E2. By right it should be -7.0183801287E2?

However if I tried to verify whether -701.83801287 is a normal number by:

double d = -701.83801287;
memset(buffer, 0x0, sizeof(buffer));
jkj::dragonbox::to_chars(d, buffer);
std::cout << buffer << std::endl;

Then it's ok. I feel a literal double may be different from a dynamic memory double?

Every my 10000 double, there are always a couple of non-shortest values.

The multipllier $10^{\kappa}$ does not need to be a power of 10

The only place where it is a power of 10 is used is that powers of 10 are stored in the precomputed cache. By allowing more general integers other than $10^{\kappa}$, we may further reduce the chance of having to investigate the fractional part, thus boosting performance. It may also opens up a possibility of eliminating the needs for treating the shorter interval case specially.

MSVC: warning C4702: unreachable code

The following error arises when compiling with MSVC (17.5.0 Preview 2.0) with /W4:

...\dragonbox-src\include\dragonbox\dragonbox.h(662): warning C4702: unreachable code

The line in for the warning is:

constexpr auto divisor = compute_power<N>(UInt(10));

However, that isn't the real problem. The warning is actually triggered by the whole second if constexpr block at:

if constexpr (std::is_same_v<UInt, std::uint64_t> && N == 3 &&

Changing that second if constexpr to else if constexpr resolves the warning.

Optimize decimal_length_minus_1

As discussed in dtolnay/dragonbox#1, the algorithm for calculating the decimal length is rather inefficient, since the number of digits can be computed with minimal branching using a shr, add, and a table lookup instruction as described here.

The current implementation is:

JKJ_FORCEINLINE static constexpr std::uint32_t decimal_length_minus_1(std::uint32_t const v) {
    assert(v < 1000000000);
    if (v >= 100000000) { return 8; }
    if (v >= 10000000) { return 7; }
    if (v >= 1000000) { return 6; }
    if (v >= 100000) { return 5; }
    if (v >= 10000) { return 4; }
    if (v >= 1000) { return 3; }
    if (v >= 100) { return 2; }
    if (v >= 10) { return 1; }
    return 0;
}

Which optimizes quite poorly (for example, on x86):

decimal_length_minus_1(unsigned int):
        mov     eax, 8
        cmp     edi, 99999999
        ja      .L1
        mov     eax, 7
        cmp     edi, 9999999
        ja      .L1
        mov     eax, 6
        cmp     edi, 999999
        ja      .L1
        mov     eax, 5
        cmp     edi, 99999
        ja      .L1
        mov     eax, 4
        cmp     edi, 9999
        ja      .L1
        mov     eax, 3
        cmp     edi, 999
        ja      .L1
        mov     eax, 2
        cmp     edi, 99
        ja      .L1
        xor     eax, eax
        cmp     edi, 9
        seta    al
.L1:
        ret

Using a built-in ctlz function, we can make this much more efficient:

#ifdef _MSC_VER
#include <intrin.h>
#endif

JKJ_FORCEINLINE static std::uint32_t ctlz(std::uint32_t v) {
    // Optimize for GCC/Clang when possible.
#ifdef __has_builtin
  #if __has_builtin(__builtin_clz)
    return __builtin_clz(v);
  #endif
#endif

    // Optimize for MSVC
#ifdef _MSC_VER
  #if defined(_M_X64) || defined(_M_ARM64)
    unsigned long leading_zero = 0;
    _BitScanReverse(&leading_zero, v);
    return static_cast<std::uint32_t>(31 - leading_zero);
  #endif
#endif

    // Fallback algorithm
    int count = 0;
    if (v & std::uint32_t(0xffff0000)) {
        v >>= 16;
        count += 16;
    }
    if (v & std::uint32_t(0xff00)) {
        v >>= 8;
        count += 8;
    }
    if (v & std::uint32_t(0xf0)) {
        v >>= 4;
        count += 4;
    }
    if (v & std::uint32_t(0xc)) {
        v >>= 2;
        count += 2;
    }
    if (v & std::uint32_t(0x2)) {
        v >>= 1;
        count += 1;
    }
    return 31 - count;
}

JKJ_FORCEINLINE static std::uint32_t fast_log2(std::uint32_t const v) {
    return 31 - ctlz(v | 1);
}

JKJ_FORCEINLINE static std::uint32_t decimal_length_minus_1(std::uint32_t const v) {
    static constexpr std::uint32_t table[] = {9, 99, 999, 9999, 99999, 999999, 9999999, 99999999, 999999999};
    std::uint32_t y = (9 * fast_log2(v)) >> 5;
    y += v > table[y];
    return y;
}

Most of the bloat is just to ensure we have a portable way to count leading zeros. And this optimizes quite nicely with the built-in optimizations:

decimal_length_minus_1(unsigned int):
        mov     eax, edi
        mov     edx, 31
        or      eax, 1
        bsr     eax, eax
        xor     eax, 31
        sub     edx, eax
        lea     eax, [rdx+rdx*8]
        shr     eax, 5
        mov     edx, eax
        cmp     DWORD PTR decimal_length_minus_1(unsigned int)::table[0+rdx*4], edi
        adc     eax, 0
        ret

For MSVC on x86_64, we have:

unsigned int decimal_length_minus_1(unsigned int) PROC                    ; decimal_length_minus_1, COMDAT
        xor     eax, eax
        mov     edx, ecx
        or      edx, 1
        bsr     r8d, edx
        lea     r9d, DWORD PTR [r8+r8*8]
        shr     r9d, 5
        lea     r8, OFFSET FLAT:unsigned int const * const `unsigned int decimal_length_minus_1(unsigned int)'::`2'::table
        cmp     ecx, DWORD PTR [r8+r9*4]
        seta    al
        add     eax, r9d
        ret     0
unsigned int decimal_length_minus_1(unsigned int) ENDP  

And for MSVC on ARM64, we have:

|unsigned int decimal_length_minus_1(unsigned int)| PROC              ; decimal_length_minus_1
        orr         w8,w0,#1
        clz         w9,w8
        mov         w10,#0x1F
        sub         w10,w10,w9
        add         w8,w10,w10,lsl #3
        adrp        x9,|unsigned int const * const `unsigned int decimal_length_minus_1(unsigned int)'::`2'::table|
        add         x9,x9,|unsigned int const * const `unsigned int decimal_length_minus_1(unsigned int)'::`2'::table|
        lsr         w11,w8,#5
        ldr         w8,[x9,w11 uxtw #2]
        cmp         w0,w8
        csethi      w10
        add         w0,w10,w11
        ret

        ENDP  ; |unsigned int decimal_length_minus_1(unsigned int)|, decimal_length_minus_1

Since the fallback ctlz algorithm isn't very good, we could do a simple optimization for MSVC and GCC/Clang, and for another other compiler keep the old algorithm.

dragonbox_to_chars.h should have a `static const int` with minimum buffer size guaranteed to be enough

essentially, the same as DtoaMinBufferLength from https://github.com/abolz/Drachennest/blob/master/src/dragonbox.h (although the value given there seems a bit large but i haven't looked into it)

this would be much more reliable and usable than relying on what the README says:

char buffer[31];   // Should be long enough

that way, users can write:

char buffer[dragonboxBufferLen];

and it'll guarantee no buffer overflow, etc

Simpler "reference implementation"

The current implementation has grown too much to a point where it is nearly impossible to understand for people with not so much exposure to template black magics and other C++-specific nonsenses. The complication of the implementation can act as a significant barrier for people wanting to port the algorithm into languages other than C++, or even into C++, even though the algorithm itself is not that overcomplicated.

Those templates and other nonsenses are done for reasons, to make this library general and fast enough, but at the same time since this is the reference implementation, it should serve the role as such, i.e., to provide guidance to other people wanting to write their own implementations. Hence, I think it makes sense to provide the core algorithm written in simpler, non-esoteric C++ (or maybe even C99?) that everybody can understand.

To be clear, I do not intend to replace the current implementation with a simpler one, which I consider impossible without losing many utilities and performance. As pointed out earlier, those template hackeries are done for reasons. Instead, what I want to have is a standalone implementation of the algorithm that only serves the expository role, which does not need to be optimized-to-hell, rather just enough to demonstrate the algorithm's potential. There already is an implementation which can serve such a role, done by Alexander Bolz, but (1) there is no implementation for IEEE-754 binary32, (2) his implementation is quite outdated at this moment as the algorithm has been improved quite a lot since then, and (3) I think it's nicer to have one in this official repository for the reference implementation, kept in sync with whatever further improvements we bring in the future.

I may work on this someday if no one helps me, but PR is very welcome if anyone is interested!

Paper/implementation mismatch?

In practice, computing the parity of x(i) is still a heavy
operation compared to others. Hence, it can be beneficial to
delay it until it is absolutely necessary. In Algorithm 5.2, we
need to compute the parity of x(i) when r and δ(i) turn out
to be the same. However, in fact if either the interval I does
not include the left endpoint or x is not an integer, we always
conclude I ∩ 10^(−k0+1)Z is empty regardless of the parity of
x(i). Therefore, if the given rounding mode indicates that the
left endpoint is not in I, we can skip computing the parity of
x(i).

(section 5.1.6, page 21)

If I understand this correctly, the condition of the left endpoint being excluded or x not being an integer is enough to say that the intersection in question is empty, and the parity doesn't need to be checked.

However, the algorithm implementation seems to disagree:

                    if (!interval_type.include_left_endpoint() ||
                        exponent < case_fc_pm_half_lower_threshold ||
                        exponent > divisibility_check_by_5_threshold) {
                        // If the left endpoint is not included, the condition for
                        // success is z^(f) < delta^(f) (odd parity).
                        // Otherwise, the inequalities on exponent ensure that
                        // x is not an integer, so if z^(f) >= delta^(f) (even parity), we in fact
                        // have strict inequality.
                        if (!compute_mul_parity(two_fl, cache, beta).parity) {
                            goto small_divisor_case_label;
                        }
                    }
                    else {
                        auto const [xi_parity, x_is_integer] =
                            compute_mul_parity(two_fl, cache, beta);
                        if (!xi_parity && !x_is_integer) {
                            goto small_divisor_case_label;
                        }
                    }

In fact, the comments explicitly state that checking parity is required. From quick testing it seems that the code is right here.

Change the way policy parameters are categorized

Currently a policy parameter belongs to a policy kind if its type is deriving from the specified base type.
At this point it seems checking for the existence of a specific member type sounds like a better idea in general.

Improve paper

I recently implemented the main algorithm in Factor with the paper as my main source.

My two main hurdles were these:

  • There is a typo Algorithm 5.6, steps 6–8: where there is 10^(k_0), it should be 10^(-k_0). I could only find this by looking at the reference implementation.
  • In Algorithm 5.4, step 5, or in the text, very little is said about how to break the tie. The code is a little obfuscated, the significand is 10s̃ + t, not f_c, as I first thought. Also helpful would be the information that you choose y^(rd) if the significand is odd, for the nearest-to-even case.

Another, smaller grievance is in section 5.2.3:

  • The information that d_1, d_2 are always zero is hidden away very well.
  • The upper bound for e in the integer check of x^(i) and z^(i) is a constant. It evaluates to 3 in the case of binary32 and binary64.

P.S.: In section 5.1.7 for completeness, you could mention that you compute r as r = z^(i) - 10^(κ+1)*s. This might seem like baby steps, but I had to extract the obvious from another implementation.

pkg-config file

Hello,

Would it be possible to offer a .pc (pkg-config) file that'd be installed with the build system, for use with non-CMake projects?

Thank you!

Do you have an algorithm to deal with ratio and percentage for integers?

I find situations like percentages very common, but using floating pointers to print them out would be very costly.

Do you have an algorithm to deal with that?

One without precision:

For example, giving input (3,10)
Output:
30%
or
0.3
For example, giving input (3,9)
Output:
33.(3)%
or
0.(3)
For example, giving input (2,14)
Output:
14.(285714)%
or
0.(142857)

Also, with precision version (which is very common in statistics, like keeping two digits decimals):

For example, giving input (2,14), precision 2
Output:
14.26%
or
0.14

C++11?

I need a port of this to C++11

Update the paper reflecting the recent change in integer checks

Perhaps it would be possible to apply the rounding trick that Schubfach uses to eliminate some of those ugly integer checks, which will greatly improve the overall performance.

However, it requires a major overhaul of the entire algorithm so it will take time.

Wrong constant?

In line 832 of dragonbox.h there is a right shift of 7 bits, whereas in the paper it is a right shift of 71 bits (section 5.1.7). It seems that 71 is indeed correct.

Is there a decimal_fp<Float> -> Float conversion?

Unless I am missing something there appears to be no reverse conversion in this library.

I am trying to convert Ryu based code that operates on moral equivalent of decimal_fp (doing more than just printing) to Dragonbox. Lack of reverse conversion is a big stumbling block because I would need to essentially carry around big chunks of Ryu data and code to do it.

As an aside it also looks like to_chars only does exponential notation. Lack of fixed format would be another stumbling block.

Would be great if decimal_fp were exposed as a first class thing with conversion and formatting functions. 😃

enhancement proposal: pimp up the readme, a little, for dummies ...

[ edit ] solved, the below discussion - besides lengthy - helped me
to get the door open, thus if you don't find better solution take
the time and read. [ /edit ]

I followed the readme, downloading and building had meaningful
responses, but I'm stuck to get the door open.
Would like to compile and run the test programs provided in the
readme, but don't know where to put and how to compile.
I'm not an experienced C++ coder, neither can tell the difference
between Cmake and make.

TIA for any help :-)

Consider adding a section on README elaborating on what roundtrip precisely means

Hi @jk-jeon, and thank you for your amazing work on this!
This isn't exactly an issue, but rather a question or a request for clarification of the documentation.
The documentation says the algorithm is round-trip.
To check this my understand of this, I have written a simple loop whose index "i" iterates all uint64_t values and:

  • obtains a double "orig" by bit_cast-ing the index
  • creates decimal "dec" from "orig" via dragonbox;
  • obtains another double called "back" via decimal.significand * pow(10.0, decimal.exponent), and then inverting it if negative;
  • compares the two doubles, and prints an error message if they're different;
  • the error message also includes a bit_cast-ed uint64_t from the "back" double, called "back_i" in the output below.

Here's a sample of its output; formatting is done with an old version of fmtlib (5.3.0):

i=4716101740380815687 orig=1.98801e+07 dec=(1988012400000122, -8, +) back=1.98801e+07 back_i=4716101740380815688
i=4716101740380815688 orig=1.98801e+07 dec=(19880124000001222, -9, +) back=1.98801e+07 back_i=4716101740380815689
i=4716101740380815699 orig=1.98801e+07 dec=(19880124000001263, -9, +) back=1.98801e+07 back_i=4716101740380815700
i=4716101740380815700 orig=1.98801e+07 dec=(19880124000001267, -9, +) back=1.98801e+07 back_i=4716101740380815701
i=4716101740380815703 orig=1.98801e+07 dec=(19880124000001278, -9, +) back=1.98801e+07 back_i=4716101740380815704
i=4716101740380815713 orig=1.98801e+07 dec=(19880124000001315, -9, +) back=1.98801e+07 back_i=4716101740380815714

I haven't verified this programatically but I notice that the back integer representation is one off the original integer from which the double was created. Note that the exact code computing "back" from "dec" is:

double back = static_cast<double>(dec.significand) * pow(10.0, static_cast<double>(decimal.exponent));

My questions are:

  • what am I doing wrong, if anything, in converting from the decimal to a double? I have a feeling that the answer is "everything"...
  • would the rounding policy have anything to do with off-by-one observation above?

Thank you kindly in advance for your time.

constexpr

I recently contributed to https://github.com/fastfloat/fast_float to make the from_chars implementation there constexpr from C++20.

I think making to_chars constexpr is equally possible. Would efforts to do this be welcome here?

I see that to_chars is currently defined in a .cpp file, which could make this challenging without making the library header-only.

JSON compliant to_chars?

I'd like to use dragonbox for serializing floats in Glaze. What I'm particularly looking for is a JSON compliant to_chars implementation that has the shortest string result. It seems like dragonbox's to_chars will print out ...E0 when it is not necessary.

The other requirement for JSON compliance is that infinity and nan should be printed as "null".

Could options to support these features be added to dragonbox or should I fork the code and edit it myself?

I'd also like the to_chars to be header only, but I don't mind doing this conversion myself and keeping it up to date with dragonbox.

Thanks for all the hard work you've put into this great library!

nvcc (Version 201402L) throws "error: namespace "std" has no member "is_same_v"

Hi,
I am trying to test dragonbox with the nvcc compiler (Version 201402L).
But it throws;

  1. dragonbox/dragonbox_to_chars.h(98): error: namespace "std" has no member "is_same_v" std::is_same_v<FloatFormat, ieee754_binary32>
  2. dragonbox/dragonbox_to_chars.h(98): error: type name is not allowed std::is_same_v<FloatFormat, ieee754_binary32>
  3. dragonbox/dragonbox_to_chars.h(98): error: expected a ";" std::is_same_v<FloatFormat, ieee754_binary32>

I only include dragonbox_to_chars.h in my test code (I have to write millions of float numbers to a csv file with high performance. std::to_string and fprintf are far to slow.).

In my understanding and looking into the dragonbox code it should work with this nvcc version.

Where is the knot in the head?

Thank you!

Finish moving the project to CMake

Related: #3 (comment)

  • Describe the purpose of benchmark
  • Make test .cpp sources independent to register each for its own test
  • Remove VS specific files (VS supports CMake)
  • Group various non-code assets in one folder and/or remove from version control (would be nice, root is messy)

Add Workflows to Ensure Portability?

Currently, there seems to be no cross-compilation to ensure support for ARM, as well as OSes like Windows and macOS. It may be wise to add tests for ARM 32-bit and 64-bit, as well as a big-endian target (PowerPC64 could be a good example). Windows and macOS support are native to workflows, so it's a simple matter of adding configurations. There's a few available options to ensure quick cross-compilation for different:

  • Dockcross Supports many different ARM targets, but lacks support for PowerPC64 and requires Docker.
  • zig cc A drop-in replacement for GCC/Clang, but supports extensive cross-compilation, including all our desired targets.

Use Dockcross would be a drop-in replacement, while zig cc would only require a toolchain file for the configurations.

These would make sure the intrinsics, such as those in #11, are supported properly. If desired, I'd gladly submit a PR with the additional workflows. Would any other targets, such as MIPS64, or RISC-V be desired?

Looking for advice - how to convert float to double, keeping to_string(double) same as to_string(float)

Hi,
Can you give some advice, how to convert float (4 bytes) to double (8 bytes).

I am looking to convert float -> double in a way that to_string(double) to produce same string as original to_string(float).

Any advice would be really appreciated.

My current idea is to you dragonbox to convert float to chars, and then use some fast chars->double converter with round-trip guarantee.

Would be very nice if you will tell me that my idea is stupid, and describe a way that is much better.

Thanks in advance!

Better support for ARM and other architectures

Thanks to Tim Kientzle (@tbkka), it turned out that the current code is not working correctly with ARM processors. I believe this is due to errors in fallback implementations of intrinsics. Also the current #define switches are not able to correctly detect availabilities of intrinsics for ARM processors, resulting in suboptimal code generation.

See: swiftlang/swift#35299.

When the mantissa of double is 16 digits, seems to_decimal() not 100% correct

__int64_t intpart = getInteger();
__int64_t fractpart = getInteger();
double dbl = fractpart + 0.000000000000000000000005;
dbl =  dbl / std::pow(10, countDigit(fractpart));  // countDigit return 5 when fractpart is 12345
dbl += intpart;

I want to make sure my double binary representation like 123.12345000000000000000000000005. So when I run jkj::dragonbox::to_decimal(dbl), I want to make sure I get 123.12345 (of course by (significand, exponent, and is_negative)), so I can verify. I used above code to generate 10000 values. However when significand/mantissa is 16 digits, quite a lot of the decimal are not accurate although I tried most policies.

if significand/mantissa < 16, so far my test looks ok.

`to_decimal` regressed since last November

double x = 1.234;
auto v = jkj::dragonbox::to_decimal(x);
std::cout << v.significand << std::endl;

outputs 1234000000000000 instead of 1234. I am not sure if this is intended or not, but it is an example from readme, so either readme should be fixed or something is wrong with the code.

I noticed this when tried to update Dragonbox from 33a9e02 to the current master in the stress of SerenityOS's Ryu implementation.

Fastest way of trimming trailing zeros

Currently we use the method proposed in the paper by Granlund-Montgomery: using modular inverse and std::rotr.

However, there are at least two competitors in this league. Assuming $q$-bit integers, and let $d$ be the divisor, then:

  • Lemire's method: multiply $\left\lceil\frac{2^{q}}{d}\right\rceil$, and compare the lower $q$-bits with $\left\lceil\frac{2^{q}}{d}\right\rceil$. Regardless of the comparison result, the upper $q$-bits is the quotient of the division. Thus it needs $2q$-bits widening multiplication, upper $q$-bits for the quotient and lower $q$-bits for the divisibility test, but there is no shift, and only one constant needs to be loaded into the register.
  • A different generalization of Granlund-Montgomery for non-odd divisors: Applying Theorem 4.2 from our paper, it is possible to generalize Granlund-Montgomery style modular inverse method into non-odd divisors. In this new method, bit-rotation is replaced by regular bit-shift, and the shift needs to be done only when the comparison succeeds so the input is determined to be a multiple of $d$, while Granlund-Montgomery unconditionally does the bit-rotation. One should note that this method does not work for all $q$-bit unsigned integers, but the range of valid inputs is enough for our application.

For future reference for myself, I write the code for the last method here:

int remove_trailing_zeros(std::uint32_t& n) noexcept {
    int s = 0;
    while (true) {
        auto const nm_mod = std::uint32_t(n * UINT32_C(42949673));
        if (nm_mod < UINT32_C(42949673)) {
            s += 2;
            n = std::uint32_t(nm_mod >> 2);
        }
        else {
            break;
        }
    }
    auto const nm_mod = std::uint32_t(n * UINT32_C(1288490189));
    if (nm_mod < UINT32_C(429496731)) {
        s += 1;
        n = std::uint32_t(nm_mod >> 1);
    }
    return s;
}

int remove_trailing_zeros(std::uint64_t& n) noexcept {
    int s = 0;
    while (true) {
        auto const nm_mod = std::uint64_t(n * UINT64_C(14941862699704736809));
        if (nm_mod < UINT64_C(184467440737095517)) {
            s += 2;
            n = std::uint64_t(nm_mod >> 2);
        }
        else {
            break;
        }
    }
    auto const nm_mod = std::uint64_t(n * UINT64_C(5534023222112865485));
    if (nm_mod < UINT64_C(1844674407370955163)) {
        s += 1;
        n = std::uint64_t(nm_mod >> 1);
    }
    return s;
}

According to a computation, the $32$-bit version is valid up to n = 1'073'741'899, and the $64$-bit version is valid up to n = 4'611'686'018'427'387'999. In Dragonbox, we need these routines for n's up to 16'777'215 for the $32$-bit version, and n's up to 9'007'199'254'740'991 for the $64$-bit version, so the input range is more than enough.

More elegant removal of trailing zeros

When spitting out decimal digits via jeaiii-style integer formatting, it is probably possible to do integer checks to figure out that all the remaining digits are zero. Then maybe we can spare yet more multiplications therefore get a better performance.

Invalid template parameters when using toward_minus_infinity and toward_plus_infinity

Trying to compile the following code, I get an error on line 2058, where the second parameter passed to divide_by_pow10 is significand_bits + kappa + 2 instead of carrier_uint or something similar. I also assume the 3rd parameter isn't the correct value.

#include "dragonbox.h"

int main( int argc, char** argv ) {
    
    float x = 1.234f;   
    auto v = jkj::dragonbox::to_decimal( x, jkj::dragonbox::policy::decimal_to_binary_rounding::toward_plus_infinity );
    return 0;
}

Compiling with MSVC: cl -nologo main.cpp -Fe:dragonbox.exe -Zi -Od -std:c++17

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.