thethirdone / jsoftfloat Goto Github PK
View Code? Open in Web Editor NEWAn implementation of the IEEE 754-2008 standard
License: MIT License
An implementation of the IEEE 754-2008 standard
License: MIT License
Division, SquareRoot and Logarithm are more difficult than addition and multiplication because they can result in infinite digits. The simple algorithm for long division gives a simple way to find the answer up to x
digits though. Squareroot can also be found exactly up to x
digits by using the bisection method (works particularly well for binary floats). But logarithm is slightly harder to get the precise answer up to ``x` digits because there isn't an exact exponential function either. This means that in order to get the correct rounding, more sophisticated error bounding is needed.
Typically this is done with Newton-Raphson iteration and precomputed tables, and continuing until the error is between floating-point values. I would like to keep the code simpler than anything more complex than what I can understand directly from reading the code though. One of the goals of this project is to have a readable, correct floating-point implementation; using more complex algorithms runs counter to that goal.
I'm not sure how to proceed. I will probably just work on other things until I get an idea (it worked for sqrt).
I'm trying to install RARS assembler to an offline machine. I get jsoftload package does not exist when I run build-jar.sh. So I extracted jsoftload source to my machine bu I still get this error.
small
should round to 0 or Float32(1)
depending on rounding mode, but it always goes to 0.
ExactFloat small = new Float32(1).toExactFloat();
small = small.divide(new ExactFloat(BigInteger.valueOf(4)),4);
I also believe, when divided by 2, this doesn't trigger underflow.
The simplest case I could create which shows this is F32(0x43c0a000) + F32(0x0000476e)
which should equal F32(0x006B7E4A)
, but instead equals F32(0x00EB7E4A)
.
The key seems to be the lowest exponent bit is set erroneously. The extra details in both numbers seem to be important, but its not clear yet how.
There are several issues related to addition near the the subnormal boundary, the answer correctly given is clearly wrong and there has to be an off by one or something in the exact conversion (has to be an a weird issue as it has been completely tested for to exact and back) or addition code (not as well tested) .
For example, F32(0x00F00000) + F32(0x00100000)
should equal F32(0x01000000)
, but instead equals F32(0x00F80000)
This shouldn't be a super hard bug to track down, but I'm not going to be able to fix it immediately.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.