Giter VIP home page Giter VIP logo

1brc-objectpascal's People

Contributors

benibela avatar bytebitespas avatar corneliusdavid avatar dtpfl avatar eagleaglow avatar gcarreno avatar georges-hatem avatar hg747 avatar ikelaiah avatar laksen avatar lawson89 avatar mobius1qwe avatar moonentity avatar ottocoddo avatar paweld avatar synopse avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

1brc-objectpascal's Issues

Clarify average rounding

result values per station in the format //, rounded to one fractional digit

ROUNDED = truncated, I guess?

Using "single" for the total value is unsafe

In the implementation, using a "single" type for adding a lot of values is known as not accurate.
With millions of values, adding new numbers may just be wrong because it exhausts the 32-bit precision of single.
This is why there are algorithms like https://en.wikipedia.org/wiki/Kahan_summation_algorithm

Implementation should better use Int64 values, and work as fixed resolution with temperature *10 64-bit integers.
And it will be faster.

BTW for a few thousand numbers, single is not faster than double, because the CPU L1 cache miss will probably be the bottleneck.

Generator seems to fail with 400 stations option

Describe the bug
Generator seems to fail with 400 stations option

To Reproduce
Steps to reproduce the behavior:
./generator -i ../data/weather_stations.csv -o data400.csv -n 1000000000 -4

Expected behavior
Should generate the file.

Screenshots

$ ./generator -i ../data/weather_stations.csv -o data400.csv -n 1000000000 -4
ERROR: Option at position 7 needs an argument : 4

Additional context
Same issue is --400stations is used.
No problem without the -4 option: the csv file is correctly generated.

Full Station Name Hash Requirement?

We need to discuss about the requirement of "full station name hash".

In (most of) my entries I use the "perfect hash" trick, i.e. only compare the 32-bit of the hash to check for a given station name. With a good enough hash function (e.g. crc32c), it works perfectly fine with our current dataset of 10K stations, and give the correct output results. BUT we may be able to add a line to the dataset with a forged name triggering a hash collision. Then the results would be inaccurate...

In the original 1BRC challenge, this trick was disallowed, and they rejected any solution not explicitly comparing the station names char by char.
gunnarmorling/1brc#495 (reply in thread)

So in my entry, I made this process flow available, and we can compare plain ./abouchez and ./abouchez -f - the later making a full name comparison, but lower (1.96s vs 1.10s on my Intel PC).

To be fair with the original comparison, I would recommend to require a full station name comparison.
It makes numbers lower, but is IMHO more accurate with what we expect on real work.

ReadMe should have partial example of the accepted result

There seems to be an accepted result, with a published SHA256 value. It would be helpful to get sample values for the text that is provided to the hash routine. For example, my last run (to a file) resulted in the wrong hash, with the beginning of the output like:

{‘Abasān al Kabīrah=-18.2/-59.6/22.8, ‘Adrā=62.2/30.7/93.6, ‘Afrīn=28.7/0.7/56.6, ‘Ajab Shīr=-29.6/-70.7/11.3, ‘Ajlūn=33.4/4.3/62.3, ‘Ajmān=22.7/-9.2/54.5, ‘Akko=38.8/-3.6/80.6, ‘Alavīcheh=-61.7/-79.6/-43.8, ‘Alem T’ēna=6.2/-19.6/31.7, ‘Ālī Shahr=46.6/16.1/77.1, ‘Alīābād-e Katūl=-56.9/-84.1/-29.8, ‘Amrān=-15.4/-45.6/14.7, ‘Āmūdā=-58.1/-85.2/-30.9, ‘Anadān=11.4/-20.2/43.0, ‘Anbarābād=-39.1/-66.3/-12.2, ‘Aqrah=-35.9/-64.2/-7.8, ‘Ayn al ‘Arab=-15.3/-39.8/9.1, ‘Aynkāwah=38.6/5.4/72.2, ‘Ibrī=-50.3/-75.7/-24.9, ‘Izbat al Burj=23.6/-17.5/64.5, ‘Unayzah=56.2/14.0/98.5, ‘Utaybah=72.2/53.9/90.4, ’Aïn Abessa=0.8/-27.4/29.2, ’Aïn Abid=17.9/-3.1/38.9, ’Aïn Arnat=-25.2/-66.0/15.9, ’Aïn Azel=70.1/45.0/95.0, ’Aïn el Hammam=-44.0/-81.4/-7.4, ’Aïn Leuh=-71.3/-91.4/-51.6, ’Aïn Roua=-31.2/-54.3/-8.3, ’Ali Ben Sliman=66.1/40.4/91.9, ’Ayn Bni Mathar=-55.9/-86.9/-25.2, ```

I suspect I have not correctly implemented rounding, but I may be missing a BOM at the beginning, or an EOL at the end. If possible, provide an excerpt from the beginning of the correct result in the ReadMe file. Thank You!

Get rid of cross-OS and cross-IDE requirement

You can only use pure Object Pascal with no calls to any operating system's API

This requirement did not exist in the original Java challenge, and is pointless IMHO.

I would like to focus on FPC Linux x86_64.
Or at least be able to use mORMot 2 as cross-platform and cross-compiler layer.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.