Giter VIP home page Giter VIP logo

geoip2's People

Contributors

bbkr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

geoip2's Issues

Reduce amount of read syscalls.

Amount of file handle syscalls required to perform simple lookup is overwhelming.
My idea is to introduce sequential read cache.

For example read(1) at pos 1000 will cause to create

{
    1000 => Buf( 1024 chars )
}

cache.

One byte will be shifted from Buf and cache will be remapped to new position key:

{
    1001 => Buf( 1023 chars)
}

This way all sequences (maps, pointer bytes) will use already fetched data.

Warnings at zef install GeoIP2

When installing via zef: $ zef install GeoIP2 got this message:

WARNINGS for /home/kostas/.zef/store/GeoIP2.git/40c1b984344b2925bfa768cab247c230d5d4f0c2/lib/GeoIP2.pm (GeoIP2):
Useless use of LOOP_BLOCK_1 symbol in sink context (line 133)
===> Testing [OK] for GeoIP2:ver<1.0.0>
===> Installing: GeoIP2:ver<1.0.0>
WARNINGS for /home/kostas/apache_root/perl6.pheix.org/git/pheix-pool/home#sources/B14446447BBA55BFD32F41138C6E7B843FF09D07 (GeoIP2):
Useless use of LOOP_BLOCK_1 symbol in sink context (line 133)

Separate metadata from derived values.

Method read-metadata should return decoded metadata. It should be possible to use it at any times and multiple times.

While %metadata attribute may contain some precalculated derived values like IPv4 start.

The split is there, but tests should reflect it.

Cache nodes and pointer values

Experimental branch that allows to cache binary tree nodes and data pointed by pointers is ready:

https://github.com/bbkr/GeoIP2/tree/node_cache
(no docs or tests yet)

It gives excellent boost (up to 400%), however random replacement retention policy used suffers from low performance on Hash.pick reported in: rakudo/rakudo#2586

So when retention takes place on large cache it can have very negative impact on overall performance.

Waiting for Rakudo task to be addressed, then I'll decide if this will be suitable for merge.

Benchmark of real traffic baseline.

Various optimization ideas require better benchmark. For example 1M of real, international www traffic IPs. With duplicates. With IPv4 and IPv6 mixed together, etc. And this should be resolved against pro version of city database because it has huge amount of search nodes.

Optimize types recognition.

String representation of decoded types is useful for debugging but it adds 2 additional steps to decoding process.

Once all bits will be in place it can be replaced by direct closure to decoding methods.

Optimize uints

All uint bytes can be fetched at once which should be faster than doing it byte-by-byte form handle.

0 size will need special treatment in this case.

Add IPv6 support.

Full form, short form (0000 -> 0 ).

Compact with '::' is to be considered.

Fix \d matching all unicode digits

perl6 -e 'say "۳.۳.۳.۳" ~~  / ^ (\d ** 1..3) ** 4 % "." $ /'
「۳.۳.۳.۳」
 0 => 「۳」
 0 => 「۳」
 0 => 「۳」
 0 => 「۳」

Translation hook

Add translation hook which allows to skip included translations (usually there is no point in decoding them all) and get specific language by geoname id.

Reader interface design.

Assume that there are people who will use bare GeoIP2 reader not masked by specific class such as GeoIP2::City. So interface must be well described and safe to use.

There are methods that are safe to call at any time - reading metadata, reading node pointer, reading location info. Those methods position cursor file on their own.

And there are unsafe methods that when called directly can cause unexpected results - mostly decoding values at current cursor position can go out of range.

Add missing data types

  • 0 => 'extended',
  • 1 => 'pointer',
  • 2 => 'utf8_string',
  • 3 => 'double', ***
  • 4 => 'bytes',
  • 5 => 'uint16',
  • 6 => 'uint32',
  • 7 => 'map',
  • 8 => 'int32', ****
  • 9 => 'uint64',
  • 10 => 'uint128',
  • 11 => 'array',
  • (not needed) 12 => 'container', *
  • (not needed) 13 => 'end_marker', **
  • 14 => 'boolean',
  • 15 => 'float' ***

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.