Giter VIP home page Giter VIP logo

Comments (2)

matriv avatar matriv commented on May 25, 2024

Did one more test with the following table (2 partitions/1 shard each)

CREATE TABLE uservisits (
       "sourceIP" STRING,
       "destinationURL" STRING,
       "visitDate" TIMESTAMP,
       "adRevenue" FLOAT,
       "UserAgent" STRING INDEX USING FULLTEXT,
       "cCode" STRING,
       "lCode" STRING,
       "searchWord" STRING,
       "duration" INTEGER,
       "y" GENERATED ALWAYS AS date_trunc('year', "visitDate") / 1000000000000,
       INDEX uagent_plain USING PLAIN("UserAgent")
       -- ^^ used for regex matching ^^ --
    ) CLUSTERED INTO 1 shards PARTITIONED BY("y") WITH (
        number_of_replicas = 0
    );

loading first the data files 0-8, and benchmarking the 9th.
0-8 preloaded data files yield the following shard info:

cr> select size / 1024 / 1024, num_docs, table_name, partition_ident from sys.shards;
+----------------------------------------+----------+------------+-----------------+
| ((size / 1024::bigint) / 1024::bigint) | num_docs | table_name | partition_ident |
+----------------------------------------+----------+------------+-----------------+
|                                    971 |  5259365 | uservisits |           04130 |
|                                    373 |  1699738 | uservisits |           04132 |
+----------------------------------------+----------+------------+-----------------+

and the benchmark results are:

Q: insert into uservisits ("adRevenue", "destinationURL", "searchWord", "UserAgent", "duration", "visitDate", "sourceIP", "lCode", "cCode") values ($1, $2, $3, $4, $5, $6, $7, $8, $9)
C: 20
| Version |         Mean ±    Stdev |        Min |     Median |         Q3 |        Max |
|   V1    |        6.815 ±   14.030 |      1.956 |      4.119 |      4.939 |    182.224 |
|   V2    |        4.156 ±   17.014 |      0.692 |      1.945 |      2.600 |    291.667 |
├---------┴-------------------------┴------------┴------------┴------------┴------------┘
|               -  48.47%                           -  71.69%   

This shows, that as the shards grow larger the optimized insert path provides more significant performance improvement.

from crate.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.