Comments (2)
Benchmark Round-1: Many rows
Generate Data
INSERT INTO log
SELECT
(FLOOR(RAND() * 1000000) % 10000)::INT,
JSON_OBJECT(
'user_ip', CONCAT_WS(
'.',
FLOOR(RAND() * 255)::STRING,
FLOOR(RAND() * 255)::STRING,
FLOOR(RAND() * 255)::STRING,
FLOOR(RAND() * 255)::STRING
),
'request_method', IF(RAND() > 0.5, 'GET', 'POST'),
'request_url', CONCAT(
'http://example.com/',
CASE
WHEN RAND() < 0.1 THEN 'login'
WHEN RAND() < 0.2 THEN 'logout'
WHEN RAND() < 0.3 THEN 'profile'
WHEN RAND() < 0.4 THEN 'settings'
WHEN RAND() < 0.5 THEN 'dashboard'
WHEN RAND() < 0.6 THEN 'search?q=example'
WHEN RAND() < 0.7 THEN 'help'
WHEN RAND() < 0.8 THEN 'contact'
WHEN RAND() < 0.9 THEN 'about'
ELSE 'home'
END
),
'response_status', IF(RAND() > 0.5, '200', '404')
),
NOW()
FROM numbers(100000000);
This will take 9min in small warehouse.
Inverted index
cold-run: 19s
hot-run: 5.4s
SELECT count(*)
FROM log
WHERE MATCH(url, 'logout');
![image](https://private-user-images.githubusercontent.com/172204/317581840-5928c25b-cfac-46de-9db7-0a71175c6ae8.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTE5OTA3NzIsIm5iZiI6MTcxMTk5MDQ3MiwicGF0aCI6Ii8xNzIyMDQvMzE3NTgxODQwLTU5MjhjMjViLWNmYWMtNDZkZS05ZGI3LTBhNzExNzVjNmFlOC5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNDAxJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDQwMVQxNjU0MzJaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1hZWQ0ODFkNmJjYzRlOTliNDY4NzgxYWZhOTRjNjgwMDUxMGZkNTc4MzNhZGZhYWE4YmE1OWE3ZmI3MzA5NWQxJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.VOliW0urCkErXxaV9ttcswVm_HTP_xFauG_wqiQu9a8)
Like
cold-run: 1.3s
hot-run: 0.5s
SELECT count(*)
FROM log
WHERE url like '%logout%';
![image](https://private-user-images.githubusercontent.com/172204/317581744-f6d0fe9c-9333-4c2c-a57f-dae4b784475d.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTE5OTA3NzIsIm5iZiI6MTcxMTk5MDQ3MiwicGF0aCI6Ii8xNzIyMDQvMzE3NTgxNzQ0LWY2ZDBmZTljLTkzMzMtNGMyYy1hNTdmLWRhZTRiNzg0NDc1ZC5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNDAxJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDQwMVQxNjU0MzJaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1hMTUzMDdiMGY0ZWUwYjAwMGZlMTYxODc0OGIzZWViODdlZTJiMzdjOTFmN2RkMWU0MTJjYmQ2ODQ3YTEwM2I2JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.6uC0lPxzyszXKyjTebfy84CkbLKaJIgsiuoE0OZ9mpA)
from databend.
Benchmark Round-2: A few rows(152 rows)
Generate Data
INSERT INTO log
SELECT
(FLOOR(RAND() * 1000000) % 10000)::INT,
JSON_OBJECT(
'user_ip', CONCAT_WS(
'.',
FLOOR(RAND() * 255)::STRING,
FLOOR(RAND() * 255)::STRING,
FLOOR(RAND() * 255)::STRING,
FLOOR(RAND() * 255)::STRING
),
'request_method', IF(RAND() > 0.5, 'GET', 'POST'),
'request_url', CONCAT(
'http://example.com/',
LEFT(MD5(RAND()::STRING), 5)
),
'response_status', IF(RAND() > 0.5, '200', '404')
),
NOW()
FROM numbers(100000000);
Inverted index
cold-run: 19.4s
hot-run: 0.075s
SELECT count(*)
FROM log
WHERE MATCH(url, '78114');
Like
cold-run: 1.3s
hot-run: 0.6s
SELECT count(*)
FROM log
WHERE url like '%78114%';
from databend.
Related Issues (20)
- bug: stream get wrong result if change tracking is changed
- bug: query detail lost qualify with window function HOT 1
- bug: leave a record with empty `query_text` when planning fail.
- Feature: parquet metadata cache in ParquetRsTable
- consider: remove the query result cache code HOT 2
- bug: copy into table source can't contain window|aggregate|udf|join functions HOT 1
- copy into use specified file list is very slow
- Tracking: refactor and optimize `merge into`
- increase `default_wait_time_secs` to be 10 secs HOT 1
- Feature: Sort aggregation support
- bug: Error when create table have column name have upper character HOT 3
- feat: support warehouse-level settings
- bug: use purge=true in copy into with failed ,but file was delete HOT 1
- Feature: copy into table support `on_error = skip_file`
- bug: invalid unicode code point when loading files dump with format ndjson.
- bug: tuple field with special char will cause session panic HOT 1
- feat: query forward HOT 5
- bug: create user with DEFAULT_ROLE must uppercase HOT 1
- Tracking: Multi-table insert statement
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from databend.