Comments (2)
I benchmarked with populated tables vs empty tables when adding reference columns on Postgres 15. tested on a Macbook M1 Pro (10 cores 16gb RAM).
with an empty table (the one adding the column), the difference is negligible; go forth and have simple migrations.
with 1 million records in both tables, the difference is up to 45ms which is hardly noticeable to anyone. Go forth and have simple migrations.
with 100 million records in both tables, the difference is up to 6 seconds. Still not a lot of time, but depending on your use cases, this could be noticeable.
In depth look when benchmarking 1 million rows:
- When adding a column to an empty table that references a populated table's column
- with validate true, total time is
4.550 ms
. blocks writes to both tables. - with validate false and separate validation, total time is
1.793 ms
(2.5x faster). blocks writes to both tables during column creation (1.045 ms
), blocks almost nothing on empty table during validation (0.748 ms
).
- with validate true, total time is
- When adding a column to a populated table that references a populated table's column
- with validate true, total time is
60.929 ms
blocks writes to both tables. - with validate false, total time is
45.17 ms
(1.3x faster). blocks writes to both tables during column creation (0.507 ms
), and doesn't block much on constraint table during validation (44.663 ms
).
- with validate true, total time is
clearly, it doesn't really matter when the referenced column is added to an empty table. You can get away with this :)
However, in both tests, separating the constraint validation from column creation is noticeably faster, and there is another noticeable difference: when there is populated data in both tables and you don't separate validation from column creation, you have longer write-blocking locks on both tables.
- Creating the referenced column obtains a
ShareRowExclusiveLock
on both tables which blocks writes. - Validating the constraint only obtains a
ShareUpdateExclusiveLock
on the table w/ the constraint which does not block writes, and obtains aRowShareLock
lock on the referenced table which blocks almost nothing.
Caution: this is one benchmark.
-- SETUP
\timing;
drop table if exists testing_one;
drop table if exists testing_two;
create table testing_one (
id bigint not null primary key,
foo varchar(255),
bar varchar(255)
);
create table testing_two (
id bigint not null primary key,
foo varchar(255),
bar varchar(255)
);
INSERT INTO testing_one(id, foo, bar)
SELECT x, 'foo ' || cast(x as varchar(20)), 'bar ' || cast(x as varchar(20)) from generate_series(1,1000000) x;
-- TESTING WITH AN EMPTY TABLE
TRUNCATE TABLE testing_two;
-- TEST WITH SIMPLE REFERENCE
ALTER TABLE testing_two
ADD COLUMN testing_one_id bigint,
ADD CONSTRAINT testing_two_testing_one_id_fkey FOREIGN KEY (testing_one_id)
REFERENCES testing_one(id) ON DELETE CASCADE;
-- Time: 4.550 ms
CREATE INDEX CONCURRENTLY ON testing_two(testing_one_id);
-- Time: 4.091 ms
ALTER TABLE testing_two DROP IF EXISTS testing_one_id;
-- TEST WITH VALIDATE FALSE
ALTER TABLE testing_two
ADD COLUMN testing_one_id bigint,
ADD CONSTRAINT testing_two_testing_one_id_fkey FOREIGN KEY (testing_one_id)
REFERENCES testing_one(id) ON DELETE CASCADE NOT VALID;
-- Time: 1.045 ms
CREATE INDEX CONCURRENTLY on testing_two(testing_one_id);
-- Time: 2.033 ms
ALTER TABLE testing_two VALIDATE CONSTRAINT testing_two_testing_one_id_fkey;
-- Time: 0.748 ms
ALTER TABLE testing_two DROP IF EXISTS testing_one_id;
-- TEST WITH POPULATED DATA
TRUNCATE TABLE testing_two;
INSERT INTO testing_two(id, foo, bar)
SELECT x, 'foo ' || cast(x as varchar(20)), 'bar ' || cast(x as varchar(20)) from generate_series(1,1000000) x;
-- TEST WITH SIMPLE REFERENCE
ALTER TABLE testing_two
ADD COLUMN testing_one_id bigint,
ADD CONSTRAINT testing_two_testing_one_id_fkey FOREIGN KEY (testing_one_id)
REFERENCES testing_one(id) ON DELETE CASCADE;
-- 60.929 ms
CREATE INDEX CONCURRENTLY ON testing_two(testing_one_id);
-- Time: 240.110 ms
ALTER TABLE testing_two DROP IF EXISTS testing_one_id;
-- TEST WITH VALIDATE FALSE
ALTER TABLE testing_two
ADD COLUMN testing_one_id bigint,
ADD CONSTRAINT testing_two_testing_one_id_fkey FOREIGN KEY (testing_one_id)
REFERENCES testing_one(id) ON DELETE CASCADE NOT VALID;
-- Time: 0.507 ms
CREATE INDEX CONCURRENTLY on testing_two(testing_one_id);
-- Time: 207.736 ms
ALTER TABLE testing_two VALIDATE CONSTRAINT testing_two_testing_one_id_fkey;
-- Time: 44.663 ms
ALTER TABLE testing_two DROP IF EXISTS testing_one_id;
from safe-ecto-migrations.
Resolved with a2f2441
from safe-ecto-migrations.
Related Issues (13)
- Advisory lock for migration transactions HOT 10
- Add hint to tune down `:migration_advisory_lock_retry_interval_ms` in `dev` environment when using `:pg_advisory_lock` HOT 1
- Verify concurrent index table locks
- Docs on setting NON NULL safely on PG12+ may not be safe due to ecto-generated inclusion of `ALTER COLUMN ... TYPE ...` HOT 2
- Recommend `:text` instead of `:string`
- Backfill data page - question on "next_page = results |> Enum.reverse() |> List.first()" HOT 5
- Adding non-immutable default value to existing table is not safe
- CREATE INDEX does not block table reads HOT 4
- adding generated stored columns on large tables in Postgres
- For each section there should be a test
- For each section there should be a citation
- New strategy for changing type of a column
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from safe-ecto-migrations.