fly-apps / safe-ecto-migrations Goto Github PK
View Code? Open in Web Editor NEWGuide to Safe Ecto Migrations
License: Other
Guide to Safe Ecto Migrations
License: Other
In the "Adding an index" section it is advised to configure the Repo to use advisory locks with :pg_advisory_lock
. This can cause long phoenix request delays (5s 😮) in phoenix dev
environment. This is due to Phoenix.Ecto.CheckRepoStatus
calling Ecto.Migrator.migrations/1
, which runs under a migration lock. When “Ecto cannot obtain the lock due to another instance occupying the lock, Ecto will wait for 5 seconds”. In dev
environment, with CheckRepoStatus
plug, you just need two web requests to hit your phoenix server at the very same time to get into that situation. We stumbled into it with a JS app that, after load, fired two simultaneous GraphQL requests. One of them would take 5s.
I already asked in Elixir Forum whether it is necessary that this convenience plug runs under a migration lock.
We worked around the problem by tuning down the :migration_advisory_lock_retry_interval_ms
setting from its default value of 5000 ms to 10 ms (just in dev
). Unless there are better options, maybe you can add that hint to the guide?
I know this but I need to find the right sources to cite and run my own benchmarks.
Here's a wonderful start: https://www.depesz.com/2010/03/02/charx-vs-varcharx-vs-varchar-vs-text/
It's not so much a caution of "don't do this" but more of a "steer in better direction with no trade-offs".
There's another way to safely change the type of a column. Essentially use an updateable view to help transition a new/old column to swap instantaneously.
# imagine if amounts were stored as strings in `amount` column
alter table("transactions") do
# add :amount, :string # This column already exists and represents what's being replaced.
add :amount_new, :decimal, precision: 12, scale: 2
end
Backfill to that new column. Wait for backfill to complete, verify 100% parity. You may need to add database constraints at this point if that's part of the business need.
Switch application reads to new column. Deploy.
field :amount, :decimal, source: :amount_new
field :amount_old, :string, source: :amount
Remove old column from schema. Deploy. At this point, the old column is no longer read.
Drop old column, rename the table to a temporary name and create an updatable view of with the original name while deploying that contains both columns which will be the same values in the database. This will ensure old code still works while it's being cycled out. Remove :source
on the schema field. Deploy.
If you have other users using the database, such as business intelligence or other extractors, ensure they're aware of this change.
repo().transaction(fn repo ->
repo.query!("ALTER TABLE transactions DROP COLUMN amount", [], [])
repo.query!("ALTER TABLE transactions RENAME TO transactions_new", [], [])
# This will be an updateable view
repo.query!(
"CREATE VIEW transactions AS SELECT *, amount_new AS amount FROM transactions_new", [], [])
end)
repo().transaction(fn repo ->
repo.query!("DROP VIEW transactions", [], [])
repo.query!("ALTER TABLE transactions_new RENAME amount_new TO amount", [], [])
repo.query!("ALTER TABLE transactions_new RENAME TO transactions", [], [])
end)
https://www.postgresql.org/docs/current/sql-altertable.html
Adding a column with a volatile DEFAULT or changing the type of an existing column will require the entire table and its indexes to be rewritten. As an exception, when changing the type of an existing column, if the USING clause does not change the column contents and the old type is either binary coercible to the new type or an unconstrained domain over the new type, a table rewrite is not needed. However, indexes must always be rebuilt unless the system can verify that the new index would be logically equivalent to the existing one. For example, if the collation for a column has been changed, an index rebuild is always required because the new sort order might be different. However, in the absence of a collation change, a column can be changed from text to varchar (or vice versa) without rebuilding the indexes because these data types sort identically. Table and/or index rebuilds may take a significant amount of time for a large table; and will temporarily require as much as double the disk space.
(emphasis mine)
The Safe Ecto Migrations note about adding a new column with a default implies that it becomes totally safe in Postgres 11+, but that's not entirely true because of the highlighted caveat above.
We should add a note to call this out.
For the recipe to set NON NULL on an existing column, I think there's a bug with one of the statements that will cause locking for extended periods of time in PG12. I have no tried to reproduce on any newer versions of postgres.
When it gets to the following step in the docs:
alter table("products") do
modify :active, :boolean, null: false
end
Ecto generates the following SQL under the hood:
ALTER TABLE "products" ALTER COLUMN "active" TYPE boolean, ALTER COLUMN "active" SET NOT NULL;
Even though the type is the same as the existing column type, which should in itself be a no-op or metadata-only change, I believe it's causing postgres to not think that it can optimize the ALTER; it sees that additional work might be needed to be done at the same time and decides not to take the constraint-check shortcut route, thus triggering the table scan again while locked. At least that was my experience on PG 12.14.
Until ecto is smart enough to know that the column type is the same, and strips out the ALTER COLUMN ... TYPE ...
, it might be dangerous to run an ecto-generated ALTER TABLE to perform this step vs. handcrafted SQL. Even then, you'd have to be on a certain version of ecto.
Assuming a table called "foo" with a lot of data and a column called "a" of type text
with a UNIQUE constraint:
(I'm using 50 million rows, but locally on a fast NVMe drive w/ enough RAM to fully cache the data set, so single-digit second responses will be larger on an active production environment)
I'm running PG 12.2 for these timings, but we also experienced the issue on PG 12.14.
Creating and validating the constraint. Note the 3.3 seconds to do a full table scan:
testdb =# ALTER TABLE "foo" ADD CONSTRAINT "a_not_null" CHECK (a IS NOT NULL) NOT VALID;
ALTER TABLE
Time: 6.648 ms
testdb=# ALTER TABLE foo VALIDATE CONSTRAINT a_not_null;
ALTER TABLE
Time: 3365.122 ms (00:03.365)
Converting the column to non-null using SQL produced by mix ecto.migrate --log-migrations-sql
. Note the inclusion of ALTER COLUMN "a" TYPE text
and the time it takes to run being similar to the full table scan:
testdb =# ALTER TABLE "foo" ALTER COLUMN "a" TYPE text, ALTER COLUMN "a" SET NOT NULL;
ALTER TABLE
Time: 3425.920 ms (00:03.426)
testdb =# ALTER TABLE "foo" ALTER COLUMN "a" DROP NOT NULL;
ALTER TABLE
Time: 6.708 ms
testdb =# ALTER TABLE "foo" ALTER COLUMN "a" TYPE text, ALTER COLUMN "a" SET NOT NULL;
ALTER TABLE
Time: 3370.575 ms (00:03.371)
testdb =# ALTER TABLE "foo" ALTER COLUMN "a" DROP NOT NULL;
ALTER TABLE
Time: 6.634 ms
testdb =# ALTER TABLE "foo" ALTER COLUMN "a" TYPE text, ALTER COLUMN "a" SET NOT NULL;
ALTER TABLE
Time: 3418.631 ms (00:03.419)
testdb =# ALTER TABLE "foo" ALTER COLUMN "a" DROP NOT NULL;
ALTER TABLE
Time: 6.586 ms
Setting the NOT NULL
without the type. Note response times in the single-digit milliseconds:
testdb =# ALTER TABLE "foo" ALTER COLUMN "a" SET NOT NULL;
ALTER TABLE
Time: 1.475 ms
testdb =# ALTER TABLE "foo" ALTER COLUMN "a" DROP NOT NULL;
ALTER TABLE
Time: 6.428 ms
testdb =# ALTER TABLE "foo" ALTER COLUMN "a" SET NOT NULL;
ALTER TABLE
Time: 6.562 ms
testdb =# ALTER TABLE "foo" ALTER COLUMN "a" DROP NOT NULL;
ALTER TABLE
Time: 6.685 ms
testdb =# ALTER TABLE "foo" ALTER COLUMN "a" SET NOT NULL;
ALTER TABLE
Time: 6.757 ms
testdb =# ALTER TABLE "foo" ALTER COLUMN "a" DROP NOT NULL;
ALTER TABLE
Time: 6.445 ms
The following table setup was used to reproduce the issue:
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE TABLE foo (
id bigint PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
a text,
b text,
c text,
d text,
e text,
f text,
g text,
h text,
i text,
j text,
k text,
l text,
m text,
n text,
o text,
p text,
q text,
r text,
s text,
t text,
u text,
v text,
w text,
x text,
y text,
z text,
CONSTRAINT a_uniq_idx UNIQUE(a)
);
INSERT INTO foo (a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z)
SELECT
uuid_generate_v4(),
concat('b-', i::text),
concat('c-', i::text),
concat('d-', i::text),
concat('e-', i::text),
concat('f-', i::text),
concat('g-', i::text),
concat('h-', i::text),
concat('i-', i::text),
concat('j-', i::text),
concat('k-', i::text),
concat('l-', i::text),
concat('m-', i::text),
concat('n-', i::text),
concat('o-', i::text),
concat('p-', i::text),
concat('q-', i::text),
concat('r-', i::text),
concat('s-', i::text),
concat('t-', i::text),
concat('u-', i::text),
concat('v-', i::text),
concat('w-', i::text),
concat('x-', i::text),
concat('y-', i::text),
concat('z-', i::text)
FROM generate_series(1, 5000000) AS gs(i);
For each section, there needs to be a docker-compose test that illustrates the scenarios.
One fantastic example of this is here: https://gitlab.com/ongresinc/blog-posts-src/-/tree/master/202005-generate_column_vs_trigger
Hello,
Not sure if this is the best place to ask but I will shoot it anyway ;).
I'm Rails developer learning Elixir and I have a question about safe Ecto migrations. Is it possible to use advisory lock instead of share update exclusive on schema_migrations
table (like Rails does it) when creating a migration record?
I'm asking because (as fair as I understand) advisory lock would allow to disable ddl transaction for the migration itself (i.e. adding index concurrently) but it would still make sure that only one node can execute it (in multi node setup).
Some of the information is just stated without backing citations. This should be fixed. Each scenario should be verified, and when verified, a citation of the scenario, either pointing to official documentation or release notes, or a repeatable test scenario.
It looks like the lock obtained in Postgres when creating an index concurrently SHARE UPDATE EXCLUSIVE
should allow writes ROW EXCLUSIVE
, but the guide currently implies that writes are blocked.
With Postgres, instead create the index concurrently which does not block reads. There are two options:
SHARE UPDATE EXCLUSIVE (ShareUpdateExclusiveLock)
Conflicts with the SHARE UPDATE EXCLUSIVE, SHARE, SHARE ROW EXCLUSIVE, EXCLUSIVE, and ACCESS EXCLUSIVE lock modes.
Acquired by VACUUM (without FULL), ANALYZE, CREATE INDEX CONCURRENTLY, CREATE STATISTICS, COMMENT ON, REINDEX CONCURRENTLY, and certain ALTER INDEX and ALTER TABLE variants (for full details see the documentation of these commands).
ROW EXCLUSIVE (RowExclusiveLock)
Conflicts with the SHARE, SHARE ROW EXCLUSIVE, EXCLUSIVE, and ACCESS EXCLUSIVE lock modes.
The commands UPDATE, DELETE, INSERT, and MERGE acquire this lock mode on the target table (in addition to ACCESS SHARE locks on any other referenced tables). In general, this lock mode will be acquired by any command that modifies data in a table.
Hi, in https://fly.io/phoenix-files/backfilling-data/#batching-deterministic-data - The code in getting the next_page
id, this seems to imply that Repo.update_all
would return the results in an ordered list, and in reversed order.
Would that be true? I wonder if you'd need to do some sorting on the result
to get the tail of the result?
tldr, when adding a generated stored column to an existing large table, it will lock the table so it can calculate the value for each row.
There seems to be a workaround:
BEFORE INSERT OR UPDATE
with a function that is equivalent to what you would put as the as generated expression.However, there seems to be a trade-off in that INSERT times are slower with a triggered function. source (4yrs old, for postgres v12): https://www.ongres.com/blog/generate_columns_vs_triggers/
CREATE OR REPLACE FUNCTION generate_foo_immutable ()
RETURNS TRIGGER
AS $$
BEGIN
NEW.foo = NEW.bar * 2;
RETURN new;
END;
$$
LANGUAGE plpgsql IMMUTABLE;
CREATE TRIGGER generate_foo_immutable_trigger
BEFORE INSERT OR UPDATE [OF other_column1, other_column2, ...] ON foo_table
FOR EACH ROW
EXECUTE PROCEDURE public.generate_foo_immutable();
https://github.com/fly-apps/safe-ecto-migrations#adding-a-reference-or-foreign-key
If the reference is on a brand new and empty table, will the referenced table w/ millions of records be locked for a meaningful amount of time?
Folks at my company have been passing this repo and associated article around like it's gospel, but I'm pretty sure you're wrong on creating postgres indexes and blocking table reads:
Creating an index will block both reads and writes in Postgres.
CREATE INDEX will block table writes but not reads in postgres
With Postgres, instead create the index concurrently which does not block reads.
creating the index concurrently will not block writes.
From the postgresql docs regarding CONCURRENTLY
When this option is used, PostgreSQL will build the index without taking any locks that prevent concurrent inserts, updates, or deletes on the table; whereas a standard index build locks out writes (but not reads) on the table until it's done. There are several caveats to be aware of when using this option — see Building Indexes Concurrently.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.