Giter VIP home page Giter VIP logo

Comments (10)

autarch avatar autarch commented on September 6, 2024 1

I realized I can give you something to reproduce it with. I attached a pg_dump of the musicbrainz schema without any data:
mb.sql.txt

from sea-schema.

autarch avatar autarch commented on September 6, 2024 1

It's not really my schema, it's from the musicbrainz project - https://musicbrainz.org/doc/MusicBrainz_Database/Schema

As for getting foreign keys, there's some good discussion here, I think - https://dba.stackexchange.com/questions/36979/retrieving-all-pk-and-fk

from sea-schema.

autarch avatar autarch commented on September 6, 2024 1

I dug into this some more and I think I found what's causing it. The artist_release table has two partitions named artist_release_nonva and artist_release_va. Here's the SQL to create all of these:

CREATE TABLE artist_release (
    -- `is_track_artist` is TRUE only if the artist came from a track
    -- AC and does not also appear in the release AC. Track artists
    -- that appear in the release AC are not stored.
    is_track_artist                     BOOLEAN NOT NULL,
    artist                              INTEGER NOT NULL, -- references artist.id, CASCADE
    first_release_date                  INTEGER,
    catalog_numbers                     TEXT[],
    country_code                        CHAR(2),
    barcode                             BIGINT,
    -- Prior to adding these materialized tables, we'd order releases
    -- by name only if all other attributes where equal. It's not too
    -- common that an artist will have tons of releases with no dates,
    -- catalog numbers, countries, or barcodes (though it can be seen
    -- on some big composers). As a compromise between dropping the
    -- name sorting and having to store the entire name here (which,
    -- as a reminder, is duplicated for every artist on the release),
    -- we only store the first character of the name for sorting.
    sort_character                      CHAR(1) COLLATE musicbrainz NOT NULL,
    release                             INTEGER NOT NULL -- references release.id, CASCADE
) PARTITION BY LIST (is_track_artist);

CREATE TABLE artist_release_nonva
    PARTITION OF artist_release FOR VALUES IN (FALSE);

CREATE TABLE artist_release_va
    PARTITION OF artist_release FOR VALUES IN (TRUE);

So I think we end up' seeing the same FK to artist 3 times because of this. There's also an identical partitioning scheme for the artist_release_group table.

I'm not sure exactly what the right fix is.

The simple solution would be to simply ignore the partitions and only generate entities for the main table. Then any FKs on those partitions could also be ignored.

But that's really not great. It'd probably be better to generate entities for both the parent table and its child partitions. I think the children should reuse the parent's FKs exactly, rather than having them retrieved separately? I'm not really sure since I haven't used Pg's partitioning features much. Is there a case where a child table would have different FKs than the parent?

from sea-schema.

billy1624 avatar billy1624 commented on September 6, 2024

Hey @autarch, thank you so much for the report! We will investigate it.

from sea-schema.

billy1624 avatar billy1624 commented on September 6, 2024

I realized I can give you something to reproduce it with. I attached a pg_dump of the musicbrainz schema without any data: mb.sql.txt

Oh my, your schema is huge!
And after doing some research online. Seems there is not a easy and standard way of getting a list of foreign keys from Postgres.

from sea-schema.

billy1624 avatar billy1624 commented on September 6, 2024

Hey @autarch, you could try #30 by running

cargo run --manifest-path tests/writer/postgres/Cargo.toml

You might need to update tests/writer/postgres/src/main.rs accordingly

use sea_schema::postgres::discovery::SchemaDiscovery;
use sea_schema::sea_query::PostgresQueryBuilder;
use sqlx::PgPool;

#[async_std::main]
async fn main() {
    let connection = PgPool::connect("postgres://sea:sea@localhost/sea_schema_issues_28")
        .await
        .unwrap();

    let schema_discovery = SchemaDiscovery::new(connection, "musicbrainz");

    let schema = schema_discovery.discover().await;

    for table in schema.tables.iter() {
        println!("{};", table.write().to_string(PostgresQueryBuilder));
        println!();
    }
}

from sea-schema.

tyt2y3 avatar tyt2y3 commented on September 6, 2024

Do we have a minimal reproducible schema?

from sea-schema.

billy1624 avatar billy1624 commented on September 6, 2024

Do we have a minimal reproducible schema?

Nope. I still don't understand what happen here. Inside the "information_schema"."referential_constraints".

select * from "information_schema"."referential_constraints"

will get... two set of identical rows... for some reason

image

from sea-schema.

billy1624 avatar billy1624 commented on September 6, 2024

Do we have a minimal reproducible schema?

Just added, 3ee593d, but it's not that minimal loll

from sea-schema.

billy1624 avatar billy1624 commented on September 6, 2024

So I think we end up' seeing the same FK to artist 3 times because of this. There's also an identical partitioning scheme for the artist_release_group table.

Yes... I think so.

But that's really not great. It'd probably be better to generate entities for both the parent table and its child partitions.

I think it did discover the parent and both partitions

from sea-schema.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.