pepsico-ecommerce / snowflex Goto Github PK

View Code? Open in Web Editor NEW

52.0 52.0 9.0 245 KB

elixir snowflake client

License: Apache License 2.0

Elixir 100.00%

snowflex's People

Contributors

Stargazers

Watchers

Forkers

squirrelsauce apptopia angel-studios kianmeng superhawk610 sam-pd trnsprntt eaguilera23 layeddie

snowflex's Issues

UTF-8 Queries not properly encoded (and this kills the process)

Hi!
I'm making some queries with Snowflex and noting this in my Snowflake query console:

The query I'm making includes the event name: [OBD-2] Finalizou tutorial do prontuário

I am guessing that there's a missing UTF-8 conversion in the query somewhere?

I have also noticed that making certain queries (try doing one with the first value from this document ) causes the connection to drop entirely and not be recreated

From then on, any query will return {:error, [:connection_closed, "No SQL-driver information available"]}

Upstream queries sometimes are not killed when downstream process crashes

I noticed this today while I was working on a new query to improve some stuff on an internal application.

A simple version of my query is like so:

WITH padded_upcs AS (
    SELECT
        item.TITLE,
        LPAD(guide_to_things.CODE, 13, '0') AS UPC
    FROM
        NAMESPACE.SUBNAMESPACE.ITEM item
    INNER JOIN
        NAMESPACE.SUBNAMESPACE.GUIDE guide_to_things
    ON
        item.FOO = guide_to_things.FOO
    WHERE
        UPPER(guide_to_things.CODE) = 'UPC'
    QUALIFY
        ROW_NUMBER()
    OVER
        (PARTITION BY item.ITEM ORDER BY item.LOAD_DTS DESC) = 1
),

gtins AS (
    SELECT
        item.TITLE,
        LPAD(guide_to_things.CODE, 13, '0') AS UPC
    FROM
        NAMESPACE.SUBNAMESPACE.ITEM item
    INNER JOIN
        NAMESPACE.SUBNAMESPACE.GUIDE guide_to_things
    ON
        item.FOO = guide_to_things.FOO
    WHERE
        UPPER(guide_to_things.CODE) = 'UPC'
    QUALIFY
        ROW_NUMBER()
    OVER
        (PARTITION BY item.ITEM ORDER BY item.LOAD_DTS DESC) = 1
)

SELECT
    p.TITLE,
    g.GTIN
FROM
    padded_upcs p
JOIN
    gtins g
ON
    charindex(p.upc, g.gtin) > 0

In this case, we are gathering a group of about 200,000 items, and a group of about 90,000 items between the joins.

In the snowflake UI, this query takes about 8 minutes to run. If you have a sufficiently slow-ish query that is similar, anything in that ballpark should suffice. If you are a PepsiCo e-comm engineering employee and want an easy, reusable example, I can provide the full, real version of this query to use for debugging.

The long and short of it is this to demonstrate the problem:

Open 6 iex -S mix shells
In each one, use snowflex to run the above query

You will now see after about 10-15 minutes (accounting for some extra serialization time through ODBC) a couple have results come in, but if you check out the snowflake activity panel, you can eye the queued/running queries.

Our work configuration allows 2 queries/user to be run for developer account concurrently, so with 6 it begins to stack.

So, after about 20 minutes (our configured timeout, use whatever one you have configured for your application to hit this checking point) the stacking and queue will make it so its unfeasible for some to complete in under 20 minutes.

Now, we will see something like this

Note the bottom right, my connection that timed out. The other 5 are complete or have had iex exit and the process is dead.

However, if we go and look at snowflake:

We still see 2 queries running. Both for over 20 minutes, that were certainly sent from Elixir-land.

This leads me to a couple theories:

Somehow :odbc is not correctly handling process checkout from poolboy and sending the appropriate kill signal
We are doing something silly and swallowing the failure yet raising the exception and a process is raising a poolboy exception but not killed (seems less likely since I ctrl-c'd the top left terminal)
Its something else deeper and I haven't the slightest idea

I'm not sure about how to go about fixing this, but I wanted to document it while it was fresh and I had screenshots that verified I am not silly and that I could in fact replicate this after it bit me earlier in the day. I initially thought that there was somehow some serious serialization overhead causing the timeout, but once I realized I had ~20 queries queued but only 1 open terminal process, it was pretty clear something is off in the library itself.

Awesome repo name

The repo name is 🔥 and @zbarnes757 is 🧑‍🚒

interpolation error

Hi there, i was trying to follow your docs on how to execute queries with external data via ? placeholders using delete queries, but i'm getting a weird error which i'm not able to identify the reason, maybe you are aware of it, or is there a limitation on the lib side.

so, this one works as expected

iex> JobApi.SnowflakeConnection.execute("DELETE FROM all_events_v3 WHERE (account_id='15094')")
{:updated, 1}

but when using placeholders it returns an error

iex> JobApi.SnowflakeConnection.execute("DELETE FROM all_events_v3 WHERE (account_id=?)", [Snowflex.string_param("15094")])
{:error, 'No SQL-driver information available.'}

i'm using the latest version

Using `AS` on an existing column name has value overridden if value exists on field resulting from another column in the query if having same name from another table

I am not 100% sure if this is a 'snowflex' issue or if its isolated enough that its just a me-problem.

An illustration:

Let's say I have a query like this

SELECT
  UPPER(product_manifest.PRODUCT_CODE_TYPE) AS PRODUCT_CODE_TYPE,
  product_manifest.PRODUCT_CODE,
  product_manifest.PRODUCT_CODE AS GTIN,
  catalog_product.*
FROM
  STUFF catalog_product
INNER JOIN
  ITEM product_manifest
ON
  catalog_product.OUR_KEY = product_manifest.OUR_KEY
WHERE
  UPPER(product_manifest.PRODUCT_CODE_TYPE) = 'GTIN'
QUALIFY
  ROW_NUMBER() OVER (PARTITION BY catalog_product.OUR_KEY ORDER BY catalog_product.LOADED_AT DESC) = 1

Let's assume that both STUFF and ITEM have a similar field, GTIN and PRODUCT_CODE that are met to be the same, but GTIN is empty in the ITEM table and we want to override that but still get everything from it so we use our AS statement.

If we run this in snowflake via a console, we get back a result that looks (roughly) like this:

PRODUCT_CODE_TYPE	PRODUCT_CODE	GTIN	WHATEVER	GTIN_2
GTIN	1234	1234	thing
GTIN	1235	1235	stuff

However, when we example the results coming out of Snowflex, we will see a setup like this:

[
  {"product_code_type", 0},
  {"product_code", 1},
  {"gtin", 2},
  {"whatever", 3},
  {"gtin", 4},
]

as the value for bin_headers, thus when we do our reduction, since the empty GTIN value from catalogue_product.* overrides the value of the one we set using our AS clause.

Now, if we simply moved the AS clause to be after catalog_products.* we could do this, but its similarly misleading as a whole and just further obfuscates the underlying issue of receiving the truth from the query. It is treating a symptom rather than a cause, and allows us to bubble the fix we're using internally in an ETL process from our app to the library, but is debatably more deceptive when it comes to figuring out why something broke later down the line (which, these things being computers being programmed by people, will for sure happen)

It seems that the solution here may be to add an option to use a not-null-or-empty value for like-keys in query results when deriving this map, but that probably has some possibly surprising behavior if its enabled by default.

I'm not 100% sure this is something that should be handled, but wanted to document it here in the event someone else ran into a similar issue, or if we deem it an API worth adding. I dont think making it just do this by default is the answer, and since maps cant have like keys, the path to the best answer seems murky.

Authentication should automatically happen when an authentication token expires

I left a process running overnight, and in the morning, received the following error:
2022-04-15 07:25:11.625 [warning] Unable to execute query: Authentication token has expired. The user must authenticate again. SQLSTATE IS: 08001
(erl_level=warning application=snowflex domain=elixir file=lib/snowflex/worker.ex function=do_param_query/3 line=143 mfa=Snowflex.Worker.do_param_query/3 module=Snowflex.Worker pid=<0.2242.0> )
2022-04-15 07:25:11.625 [error] Error connecting to Snowflake: 'Authentication token has expired. The user must authenticate again. SQLSTATE IS: 08001'
(erl_level=error application=api_services domain=elixir file=lib/api_services/analytic/inventory_forecast/other_fill_quantity.ex function=accumulate_into_map/1 line=47 mfa=Optimizer.ApiServices.Analytic.InventoryForecast.OtherFillQuantity.accumulate_into_map/1 module=Optimizer.ApiServices.Analytic.InventoryForecast.OtherFillQuantity pid=<0.2249.0> )

I am using authenticator: "SNOWFLAKE_JWT", and a private key file.

If a query fails due to expired authentication, snowflex should attempt to automatically renew the authentication token.

The Reason returned by :odbc.connect on error is not always a string

:odbc.connect returns {:error, Reason}, but Reason isn't guaranteed to be something that is stringable. A very common example would be hitting the file/process limit on OSX, which throws {:emfile, [...]}. This leads to the case statement throwing a ((Protocol.UndefinedError) protocol String.Chars not implemented for...) which is a bit of a red herring.

Tossing an inspect on the log line here would make it easier to debug the underlying issue, especially the first time you come across it.

Numeric column special values

Snowflake supports some special data types in numeric columns.

When the adapter sees one of these values, it explodes as we are treating binary data as a numeric type.

Allow query specific timeouts

Timeouts are important to maintain the health of an application and ensure that certain events happen in a logical period of time. However, holding everything to the same expectation can be dangerous. For example, if an expensive query can take up to 15 minutes to execute, requiring a global timeout of 15 minutes can delay finding issues with the performance of critical path queries that should take no longer than 1 minute.

For situations where a query is expected to exceed the default timeout for the connection, we should allow the execution function to take a timeout different from the default timeout.

Update to use DBConnection behavior

Has there been any consideration on updating this to implement the DBConnection behavior that official adapters like Postregx do?

https://github.com/elixir-ecto/db_connection

Is there a reason NOT to implement the DBConnection behavior?
Would you be open to a pull request with this major refactor?

Apple M1 Support

Hi there, sorry to ask here, but have you been able to run the project with apple M1? i have been trying to do it, but no success so far, i always get the following errors, i tried to do the same on apple intel and worked, but i may be doing something wrong on M1 and that's why i'm asking, thank you.

[warning] Unable to connect to snowflake: [unixODBC][Driver Manager]Can't open lib '/opt/snowflake/snowflakeodbc/lib/universal/libSnowflake.dylib' : file not found SQLSTATE IS: 01000 Connection to database failed.

snowflake Driver's SQLAllocHandle on SQL_HANDLE_HENV

# ~/Library/ODBC/odbc.ini

[ODBC Data Sources]
SnowflakeDSII = Snowflake

[SnowflakeDSII]
Server = xxx.snowflakecomputing.com
UID = xxx
Schema = PUBLIC
Warehouse = xxx
Role = xxx
Database = xxx
Driver = /opt/snowflake/snowflakeodbc/lib/universal/libSnowflake.dylib
Description = Snowflake DSII
Locale = en-US
Tracing = 0

# ~/Library/ODBC/odbcinst.ini

[ODBC]
Trace=no
TraceFile=

[ODBC Drivers]
Snowflake                       = Installed

[Snowflake]
Driver      = /opt/snowflake/snowflakeodbc/lib/universal/libSnowflake.dylib

# simba.snowflake.ini

ODBCInstLib=/opt/snowflake/snowflakeodbc/lib/universal/libSnowflake.dylib

# config.exs
config :snowflex, driver: "/opt/snowflake/snowflakeodbc/lib/universal/libSnowflake.dylib"

config :job_api, JobApi.SnowflakeConnection,
  connection: [
    role: "xxx",
    warehouse: "xxx",
    uid: "xxx",
    pwd: "xxx",
    server: "xxx.snowflakecomputing.com",
    schema: "PUBLIC",
    database: "xxx"
  ]

`distinct` produces an erroneous error message

Snowflex does not properly account for Ecto passing in an order for distinct queries.

The following code:

MySchema
|> distinct([q], q.my_field)
|> Repo.all()

Produces this compiled Ecto query:

from a0 in MySchema,
  distinct: [asc: a0.my_field],
  select: a0

Snowflex crashes due to this line.

Snowflake does not natively support DISTINCT ON, which is where the order by becomes important. As such, we should probably just ignore the order values?

WORKAROUND:
If you pass in a select and only select 1 field, this will work as expected.

backfill worker unit tests

the worker could use more unit tests. to accomplish this, we should:

wrap odbc in an Elixir behaviour
replace meck with mox
test (at a minimum) Worker.sql_query/3 and Worker.param_query/4