lmc-eu / luft Goto Github PK
View Code? Open in Web Editor NEW๐จ Luft is standard operators replacement for Airflow with declarative DAGs via Yaml file.
License: MIT License
๐จ Luft is standard operators replacement for Airflow with declarative DAGs via Yaml file.
License: MIT License
I am trying to execute bq load with a table that has a timestamp. I have following configuration settings (luft.cfg):
timestamp = timestamp, format: '%%Y-%%m-%%d %%H:%%M:%%S'
And I get this error:
"Could not parse '1997-09-01 00:00:00.000000 +0200' as a timestamp. Required format is YYYY-MM-DD HH:MM[:SS[.SSSSSS]]; Could not parse '1997-09-01 00:00:00.000000 +0200' as datetime for field add_date (position 10) starting at location 1261"
The schema of the table (saved in yml format) is following:
name: tblCompanies
source_system: zmena111
columns:
It seems it adds timezone %%z every time (even though we have deleted it from there).
LUFT_CONFIG=/luft/luftlz.cfg JDBC_CONFIG=/luft/jdbclz.cfg luft jdbc load -y profesialz/web/tblCVRegions.yml -s '0001-01-01'
2019-09-27 08:32:00,928 [INFO] common:__call__ - Embulk cmd: ['java', '-jar', '/opt/embulk/embulk.jar']
2019-09-27 08:32:01,554 [INFO] common:_read_output - 2019-09-27 08:32:01.551 +0000: Embulk v0.9.18
2019-09-27 08:32:02,746 [INFO] common:_read_output - 2019-09-27 08:32:02.743 +0000 [WARN] (main): DEPRECATION: JRuby org.jruby.embed.ScriptingContainer is directly injected.
2019-09-27 08:32:09,051 [INFO] common:_read_output - 2019-09-27 08:32:09.050 +0000 [INFO] (main): Gem's home and path are set by default: "/root/.embulk/lib/gems"
2019-09-27 08:32:15,464 [INFO] common:_read_output - 2019-09-27 08:32:15.463 +0000 [INFO] (main): Started Embulk v0.9.18
2019-09-27 08:32:15,649 [INFO] common:_read_output - 2019-09-27 08:32:15.649 +0000 [INFO] (0001:transaction): Loaded plugin embulk-input-mysql (0.10.0)
2019-09-27 08:32:15,752 [INFO] common:_read_output - 2019-09-27 08:32:15.752 +0000 [INFO] (0001:transaction): Loaded plugin embulk-output-gcs (0.4.4)
2019-09-27 08:32:15,860 [INFO] common:_read_output - 2019-09-27 08:32:15.860 +0000 [INFO] (0001:transaction): JDBC Driver = /root/.embulk/lib/gems/gems/embulk-input-mysql-0.10.0/default_jdbc_driver/mysql-connector-java-5.1.44.jar
2019-09-27 08:32:15,933 [INFO] common:_read_output - 2019-09-27 08:32:15.867 +0000 [INFO] (0001:transaction): Fetch size is 10000. Using server-side prepared statement.
2019-09-27 08:32:15,934 [INFO] common:_read_output - 2019-09-27 08:32:15.933 +0000 [INFO] (0001:transaction): Connecting to jdbc:mysql://10.20.100.11:3306/profesia options {useCompression=true, socketTimeout=1800000, useSSL=false, user=devel, useTimezone=true, useLegacyDatetimeCode=false, tcpKeepAlive=true, serverTimezone=Europe/Prague, useCursorFetch=true, connectTimeout=1800000, password=***, zeroDateTimeBehavior=convertToNull}
2019-09-27 08:32:16,639 [INFO] common:_read_output - 2019-09-27 08:32:16.639 +0000 [INFO] (0001:transaction): Using JDBC Driver mysql-connector-java-5.1.44 ( Revision: b3cda4f864902ffdde495b9df93937c3e20009be )
2019-09-27 08:32:16,639 [INFO] common:_read_output - 2019-09-27 08:32:16.639 +0000 [WARN] (0001:transaction): embulk-input-mysql 0.9.0 upgraded the bundled MySQL Connector/J version from 5.1.34 to 5.1.44 .
2019-09-27 08:32:16,639 [INFO] common:_read_output - 2019-09-27 08:32:16.639 +0000 [WARN] (0001:transaction): And set useLegacyDatetimeCode=false by default in order to get correct datetime value when the server timezone and the client timezone are different.
2019-09-27 08:32:16,640 [INFO] common:_read_output - 2019-09-27 08:32:16.639 +0000 [WARN] (0001:transaction): Set useLegacyDatetimeCode=true if you need to get datetime value same as older embulk-input-mysql.
2019-09-27 08:32:16,841 [INFO] common:_read_output - 2019-09-27 08:32:16.840 +0000 [INFO] (0001:transaction): Using local thread executor with max_threads=2 / tasks=1
2019-09-27 08:32:17,040 [INFO] common:_read_output - 2019-09-27 08:32:17.039 +0000 [INFO] (0001:transaction): {done: 0 / 1, running: 0}
2019-09-27 08:32:18,969 [INFO] common:_read_output - 2019-09-27 08:32:18.969 +0000 [INFO] (0015:task-0000): Fetch size is 10000. Using server-side prepared statement.
2019-09-27 08:32:18,969 [INFO] common:_read_output - 2019-09-27 08:32:18.969 +0000 [INFO] (0015:task-0000): Connecting to jdbc:mysql://10.20.100.11:3306/profesia options {useCompression=true, socketTimeout=1800000, useSSL=false, user=devel, useTimezone=true, useLegacyDatetimeCode=false, tcpKeepAlive=true, serverTimezone=Europe/Prague, useCursorFetch=true, connectTimeout=1800000, password=***, zeroDateTimeBehavior=convertToNull}
2019-09-27 08:32:19,175 [INFO] common:_read_output - 2019-09-27 08:32:19.174 +0000 [INFO] (0015:task-0000): SQL: SELECT CV_ID AS cv_id, REGION_ID AS region_id, DWH_UPDATE_TS AS dwh_update_ts FROM `tblCVRegions` WHERE dwh_update_ts >= '2001-01-01'
same with bq load
LUFT_CONFIG=/luft/luftlz.cfg JDBC_CONFIG=/luft/jdbclz.cfg luft bq load -y profesialz/web/tblCVRegions.yml -s '0001-01-01'
2019-09-27 08:33:27,735 [INFO] common:_dataset_exists - Dataset stage_profesialz already exists.
2019-09-27 08:33:28,098 [INFO] common:_run_bq_command - ######################################################
2019-09-27 08:33:28,098 [INFO] common:_run_bq_command - Starting job luft-246cc035-64c1-4b0b-9b9d-6211b9622a82
2019-09-27 08:33:28,098 [INFO] common:_run_bq_command - ------------------------------------------------------
2019-09-27 08:33:28,099 [INFO] common:_run_bq_command - -- Create stage table
2019-09-27 08:33:28,099 [INFO] common:_run_bq_command - CREATE OR REPLACE TABLE stage_profesialz.tblCVRegions (
2019-09-27 08:33:28,099 [INFO] common:_run_bq_command - -------------------------------- PKs -------------------------------------
2019-09-27 08:33:28,099 [INFO] common:_run_bq_command -
2019-09-27 08:33:28,099 [INFO] common:_run_bq_command - ------------------------------ Columns -----------------------------------
2019-09-27 08:33:28,099 [INFO] common:_run_bq_command - cv_id NUMERIC,
2019-09-27 08:33:28,099 [INFO] common:_run_bq_command - region_id NUMERIC,
2019-09-27 08:33:28,099 [INFO] common:_run_bq_command - dwh_update_ts TIMESTAMP
2019-09-27 08:33:28,099 [INFO] common:_run_bq_command - --------------------------------------------------------------------------
2019-09-27 08:33:28,099 [INFO] common:_run_bq_command - )
2019-09-27 08:33:28,380 [INFO] common:_run_bq_command - ------------------------------------------------------
2019-09-27 08:33:28,380 [INFO] common:_run_bq_command - Job luft-246cc035-64c1-4b0b-9b9d-6211b9622a82 finished.
2019-09-27 08:33:28,380 [INFO] common:_run_bq_command - DONE. It took 0.106 sec.
2019-09-27 08:33:28,380 [INFO] common:_run_bq_command - ######################################################
2019-09-27 08:33:28,380 [INFO] common:load_csv - Loading CSV data from `gs://nada-test-profesia-data/DEV/profesialz/web/tblCVRegions/2001-01-01/000000*`.
Today morning i was trying to load data from yesterday jdbc load
$LUFT_CONFIG=/luft/luft2.cfg JDBC_CONFIG=/luft/jdbc.cfg luft bq load -y profesia3 --start-date 2019-09-11
2019-09-13 05:31:31,025 [INFO]
common:_dataset_exists - Dataset stage_tasks already exists.
2019-09-13 05:31:31,230 [INFO] common:_run_bq_command - ######################################################
...
2019-09-13 05:31:31,741 [ERROR] common:load_csv - [{'reason': 'notFound', 'message': 'Not found: Uris gs://nada-test-profesia-data/DEV/tasks/profesia3/tblZip/2019-09-12/000000*'}]
Changed:
$LUFT_CONFIG=/luft/luft2.cfg JDBC_CONFIG=/luft/jdbc.cfg luft bq load -y profesia3 --start-date 2019-09-12
2019-09-13 05:31:48,326 [INFO] common:_dataset_exists - Dataset stage_tasks already exists.
2019-09-13 05:31:48,528 [INFO] common:_run_bq_command - ######################################################
....
2019-09-13 05:31:49,306 [ERROR] common:load_csv - [{'reason': 'notFound', 'message': 'Not found: Uris gs://nada-test-profesia-data/DEV/tasks/profesia3/tblZip/2019-09-12/000000*'}]
It seems it tries to hit same file each time
Table has only 2 columns , both are primary keys
2019-09-20 12:57:26,325 [INFO] common:_run_bq_command - ------------------------------------------------------
2019-09-20 12:57:26,325 [INFO] common:_run_bq_command - -- Create stage table
2019-09-20 12:57:26,325 [INFO] common:_run_bq_command - CREATE OR REPLACE TABLE stage_profesialz.tblAdminDepartment (
2019-09-20 12:57:26,325 [INFO] common:_run_bq_command - -------------------------------- PKs -------------------------------------
2019-09-20 12:57:26,325 [INFO] common:_run_bq_command - admin_id NUMERIC,
2019-09-20 12:57:26,325 [INFO] common:_run_bq_command - department_id NUMERIC,
2019-09-20 12:57:26,325 [INFO] common:_run_bq_command - ------------------------------ Columns -----------------------------------
2019-09-20 12:57:26,325 [INFO] common:_run_bq_command -
2019-09-20 12:57:26,325 [INFO] common:_run_bq_command - --------------------------------------------------------------------------
2019-09-20 12:57:26,325 [INFO] common:_run_bq_command - )
Traceback (most recent call last):
File "/usr/local/bin/luft", line 11, in <module>
load_entry_point('luft', 'console_scripts', 'luft')()
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/click/decorators.py", line 17, in new_func
return f(get_current_context(), *args, **kwargs)
File "/work/cli/luft.py", line 150, in load
_loop_tasks(task_list)
File "/work/cli/luft.py", line 64, in _loop_tasks
task(ts=date_valid, **kwargs)
File "/work/luft/tasks/bq_load_task.py", line 81, in __call__
env_vars)
File "/work/luft/tasks/bq_exec_task.py", line 167, in _run_bq_command
query_job.result()
File "/usr/local/lib/python3.6/dist-packages/google/cloud/bigquery/job.py", line 2908, in result
super(QueryJob, self).result(timeout=timeout)
File "/usr/local/lib/python3.6/dist-packages/google/cloud/bigquery/job.py", line 733, in result
return super(_AsyncJob, self).result(timeout=timeout)
File "/usr/local/lib/python3.6/dist-packages/google/api_core/future/polling.py", line 127, in result
raise self._exception
google.api_core.exceptions.BadRequest: 400 Syntax error: Unexpected ")" at [9:1]
(job ID: luft-b0eb7eb7-4a36-45b5-963f-d267af2a087e)
CREATE OR REPLACE TABLE stage_profesialz.tblAdminDepartment (
admin_id NUMERIC,
department_id NUMERIC
)
instead of:
CREATE OR REPLACE TABLE stage_profesialz.tblAdminDepartment (
admin_id NUMERIC,
department_id NUMERIC,
)
When I try to execute jdbc load task for a specific single yaml file (instead of folder with yaml files), I get this error:
UnboundLocalError: local variable 'yml_file' referenced before assignment
It can be found in this part of the code:
/work/luft/common/task_list.py
https://github.com/lmc-eu/luft/blob/master/luft/common/task_list.py#L105
We have table with primary key on 2 columns (composite key)
CREATE TABLE `tblAdmin` (
`admin_id` INT(11) NOT NULL AUTO_INCREMENT,
`channel_id` MEDIUMINT(9) NOT NULL DEFAULT '1',
`firstname` VARCHAR(50) NOT NULL DEFAULT '' COLLATE 'utf8_slovak_ci',
PRIMARY KEY (`admin_id`, `channel_id`)
)
in tblAdmin.yml we have for this 2 columns added pk: true, Without defining pk it fails too... :
Output:
2019-09-16 10:13:41,223 [INFO] common:_dataset_exists - Dataset stage_tasks already exists.
2019-09-16 10:13:41,460 [INFO] common:_run_bq_command - ######################################################
2019-09-16 10:13:41,461 [INFO] common:_run_bq_command - Starting job luft-ce23ee38-528f-4e60-9972-949347e6a93c
2019-09-16 10:13:41,461 [INFO] common:_run_bq_command - ------------------------------------------------------
2019-09-16 10:13:41,461 [INFO] common:_run_bq_command - -- Create stage table
2019-09-16 10:13:41,461 [INFO] common:_run_bq_command - CREATE OR REPLACE TABLE stage_tasks.tblAdmin (
2019-09-16 10:13:41,461 [INFO] common:_run_bq_command - -------------------------------- PKs -------------------------------------
2019-09-16 10:13:41,461 [INFO] common:_run_bq_command - admin_id NUMERIC,
2019-09-16 10:13:41,461 [INFO] common:_run_bq_command - channel_id NUMERIC,
2019-09-16 10:13:41,461 [INFO] common:_run_bq_command - ------------------------------ Columns -----------------------------------
2019-09-16 10:13:41,461 [INFO] common:_run_bq_command - firstname STRING,
2019-09-16 10:13:41,461 [INFO] common:_run_bq_command - surname STRING,
2019-09-16 10:13:41,461 [INFO] common:_run_bq_command - e_mail STRING,
2019-09-16 10:13:41,461 [INFO] common:_run_bq_command - signature STRING,
2019-09-16 10:13:41,461 [INFO] common:_run_bq_command - title_name STRING,
2019-09-16 10:13:41,461 [INFO] common:_run_bq_command - phone STRING,
2019-09-16 10:13:41,461 [INFO] common:_run_bq_command - celular STRING,
2019-09-16 10:13:41,461 [INFO] common:_run_bq_command - celular_private STRING,
2019-09-16 10:13:41,461 [INFO] common:_run_bq_command - fax STRING,
2019-09-16 10:13:41,461 [INFO] common:_run_bq_command - post_address STRING,
2019-09-16 10:13:41,461 [INFO] common:_run_bq_command - msn STRING,
2019-09-16 10:13:41,461 [INFO] common:_run_bq_command - skype STRING,
2019-09-16 10:13:41,462 [INFO] common:_run_bq_command - holiday STRING,
2019-09-16 10:13:41,462 [INFO] common:_run_bq_command - position_title STRING,
2019-09-16 10:13:41,462 [INFO] common:_run_bq_command - responsibilities STRING,
2019-09-16 10:13:41,462 [INFO] common:_run_bq_command - photo STRING,
2019-09-16 10:13:41,462 [INFO] common:_run_bq_command - birth_date DATE,
2019-09-16 10:13:41,462 [INFO] common:_run_bq_command - reporter_send_email NUMERIC,
2019-09-16 10:13:41,462 [INFO] common:_run_bq_command - maternity_leave NUMERIC,
2019-09-16 10:13:41,462 [INFO] common:_run_bq_command - noticeboard_visit DATETIME
2019-09-16 10:13:41,462 [INFO] common:_run_bq_command - --------------------------------------------------------------------------
2019-09-16 10:13:41,462 [INFO] common:_run_bq_command - )
2019-09-16 10:13:41,925 [INFO] common:_run_bq_command - ------------------------------------------------------
2019-09-16 10:13:41,925 [INFO] common:_run_bq_command - Job luft-ce23ee38-528f-4e60-9972-949347e6a93c finished.
2019-09-16 10:13:41,925 [INFO] common:_run_bq_command - DONE. It took 0.143 sec.
2019-09-16 10:13:41,926 [INFO] common:_run_bq_command - ######################################################
2019-09-16 10:13:41,926 [INFO] common:load_csv - Loading CSV data from `gs://nada-test-profesia-data/DEV/tasks/profesia4/tblAdmin/2019-09-15/000000*`.
2019-09-16 10:13:44,533 [INFO] common:load_csv - Loaded 686 rows into tblAdmin.
2019-09-16 10:13:44,825 [INFO] common:_dataset_exists - Dataset tasks already exists.
2019-09-16 10:13:45,010 [INFO] common:_run_bq_command - ######################################################
2019-09-16 10:13:45,011 [INFO] common:_run_bq_command - Starting job luft-2c8be889-7903-4397-bea7-e4383dc490bb
2019-09-16 10:13:45,011 [INFO] common:_run_bq_command - ------------------------------------------------------
2019-09-16 10:13:45,011 [INFO] common:_run_bq_command - -- Create history table
2019-09-16 10:13:45,011 [INFO] common:_run_bq_command - CREATE TABLE IF NOT EXISTS tasks.tblAdmin (
2019-09-16 10:13:45,011 [INFO] common:_run_bq_command - -------------------------------- PKs -------------------------------------
2019-09-16 10:13:45,011 [INFO] common:_run_bq_command - admin_id NUMERIC,
2019-09-16 10:13:45,011 [INFO] common:_run_bq_command - channel_id NUMERIC,
2019-09-16 10:13:45,011 [INFO] common:_run_bq_command - ------------------------------- Tech -------------------------------------
2019-09-16 10:13:45,011 [INFO] common:_run_bq_command - dw_load_date TIMESTAMP NOT NULL,
2019-09-16 10:13:45,011 [INFO] common:_run_bq_command - dw_valid_from TIMESTAMP NOT NULL,
2019-09-16 10:13:45,011 [INFO] common:_run_bq_command - dw_valid_to TIMESTAMP NOT NULL,
2019-09-16 10:13:45,011 [INFO] common:_run_bq_command - dw_current_flag BOOLEAN NOT NULL,
2019-09-16 10:13:45,011 [INFO] common:_run_bq_command - dw_gdpr_flag STRING NOT NULL,
2019-09-16 10:13:45,011 [INFO] common:_run_bq_command - dw_source STRING NOT NULL,
2019-09-16 10:13:45,011 [INFO] common:_run_bq_command - dw_hash_diff NUMERIC NOT NULL,
2019-09-16 10:13:45,011 [INFO] common:_run_bq_command - ------------------------------ Columns -----------------------------------
2019-09-16 10:13:45,011 [INFO] common:_run_bq_command - firstname STRING,
2019-09-16 10:13:45,011 [INFO] common:_run_bq_command - surname STRING,
2019-09-16 10:13:45,011 [INFO] common:_run_bq_command - e_mail STRING,
2019-09-16 10:13:45,011 [INFO] common:_run_bq_command - signature STRING,
2019-09-16 10:13:45,011 [INFO] common:_run_bq_command - title_name STRING,
2019-09-16 10:13:45,011 [INFO] common:_run_bq_command - phone STRING,
2019-09-16 10:13:45,012 [INFO] common:_run_bq_command - celular STRING,
2019-09-16 10:13:45,012 [INFO] common:_run_bq_command - celular_private STRING,
2019-09-16 10:13:45,012 [INFO] common:_run_bq_command - fax STRING,
2019-09-16 10:13:45,012 [INFO] common:_run_bq_command - post_address STRING,
2019-09-16 10:13:45,012 [INFO] common:_run_bq_command - msn STRING,
2019-09-16 10:13:45,012 [INFO] common:_run_bq_command - skype STRING,
2019-09-16 10:13:45,012 [INFO] common:_run_bq_command - holiday STRING,
2019-09-16 10:13:45,012 [INFO] common:_run_bq_command - position_title STRING,
2019-09-16 10:13:45,012 [INFO] common:_run_bq_command - responsibilities STRING,
2019-09-16 10:13:45,012 [INFO] common:_run_bq_command - photo STRING,
2019-09-16 10:13:45,012 [INFO] common:_run_bq_command - birth_date DATE,
2019-09-16 10:13:45,012 [INFO] common:_run_bq_command - reporter_send_email NUMERIC,
2019-09-16 10:13:45,012 [INFO] common:_run_bq_command - maternity_leave NUMERIC,
2019-09-16 10:13:45,012 [INFO] common:_run_bq_command - noticeboard_visit DATETIME
2019-09-16 10:13:45,012 [INFO] common:_run_bq_command - --------------------------------------------------------------------------
2019-09-16 10:13:45,012 [INFO] common:_run_bq_command - )
2019-09-16 10:13:45,211 [INFO] common:_run_bq_command - ------------------------------------------------------
2019-09-16 10:13:45,212 [INFO] common:_run_bq_command - Job luft-2c8be889-7903-4397-bea7-e4383dc490bb finished.
2019-09-16 10:13:45,212 [INFO] common:_run_bq_command - DONE. It took 0.076 sec.
2019-09-16 10:13:45,212 [INFO] common:_run_bq_command - ######################################################
2019-09-16 10:13:45,472 [INFO] common:_run_bq_command - ######################################################
2019-09-16 10:13:45,473 [INFO] common:_run_bq_command - Starting job luft-00ac8ee3-7efe-44c3-874f-2c4407cde9fd
2019-09-16 10:13:45,473 [INFO] common:_run_bq_command - ------------------------------------------------------
2019-09-16 10:13:45,473 [INFO] common:_run_bq_command - -- When this script runs multiple times a day we need to delete this day
2019-09-16 10:13:45,473 [INFO] common:_run_bq_command - DELETE FROM tasks.tblAdmin WHERE dw_valid_from >= timestamp ('2019-09-15')
2019-09-16 10:13:47,010 [INFO] common:_run_bq_command - ------------------------------------------------------
2019-09-16 10:13:47,010 [INFO] common:_run_bq_command - Job luft-00ac8ee3-7efe-44c3-874f-2c4407cde9fd finished.
2019-09-16 10:13:47,010 [INFO] common:_run_bq_command - DONE. It took 1.149 sec.
2019-09-16 10:13:47,010 [INFO] common:_run_bq_command - ######################################################
2019-09-16 10:13:47,237 [INFO] common:_run_bq_command - ######################################################
2019-09-16 10:13:47,238 [INFO] common:_run_bq_command - Starting job luft-c58f12e3-65c8-482e-b3ec-ba8efc8ae741
2019-09-16 10:13:47,238 [INFO] common:_run_bq_command - ------------------------------------------------------
2019-09-16 10:13:47,238 [INFO] common:_run_bq_command - -- We need to set current flag to False otherwise we will get multiple trues for one PK
2019-09-16 10:13:47,238 [INFO] common:_run_bq_command - UPDATE tasks.tblAdmin SET dw_current_flag = False WHERE dw_current_flag = True
2019-09-16 10:13:48,766 [INFO] common:_run_bq_command - ------------------------------------------------------
2019-09-16 10:13:48,766 [INFO] common:_run_bq_command - Job luft-c58f12e3-65c8-482e-b3ec-ba8efc8ae741 finished.
2019-09-16 10:13:48,766 [INFO] common:_run_bq_command - DONE. It took 1.178 sec.
2019-09-16 10:13:48,766 [INFO] common:_run_bq_command - ######################################################
2019-09-16 10:13:48,998 [INFO] common:_run_bq_command - ######################################################
2019-09-16 10:13:48,998 [INFO] common:_run_bq_command - Starting job luft-f15d4ce4-ae20-401b-b45b-6f83f87b8f1e
2019-09-16 10:13:48,998 [INFO] common:_run_bq_command - ------------------------------------------------------
2019-09-16 10:13:48,998 [INFO] common:_run_bq_command - -- Merge new data into historic table
2019-09-16 10:13:48,998 [INFO] common:_run_bq_command - MERGE INTO tasks.tblAdmin t
2019-09-16 10:13:48,998 [INFO] common:_run_bq_command - USING (
2019-09-16 10:13:48,998 [INFO] common:_run_bq_command - SELECT
2019-09-16 10:13:48,998 [INFO] common:_run_bq_command - -- make sure to add column after Not empty PK
2019-09-16 10:13:48,998 [INFO] common:_run_bq_command - admin_id,
2019-09-16 10:13:48,998 [INFO] common:_run_bq_command - channel_id,
2019-09-16 10:13:48,998 [INFO] common:_run_bq_command - -- DW columns
2019-09-16 10:13:48,999 [INFO] common:_run_bq_command - current_timestamp AS dw_load_date,
2019-09-16 10:13:48,999 [INFO] common:_run_bq_command - -- New row has alwas same DW_VALID_FROM and DW_VALID_TO
2019-09-16 10:13:48,999 [INFO] common:_run_bq_command - timestamp('2019-09-15') AS dw_valid_from,
2019-09-16 10:13:48,999 [INFO] common:_run_bq_command - timestamp('2019-09-15') AS dw_valid_to,
2019-09-16 10:13:48,999 [INFO] common:_run_bq_command - -- New row is always current
2019-09-16 10:13:48,999 [INFO] common:_run_bq_command - True AS dw_current_flag,
2019-09-16 10:13:48,999 [INFO] common:_run_bq_command - -- Not implemented
2019-09-16 10:13:48,999 [INFO] common:_run_bq_command - 'N' AS dw_gdpr_flag,
2019-09-16 10:13:48,999 [INFO] common:_run_bq_command - 'tasks.profesia4.tblAdmin' AS dw_source,
2019-09-16 10:13:48,999 [INFO] common:_run_bq_command - -- For finding changes
2019-09-16 10:13:48,999 [INFO] common:_run_bq_command - FARM_FINGERPRINT(CONCAT(IFNULL(CAST(admin_id AS STRING), ''), IFNULL(CAST(channel_id AS STRING), ''), IFNULL(CAST(firstname AS STRING), ''), IFNULL(CAST(surname AS STRING), ''), IFNULL(CAST(e_mail AS STRING), ''), IFNULL(CAST(signature AS STRING), ''), IFNULL(CAST(title_name AS STRING), ''), IFNULL(CAST(phone AS STRING), ''), IFNULL(CAST(celular AS STRING), ''), IFNULL(CAST(celular_private AS STRING), ''), IFNULL(CAST(fax AS STRING), ''), IFNULL(CAST(post_address AS STRING), ''), IFNULL(CAST(msn AS STRING), ''), IFNULL(CAST(skype AS STRING), ''), IFNULL(CAST(holiday AS STRING), ''), IFNULL(CAST(position_title AS STRING), ''), IFNULL(CAST(responsibilities AS STRING), ''), IFNULL(CAST(photo AS STRING), ''), IFNULL(CAST(birth_date AS STRING), ''), IFNULL(CAST(reporter_send_email AS STRING), ''), IFNULL(CAST(maternity_leave AS STRING), ''), IFNULL(CAST(noticeboard_visit AS STRING), ''))) AS dw_hash_diff
2019-09-16 10:13:48,999 [INFO] common:_run_bq_command - ,firstname,
2019-09-16 10:13:48,999 [INFO] common:_run_bq_command - surname,
2019-09-16 10:13:48,999 [INFO] common:_run_bq_command - e_mail,
2019-09-16 10:13:48,999 [INFO] common:_run_bq_command - signature,
2019-09-16 10:13:48,999 [INFO] common:_run_bq_command - title_name,
2019-09-16 10:13:48,999 [INFO] common:_run_bq_command - phone,
2019-09-16 10:13:49,000 [INFO] common:_run_bq_command - celular,
2019-09-16 10:13:49,000 [INFO] common:_run_bq_command - celular_private,
2019-09-16 10:13:49,000 [INFO] common:_run_bq_command - fax,
2019-09-16 10:13:49,000 [INFO] common:_run_bq_command - post_address,
2019-09-16 10:13:49,000 [INFO] common:_run_bq_command - msn,
2019-09-16 10:13:49,000 [INFO] common:_run_bq_command - skype,
2019-09-16 10:13:49,000 [INFO] common:_run_bq_command - holiday,
2019-09-16 10:13:49,000 [INFO] common:_run_bq_command - position_title,
2019-09-16 10:13:49,000 [INFO] common:_run_bq_command - responsibilities,
2019-09-16 10:13:49,000 [INFO] common:_run_bq_command - photo,
2019-09-16 10:13:49,000 [INFO] common:_run_bq_command - birth_date,
2019-09-16 10:13:49,000 [INFO] common:_run_bq_command - reporter_send_email,
2019-09-16 10:13:49,000 [INFO] common:_run_bq_command - maternity_leave,
2019-09-16 10:13:49,000 [INFO] common:_run_bq_command - noticeboard_visit
2019-09-16 10:13:49,000 [INFO] common:_run_bq_command - FROM (
2019-09-16 10:13:49,000 [INFO] common:_run_bq_command - -- We need unique columns
2019-09-16 10:13:49,001 [INFO] common:_run_bq_command - SELECT DISTINCT
2019-09-16 10:13:49,001 [INFO] common:_run_bq_command - admin_id,
2019-09-16 10:13:49,001 [INFO] common:_run_bq_command - channel_id,
2019-09-16 10:13:49,001 [INFO] common:_run_bq_command - firstname,
2019-09-16 10:13:49,001 [INFO] common:_run_bq_command - surname,
2019-09-16 10:13:49,001 [INFO] common:_run_bq_command - e_mail,
2019-09-16 10:13:49,001 [INFO] common:_run_bq_command - signature,
2019-09-16 10:13:49,001 [INFO] common:_run_bq_command - title_name,
2019-09-16 10:13:49,001 [INFO] common:_run_bq_command - phone,
2019-09-16 10:13:49,001 [INFO] common:_run_bq_command - celular,
2019-09-16 10:13:49,001 [INFO] common:_run_bq_command - celular_private,
2019-09-16 10:13:49,001 [INFO] common:_run_bq_command - fax,
2019-09-16 10:13:49,001 [INFO] common:_run_bq_command - post_address,
2019-09-16 10:13:49,001 [INFO] common:_run_bq_command - msn,
2019-09-16 10:13:49,001 [INFO] common:_run_bq_command - skype,
2019-09-16 10:13:49,001 [INFO] common:_run_bq_command - holiday,
2019-09-16 10:13:49,002 [INFO] common:_run_bq_command - position_title,
2019-09-16 10:13:49,002 [INFO] common:_run_bq_command - responsibilities,
2019-09-16 10:13:49,002 [INFO] common:_run_bq_command - photo,
2019-09-16 10:13:49,002 [INFO] common:_run_bq_command - birth_date,
2019-09-16 10:13:49,002 [INFO] common:_run_bq_command - reporter_send_email,
2019-09-16 10:13:49,002 [INFO] common:_run_bq_command - maternity_leave,
2019-09-16 10:13:49,002 [INFO] common:_run_bq_command - noticeboard_visit
2019-09-16 10:13:49,002 [INFO] common:_run_bq_command - FROM stage_tasks.tblAdmin
2019-09-16 10:13:49,002 [INFO] common:_run_bq_command - ) i
2019-09-16 10:13:49,002 [INFO] common:_run_bq_command - ) s
2019-09-16 10:13:49,002 [INFO] common:_run_bq_command - -- first we need same PKs
2019-09-16 10:13:49,002 [INFO] common:_run_bq_command - ON (s.admin_id = t.admin_id
2019-09-16 10:13:49,002 [INFO] common:_run_bq_command - AND s.channel_id = t.channel_id AND
2019-09-16 10:13:49,002 [INFO] common:_run_bq_command - -- and same hashes because we want to compare only same rows
2019-09-16 10:13:49,002 [INFO] common:_run_bq_command - t.DW_HASH_DIFF = s.DW_HASH_DIFF
2019-09-16 10:13:49,002 [INFO] common:_run_bq_command - -- and also need compare only last instance (day) with new data
2019-09-16 10:13:49,003 [INFO] common:_run_bq_command - AND t.DW_VALID_TO = TIMESTAMP_SUB(s.DW_VALID_TO, INTERVAL 1 DAY))
2019-09-16 10:13:49,003 [INFO] common:_run_bq_command - -- if new data is not same as last increment then insert data
2019-09-16 10:13:49,003 [INFO] common:_run_bq_command - WHEN NOT MATCHED THEN
2019-09-16 10:13:49,003 [INFO] common:_run_bq_command - INSERT (
2019-09-16 10:13:49,003 [INFO] common:_run_bq_command - admin_id,
2019-09-16 10:13:49,003 [INFO] common:_run_bq_command - channel_id,
2019-09-16 10:13:49,003 [INFO] common:_run_bq_command - DW_LOAD_DATE,
2019-09-16 10:13:49,003 [INFO] common:_run_bq_command - DW_VALID_FROM,
2019-09-16 10:13:49,003 [INFO] common:_run_bq_command - DW_VALID_TO,
2019-09-16 10:13:49,003 [INFO] common:_run_bq_command - DW_CURRENT_FLAG,
2019-09-16 10:13:49,003 [INFO] common:_run_bq_command - DW_GDPR_FLAG,
2019-09-16 10:13:49,003 [INFO] common:_run_bq_command - DW_SOURCE,
2019-09-16 10:13:49,003 [INFO] common:_run_bq_command - DW_HASH_DIFF
2019-09-16 10:13:49,003 [INFO] common:_run_bq_command - ,firstname,
2019-09-16 10:13:49,003 [INFO] common:_run_bq_command - surname,
2019-09-16 10:13:49,003 [INFO] common:_run_bq_command - e_mail,
2019-09-16 10:13:49,003 [INFO] common:_run_bq_command - signature,
2019-09-16 10:13:49,003 [INFO] common:_run_bq_command - title_name,
2019-09-16 10:13:49,003 [INFO] common:_run_bq_command - phone,
2019-09-16 10:13:49,003 [INFO] common:_run_bq_command - celular,
2019-09-16 10:13:49,004 [INFO] common:_run_bq_command - celular_private,
2019-09-16 10:13:49,004 [INFO] common:_run_bq_command - fax,
2019-09-16 10:13:49,004 [INFO] common:_run_bq_command - post_address,
2019-09-16 10:13:49,004 [INFO] common:_run_bq_command - msn,
2019-09-16 10:13:49,004 [INFO] common:_run_bq_command - skype,
2019-09-16 10:13:49,004 [INFO] common:_run_bq_command - holiday,
2019-09-16 10:13:49,004 [INFO] common:_run_bq_command - position_title,
2019-09-16 10:13:49,004 [INFO] common:_run_bq_command - responsibilities,
2019-09-16 10:13:49,004 [INFO] common:_run_bq_command - photo,
2019-09-16 10:13:49,004 [INFO] common:_run_bq_command - birth_date,
2019-09-16 10:13:49,004 [INFO] common:_run_bq_command - reporter_send_email,
2019-09-16 10:13:49,004 [INFO] common:_run_bq_command - maternity_leave,
2019-09-16 10:13:49,004 [INFO] common:_run_bq_command - noticeboard_visit
2019-09-16 10:13:49,004 [INFO] common:_run_bq_command - )
2019-09-16 10:13:49,004 [INFO] common:_run_bq_command - VALUES (
2019-09-16 10:13:49,004 [INFO] common:_run_bq_command - admin_id,
2019-09-16 10:13:49,004 [INFO] common:_run_bq_command - channel_id,
2019-09-16 10:13:49,004 [INFO] common:_run_bq_command - DW_LOAD_DATE,
2019-09-16 10:13:49,004 [INFO] common:_run_bq_command - DW_VALID_FROM,
2019-09-16 10:13:49,004 [INFO] common:_run_bq_command - DW_VALID_TO,
2019-09-16 10:13:49,005 [INFO] common:_run_bq_command - DW_CURRENT_FLAG,
2019-09-16 10:13:49,005 [INFO] common:_run_bq_command - DW_GDPR_FLAG,
2019-09-16 10:13:49,005 [INFO] common:_run_bq_command - DW_SOURCE,
2019-09-16 10:13:49,005 [INFO] common:_run_bq_command - DW_HASH_DIFF
2019-09-16 10:13:49,005 [INFO] common:_run_bq_command - ,firstname,
2019-09-16 10:13:49,005 [INFO] common:_run_bq_command - surname,
2019-09-16 10:13:49,005 [INFO] common:_run_bq_command - e_mail,
2019-09-16 10:13:49,005 [INFO] common:_run_bq_command - signature,
2019-09-16 10:13:49,005 [INFO] common:_run_bq_command - title_name,
2019-09-16 10:13:49,005 [INFO] common:_run_bq_command - phone,
2019-09-16 10:13:49,005 [INFO] common:_run_bq_command - celular,
2019-09-16 10:13:49,005 [INFO] common:_run_bq_command - celular_private,
2019-09-16 10:13:49,005 [INFO] common:_run_bq_command - fax,
2019-09-16 10:13:49,005 [INFO] common:_run_bq_command - post_address,
2019-09-16 10:13:49,005 [INFO] common:_run_bq_command - msn,
2019-09-16 10:13:49,005 [INFO] common:_run_bq_command - skype,
2019-09-16 10:13:49,005 [INFO] common:_run_bq_command - holiday,
2019-09-16 10:13:49,005 [INFO] common:_run_bq_command - position_title,
2019-09-16 10:13:49,005 [INFO] common:_run_bq_command - responsibilities,
2019-09-16 10:13:49,005 [INFO] common:_run_bq_command - photo,
2019-09-16 10:13:49,005 [INFO] common:_run_bq_command - birth_date,
2019-09-16 10:13:49,006 [INFO] common:_run_bq_command - reporter_send_email,
2019-09-16 10:13:49,006 [INFO] common:_run_bq_command - maternity_leave,
2019-09-16 10:13:49,006 [INFO] common:_run_bq_command - noticeboard_visit
2019-09-16 10:13:49,006 [INFO] common:_run_bq_command - )
2019-09-16 10:13:49,006 [INFO] common:_run_bq_command - -- else update DW_VALID_TO to last
2019-09-16 10:13:49,006 [INFO] common:_run_bq_command - WHEN MATCHED THEN
2019-09-16 10:13:49,006 [INFO] common:_run_bq_command - UPDATE SET t.DW_VALID_TO = s.DW_VALID_TO,
2019-09-16 10:13:49,006 [INFO] common:_run_bq_command - t.DW_CURRENT_FLAG = True
Traceback (most recent call last):
File "/usr/local/bin/luft", line 11, in <module>
load_entry_point('luft', 'console_scripts', 'luft')()
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/click/decorators.py", line 17, in new_func
return f(get_current_context(), *args, **kwargs)
File "/work/cli/luft.py", line 150, in load
_loop_tasks(task_list)
File "/work/cli/luft.py", line 64, in _loop_tasks
task(ts=date_valid, **kwargs)
File "/work/luft/tasks/bq_load_task.py", line 85, in __call__
env_vars)
File "/work/luft/tasks/bq_exec_task.py", line 167, in _run_bq_command
query_job.result()
File "/usr/local/lib/python3.6/dist-packages/google/cloud/bigquery/job.py", line 2908, in result
super(QueryJob, self).result(timeout=timeout)
File "/usr/local/lib/python3.6/dist-packages/google/cloud/bigquery/job.py", line 733, in result
return super(_AsyncJob, self).result(timeout=timeout)
File "/usr/local/lib/python3.6/dist-packages/google/api_core/future/polling.py", line 127, in result
raise self._exception
google.api_core.exceptions.BadRequest: 400 Name channel_id not found inside t at [70:22]
(job ID: luft-f15d4ce4-ae20-401b-b45b-6f83f87b8f1e)
-----Query Job SQL Follows-----
| . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
1:-- Merge new data into historic table
2:MERGE INTO tasks.tblAdmin t
3: USING (
4: SELECT
5: -- make sure to add column after Not empty PK
6: admin_id,
7:channel_id,
8: -- DW columns
9: current_timestamp AS dw_load_date,
10: -- New row has alwas same DW_VALID_FROM and DW_VALID_TO
11: timestamp('2019-09-15') AS dw_valid_from,
12: timestamp('2019-09-15') AS dw_valid_to,
13: -- New row is always current
14: True AS dw_current_flag,
15: -- Not implemented
16: 'N' AS dw_gdpr_flag,
17: 'tasks.profesia4.tblAdmin' AS dw_source,
18: -- For finding changes
19: FARM_FINGERPRINT(CONCAT(IFNULL(CAST(admin_id AS STRING), ''), IFNULL(CAST(channel_id AS STRING), ''), IFNULL(CAST(firstname AS STRING), ''), IFNULL(CAST(surname AS STRING), ''), IFNULL(CAST(e_mail AS STRING), ''), IFNULL(CAST(signature AS STRING), ''), IFNULL(CAST(title_name AS STRING), ''), IFNULL(CAST(phone AS STRING), ''), IFNULL(CAST(celular AS STRING), ''), IFNULL(CAST(celular_private AS STRING), ''), IFNULL(CAST(fax AS STRING), ''), IFNULL(CAST(post_address AS STRING), ''), IFNULL(CAST(msn AS STRING), ''), IFNULL(CAST(skype AS STRING), ''), IFNULL(CAST(holiday AS STRING), ''), IFNULL(CAST(position_title AS STRING), ''), IFNULL(CAST(responsibilities AS STRING), ''), IFNULL(CAST(photo AS STRING), ''), IFNULL(CAST(birth_date AS STRING), ''), IFNULL(CAST(reporter_send_email AS STRING), ''), IFNULL(CAST(maternity_leave AS STRING), ''), IFNULL(CAST(noticeboard_visit AS STRING), ''))) AS dw_hash_diff
20: ,firstname,
21:surname,
22:e_mail,
23:signature,
24:title_name,
25:phone,
26:celular,
27:celular_private,
28:fax,
29:post_address,
30:msn,
31:skype,
32:holiday,
33:position_title,
34:responsibilities,
35:photo,
36:birth_date,
37:reporter_send_email,
38:maternity_leave,
39:noticeboard_visit
40: FROM (
41: -- We need unique columns
42: SELECT DISTINCT
43: admin_id,
44:channel_id,
45: firstname,
46:surname,
47:e_mail,
48:signature,
49:title_name,
50:phone,
51:celular,
52:celular_private,
53:fax,
54:post_address,
55:msn,
56:skype,
57:holiday,
58:position_title,
59:responsibilities,
60:photo,
61:birth_date,
62:reporter_send_email,
63:maternity_leave,
64:noticeboard_visit
65: FROM stage_tasks.tblAdmin
66: ) i
67: ) s
68: -- first we need same PKs
69: ON (s.admin_id = t.admin_id
70:AND s.channel_id = t.channel_id AND
71: -- and same hashes because we want to compare only same rows
72: t.DW_HASH_DIFF = s.DW_HASH_DIFF
73: -- and also need compare only last instance (day) with new data
74: AND t.DW_VALID_TO = TIMESTAMP_SUB(s.DW_VALID_TO, INTERVAL 1 DAY))
75: -- if new data is not same as last increment then insert data
76: WHEN NOT MATCHED THEN
77: INSERT (
78: admin_id,
79:channel_id,
80: DW_LOAD_DATE,
81: DW_VALID_FROM,
82: DW_VALID_TO,
83: DW_CURRENT_FLAG,
84: DW_GDPR_FLAG,
85: DW_SOURCE,
86: DW_HASH_DIFF
87: ,firstname,
88:surname,
89:e_mail,
90:signature,
91:title_name,
92:phone,
93:celular,
94:celular_private,
95:fax,
96:post_address,
97:msn,
98:skype,
99:holiday,
100:position_title,
101:responsibilities,
102:photo,
103:birth_date,
104:reporter_send_email,
105:maternity_leave,
106:noticeboard_visit
107: )
108: VALUES (
109: admin_id,
110:channel_id,
111: DW_LOAD_DATE,
112: DW_VALID_FROM,
113: DW_VALID_TO,
114: DW_CURRENT_FLAG,
115: DW_GDPR_FLAG,
116: DW_SOURCE,
117: DW_HASH_DIFF
118: ,firstname,
119:surname,
120:e_mail,
121:signature,
122:title_name,
123:phone,
124:celular,
125:celular_private,
126:fax,
127:post_address,
128:msn,
129:skype,
130:holiday,
131:position_title,
132:responsibilities,
133:photo,
134:birth_date,
135:reporter_send_email,
136:maternity_leave,
137:noticeboard_visit
138: )
139: -- else update DW_VALID_TO to last
140: WHEN MATCHED THEN
141: UPDATE SET t.DW_VALID_TO = s.DW_VALID_TO,
142: t.DW_CURRENT_FLAG = True
| . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
[ERROR] common:load_csv - 400 Error while reading data, error message: CSV table encountered too many errors, giving up. Rows: 1; errors: 1. Please look into the errors[] collection for more details.
2019-09-04 13:00:55,072 [ERROR] common:load_csv - [{'reason': 'invalid', 'message': 'Error while reading data, error message: CSV table encountered too many errors, giving up. Rows: 1; errors: 1. Please look into the errors[] collection for more details.'}, {'reason': 'invalid', 'message': 'Error while reading data, error message: CSV table references column position 72, but line starting at position:1261 contains only 6 columns.'}]
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.