treasure-data / digdag Goto Github PK
View Code? Open in Web Editor NEWWorkload Automation System
Home Page: https://www.digdag.io/
License: Apache License 2.0
Workload Automation System
Home Page: https://www.digdag.io/
License: Apache License 2.0
INSERT INTO OVERWRITE xxxx AS (query) is not supported yet in TD.
$ digdag run -d
...
2016-03-11 21:27:04 +0900 [ERROR] (0021@+main+tfidf): Task failed
java.lang.RuntimeException: Failed to process task config templates
at io.digdag.core.agent.OperatorManager.runWithArchive(OperatorManager.java:156)
at io.digdag.core.agent.OperatorManager.lambda$runWithHeartbeat$1(OperatorManager.java:128)
at io.digdag.core.agent.CurrentDirectoryArchiveManager.withExtractedArchive(CurrentDirectoryArchiveManager.java:20)
at io.digdag.core.agent.OperatorManager.runWithHeartbeat(OperatorManager.java:127)
at io.digdag.core.agent.OperatorManager.run(OperatorManager.java:106)
at io.digdag.cli.Run$OperatorManagerWithSkip.run(Run.java:567)
at io.digdag.core.agent.LocalAgent.lambda$run$0(LocalAgent.java:61)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: io.digdag.spi.TemplateException: Failed to evaluate JavaScript code: ${td.last_results.job_count}
Queries with magic comment like this:
-- @TD reducers: 4
SELECT ...
will be replaced as:
INSERT OVERWRITE TABLE ...
-- @TD reducers: 4
SELECT ...
Then a magic comment in Hive becomes ineffective. And also this can be problematic if a customer embeds meaningful message in the first line comment.
A possible solution for this problem would be skipping the first comment lines, then embedding INSERT OVERWRITE statement.
Right now, a user of digdag could be confused by creating & pushing a workflow to the server that is not longer in spec.
The server mode should check the versions are not the same and when it doesn't do the following:
We would like to avoid embedding td-api-key in the digdag yml file.
How about making apikey parameter optional for td executor? Even if API key is not given, td-client-java can read TD_API_KEY or .td/td.conf file.
There are 3 singular REST API:
GET /api/project?name=<name>
GET /api/projects/{id}/workflow?name=name[&revision=name]
GET /api/workflow?project=<name>&name=<name>[&revision=<name>]
Replace them by adding following REST API:
GET /api/projects?name=<name>
GET /api/projects/{id}/workflows?name=<name>[&revision=name]
/api/workflow
. Use /api/projects?name=<name
and /api/projects/{id}/workflows
instead)If a resource exist, these new REST API return an array with one element in it. If a resource doesn't exist, it returns an empty array (not 404 Not Found). With this way, client can tell that a project doesn't exist when GET /api/projects/{id}/workflows?name=<name>[&revision=name]
returns 404 Not Found because it returns an empty array if a project exists but workflow doesn't exist.
INSERT OVERWRITE TABLE (table) WITH ...
is not supported in Hive.
Observed the following error:
leo@weaver:~/work/td/2016-03-09> digdag new query-analysis [9:13:26 Mar 07 2016]
2016-03-09 09:13:36 +0900: Digdag v0.3.4
Creating query-analysis/digdag
Creating query-analysis/.digdag-wrapper/digdag.jar
Creating query-analysis/.gitignore
Creating query-analysis/tasks/shell_sample.sh
Creating query-analysis/tasks/repeat_hello.sh
Creating query-analysis/tasks/__init__.py
Creating query-analysis/digdag.yml
Done. Type `cd query-analysis` and `./digdag r` to run the workflow. Enjoy!
leo@weaver:~/work/td/2016-03-09> cd query-analysis [9:13:36 Mar 07 2016]
leo@weaver:~/work/td/2016-03-09/query-analysis> ls [9:13:38 Mar 07 2016]
digdag digdag.yml tasks
leo@weaver:~/work/td/2016-03-09/query-analysis> digdag run [9:13:38 Mar 07 2016]
2016-03-09 09:13:49 +0900: Digdag v0.3.4
2016-03-09 09:13:50 +0900 [WARN] (main): --session-time argument, --hour argument, or _schedule in yaml file is not set. Using today's 00:00:00 as ${session_time}.
2016-03-09 09:13:50 +0900 [INFO] (main): Using state files at digdag.status/20160309T000000+0900.
2016-03-09 09:13:50 +0900 [INFO] (main): Starting a new session repository id=1 workflow name=+main session_time=2016-03-09T00:00:00+09:00
2016-03-09 09:13:50 +0900 [ERROR] (0021@+main+step1): Task failed
java.lang.RuntimeException: Failed to process task config templates
at io.digdag.core.agent.OperatorManager.runWithArchive(OperatorManager.java:154)
at io.digdag.core.agent.OperatorManager.lambda$runWithHeartbeat$1(OperatorManager.java:128)
at io.digdag.core.agent.OperatorManager$$Lambda$114/2102246737.run(Unknown Source)
at io.digdag.core.agent.CurrentDirectoryArchiveManager.withExtractedArchive(CurrentDirectoryArchiveManager.java:20)
at io.digdag.core.agent.OperatorManager.runWithHeartbeat(OperatorManager.java:127)
at io.digdag.core.agent.OperatorManager.run(OperatorManager.java:106)
at io.digdag.cli.Run$OperatorManagerWithSkip.run(Run.java:567)
at io.digdag.core.agent.LocalAgent.lambda$run$0(LocalAgent.java:61)
at io.digdag.core.agent.LocalAgent$$Lambda$112/1884150000.run(Unknown Source)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: io.digdag.spi.TemplateException: Failed to evaluate JavaScript code: tasks/shell_sample.sh
at io.digdag.core.agent.ConfigEvalEngine.invokeTemplate(ConfigEvalEngine.java:91)
at io.digdag.core.agent.ConfigEvalEngine.access$200(ConfigEvalEngine.java:31)
at io.digdag.core.agent.ConfigEvalEngine$Context.evalValue(ConfigEvalEngine.java:170)
at io.digdag.core.agent.ConfigEvalEngine$Context.evalObjectRecursive(ConfigEvalEngine.java:128)
at io.digdag.core.agent.ConfigEvalEngine$Context.access$000(ConfigEvalEngine.java:95)
at io.digdag.core.agent.ConfigEvalEngine.eval(ConfigEvalEngine.java:62)
at io.digdag.core.agent.OperatorManager.runWithArchive(OperatorManager.java:151)
... 13 common frames omitted
Caused by: javax.script.ScriptException: String index out of range: 72
at jdk.nashorn.api.scripting.NashornScriptEngine.throwAsScriptException(NashornScriptEngine.java:455)
at jdk.nashorn.api.scripting.NashornScriptEngine.invokeImpl(NashornScriptEngine.java:387)
at jdk.nashorn.api.scripting.NashornScriptEngine.invokeFunction(NashornScriptEngine.java:187)
at io.digdag.core.agent.ConfigEvalEngine.invokeTemplate(ConfigEvalEngine.java:88)
... 19 common frames omitted
Caused by: jdk.nashorn.internal.runtime.ParserException: String index out of range: 72
at jdk.nashorn.internal.runtime.Context$ThrowErrorManager.error(Context.java:419)
at jdk.nashorn.internal.parser.Parser.recover(Parser.java:413)
at jdk.nashorn.internal.parser.Parser.sourceElements(Parser.java:831)
at jdk.nashorn.internal.parser.Parser.program(Parser.java:711)
at jdk.nashorn.internal.parser.Parser.parse(Parser.java:284)
at jdk.nashorn.internal.runtime.RecompilableScriptFunctionData.reparse(RecompilableScriptFunctionData.java:386)
at jdk.nashorn.internal.runtime.RecompilableScriptFunctionData.compileTypeSpecialization(RecompilableScriptFunctionData.java:511)
at jdk.nashorn.internal.runtime.RecompilableScriptFunctionData.getBest(RecompilableScriptFunctionData.java:730)
at jdk.nashorn.internal.runtime.ScriptFunctionData.getBestInvoker(ScriptFunctionData.java:232)
at jdk.nashorn.internal.runtime.ScriptFunction.findCallMethod(ScriptFunction.java:586)
at jdk.nashorn.internal.runtime.ScriptObject.lookup(ScriptObject.java:1872)
at jdk.nashorn.internal.runtime.linker.NashornLinker.getGuardedInvocation(NashornLinker.java:100)
at jdk.nashorn.internal.runtime.linker.NashornLinker.getGuardedInvocation(NashornLinker.java:94)
at jdk.internal.dynalink.support.CompositeTypeBasedGuardingDynamicLinker.getGuardedInvocation(CompositeTypeBasedGuardingDynamicLinker.java:176)
at jdk.internal.dynalink.support.CompositeGuardingDynamicLinker.getGuardedInvocation(CompositeGuardingDynamicLinker.java:124)
at jdk.internal.dynalink.support.LinkerServicesImpl.getGuardedInvocation(LinkerServicesImpl.java:149)
at jdk.internal.dynalink.DynamicLinker.relink(DynamicLinker.java:233)
at jdk.nashorn.internal.objects.NativeRegExp.callReplaceValue(NativeRegExp.java:819)
at jdk.nashorn.internal.objects.NativeRegExp.replace(NativeRegExp.java:696)
at jdk.nashorn.internal.objects.NativeString.replace(NativeString.java:809)
at jdk.nashorn.internal.scripts.Script$Recompilation$1$62AA$\^eval\_.template(<eval>:26)
at jdk.nashorn.internal.runtime.ScriptFunctionData.invoke(ScriptFunctionData.java:640)
at jdk.nashorn.internal.runtime.ScriptFunction.invoke(ScriptFunction.java:229)
at jdk.nashorn.internal.runtime.ScriptRuntime.apply(ScriptRuntime.java:387)
at jdk.nashorn.api.scripting.ScriptObjectMirror.callMember(ScriptObjectMirror.java:192)
at jdk.nashorn.api.scripting.NashornScriptEngine.invokeImpl(NashornScriptEngine.java:381)
... 21 common frames omitted
error:
* +main+step1:
Failed to process task config templates
Task state is saved at digdag.status/20160309T000000+0900 directory.
Run command with --session-time '2016-03-09 00:00:00' argument to retry failed tasks.```
digdag tasks are defined in a key-value map and digdag executes tasks in the literal order of the text in the yaml file. This can be fragile as key-value maps in YAML do not have a well defined semantic order.
http://yaml.org/spec/1.2/spec.html#id2765608
Some examples of how this can become painful:
Suggested solution alternatives:
.digdag
or .dd
etc. We can still say that syntax is YAML and people can configure editors to do YAML syntax highlighting etc of these files.tasks:
- name: task1
sh>: echo first task
- name: task2
td>: queries/second_task.sql
tasks:
- name: subtask1
sh>: echo sub
- name: subtask2
sh>: echo tasks
Some benefits of this approach are:
.yml
suffix.for_each>
parameters can be parameterized like below by utilizing YAML anchors:
run: +parameterized_for_each
_export:
foos: &FOOS
- 1
- 2
+parameterized_for_each:
for_each>:
foo: *FOOS
_do:
sh>: "echo hello ${foo}"
But it might be useful to allow for parameterizing for_each>
using actual digdag parameters. E.g.
run: +parameterized_for_each
_export:
foos:
- 1
- 2
+parameterized_for_each:
for_each>:
foo: @foos
_do:
sh>: "echo hello ${foo}"
And using parameters explicitly set by e.g. a py>
task:
run: +main
+main:
+export_foos:
py>: tasks.export_foos
+parameterized_for_each:
for_each>:
foo: ${foos}
_do:
sh>: "echo hello ${foo}"
import digdag
def export_foos():
digdag.env.store({"foos": [1, 2]})
Attempting to parameterize for_each>
like this currently fails with the below error:
2016-03-30 10:18:17 +0900: Digdag v0.4.2
2016-03-30 10:18:18 +0900 [WARN] (main): Reusing the last session time 2016-03-29T00:00:00+09:00.
2016-03-30 10:18:18 +0900 [INFO] (main): Using session digdag.status/20160329T000000+0900.
2016-03-30 10:18:18 +0900 [INFO] (main): Starting a new session repository id=1 workflow name=+main session_time=2016-03-29T00:00:00+09:00
2016-03-30 10:18:19 +0900 [INFO] (0020@+main+export_foos): py>: tasks.export_foos
2016-03-30 10:18:19 +0900 [INFO] (0020@+main+parameterized_for_each): for_each>: {foo=[1,2]}
2016-03-30 10:18:19 +0900 [ERROR] (0020@+main+parameterized_for_each): Task failed
io.digdag.client.config.ConfigException: Expected array type for key 'foo' but got "[1,2]" (string)
at io.digdag.client.config.Config.propagateConvertException(Config.java:400)
at io.digdag.client.config.Config.readObject(Config.java:391)
at io.digdag.client.config.Config.get(Config.java:235)
at io.digdag.client.config.Config.getList(Config.java:277)
at io.digdag.standards.operator.ForEachOperatorFactory$ForEachOperator.runTask(ForEachOperatorFactory.java:67)
at io.digdag.standards.operator.BaseOperator.run(BaseOperator.java:49)
at io.digdag.core.agent.OperatorManager.callExecutor(OperatorManager.java:238)
at io.digdag.cli.Run$OperatorManagerWithSkip.callExecutor(Run.java:653)
at io.digdag.core.agent.OperatorManager.runWithArchive(OperatorManager.java:193)
at io.digdag.core.agent.OperatorManager.lambda$runWithHeartbeat$1(OperatorManager.java:130)
at io.digdag.core.agent.CurrentDirectoryArchiveManager.withExtractedArchive(CurrentDirectoryArchiveManager.java:20)
at io.digdag.core.agent.OperatorManager.runWithHeartbeat(OperatorManager.java:129)
at io.digdag.core.agent.OperatorManager.run(OperatorManager.java:107)
at io.digdag.cli.Run$OperatorManagerWithSkip.run(Run.java:635)
at io.digdag.core.agent.LocalAgent.lambda$run$0(LocalAgent.java:61)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Can not deserialize instance of java.util.ArrayList out of VALUE_STRING token
at [Source: N/A; line: -1, column: -1]
at com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:148)
at com.fasterxml.jackson.databind.DeserializationContext.mappingException(DeserializationContext.java:854)
at com.fasterxml.jackson.databind.DeserializationContext.mappingException(DeserializationContext.java:850)
at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.handleNonArray(CollectionDeserializer.java:292)
at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:227)
at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:217)
at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:25)
at com.fasterxml.jackson.databind.ObjectMapper._readValue(ObjectMapper.java:3703)
at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2072)
at io.digdag.client.config.Config.readObject(Config.java:388)
... 18 common frames omitted
error:
* +main+parameterized_for_each:
Expected array type for key 'foo' but got "[1,2]" (string)
Task state is saved at digdag.status/20160329T000000+0900 directory.
Run command with --session '2016-03-29 00:00:00' argument to retry failed tasks.
I'm uncertain whether the best approach would be to make ${...}
expansion more intelligent about types of parameters and not coerce everything to a string, or if it would make more sense to introduce another syntax.
Currently ~/.digdag/config
is a java properties file. It would be convenient to use the HOCON format and the typesafe config library to allow for things like:
email {
host: smtp.foo.bar
user: ...
....
}
And to gain useful functionality like system property and environment variable substitution for free.
Do people use local timezones in their data pipelines, applications, servers, etc? In my experience local timezone application happens in the presentation layer (UI etc).
Afaik EC2 and GCE default to UTC.
Forget to push?
Design:
$ digdag init <name>
creates digdag.yml
and <name>.yml
files in <name>
directory as following:
# digdag.yml
name: <name>
workflows:
- <name>.yml
# <name>.yml
+step1:
sh>: some example here...
+step2:
sh>: some example here...
All digdag commands (run, push, etc.) use digdag.yml
as the entry point.
name
section in digdag.yml is the name of project.
Name of a workflow matches with the file name (so workflow yml files don't need name
section).
digdag push
doesn't require -f
option. It reads ./digdag.yml.
When I ran a workflow on Digdag server running in localhost, this error occurred.
2016-04-05 11:21:09 +0900 [INFO] (0037@+main+step1): sh>: ./tasks/bin/enc-tool -e development-ec2 -a 1 -c ENCRYPT -t 200
/bin/sh: ./tasks/bin/enc-tool: Permission denied
2016-04-05 11:21:09 +0900 [ERROR] (0037@+main+step1): Task failed
java.lang.RuntimeException: Command failed with code 126
at io.digdag.standards.operator.ShOperatorFactory$ShOperator.runTask(ShOperatorFactory.java:115)
at io.digdag.standards.operator.BaseOperator.run(BaseOperator.java:49)
at io.digdag.core.agent.OperatorManager.callExecutor(OperatorManager.java:241)
at io.digdag.core.agent.OperatorManager.runWithWorkspace(OperatorManager.java:196)
at io.digdag.core.agent.OperatorManager.lambda$runWithHeartbeat$1(OperatorManager.java:133)
at io.digdag.core.agent.LocalWorkspaceManager.withExtractedArchive(LocalWorkspaceManager.java:63)
at io.digdag.core.agent.OperatorManager.runWithHeartbeat(OperatorManager.java:132)
at io.digdag.core.agent.OperatorManager.run(OperatorManager.java:109)
at io.digdag.core.agent.LocalAgent.lambda$run$0(LocalAgent.java:61)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
It works in local mode.
digdag.yml
project file.yml
to .dig
.dig
files in project directory are picked up automatically by digdag--project
option to specify the project directorydigdag run
-f
option is removeddigdag init
creates:
.digdag
directory in project directory.digdag/config
file.digdag
directory, starting in current directory and then recurses into parent directories (like git).digdag push
pushes entire project (as identified by .digdag
) in some parent dir, even if executed in a subdir..dig
file are resolved relative to the path of the .dig
file so that given a project with:.digdag/config
subdir/workflow.dig
queries/foo.sql
Then a user can both be in the project dir and do:
digdag run subdir/workflow
Or
cd subdir/
digdag run workflow
with the same result. The subdir/workflow.dig
file must reference the query file as ../queries/foo.sql
.
digdag run
<workflow name>
parameter can include .dig
suffix so that the below invocation semantics are identical:
digdag run foo
digdag run foo.dig
When a user tries using a parameter that requires a '' at as a prefix, but doesn't include it, it should fail and mentioned the the '' is required.
For example, if a user doesn't include the '_' as a prefix in '_parallel', then the workflow is executed sequentially, but it's not necessarily obvious that it's not operating as the user expected.
next session time
and next runs at
timezones are different, which is a bit confusing.
$ digdag schedules
2016-04-11 14:46:04 -0700: Digdag v0.5.9
Schedules:
id: 2
repository: dano-churn-prediction-poc
workflow: +main
next session time: 2016-04-13 00:00:00 +0900
next runs at: 2016-04-12 15:00:00 -0700 (24h 13m 55s later)
1 entries.
Use `digdag workflows +NAME` to show workflow details.
td>: sample.sql
create_table: (db_name).(table_name)
throws an exception in td-client-java since an invalid table name db_name.table_name
is passed as an argument of createTable.
digdag schedules
2016-04-11 15:04:37 -0700: Digdag v0.5.9
Schedules:
id: 2
repository: dano-churn-prediction-poc
workflow: +main
next session time: 2016-04-13 00:00:00 +0900
next runs at: 2016-04-12 15:00:00 -0700 (23h 55m 21s later)
1 entries.
Use `digdag workflows +NAME` to show workflow details.
and
digdag check
2016-04-11 15:04:54 -0700: Digdag v0.5.9
System default timezone: America/Los_Angeles
Definitions (1 workflows):
+main (1 tasks)
Parameters:
timezone: "America/Los_Angeles"
Schedules (0 entries):
As a user I'm really confused about how scheduling works =)
Which is preferred? init or new?
Currently the only way to disable a scheduled workflow (so that it doesn't get run) is to push another revision to the same repo.
Though the standard syntax is great for understanding. Utilizing 'after' to generate a DAG explicitly is very powerful for optimizing the overall processing time of a workflow. We should emphasize it as part of Digdag's docs in the future.
so that digdag push
excludes those files.
E.g.
_schedule:
weekly>: 6, 07:00:00
Note the line saying Archiving digdag.archive.tar.gz
below:
digdag push nasdaq_analysis -r 3
2016-05-11 16:54:44 +0900: Digdag v0.6.1
Creating digdag.archive.tar.gz...
Archiving digdag.archive.tar.gz
Archiving digdag.yml
Archiving nasdaq_analysis.yml
Archiving queries/daily_open.sql
Archiving queries/monthly_open.sql
Workflows:
nasdaq_analysis
...
It would be useful if we can use pre-defined variables within _export:
_export:
query_start: 2016-03-01
query_end: ${session_date}
empty is adjective while create and drop_table is verb.
push command requires -r option. it should be ok to make it optional as long as idempotency of the operation is considered.
Now, client needs to include .digdag.yml file in a project archive file (tar.gz). But it makes it difficult to create client libraries such as ruby clients.
Idea here is that server reads digdag.yml (not .digdag.yml) as described at #45. Clients just create a tar.gz including digdag.yml and send it to server. Archiving is like this command:
tar -C $(basedir path/to/digdag.yml) --exclude=".*" -czvf - .
Client must include digdag.yml in the archive.
I.e., the below workflow definition should fail instead of silently just executing echo bar
.
run: +foo
+foo:
sh>: echo bar
py>: baz
Also, it might be worth considering if multiple identical keys should fail. In the below workflow definition the second sh>
operator implicitly overrides the first operator when the yaml is parsed because yaml.
run: +foo
+foo:
sh>: echo bar
sh>: echo baz
E.g.
+foo:
td>: queries/foo.sql
google_sheet: domain.com/foo/bar
Would perform the equivalent of
td query \
--result 'gspreadsheet://domain.com/foo/bar' \
-w -d ${td.database} \
-q queries/foo.sql
It would be useful for users to be able to filter logs from workflows executed on the server.
Currently all output from workflow and its tasks, both stdout/err as well as java log messages are concatenated into a single log file. Thus it's not easy to reliably filter out only e.g. stdout or on java logging level.
I propose changing the digdag log file format to be a structured stream of messages (e.g. msgpack) that combines the raw log message with metadata like timestamp and whether the message is from stdout/err or java logging and then the logging level.
I.e. something like:
// ...
{
"ts": 1463387968,
"stream": "OUT",
"message": "hello world",
},
{
"ts": 1463388012,
"stream": "LOG",
"level": "WARN",
"message": "the foo failed to bar the baz because quux",
},
{
"ts": 1463388153,
"stream": "ERR",
"message": "The quick brown fox jumps over the lazy dog",
},
{
"ts": 1463388265,
"stream": "LOG",
"level": "DEBUG",
"message": "...",
},
// ...
We could then have server the expose a rest api endpoint for querying this stream or have the client fetch all log entries and perform the filtering client-side.
digdag start test foobar --session "2016-04-19T17:22:58Z"
2016-04-19 10:23:11 -0700: Digdag v0.6.0
error: --session must be hourly, daily, now, "yyyy-MM-dd", or "yyyy-MM-dd HH:mm:SS" format: 2016-04-19T17:22:58Z
Current the error for when a parameter is not created, or does not have a value is not obvious. In such instances the error log should say "parameter or template named _____ is not found.", or something similar.
This how the error currently shows up, for a td> operator query with a undefined parameter embedded within the query
`2016-05-16 22:58:05 -0700 ERROR: Task failed
io.digdag.client.config.ConfigException: Failed to load a template file
at io.digdag.spi.TemplateEngine.templateCommand(TemplateEngine.java:30)
at io.digdag.standards.operator.td.TdOperatorFactory$TdOperator.runTask(TdOperatorFactory.java:87)
at io.digdag.standards.operator.BaseOperator.run(BaseOperator.java:49)
at io.digdag.core.agent.OperatorManager.callExecutor(OperatorManager.java:255)
at io.digdag.cli.Run$OperatorManagerWithSkip.callExecutor(Run.java:661)
at io.digdag.core.agent.OperatorManager.runWithWorkspace(OperatorManager.java:200)
at io.digdag.core.agent.OperatorManager.lambda$runWithHeartbeat$1(OperatorManager.java:129)
at io.digdag.core.agent.NoopWorkspaceManager.withExtractedArchive(NoopWorkspaceManager.java:20)
at io.digdag.core.agent.OperatorManager.runWithHeartbeat(OperatorManager.java:128)
at io.digdag.core.agent.OperatorManager.run(OperatorManager.java:105)
at io.digdag.cli.Run$OperatorManagerWithSkip.run(Run.java:643)
at io.digdag.core.agent.LocalAgent.lambda$run$0(LocalAgent.java:61)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: io.digdag.spi.TemplateException: Failed to evaluate JavaScript code: SELECT
count(*)
FROM
${table_name}
at io.digdag.core.agent.ConfigEvalEngine.invokeTemplate(ConfigEvalEngine.java:91)
at io.digdag.core.agent.ConfigEvalEngine.template(ConfigEvalEngine.java:185)
at io.digdag.core.agent.ConfigEvalEngine.templateFile(ConfigEvalEngine.java:206)
at io.digdag.spi.TemplateEngine.templateCommand(TemplateEngine.java:27)
... 16 common frames omitted
Caused by: javax.script.ScriptException: ReferenceError: "table_name" is not defined in at line number 4
at jdk.nashorn.api.scripting.NashornScriptEngine.throwAsScriptException(NashornScriptEngine.java:467)
at jdk.nashorn.api.scripting.NashornScriptEngine.invokeImpl(NashornScriptEngine.java:389)
at jdk.nashorn.api.scripting.NashornScriptEngine.invokeFunction(NashornScriptEngine.java:190)
at io.digdag.core.agent.ConfigEvalEngine.invokeTemplate(ConfigEvalEngine.java:88)
... 19 common frames omitted
Caused by: jdk.nashorn.internal.runtime.ECMAException: ReferenceError: "table_name" is not defined
at jdk.nashorn.internal.runtime.ECMAErrors.error(ECMAErrors.java:57)
at jdk.nashorn.internal.runtime.ECMAErrors.referenceError(ECMAErrors.java:319)
at jdk.nashorn.internal.runtime.ECMAErrors.referenceError(ECMAErrors.java:291)
at jdk.nashorn.internal.objects.Global.noSuchProperty(Global.java:1428)
at jdk.nashorn.internal.scripts.Script$Recompilation$32$13$^function_.L:1(:4)
at jdk.nashorn.internal.scripts.Script$Recompilation$28$62AA$^eval_.template(:56)
at jdk.nashorn.internal.runtime.ScriptFunctionData.invoke(ScriptFunctionData.java:627)
at jdk.nashorn.internal.runtime.ScriptFunction.invoke(ScriptFunction.java:494)
at jdk.nashorn.internal.runtime.ScriptRuntime.apply(ScriptRuntime.java:393)
at jdk.nashorn.api.scripting.ScriptObjectMirror.callMember(ScriptObjectMirror.java:199)
at jdk.nashorn.api.scripting.NashornScriptEngine.invokeImpl(NashornScriptEngine.java:383)
... 21 common frames omitted
error:
The query:
SELECT count(*) FROM ${table_name}
./pkg/digdag-0.4.2-SNAPSHOT.jar run -f examples/require.yml --all
2016-03-29 10:26:43 +0900: Digdag v0.4.2
2016-03-29 10:26:44 +0900 [WARN] (main): Using a new session time 2016-03-29T00:00:00+09:00.
2016-03-29 10:26:44 +0900 [INFO] (main): Using session digdag.status/20160329T000000+0900.
2016-03-29 10:26:44 +0900 [INFO] (main): Starting a new session repository id=1 workflow name=+require session_time=2016-03-29T00:00:00+09:00
2016-03-29 10:26:45 +0900 [INFO] (0020@+require+task1+require_data): require>: +make_data
2016-03-29 10:26:45 +0900 [INFO] (0020@+require+task1+require_data): Starting a new session repository id=1 workflow name=+make_data session_time=2016-03-29T00:00:00+09:00
2016-03-29 10:26:45 +0900 [ERROR] (0020@+require+task1+require_data): Task failed, retrying
io.digdag.spi.TaskExecutionException: Retrying this task after 1 seconds
at io.digdag.core.agent.RequireOperatorFactory$RequireOperator.run(RequireOperatorFactory.java:91)
at io.digdag.core.agent.OperatorManager.callExecutor(OperatorManager.java:238)
at io.digdag.cli.Run$OperatorManagerWithSkip.callExecutor(Run.java:653)
at io.digdag.core.agent.OperatorManager.runWithArchive(OperatorManager.java:193)
at io.digdag.core.agent.OperatorManager.lambda$runWithHeartbeat$1(OperatorManager.java:130)
at io.digdag.core.agent.CurrentDirectoryArchiveManager.withExtractedArchive(CurrentDirectoryArchiveManager.java:20)
at io.digdag.core.agent.OperatorManager.runWithHeartbeat(OperatorManager.java:129)
at io.digdag.core.agent.OperatorManager.run(OperatorManager.java:107)
at io.digdag.cli.Run$OperatorManagerWithSkip.run(Run.java:635)
at io.digdag.core.agent.LocalAgent.lambda$run$0(LocalAgent.java:61)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2016-03-29 10:26:45 +0900 [ERROR] (0020@+require+task1+require_data): Task failed
org.skife.jdbi.v2.exceptions.UnableToCreateStatementException: org.h2.jdbc.JdbcSQLException: Syntax error in SQL statement "UPDATE TASKS SET UPDATED_AT = NOW(), STATE = ?, RETRY_AT = TIMESTAMPADD('SECOND', STATE_PARAMS = ?, ?,[*] NOW()) WHERE ID = ? AND STATE = ? "; expected "[, ::, *, /, %, +, -, ||, ~, !~, NOT, LIKE, REGEXP, IS, IN, BETWEEN, AND, OR, )"; SQL statement:
update tasks set updated_at = now(), state = ?, retry_at = TIMESTAMPADD('SECOND', state_params = ?, ?, now()) where id = ? and state = ? [42001-191] [statement:"update tasks set updated_at = now(), state = :newState, retry_at = TIMESTAMPADD('SECOND', state_params = :stateParams, :retryInterval, now()) where id = :id and state = :oldState", located:"update tasks set updated_at = now(), state = :newState, retry_at = TIMESTAMPADD('SECOND', state_params = :stateParams, :retryInterval, now()) where id = :id and state = :oldState", rewritten:"update tasks set updated_at = now(), state = ?, retry_at = TIMESTAMPADD('SECOND', state_params = ?, ?, now()) where id = ? and state = ?", arguments:{ positional:{}, named:{}, finder:[]}]
at org.skife.jdbi.v2.SQLStatement.internalExecute(SQLStatement.java:1306)
at org.skife.jdbi.v2.Update.execute(Update.java:56)
at io.digdag.core.database.DatabaseSessionStoreManager$DatabaseTaskControlStore.setRetryWaitingState(DatabaseSessionStoreManager.java:695)
at io.digdag.core.workflow.TaskControl.setRunningToRetryWaiting(TaskControl.java:272)
at io.digdag.core.workflow.WorkflowExecutor.retryTask(WorkflowExecutor.java:937)
at io.digdag.core.workflow.WorkflowExecutor.lambda$retryTask$21(WorkflowExecutor.java:849)
at io.digdag.core.database.DatabaseSessionStoreManager.lambda$lockTaskIfExists$14(DatabaseSessionStoreManager.java:278)
at io.digdag.core.database.BasicDatabaseStoreManager.lambda$transaction$0(BasicDatabaseStoreManager.java:192)
at org.skife.jdbi.v2.tweak.transactions.LocalTransactionHandler.inTransaction(LocalTransactionHandler.java:183)
at org.skife.jdbi.v2.BasicHandle.inTransaction(BasicHandle.java:330)
at io.digdag.core.database.BasicDatabaseStoreManager.transaction(BasicDatabaseStoreManager.java:192)
at io.digdag.core.database.DatabaseSessionStoreManager.lockTaskIfExists(DatabaseSessionStoreManager.java:272)
at io.digdag.core.workflow.WorkflowExecutor.retryTask(WorkflowExecutor.java:848)
at io.digdag.core.agent.InProcessTaskCallbackApi.retryTask(InProcessTaskCallbackApi.java:113)
at io.digdag.core.agent.OperatorManager.runWithArchive(OperatorManager.java:209)
at io.digdag.core.agent.OperatorManager.lambda$runWithHeartbeat$1(OperatorManager.java:130)
at io.digdag.core.agent.CurrentDirectoryArchiveManager.withExtractedArchive(CurrentDirectoryArchiveManager.java:20)
at io.digdag.core.agent.OperatorManager.runWithHeartbeat(OperatorManager.java:129)
at io.digdag.core.agent.OperatorManager.run(OperatorManager.java:107)
at io.digdag.cli.Run$OperatorManagerWithSkip.run(Run.java:635)
at io.digdag.core.agent.LocalAgent.lambda$run$0(LocalAgent.java:61)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.h2.jdbc.JdbcSQLException: Syntax error in SQL statement "UPDATE TASKS SET UPDATED_AT = NOW(), STATE = ?, RETRY_AT = TIMESTAMPADD('SECOND', STATE_PARAMS = ?, ?,[*] NOW()) WHERE ID = ? AND STATE = ? "; expected "[, ::, *, /, %, +, -, ||, ~, !~, NOT, LIKE, REGEXP, IS, IN, BETWEEN, AND, OR, )"; SQL statement:
update tasks set updated_at = now(), state = ?, retry_at = TIMESTAMPADD('SECOND', state_params = ?, ?, now()) where id = ? and state = ? [42001-191]
at org.h2.message.DbException.getJdbcSQLException(DbException.java:345)
at org.h2.message.DbException.getSyntaxError(DbException.java:205)
at org.h2.command.Parser.getSyntaxError(Parser.java:535)
at org.h2.command.Parser.read(Parser.java:3170)
at org.h2.command.Parser.readFunction(Parser.java:2506)
at org.h2.command.Parser.readTerm(Parser.java:2791)
at org.h2.command.Parser.readFactor(Parser.java:2308)
at org.h2.command.Parser.readSum(Parser.java:2295)
at org.h2.command.Parser.readConcat(Parser.java:2265)
at org.h2.command.Parser.readCondition(Parser.java:2115)
at org.h2.command.Parser.readAnd(Parser.java:2087)
at org.h2.command.Parser.readExpression(Parser.java:2079)
at org.h2.command.Parser.parseUpdate(Parser.java:751)
at org.h2.command.Parser.parsePrepared(Parser.java:465)
at org.h2.command.Parser.parse(Parser.java:315)
at org.h2.command.Parser.parse(Parser.java:291)
at org.h2.command.Parser.prepareCommand(Parser.java:252)
at org.h2.engine.Session.prepareLocal(Session.java:560)
at org.h2.engine.Session.prepareCommand(Session.java:501)
at org.h2.jdbc.JdbcConnection.prepareCommand(JdbcConnection.java:1188)
at org.h2.jdbc.JdbcPreparedStatement.<init>(JdbcPreparedStatement.java:73)
at org.h2.jdbc.JdbcConnection.prepareStatement(JdbcConnection.java:276)
at com.zaxxer.hikari.pool.ProxyConnection.prepareStatement(ProxyConnection.java:308)
at com.zaxxer.hikari.pool.HikariProxyConnection.prepareStatement(HikariProxyConnection.java)
at org.skife.jdbi.v2.DefaultStatementBuilder.create(DefaultStatementBuilder.java:54)
at org.skife.jdbi.v2.SQLStatement.internalExecute(SQLStatement.java:1302)
... 25 common frames omitted
2016-03-29 10:26:45 +0900 [INFO] (0020@+make_data+step1): sh>: echo "creating data..."
creating data...
error:
* +require+task1+require_data:
org.h2.jdbc.JdbcSQLException: Syntax error in SQL statement "UPDATE TASKS SET UPDATED_AT = NOW(), STATE = ?, RETRY_AT = TIMESTAMPADD('SECOND', STATE_PARAMS = ?, ?,[*] NOW()) WHERE ID = ? AND STATE = ? "; expected "[, ::, *, /, %, +, -, ||, ~, !~, NOT, LIKE, REGEXP, IS, IN, BETWEEN, AND, OR, )"; SQL statement:
update tasks set updated_at = now(), state = ?, retry_at = TIMESTAMPADD('SECOND', state_params = ?, ?, now()) where id = ? and state = ? [42001-191] [statement:"update tasks set updated_at = now(), state = :newState, retry_at = TIMESTAMPADD('SECOND', state_params = :stateParams, :retryInterval, now()) where id = :id and state = :oldState", located:"update tasks set updated_at = now(), state = :newState, retry_at = TIMESTAMPADD('SECOND', state_params = :stateParams, :retryInterval, now()) where id = :id and state = :oldState", rewritten:"update tasks set updated_at = now(), state = ?, retry_at = TIMESTAMPADD('SECOND', state_params = ?, ?, now()) where id = ? and state = ?", arguments:{ positional:{}, named:{}, finder:[]}]
Task state is saved at digdag.status/20160329T000000+0900 directory.
Run command with --session '2016-03-29 00:00:00' argument to retry failed tasks.
running digdag workflows <project_name> +<workflow_name>
should show the original file, not a slightly deviated version of it. Including any comments or other notations included.
First iteration; just a list of workflow attempts with results.
Project | Workflow | Start Time | Elapsed time | Status | ID |
---|---|---|---|---|---|
nasdaq | analysis | 2016-04-18 12:45:07 | 1 min 3 sec | Running | 712934839855 |
nasdaq | ingest | 2016-04-18 12:34:56 | 2 min 17 sec | Succeeded | 712934871234 |
... |
Setting -r
option is not always useful for users. Most of initial users want to auto-generated sequence.
Idea here is to add something to server (and client) so that they generate an id for each revision in a project (The id is not globally unique. It's scoped in a project).
$ cat digdag.yml
run: +main
timezone: "America/Los_Angeles"
+main:
_schedule:
cron>: 42 4 1 * *
sh>: echo hello world
$ digdag push dano-cron-test -r 1
2016-04-11 15:46:26 -0700: Digdag v0.5.9
Creating digdag.archive.tar.gz...
Archiving digdag.yml
Workflows:
+main
error: Status code 400: {"message":"Parameter 'cron' is required but not set","status":400}
Add a systemConfig to store access log files to digdag-server. Maybe like server.access-log.path = <path>
.
Many operators need parameters such as hostname, password, etc. They are connection information. We don't want to put those information in workflow yml file because those files could be uploaded to github.
Idea here is to put those information in a file on home directory.
For example, ~/.digdag/config file has following configuration:
client.server-endpoint = ...
client.http-header.authorization = ...
params.td.apikey = ...
params.mysql.hostname = ...
params.mysql.username = ...
params.mysql.password = ...
digdag run
and digdag schedule
take those parameters.
In this idea, digdag push
also takes those parameters and set them to defaultParams of a new revision. This is not always expected behavior especially when a project could be pushed by multiple people. But it's acceptable for now.
Deleting a project should do this:
Implementation limitations are:
I ran into this error when running a workflow as a local mode.
2016-04-01 14:25:30 +0900 [ERROR] (local-agent-0): Uncaught exception
org.skife.jdbi.v2.exceptions.UnableToCreateStatementException: org.h2.jdbc.JdbcSQLException: Table "QUEUED_SHARED_TASK_LOCKS" not found; SQL statement:
with recursive t (queue_id) as ((select queue_id from queued_shared_task_locks where hold_expire_time is null order by queue_id limit 1) union all select (select queue_id from queued_shared_task_locks where hold_expire_time is null and queue_id > t.queue_id order by queue_id limit 1) from t where t.queue_id is not null) select queue_id as id from t where queue_id is not null [42102-191] [statement:"with recursive t (queue_id) as ((select queue_id from queued_shared_task_locks where hold_expire_time is null order by queue_id limit 1) union all select (select queue_id from queued_shared_task_locks where hold_expire_time is null and queue_id > t.queue_id order by queue_id limit 1) from t where t.queue_id is not null) select queue_id as id from t where queue_id is not null", located:"with recursive t (queue_id) as ((select queue_id from queued_shared_task_locks where hold_expire_time is null order by queue_id limit 1) union all select (select queue_id from queued_shared_task_locks where hold_expire_time is null and queue_id > t.queue_id order by queue_id limit 1) from t where t.queue_id is not null) select queue_id as id from t where queue_id is not null", rewritten:"with recursive t (queue_id) as ((select queue_id from queued_shared_task_locks where hold_expire_time is null order by queue_id limit 1) union all select (select queue_id from queued_shared_task_locks where hold_expire_time is null and queue_id > t.queue_id order by queue_id limit 1) from t where t.queue_id is not null) select queue_id as id from t where queue_id is not null", arguments:{ positional:{}, named:{}, finder:[]}]
at org.skife.jdbi.v2.SQLStatement.internalExecute(SQLStatement.java:1306)
at org.skife.jdbi.v2.Query.fold(Query.java:173)
at org.skife.jdbi.v2.Query.list(Query.java:82)
at org.skife.jdbi.v2.Query.list(Query.java:75)
at io.digdag.core.database.DatabaseTaskQueueStore.lambda$lockSharedTasks$147(DatabaseTaskQueueStore.java:195)
at io.digdag.core.database.BasicDatabaseStoreManager.lambda$transaction$54(BasicDatabaseStoreManager.java:192)
at org.skife.jdbi.v2.tweak.transactions.LocalTransactionHandler.inTransaction(LocalTransactionHandler.java:183)
at org.skife.jdbi.v2.BasicHandle.inTransaction(BasicHandle.java:330)
at io.digdag.core.database.BasicDatabaseStoreManager.transaction(BasicDatabaseStoreManager.java:192)
at io.digdag.core.database.DatabaseTaskQueueStore.lockSharedTasks(DatabaseTaskQueueStore.java:176)
at io.digdag.core.database.DatabaseTaskQueueFactory$DatabaseTaskQueue.lockSharedTasks(DatabaseTaskQueueFactory.java:87)
at io.digdag.core.agent.LocalAgent.run(LocalAgent.java:57)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.h2.jdbc.JdbcSQLException: Table "QUEUED_SHARED_TASK_LOCKS" not found; SQL statement:
with recursive t (queue_id) as ((select queue_id from queued_shared_task_locks where hold_expire_time is null order by queue_id limit 1) union all select (select queue_id from queued_shared_task_locks where hold_expire_time is null and queue_id > t.queue_id order by queue_id limit 1) from t where t.queue_id is not null) select queue_id as id from t where queue_id is not null [42102-191]
at org.h2.message.DbException.getJdbcSQLException(DbException.java:345)
at org.h2.message.DbException.get(DbException.java:179)
at org.h2.message.DbException.get(DbException.java:155)
at org.h2.command.Parser.readTableOrView(Parser.java:5349)
at org.h2.command.Parser.readTableFilter(Parser.java:1245)
at org.h2.command.Parser.parseSelectSimpleFromPart(Parser.java:1884)
at org.h2.command.Parser.parseSelectSimple(Parser.java:2032)
at org.h2.command.Parser.parseSelectSub(Parser.java:1878)
at org.h2.command.Parser.parseSelectUnion(Parser.java:1699)
at org.h2.command.Parser.parseSelectSub(Parser.java:1874)
at org.h2.command.Parser.parseSelectUnion(Parser.java:1699)
at org.h2.command.Parser.parseSelect(Parser.java:1687)
at org.h2.command.Parser.parseWith(Parser.java:4745)
at org.h2.command.Parser.parsePrepared(Parser.java:479)
at org.h2.command.Parser.parse(Parser.java:315)
at org.h2.command.Parser.parse(Parser.java:287)
at org.h2.command.Parser.prepareCommand(Parser.java:252)
at org.h2.engine.Session.prepareLocal(Session.java:560)
at org.h2.engine.Session.prepareCommand(Session.java:501)
at org.h2.jdbc.JdbcConnection.prepareCommand(JdbcConnection.java:1188)
at org.h2.jdbc.JdbcPreparedStatement.<init>(JdbcPreparedStatement.java:73)
at org.h2.jdbc.JdbcConnection.prepareStatement(JdbcConnection.java:276)
at com.zaxxer.hikari.pool.ProxyConnection.prepareStatement(ProxyConnection.java:308)
at com.zaxxer.hikari.pool.HikariProxyConnection.prepareStatement(HikariProxyConnection.java)
at org.skife.jdbi.v2.DefaultStatementBuilder.create(DefaultStatementBuilder.java:54)
at org.skife.jdbi.v2.SQLStatement.internalExecute(SQLStatement.java:1302)
... 16 common frames omitted
This is the workflow:
run: +main
+main:
+step1:
sh>: ./tasks/bin/enc-tool -e development-ec2 -a 1 -c ENCRYPT -t 200
$ digdag --version
0.5.9
$ digdag init foo
2016-04-11 14:42:38 -0700: Digdag v0.5.9
Creating foo/digdag
Creating foo/.digdag-wrapper/digdag.jar
Creating foo/.gitignore
Creating foo/tasks/shell_sample.sh
Creating foo/tasks/repeat_hello.sh
Creating foo/tasks/__init__.py
Creating foo/digdag.yml
Done. Type `cd foo` and `./digdag r` to run the workflow. Enjoy!
$ cd foo
$ digdag push foo -r test
2016-04-11 14:42:52 -0700: Digdag v0.5.9
Creating digdag.archive.tar.gz...
Exception in thread "main" io.digdag.client.config.ConfigException: timezone: parameter is required but not set at digdag.yml. Example is 'timezone: America/Los_Angeles'.
at io.digdag.cli.client.Archive.runArchive(Archive.java:152)
at io.digdag.cli.client.Archive.archive(Archive.java:105)
at io.digdag.cli.client.Push.push(Push.java:66)
at io.digdag.cli.client.Push.mainWithClientException(Push.java:41)
at io.digdag.cli.client.ClientCommand.main(ClientCommand.java:55)
at io.digdag.cli.Main.main(Main.java:176)
Those functions are useful when a task content wants to include custom timestamp format.
For example,
create_tables: [mytable_${session_time_format("%Y%m%d_%H")}]
create_tables: [mytable_${last_session_time_format("%Y%m")}]
Here is strftime library for JavaScript which can be used internally:
https://github.com/samsonjs/strftime
Since create_table doesn't indicate the query result will be stored to a table.
digdag should complain about incorrectly placing _schedule
outside the workflow like this:
run: +main
_schedule:
hourly>: 12:34
+main:
td>: query/important.sql
To know who is running/submitting the workflow.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.