treasure-data / digdag Goto Github PK

Workload Automation System

License: Apache License 2.0

Ruby 0.20% Python 0.38% Java 95.49% Shell 0.26% JavaScript 1.00% Makefile 0.02% Batchfile 0.02% CSS 0.04% HTML 0.06% Scala 0.01% Dockerfile 0.01% Less 0.02% TypeScript 2.47%

digdag

digdag's People

Contributors

Stargazers

Watchers

Forkers

hasegaw hiroyuki-sato toyama0919 darklore kumagi kimuratakaumi ashigeru czohshima smdmts hatone msysyamamoto yudozen hanazuki choplin charsyam ariarijp komamitsu nuzayats y-ken atsushi-yamaguchi tony810430 gyamxxx bwtakacy myui cm-moriwaki cesarandreu shigemk2 fabregas4you jr9999 szyn akirakw vsair hiroyukim yteraoka morizyun shm21068 shio-phys akirakoyasu gfx yonekawa duck8823 toracoya yoichi azaazato tkawachi yoyama lulichn saulius abhishekumar123 parroty cookingcodewithme syucream toshift mattcarter hinata nextiams mikoto2000 e-mon sreekanthpulagam k2o jo8937 alu junpeianzai oqrusk strangerss pigmon93 moonstrike tosametal rogervaas muga osiire liguohui2018 jerry-hong extremenelson nak-tera yusuken-freee nshota45 chengvt wreulicke ichihara-3 lanocci franksam007 mercurychs seratch katsuyan hoco znz ne-sachirou yuvrajbset mecolosimo kinchowagiken cuphan pdelagrave rhase cryeo tiqwab danielnorberg yui-knk terukura sergioromcar

digdag's Issues

create_table option doesn't work for Hive

INSERT INTO OVERWRITE xxxx AS (query) is not supported yet in TD.

Using ${td.last_results.job_count} causes an error with --dry-run

$ digdag run -d
...
2016-03-11 21:27:04 +0900 [ERROR] (0021@+main+tfidf): Task failed
java.lang.RuntimeException: Failed to process task config templates
    at io.digdag.core.agent.OperatorManager.runWithArchive(OperatorManager.java:156)
    at io.digdag.core.agent.OperatorManager.lambda$runWithHeartbeat$1(OperatorManager.java:128)
    at io.digdag.core.agent.CurrentDirectoryArchiveManager.withExtractedArchive(CurrentDirectoryArchiveManager.java:20)
    at io.digdag.core.agent.OperatorManager.runWithHeartbeat(OperatorManager.java:127)
    at io.digdag.core.agent.OperatorManager.run(OperatorManager.java:106)
    at io.digdag.cli.Run$OperatorManagerWithSkip.run(Run.java:567)
    at io.digdag.core.agent.LocalAgent.lambda$run$0(LocalAgent.java:61)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: io.digdag.spi.TemplateException: Failed to evaluate JavaScript code: ${td.last_results.job_count}

Fix query layout generated by td:create_table parameter

Queries with magic comment like this:

-- @TD reducers: 4
SELECT ...

will be replaced as:

INSERT OVERWRITE TABLE ...
-- @TD reducers: 4
SELECT ...

Then a magic comment in Hive becomes ineffective. And also this can be problematic if a customer embeds meaningful message in the first line comment.

A possible solution for this problem would be skipping the first comment lines, then embedding INSERT OVERWRITE statement.

server mode should notify client when version is different same

Right now, a user of digdag could be confused by creating & pushing a workflow to the server that is not longer in spec.

The server mode should check the versions are not the same and when it doesn't do the following:

Not accept the workflow submission
Provide a note to the user to update the client. This note should include the following:

user should run "digdag selfupdate"
user should try re-running workflow to ensure compatibility with new version
then re-submit to the server mode

Reading td api key from TD_API_KEY env or .td/td.conf file

We would like to avoid embedding td-api-key in the digdag yml file.

How about making apikey parameter optional for td executor? Even if API key is not given, td-client-java can read TD_API_KEY or .td/td.conf file.

plural REST API with ?name= parameter replaces singular REST API

There are 3 singular REST API:

GET /api/project?name=<name>
GET /api/projects/{id}/workflow?name=name[&revision=name]
GET /api/workflow?project=<name>&name=<name>[&revision=<name>]

Replace them by adding following REST API:

GET /api/projects?name=<name>
GET /api/projects/{id}/workflows?name=<name>[&revision=name]
(no replacement for /api/workflow. Use /api/projects?name=<name and /api/projects/{id}/workflows instead)

If a resource exist, these new REST API return an array with one element in it. If a resource doesn't exist, it returns an empty array (not 404 Not Found). With this way, client can tell that a project doesn't exist when GET /api/projects/{id}/workflows?name=<name>[&revision=name] returns 404 Not Found because it returns an empty array if a project exists but workflow doesn't exist.

td:create_table option for WITH statement doesn't work for Hive

INSERT OVERWRITE TABLE (table) WITH ... is not supported in Hive.

shell_sample.sh is evaluated as JavaScript?

Observed the following error:

leo@weaver:~/work/td/2016-03-09> digdag new query-analysis                                                                                                     [9:13:26 Mar 07 2016]
2016-03-09 09:13:36 +0900: Digdag v0.3.4
  Creating query-analysis/digdag
  Creating query-analysis/.digdag-wrapper/digdag.jar
  Creating query-analysis/.gitignore
  Creating query-analysis/tasks/shell_sample.sh
  Creating query-analysis/tasks/repeat_hello.sh
  Creating query-analysis/tasks/__init__.py
  Creating query-analysis/digdag.yml
Done. Type `cd query-analysis` and `./digdag r` to run the workflow. Enjoy!
leo@weaver:~/work/td/2016-03-09> cd query-analysis                                                                                                             [9:13:36 Mar 07 2016]
leo@weaver:~/work/td/2016-03-09/query-analysis> ls                                                                                                             [9:13:38 Mar 07 2016]
digdag      digdag.yml  tasks
leo@weaver:~/work/td/2016-03-09/query-analysis> digdag run                                                                                                     [9:13:38 Mar 07 2016]
2016-03-09 09:13:49 +0900: Digdag v0.3.4
2016-03-09 09:13:50 +0900 [WARN] (main): --session-time argument, --hour argument, or _schedule in yaml file is not set. Using today's 00:00:00 as ${session_time}.
2016-03-09 09:13:50 +0900 [INFO] (main): Using state files at digdag.status/20160309T000000+0900.
2016-03-09 09:13:50 +0900 [INFO] (main): Starting a new session repository id=1 workflow name=+main session_time=2016-03-09T00:00:00+09:00
2016-03-09 09:13:50 +0900 [ERROR] (0021@+main+step1): Task failed
java.lang.RuntimeException: Failed to process task config templates
    at io.digdag.core.agent.OperatorManager.runWithArchive(OperatorManager.java:154)
    at io.digdag.core.agent.OperatorManager.lambda$runWithHeartbeat$1(OperatorManager.java:128)
    at io.digdag.core.agent.OperatorManager$$Lambda$114/2102246737.run(Unknown Source)
    at io.digdag.core.agent.CurrentDirectoryArchiveManager.withExtractedArchive(CurrentDirectoryArchiveManager.java:20)
    at io.digdag.core.agent.OperatorManager.runWithHeartbeat(OperatorManager.java:127)
    at io.digdag.core.agent.OperatorManager.run(OperatorManager.java:106)
    at io.digdag.cli.Run$OperatorManagerWithSkip.run(Run.java:567)
    at io.digdag.core.agent.LocalAgent.lambda$run$0(LocalAgent.java:61)
    at io.digdag.core.agent.LocalAgent$$Lambda$112/1884150000.run(Unknown Source)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: io.digdag.spi.TemplateException: Failed to evaluate JavaScript code: tasks/shell_sample.sh
    at io.digdag.core.agent.ConfigEvalEngine.invokeTemplate(ConfigEvalEngine.java:91)
    at io.digdag.core.agent.ConfigEvalEngine.access$200(ConfigEvalEngine.java:31)
    at io.digdag.core.agent.ConfigEvalEngine$Context.evalValue(ConfigEvalEngine.java:170)
    at io.digdag.core.agent.ConfigEvalEngine$Context.evalObjectRecursive(ConfigEvalEngine.java:128)
    at io.digdag.core.agent.ConfigEvalEngine$Context.access$000(ConfigEvalEngine.java:95)
    at io.digdag.core.agent.ConfigEvalEngine.eval(ConfigEvalEngine.java:62)
    at io.digdag.core.agent.OperatorManager.runWithArchive(OperatorManager.java:151)
    ... 13 common frames omitted
Caused by: javax.script.ScriptException: String index out of range: 72
    at jdk.nashorn.api.scripting.NashornScriptEngine.throwAsScriptException(NashornScriptEngine.java:455)
    at jdk.nashorn.api.scripting.NashornScriptEngine.invokeImpl(NashornScriptEngine.java:387)
    at jdk.nashorn.api.scripting.NashornScriptEngine.invokeFunction(NashornScriptEngine.java:187)
    at io.digdag.core.agent.ConfigEvalEngine.invokeTemplate(ConfigEvalEngine.java:88)
    ... 19 common frames omitted
Caused by: jdk.nashorn.internal.runtime.ParserException: String index out of range: 72
    at jdk.nashorn.internal.runtime.Context$ThrowErrorManager.error(Context.java:419)
    at jdk.nashorn.internal.parser.Parser.recover(Parser.java:413)
    at jdk.nashorn.internal.parser.Parser.sourceElements(Parser.java:831)
    at jdk.nashorn.internal.parser.Parser.program(Parser.java:711)
    at jdk.nashorn.internal.parser.Parser.parse(Parser.java:284)
    at jdk.nashorn.internal.runtime.RecompilableScriptFunctionData.reparse(RecompilableScriptFunctionData.java:386)
    at jdk.nashorn.internal.runtime.RecompilableScriptFunctionData.compileTypeSpecialization(RecompilableScriptFunctionData.java:511)
    at jdk.nashorn.internal.runtime.RecompilableScriptFunctionData.getBest(RecompilableScriptFunctionData.java:730)
    at jdk.nashorn.internal.runtime.ScriptFunctionData.getBestInvoker(ScriptFunctionData.java:232)
    at jdk.nashorn.internal.runtime.ScriptFunction.findCallMethod(ScriptFunction.java:586)
    at jdk.nashorn.internal.runtime.ScriptObject.lookup(ScriptObject.java:1872)
    at jdk.nashorn.internal.runtime.linker.NashornLinker.getGuardedInvocation(NashornLinker.java:100)
    at jdk.nashorn.internal.runtime.linker.NashornLinker.getGuardedInvocation(NashornLinker.java:94)
    at jdk.internal.dynalink.support.CompositeTypeBasedGuardingDynamicLinker.getGuardedInvocation(CompositeTypeBasedGuardingDynamicLinker.java:176)
    at jdk.internal.dynalink.support.CompositeGuardingDynamicLinker.getGuardedInvocation(CompositeGuardingDynamicLinker.java:124)
    at jdk.internal.dynalink.support.LinkerServicesImpl.getGuardedInvocation(LinkerServicesImpl.java:149)
    at jdk.internal.dynalink.DynamicLinker.relink(DynamicLinker.java:233)
    at jdk.nashorn.internal.objects.NativeRegExp.callReplaceValue(NativeRegExp.java:819)
    at jdk.nashorn.internal.objects.NativeRegExp.replace(NativeRegExp.java:696)
    at jdk.nashorn.internal.objects.NativeString.replace(NativeString.java:809)
    at jdk.nashorn.internal.scripts.Script$Recompilation$1$62AA$\^eval\_.template(<eval>:26)
    at jdk.nashorn.internal.runtime.ScriptFunctionData.invoke(ScriptFunctionData.java:640)
    at jdk.nashorn.internal.runtime.ScriptFunction.invoke(ScriptFunction.java:229)
    at jdk.nashorn.internal.runtime.ScriptRuntime.apply(ScriptRuntime.java:387)
    at jdk.nashorn.api.scripting.ScriptObjectMirror.callMember(ScriptObjectMirror.java:192)
    at jdk.nashorn.api.scripting.NashornScriptEngine.invokeImpl(NashornScriptEngine.java:381)
    ... 21 common frames omitted
error:
  * +main+step1:
    Failed to process task config templates

Task state is saved at digdag.status/20160309T000000+0900 directory.
Run command with --session-time '2016-03-09 00:00:00' argument to retry failed tasks.```

task ordering is not well defined

digdag tasks are defined in a key-value map and digdag executes tasks in the literal order of the text in the yaml file. This can be fragile as key-value maps in YAML do not have a well defined semantic order.

http://yaml.org/spec/1.2/spec.html#id2765608

Some examples of how this can become painful:

Using a YAML editor etc to edit a workflow file might cause tasks to become reordered and break the workflow.
Parsing and serializing the workflow using a YAML library etc is not safe and will break workflows.
Generating a workflow becomes harder, as it is not possible to represent a digdag workflow as e.g. a map/hash/dictionary in some programming language and serialize it to YAML. Instead non-insignificant amounts of custom logic might need to be written to represent and serialize tasks in the correct order.

for_each parameterization capability is limited

for_each> parameters can be parameterized like below by utilizing YAML anchors:

run: +parameterized_for_each

_export:
  foos: &FOOS
    - 1
    - 2

+parameterized_for_each:
  for_each>:
    foo: *FOOS
  _do:
    sh>: "echo hello ${foo}"

But it might be useful to allow for parameterizing for_each> using actual digdag parameters. E.g.

run: +parameterized_for_each

_export:
  foos:
    - 1
    - 2

+parameterized_for_each:
  for_each>:
    foo: @foos
  _do:
    sh>: "echo hello ${foo}"

And using parameters explicitly set by e.g. a py> task:

run: +main

+main:
  +export_foos:
    py>: tasks.export_foos
  +parameterized_for_each:
    for_each>:
      foo: ${foos}
    _do:
      sh>: "echo hello ${foo}"

import digdag

def export_foos():
    digdag.env.store({"foos": [1, 2]})

Attempting to parameterize for_each> like this currently fails with the below error:

2016-03-30 10:18:17 +0900: Digdag v0.4.2
2016-03-30 10:18:18 +0900 [WARN] (main): Reusing the last session time 2016-03-29T00:00:00+09:00.
2016-03-30 10:18:18 +0900 [INFO] (main): Using session digdag.status/20160329T000000+0900.
2016-03-30 10:18:18 +0900 [INFO] (main): Starting a new session repository id=1 workflow name=+main session_time=2016-03-29T00:00:00+09:00
2016-03-30 10:18:19 +0900 [INFO] (0020@+main+export_foos): py>: tasks.export_foos
2016-03-30 10:18:19 +0900 [INFO] (0020@+main+parameterized_for_each): for_each>: {foo=[1,2]}
2016-03-30 10:18:19 +0900 [ERROR] (0020@+main+parameterized_for_each): Task failed
io.digdag.client.config.ConfigException: Expected array type for key 'foo' but got "[1,2]" (string)
    at io.digdag.client.config.Config.propagateConvertException(Config.java:400)
    at io.digdag.client.config.Config.readObject(Config.java:391)
    at io.digdag.client.config.Config.get(Config.java:235)
    at io.digdag.client.config.Config.getList(Config.java:277)
    at io.digdag.standards.operator.ForEachOperatorFactory$ForEachOperator.runTask(ForEachOperatorFactory.java:67)
    at io.digdag.standards.operator.BaseOperator.run(BaseOperator.java:49)
    at io.digdag.core.agent.OperatorManager.callExecutor(OperatorManager.java:238)
    at io.digdag.cli.Run$OperatorManagerWithSkip.callExecutor(Run.java:653)
    at io.digdag.core.agent.OperatorManager.runWithArchive(OperatorManager.java:193)
    at io.digdag.core.agent.OperatorManager.lambda$runWithHeartbeat$1(OperatorManager.java:130)
    at io.digdag.core.agent.CurrentDirectoryArchiveManager.withExtractedArchive(CurrentDirectoryArchiveManager.java:20)
    at io.digdag.core.agent.OperatorManager.runWithHeartbeat(OperatorManager.java:129)
    at io.digdag.core.agent.OperatorManager.run(OperatorManager.java:107)
    at io.digdag.cli.Run$OperatorManagerWithSkip.run(Run.java:635)
    at io.digdag.core.agent.LocalAgent.lambda$run$0(LocalAgent.java:61)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Can not deserialize instance of java.util.ArrayList out of VALUE_STRING token
 at [Source: N/A; line: -1, column: -1]
    at com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:148)
    at com.fasterxml.jackson.databind.DeserializationContext.mappingException(DeserializationContext.java:854)
    at com.fasterxml.jackson.databind.DeserializationContext.mappingException(DeserializationContext.java:850)
    at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.handleNonArray(CollectionDeserializer.java:292)
    at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:227)
    at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:217)
    at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:25)
    at com.fasterxml.jackson.databind.ObjectMapper._readValue(ObjectMapper.java:3703)
    at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2072)
    at io.digdag.client.config.Config.readObject(Config.java:388)
    ... 18 common frames omitted
error:
  * +main+parameterized_for_each:
    Expected array type for key 'foo' but got "[1,2]" (string)

Task state is saved at digdag.status/20160329T000000+0900 directory.
Run command with --session '2016-03-29 00:00:00' argument to retry failed tasks.

I'm uncertain whether the best approach would be to make ${...} expansion more intelligent about types of parameters and not coerce everything to a string, or if it would make more sense to introduce another syntax.

Embedding environment variables in config files

Currently ~/.digdag/config is a java properties file. It would be convenient to use the HOCON format and the typesafe config library to allow for things like:

email {
  host: smtp.foo.bar
  user: ...
  ....
}

And to gain useful functionality like system property and environment variable substitution for free.

digdag should default to UTC timezone?

Do people use local timezones in their data pipelines, applications, servers, etc? In my experience local timezone application happens in the presentation layer (UI etc).

Afaik EC2 and GCE default to UTC.

ImmutableSessionAttempt class is missing

Forget to push?

workflow per file and digdag.yml

Design:

$ digdag init <name> creates digdag.yml and <name>.yml files in <name> directory as following:

# digdag.yml
name: <name>
workflows:
  - <name>.yml

# <name>.yml
+step1:
  sh>: some example here...
+step2:
  sh>: some example here...

All digdag commands (run, push, etc.) use digdag.yml as the entry point.

name section in digdag.yml is the name of project.
Name of a workflow matches with the file name (so workflow yml files don't need name section).

digdag push doesn't require -f option. It reads ./digdag.yml.

Failed to run workflow on Digdag server

When I ran a workflow on Digdag server running in localhost, this error occurred.

2016-04-05 11:21:09 +0900 [INFO] (0037@+main+step1): sh>: ./tasks/bin/enc-tool -e development-ec2 -a 1 -c ENCRYPT -t 200
/bin/sh: ./tasks/bin/enc-tool: Permission denied
2016-04-05 11:21:09 +0900 [ERROR] (0037@+main+step1): Task failed
java.lang.RuntimeException: Command failed with code 126
        at io.digdag.standards.operator.ShOperatorFactory$ShOperator.runTask(ShOperatorFactory.java:115)
        at io.digdag.standards.operator.BaseOperator.run(BaseOperator.java:49)
        at io.digdag.core.agent.OperatorManager.callExecutor(OperatorManager.java:241)
        at io.digdag.core.agent.OperatorManager.runWithWorkspace(OperatorManager.java:196)
        at io.digdag.core.agent.OperatorManager.lambda$runWithHeartbeat$1(OperatorManager.java:133)
        at io.digdag.core.agent.LocalWorkspaceManager.withExtractedArchive(LocalWorkspaceManager.java:63)
        at io.digdag.core.agent.OperatorManager.runWithHeartbeat(OperatorManager.java:132)
        at io.digdag.core.agent.OperatorManager.run(OperatorManager.java:109)
        at io.digdag.core.agent.LocalAgent.lambda$run$0(LocalAgent.java:61)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

It works in local mode.

project structure and workflow suffix changes

no digdag.yml project file
workflow file suffixes change from .yml to .dig
all .dig files in project directory are picked up automatically by digdag
digdag cli gains a --project option to specify the project directory
digdag run -f option is removed
digdag init creates:
- .digdag directory in project directory
- .digdag/config file
digdag cli looks for .digdag directory, starting in current directory and then recurses into parent directories (like git).
digdag push pushes entire project (as identified by .digdag) in some parent dir, even if executed in a subdir.
filenames referenced by a .dig file are resolved relative to the path of the .dig file so that given a project with:

.digdag/config
subdir/workflow.dig
queries/foo.sql

Then a user can both be in the project dir and do:

digdag run subdir/workflow

cd subdir/
digdag run workflow

with the same result. The subdir/workflow.dig file must reference the query file as ../queries/foo.sql.

digdag does not change "CWD" when executing a workflow.
digdag run <workflow name> parameter can include .dig suffix so that the below invocation semantics are identical:
- digdag run foo
- digdag run foo.dig

Fail when '_' is not used as a prefix as required for certain parameters

When a user tries using a parameter that requires a '' at as a prefix, but doesn't include it, it should fail and mentioned the the '' is required.

For example, if a user doesn't include the '_' as a prefix in '_parallel', then the workflow is executed sequentially, but it's not necessarily obvious that it's not operating as the user expected.

digdag schedules output timezones confusing

next session time and next runs at timezones are different, which is a bit confusing.

$ digdag schedules
2016-04-11 14:46:04 -0700: Digdag v0.5.9
Schedules:
  id: 2
  repository: dano-churn-prediction-poc
  workflow: +main
  next session time: 2016-04-13 00:00:00 +0900
  next runs at: 2016-04-12 15:00:00 -0700 (24h 13m 55s later)

1 entries.
Use `digdag workflows +NAME` to show workflow details.

td: create_table option should accept db_name.table_name format

td>: sample.sql
create_table: (db_name).(table_name)

throws an exception in td-client-java since an invalid table name db_name.table_name is passed as an argument of createTable.

digdag sometimes uses local ./digdag.yml scope and sometimes global scope

digdag schedules
2016-04-11 15:04:37 -0700: Digdag v0.5.9
Schedules:
  id: 2
  repository: dano-churn-prediction-poc
  workflow: +main
  next session time: 2016-04-13 00:00:00 +0900
  next runs at: 2016-04-12 15:00:00 -0700 (23h 55m 21s later)

1 entries.
Use `digdag workflows +NAME` to show workflow details.

and

digdag check
2016-04-11 15:04:54 -0700: Digdag v0.5.9
  System default timezone: America/Los_Angeles

  Definitions (1 workflows):
    +main (1 tasks)

  Parameters:
    timezone: "America/Los_Angeles"

  Schedules (0 entries):

As a user I'm really confused about how scheduling works =)

Two commands for creating a new project: digdag init, new.

Which is preferred? init or new?

should be possible to disable a scheduled workflow

Currently the only way to disable a scheduled workflow (so that it doesn't get run) is to push another revision to the same repo.

Digdag documentation Add '_after:' parameter description & usage

Though the standard syntax is great for understanding. Utilizing 'after' to generate a DAG explicitly is very powerful for optimizing the overall processing time of a workflow. We should emphasize it as part of Digdag's docs in the future.

Local mode commands should put temporary files to .digdag/ directory

so that digdag push excludes those files.

digdag should offer weekly scheduling

E.g.

_schedule:
  weekly>: 6, 07:00:00

digdag push includes archive in archive

Note the line saying Archiving digdag.archive.tar.gz below:

digdag push nasdaq_analysis -r 3
2016-05-11 16:54:44 +0900: Digdag v0.6.1
Creating digdag.archive.tar.gz...
  Archiving digdag.archive.tar.gz
  Archiving digdag.yml
  Archiving nasdaq_analysis.yml
  Archiving queries/daily_open.sql
  Archiving queries/monthly_open.sql
Workflows:
  nasdaq_analysis
...

Using pre-defined variables inside _export block

It would be useful if we can use pre-defined variables within _export:

_export:
  query_start: 2016-03-01
  query_end: ${session_date}

td-ddl: empty_tables should be renamed to replace_table

empty is adjective while create and drop_table is verb.

Making -r option optional with push

push command requires -r option. it should be ok to make it optional as long as idempotency of the operation is considered.

Server should not require clients to include precompiled .digdag.yml in project archive

Now, client needs to include .digdag.yml file in a project archive file (tar.gz). But it makes it difficult to create client libraries such as ruby clients.

Idea here is that server reads digdag.yml (not .digdag.yml) as described at #45. Clients just create a tar.gz including digdag.yml and send it to server. Archiving is like this command:

tar -C $(basedir path/to/digdag.yml) --exclude=".*" -czvf - .

Client must include digdag.yml in the archive.

digdag should fail on multiple operators within a task

I.e., the below workflow definition should fail instead of silently just executing echo bar.

run: +foo
+foo:
  sh>: echo bar
  py>: baz

Also, it might be worth considering if multiple identical keys should fail. In the below workflow definition the second sh> operator implicitly overrides the first operator when the yaml is parsed because yaml.

run: +foo
+foo:
  sh>: echo bar
  sh>: echo baz

Implement google sheet export

E.g.

+foo:
  td>: queries/foo.sql
  google_sheet: domain.com/foo/bar

Would perform the equivalent of

td query \
  --result 'gspreadsheet://domain.com/foo/bar' \
  -w -d ${td.database} \
  -q queries/foo.sql

server side workflow log filtering

It would be useful for users to be able to filter logs from workflows executed on the server.

Currently all output from workflow and its tasks, both stdout/err as well as java log messages are concatenated into a single log file. Thus it's not easy to reliably filter out only e.g. stdout or on java logging level.

I propose changing the digdag log file format to be a structured stream of messages (e.g. msgpack) that combines the raw log message with metadata like timestamp and whether the message is from stdout/err or java logging and then the logging level.

I.e. something like:

// ...
{
  "ts": 1463387968,
  "stream": "OUT",
  "message": "hello world",
},
{
  "ts": 1463388012,
  "stream": "LOG",
  "level": "WARN",
  "message": "the foo failed to bar the baz because quux",
},
{
  "ts": 1463388153,
  "stream": "ERR",
  "message": "The quick brown fox jumps over the lazy dog",
},
{
  "ts": 1463388265,
  "stream": "LOG",
  "level": "DEBUG",
  "message": "...",
},
// ...

We could then have server the expose a rest api endpoint for querying this stream or have the client fetch all log entries and perform the filtering client-side.

--session should accept iso8601

digdag start test foobar --session "2016-04-19T17:22:58Z"
2016-04-19 10:23:11 -0700: Digdag v0.6.0
error: --session must be hourly, daily, now, "yyyy-MM-dd", or "yyyy-MM-dd HH:mm:SS" format: 2016-04-19T17:22:58Z

Easier to understand error log when query embedded parameter is not defined

Current the error for when a parameter is not created, or does not have a value is not obvious. In such instances the error log should say "parameter or template named _____ is not found.", or something similar.

This how the error currently shows up, for a td> operator query with a undefined parameter embedded within the query

`2016-05-16 22:58:05 -0700 ERROR: Task failed
io.digdag.client.config.ConfigException: Failed to load a template file
at io.digdag.spi.TemplateEngine.templateCommand(TemplateEngine.java:30)
at io.digdag.standards.operator.td.TdOperatorFactory$TdOperator.runTask(TdOperatorFactory.java:87)
at io.digdag.standards.operator.BaseOperator.run(BaseOperator.java:49)
at io.digdag.core.agent.OperatorManager.callExecutor(OperatorManager.java:255)
at io.digdag.cli.Run$OperatorManagerWithSkip.callExecutor(Run.java:661)
at io.digdag.core.agent.OperatorManager.runWithWorkspace(OperatorManager.java:200)
at io.digdag.core.agent.OperatorManager.lambda$runWithHeartbeat$1(OperatorManager.java:129)
at io.digdag.core.agent.NoopWorkspaceManager.withExtractedArchive(NoopWorkspaceManager.java:20)
at io.digdag.core.agent.OperatorManager.runWithHeartbeat(OperatorManager.java:128)
at io.digdag.core.agent.OperatorManager.run(OperatorManager.java:105)
at io.digdag.cli.Run$OperatorManagerWithSkip.run(Run.java:643)
at io.digdag.core.agent.LocalAgent.lambda$run$0(LocalAgent.java:61)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: io.digdag.spi.TemplateException: Failed to evaluate JavaScript code: SELECT
count(*)
FROM
${table_name}
at io.digdag.core.agent.ConfigEvalEngine.invokeTemplate(ConfigEvalEngine.java:91)
at io.digdag.core.agent.ConfigEvalEngine.template(ConfigEvalEngine.java:185)
at io.digdag.core.agent.ConfigEvalEngine.templateFile(ConfigEvalEngine.java:206)
at io.digdag.spi.TemplateEngine.templateCommand(TemplateEngine.java:27)
... 16 common frames omitted
Caused by: javax.script.ScriptException: ReferenceError: "table_name" is not defined in at line number 4
at jdk.nashorn.api.scripting.NashornScriptEngine.throwAsScriptException(NashornScriptEngine.java:467)
at jdk.nashorn.api.scripting.NashornScriptEngine.invokeImpl(NashornScriptEngine.java:389)
at jdk.nashorn.api.scripting.NashornScriptEngine.invokeFunction(NashornScriptEngine.java:190)
at io.digdag.core.agent.ConfigEvalEngine.invokeTemplate(ConfigEvalEngine.java:88)
... 19 common frames omitted
Caused by: jdk.nashorn.internal.runtime.ECMAException: ReferenceError: "table_name" is not defined
at jdk.nashorn.internal.runtime.ECMAErrors.error(ECMAErrors.java:57)
at jdk.nashorn.internal.runtime.ECMAErrors.referenceError(ECMAErrors.java:319)
at jdk.nashorn.internal.runtime.ECMAErrors.referenceError(ECMAErrors.java:291)
at jdk.nashorn.internal.objects.Global.noSuchProperty(Global.java:1428)
at jdk.nashorn.internal.scripts.Script$Recompilation$32$13$^function_.L:1(:4)
at jdk.nashorn.internal.scripts.Script$Recompilation$28$62AA$^eval_.template(:56)
at jdk.nashorn.internal.runtime.ScriptFunctionData.invoke(ScriptFunctionData.java:627)
at jdk.nashorn.internal.runtime.ScriptFunction.invoke(ScriptFunction.java:494)
at jdk.nashorn.internal.runtime.ScriptRuntime.apply(ScriptRuntime.java:393)
at jdk.nashorn.api.scripting.ScriptObjectMirror.callMember(ScriptObjectMirror.java:199)
at jdk.nashorn.api.scripting.NashornScriptEngine.invokeImpl(NashornScriptEngine.java:383)
... 21 common frames omitted
error:

+pull_push+export_to_sftp:
Failed to load a template file`

The query:
SELECT count(*) FROM ${table_name}

examples/require.yml fails

./pkg/digdag-0.4.2-SNAPSHOT.jar run -f examples/require.yml --all
2016-03-29 10:26:43 +0900: Digdag v0.4.2
2016-03-29 10:26:44 +0900 [WARN] (main): Using a new session time 2016-03-29T00:00:00+09:00.
2016-03-29 10:26:44 +0900 [INFO] (main): Using session digdag.status/20160329T000000+0900.
2016-03-29 10:26:44 +0900 [INFO] (main): Starting a new session repository id=1 workflow name=+require session_time=2016-03-29T00:00:00+09:00
2016-03-29 10:26:45 +0900 [INFO] (0020@+require+task1+require_data): require>: +make_data
2016-03-29 10:26:45 +0900 [INFO] (0020@+require+task1+require_data): Starting a new session repository id=1 workflow name=+make_data session_time=2016-03-29T00:00:00+09:00
2016-03-29 10:26:45 +0900 [ERROR] (0020@+require+task1+require_data): Task failed, retrying
io.digdag.spi.TaskExecutionException: Retrying this task after 1 seconds
    at io.digdag.core.agent.RequireOperatorFactory$RequireOperator.run(RequireOperatorFactory.java:91)
    at io.digdag.core.agent.OperatorManager.callExecutor(OperatorManager.java:238)
    at io.digdag.cli.Run$OperatorManagerWithSkip.callExecutor(Run.java:653)
    at io.digdag.core.agent.OperatorManager.runWithArchive(OperatorManager.java:193)
    at io.digdag.core.agent.OperatorManager.lambda$runWithHeartbeat$1(OperatorManager.java:130)
    at io.digdag.core.agent.CurrentDirectoryArchiveManager.withExtractedArchive(CurrentDirectoryArchiveManager.java:20)
    at io.digdag.core.agent.OperatorManager.runWithHeartbeat(OperatorManager.java:129)
    at io.digdag.core.agent.OperatorManager.run(OperatorManager.java:107)
    at io.digdag.cli.Run$OperatorManagerWithSkip.run(Run.java:635)
    at io.digdag.core.agent.LocalAgent.lambda$run$0(LocalAgent.java:61)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
2016-03-29 10:26:45 +0900 [ERROR] (0020@+require+task1+require_data): Task failed
org.skife.jdbi.v2.exceptions.UnableToCreateStatementException: org.h2.jdbc.JdbcSQLException: Syntax error in SQL statement "UPDATE TASKS  SET UPDATED_AT = NOW(), STATE = ?, RETRY_AT = TIMESTAMPADD('SECOND', STATE_PARAMS = ?, ?,[*] NOW()) WHERE ID = ? AND STATE = ? "; expected "[, ::, *, /, %, +, -, ||, ~, !~, NOT, LIKE, REGEXP, IS, IN, BETWEEN, AND, OR, )"; SQL statement:
update tasks  set updated_at = now(), state = ?, retry_at = TIMESTAMPADD('SECOND', state_params = ?, ?, now()) where id = ? and state = ? [42001-191] [statement:"update tasks  set updated_at = now(), state = :newState, retry_at = TIMESTAMPADD('SECOND', state_params = :stateParams, :retryInterval, now()) where id = :id and state = :oldState", located:"update tasks  set updated_at = now(), state = :newState, retry_at = TIMESTAMPADD('SECOND', state_params = :stateParams, :retryInterval, now()) where id = :id and state = :oldState", rewritten:"update tasks  set updated_at = now(), state = ?, retry_at = TIMESTAMPADD('SECOND', state_params = ?, ?, now()) where id = ? and state = ?", arguments:{ positional:{}, named:{}, finder:[]}]
    at org.skife.jdbi.v2.SQLStatement.internalExecute(SQLStatement.java:1306)
    at org.skife.jdbi.v2.Update.execute(Update.java:56)
    at io.digdag.core.database.DatabaseSessionStoreManager$DatabaseTaskControlStore.setRetryWaitingState(DatabaseSessionStoreManager.java:695)
    at io.digdag.core.workflow.TaskControl.setRunningToRetryWaiting(TaskControl.java:272)
    at io.digdag.core.workflow.WorkflowExecutor.retryTask(WorkflowExecutor.java:937)
    at io.digdag.core.workflow.WorkflowExecutor.lambda$retryTask$21(WorkflowExecutor.java:849)
    at io.digdag.core.database.DatabaseSessionStoreManager.lambda$lockTaskIfExists$14(DatabaseSessionStoreManager.java:278)
    at io.digdag.core.database.BasicDatabaseStoreManager.lambda$transaction$0(BasicDatabaseStoreManager.java:192)
    at org.skife.jdbi.v2.tweak.transactions.LocalTransactionHandler.inTransaction(LocalTransactionHandler.java:183)
    at org.skife.jdbi.v2.BasicHandle.inTransaction(BasicHandle.java:330)
    at io.digdag.core.database.BasicDatabaseStoreManager.transaction(BasicDatabaseStoreManager.java:192)
    at io.digdag.core.database.DatabaseSessionStoreManager.lockTaskIfExists(DatabaseSessionStoreManager.java:272)
    at io.digdag.core.workflow.WorkflowExecutor.retryTask(WorkflowExecutor.java:848)
    at io.digdag.core.agent.InProcessTaskCallbackApi.retryTask(InProcessTaskCallbackApi.java:113)
    at io.digdag.core.agent.OperatorManager.runWithArchive(OperatorManager.java:209)
    at io.digdag.core.agent.OperatorManager.lambda$runWithHeartbeat$1(OperatorManager.java:130)
    at io.digdag.core.agent.CurrentDirectoryArchiveManager.withExtractedArchive(CurrentDirectoryArchiveManager.java:20)
    at io.digdag.core.agent.OperatorManager.runWithHeartbeat(OperatorManager.java:129)
    at io.digdag.core.agent.OperatorManager.run(OperatorManager.java:107)
    at io.digdag.cli.Run$OperatorManagerWithSkip.run(Run.java:635)
    at io.digdag.core.agent.LocalAgent.lambda$run$0(LocalAgent.java:61)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.h2.jdbc.JdbcSQLException: Syntax error in SQL statement "UPDATE TASKS  SET UPDATED_AT = NOW(), STATE = ?, RETRY_AT = TIMESTAMPADD('SECOND', STATE_PARAMS = ?, ?,[*] NOW()) WHERE ID = ? AND STATE = ? "; expected "[, ::, *, /, %, +, -, ||, ~, !~, NOT, LIKE, REGEXP, IS, IN, BETWEEN, AND, OR, )"; SQL statement:
update tasks  set updated_at = now(), state = ?, retry_at = TIMESTAMPADD('SECOND', state_params = ?, ?, now()) where id = ? and state = ? [42001-191]
    at org.h2.message.DbException.getJdbcSQLException(DbException.java:345)
    at org.h2.message.DbException.getSyntaxError(DbException.java:205)
    at org.h2.command.Parser.getSyntaxError(Parser.java:535)
    at org.h2.command.Parser.read(Parser.java:3170)
    at org.h2.command.Parser.readFunction(Parser.java:2506)
    at org.h2.command.Parser.readTerm(Parser.java:2791)
    at org.h2.command.Parser.readFactor(Parser.java:2308)
    at org.h2.command.Parser.readSum(Parser.java:2295)
    at org.h2.command.Parser.readConcat(Parser.java:2265)
    at org.h2.command.Parser.readCondition(Parser.java:2115)
    at org.h2.command.Parser.readAnd(Parser.java:2087)
    at org.h2.command.Parser.readExpression(Parser.java:2079)
    at org.h2.command.Parser.parseUpdate(Parser.java:751)
    at org.h2.command.Parser.parsePrepared(Parser.java:465)
    at org.h2.command.Parser.parse(Parser.java:315)
    at org.h2.command.Parser.parse(Parser.java:291)
    at org.h2.command.Parser.prepareCommand(Parser.java:252)
    at org.h2.engine.Session.prepareLocal(Session.java:560)
    at org.h2.engine.Session.prepareCommand(Session.java:501)
    at org.h2.jdbc.JdbcConnection.prepareCommand(JdbcConnection.java:1188)
    at org.h2.jdbc.JdbcPreparedStatement.<init>(JdbcPreparedStatement.java:73)
    at org.h2.jdbc.JdbcConnection.prepareStatement(JdbcConnection.java:276)
    at com.zaxxer.hikari.pool.ProxyConnection.prepareStatement(ProxyConnection.java:308)
    at com.zaxxer.hikari.pool.HikariProxyConnection.prepareStatement(HikariProxyConnection.java)
    at org.skife.jdbi.v2.DefaultStatementBuilder.create(DefaultStatementBuilder.java:54)
    at org.skife.jdbi.v2.SQLStatement.internalExecute(SQLStatement.java:1302)
    ... 25 common frames omitted
2016-03-29 10:26:45 +0900 [INFO] (0020@+make_data+step1): sh>: echo "creating data..."
creating data...
error:
  * +require+task1+require_data:
    org.h2.jdbc.JdbcSQLException: Syntax error in SQL statement "UPDATE TASKS  SET UPDATED_AT = NOW(), STATE = ?, RETRY_AT = TIMESTAMPADD('SECOND', STATE_PARAMS = ?, ?,[*] NOW()) WHERE ID = ? AND STATE = ? "; expected "[, ::, *, /, %, +, -, ||, ~, !~, NOT, LIKE, REGEXP, IS, IN, BETWEEN, AND, OR, )"; SQL statement:
update tasks  set updated_at = now(), state = ?, retry_at = TIMESTAMPADD('SECOND', state_params = ?, ?, now()) where id = ? and state = ? [42001-191] [statement:"update tasks  set updated_at = now(), state = :newState, retry_at = TIMESTAMPADD('SECOND', state_params = :stateParams, :retryInterval, now()) where id = :id and state = :oldState", located:"update tasks  set updated_at = now(), state = :newState, retry_at = TIMESTAMPADD('SECOND', state_params = :stateParams, :retryInterval, now()) where id = :id and state = :oldState", rewritten:"update tasks  set updated_at = now(), state = ?, retry_at = TIMESTAMPADD('SECOND', state_params = ?, ?, now()) where id = ? and state = ?", arguments:{ positional:{}, named:{}, finder:[]}]

Task state is saved at digdag.status/20160329T000000+0900 directory.
Run command with --session '2016-03-29 00:00:00' argument to retry failed tasks.

Digdag workflows command should show the original definition file

running digdag workflows <project_name> +<workflow_name> should show the original file, not a slightly deviated version of it. Including any comments or other notations included.

basic monitoring UI

First iteration; just a list of workflow attempts with results.

Project	Workflow	Start Time	Elapsed time	Status	ID
nasdaq	analysis	2016-04-18 12:45:07	1 min 3 sec	Running	712934839855
nasdaq	ingest	2016-04-18 12:34:56	2 min 17 sec	Succeeded	712934871234
...

auto-increment id for revisions

Setting -r option is not always useful for users. Most of initial users want to auto-generated sequence.

Idea here is to add something to server (and client) so that they generate an id for each revision in a project (The id is not globally unique. It's scoped in a project).

cannot use _schedule cron>

$ cat digdag.yml
run: +main
timezone: "America/Los_Angeles"

+main:
  _schedule:
    cron>: 42 4 1 * *

  sh>: echo hello world

$ digdag push dano-cron-test -r 1
2016-04-11 15:46:26 -0700: Digdag v0.5.9
Creating digdag.archive.tar.gz...
  Archiving digdag.yml
Workflows:
  +main

error: Status code 400: {"message":"Parameter 'cron' is required but not set","status":400}

Add configuration for server to store access logs

Add a systemConfig to store access log files to digdag-server. Maybe like server.access-log.path = <path>.

Take parameters from a config file in ~/.digdag

Many operators need parameters such as hostname, password, etc. They are connection information. We don't want to put those information in workflow yml file because those files could be uploaded to github.

Idea here is to put those information in a file on home directory.
For example, ~/.digdag/config file has following configuration:

client.server-endpoint = ...
client.http-header.authorization = ...
params.td.apikey = ...
params.mysql.hostname = ...
params.mysql.username = ...
params.mysql.password = ...

digdag run and digdag schedule take those parameters.

In this idea, digdag push also takes those parameters and set them to defaultParams of a new revision. This is not always expected behavior especially when a project could be pushed by multiple people. But it's acceptable for now.

digdag cli seems prioritize params from config file over params passed on command line

digdag can't delete project

Deleting a project should do this:

make the project invisible by default.
all running session attempts associated to the project are canceled, or impossible to delete a project if there are running session attempts.
keeps session records and task logs so that we can check past information.

Implementation limitations are:

deleting sessions associated to the project may take long time. Therefore, it needs to be divided into multiple DB transactions scheduled by a background thread.
it needs a long transaction to check whether there is a running session attempt or not in a project.

org.h2.jdbc.JdbcSQLException: Table "QUEUED_SHARED_TASK_LOCKS" not found

I ran into this error when running a workflow as a local mode.

2016-04-01 14:25:30 +0900 [ERROR] (local-agent-0): Uncaught exception
org.skife.jdbi.v2.exceptions.UnableToCreateStatementException: org.h2.jdbc.JdbcSQLException: Table "QUEUED_SHARED_TASK_LOCKS" not found; SQL statement:
with recursive t (queue_id) as ((select queue_id from queued_shared_task_locks where hold_expire_time is null order by queue_id limit 1) union all select (select queue_id from queued_shared_task_locks where hold_expire_time is null and queue_id > t.queue_id order by queue_id limit 1) from t where t.queue_id is not null) select queue_id as id from t where queue_id is not null [42102-191] [statement:"with recursive t (queue_id) as ((select queue_id from queued_shared_task_locks where hold_expire_time is null order by queue_id limit 1) union all select (select queue_id from queued_shared_task_locks where hold_expire_time is null and queue_id > t.queue_id order by queue_id limit 1) from t where t.queue_id is not null) select queue_id as id from t where queue_id is not null", located:"with recursive t (queue_id) as ((select queue_id from queued_shared_task_locks where hold_expire_time is null order by queue_id limit 1) union all select (select queue_id from queued_shared_task_locks where hold_expire_time is null and queue_id > t.queue_id order by queue_id limit 1) from t where t.queue_id is not null) select queue_id as id from t where queue_id is not null", rewritten:"with recursive t (queue_id) as ((select queue_id from queued_shared_task_locks where hold_expire_time is null order by queue_id limit 1) union all select (select queue_id from queued_shared_task_locks where hold_expire_time is null and queue_id > t.queue_id order by queue_id limit 1) from t where t.queue_id is not null) select queue_id as id from t where queue_id is not null", arguments:{ positional:{}, named:{}, finder:[]}]
        at org.skife.jdbi.v2.SQLStatement.internalExecute(SQLStatement.java:1306)
        at org.skife.jdbi.v2.Query.fold(Query.java:173)
        at org.skife.jdbi.v2.Query.list(Query.java:82)
        at org.skife.jdbi.v2.Query.list(Query.java:75)
        at io.digdag.core.database.DatabaseTaskQueueStore.lambda$lockSharedTasks$147(DatabaseTaskQueueStore.java:195)
        at io.digdag.core.database.BasicDatabaseStoreManager.lambda$transaction$54(BasicDatabaseStoreManager.java:192)
        at org.skife.jdbi.v2.tweak.transactions.LocalTransactionHandler.inTransaction(LocalTransactionHandler.java:183)
        at org.skife.jdbi.v2.BasicHandle.inTransaction(BasicHandle.java:330)
        at io.digdag.core.database.BasicDatabaseStoreManager.transaction(BasicDatabaseStoreManager.java:192)
        at io.digdag.core.database.DatabaseTaskQueueStore.lockSharedTasks(DatabaseTaskQueueStore.java:176)
        at io.digdag.core.database.DatabaseTaskQueueFactory$DatabaseTaskQueue.lockSharedTasks(DatabaseTaskQueueFactory.java:87)
        at io.digdag.core.agent.LocalAgent.run(LocalAgent.java:57)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.h2.jdbc.JdbcSQLException: Table "QUEUED_SHARED_TASK_LOCKS" not found; SQL statement:
with recursive t (queue_id) as ((select queue_id from queued_shared_task_locks where hold_expire_time is null order by queue_id limit 1) union all select (select queue_id from queued_shared_task_locks where hold_expire_time is null and queue_id > t.queue_id order by queue_id limit 1) from t where t.queue_id is not null) select queue_id as id from t where queue_id is not null [42102-191]
        at org.h2.message.DbException.getJdbcSQLException(DbException.java:345)
        at org.h2.message.DbException.get(DbException.java:179)
        at org.h2.message.DbException.get(DbException.java:155)
        at org.h2.command.Parser.readTableOrView(Parser.java:5349)
        at org.h2.command.Parser.readTableFilter(Parser.java:1245)
        at org.h2.command.Parser.parseSelectSimpleFromPart(Parser.java:1884)
        at org.h2.command.Parser.parseSelectSimple(Parser.java:2032)
        at org.h2.command.Parser.parseSelectSub(Parser.java:1878)
        at org.h2.command.Parser.parseSelectUnion(Parser.java:1699)
        at org.h2.command.Parser.parseSelectSub(Parser.java:1874)
        at org.h2.command.Parser.parseSelectUnion(Parser.java:1699)
        at org.h2.command.Parser.parseSelect(Parser.java:1687)
        at org.h2.command.Parser.parseWith(Parser.java:4745)
        at org.h2.command.Parser.parsePrepared(Parser.java:479)
        at org.h2.command.Parser.parse(Parser.java:315)
        at org.h2.command.Parser.parse(Parser.java:287)
        at org.h2.command.Parser.prepareCommand(Parser.java:252)
        at org.h2.engine.Session.prepareLocal(Session.java:560)
        at org.h2.engine.Session.prepareCommand(Session.java:501)
        at org.h2.jdbc.JdbcConnection.prepareCommand(JdbcConnection.java:1188)
        at org.h2.jdbc.JdbcPreparedStatement.<init>(JdbcPreparedStatement.java:73)
        at org.h2.jdbc.JdbcConnection.prepareStatement(JdbcConnection.java:276)
        at com.zaxxer.hikari.pool.ProxyConnection.prepareStatement(ProxyConnection.java:308)
        at com.zaxxer.hikari.pool.HikariProxyConnection.prepareStatement(HikariProxyConnection.java)
        at org.skife.jdbi.v2.DefaultStatementBuilder.create(DefaultStatementBuilder.java:54)
        at org.skife.jdbi.v2.SQLStatement.internalExecute(SQLStatement.java:1302)
        ... 16 common frames omitted

This is the workflow:

run: +main

+main:
  +step1:
    sh>: ./tasks/bin/enc-tool -e development-ec2 -a 1 -c ENCRYPT -t 200

digdag init should set default timezone

$ digdag --version
0.5.9

$ digdag init foo
2016-04-11 14:42:38 -0700: Digdag v0.5.9
  Creating foo/digdag
  Creating foo/.digdag-wrapper/digdag.jar
  Creating foo/.gitignore
  Creating foo/tasks/shell_sample.sh
  Creating foo/tasks/repeat_hello.sh
  Creating foo/tasks/__init__.py
  Creating foo/digdag.yml
Done. Type `cd foo` and `./digdag r` to run the workflow. Enjoy!

$ cd foo
$ digdag push foo -r test
2016-04-11 14:42:52 -0700: Digdag v0.5.9
Creating digdag.archive.tar.gz...
Exception in thread "main" io.digdag.client.config.ConfigException: timezone: parameter is required but not set at digdag.yml. Example is 'timezone: America/Los_Angeles'.
    at io.digdag.cli.client.Archive.runArchive(Archive.java:152)
    at io.digdag.cli.client.Archive.archive(Archive.java:105)
    at io.digdag.cli.client.Push.push(Push.java:66)
    at io.digdag.cli.client.Push.mainWithClientException(Push.java:41)
    at io.digdag.cli.client.ClientCommand.main(ClientCommand.java:55)
    at io.digdag.cli.Main.main(Main.java:176)

Add session_time_format, last_session_time_format, and next_session_time_format functions to config eval engine

Those functions are useful when a task content wants to include custom timestamp format.
For example,

create_tables: [mytable_${session_time_format("%Y%m%d_%H")}]
create_tables: [mytable_${last_session_time_format("%Y%m")}]

Here is strftime library for JavaScript which can be used internally:
https://github.com/samsonjs/strftime

td: create_table option should be renamed to result_table

Since create_table doesn't indicate the query result will be stored to a table.

digdag should warn about _schedule outside workflow

digdag should complain about incorrectly placing _schedule outside the workflow like this:

run: +main

_schedule:
  hourly>: 12:34

+main:
  td>: query/important.sql

Add ${user.name} property

To know who is running/submitting the workflow.