Giter VIP home page Giter VIP logo

dagu-dev / dagu Goto Github PK

View Code? Open in Web Editor NEW
1.4K 18.0 137.0 33.49 MB

Developer-friendly, minimalism Cron alternative, but with much more capabilities. It aims to solve greater problems.

Home Page: https://dagu.readthedocs.io

License: GNU General Public License v3.0

Makefile 1.46% Go 78.04% Shell 0.74% CSS 0.13% JavaScript 0.40% HTML 0.08% TypeScript 18.94% Dockerfile 0.22%
workflow cron automation scheduler golang directed-acyclic-graph task-runner task-scheduler continuous-delivery devops-pipeline workflow-automation workflow-engine workflow-management workflow-tool dag-scheduling workflow-scheduler

dagu's Introduction

dagu-logo

Dagu

Dagu is a powerful Cron alternative that comes with a Web UI. It allows you to define dependencies between commands as a Directed Acyclic Graph (DAG) in a declarative YAML format. Dagu simplifies the management and execution of complex workflows. It natively supports running Docker containers, making HTTP requests, and executing commands over SSH.

Highlights

  • Single binary file installation
  • Declarative YAML format for defining DAGs
  • Web UI for visually managing, rerunning, and monitoring pipelines
  • Use existing programs without any modification
  • Self-contained, with no need for a DBMS

Table of Contents

Features

  • Web User Interface
  • Command Line Interface (CLI) with several commands for running and managing DAGs
  • YAML format for defining DAGs, with support for various features including:
    • Execution of custom code snippets
    • Parameters
    • Command substitution
    • Conditional logic
    • Redirection of stdout and stderr
    • Lifecycle hooks
    • Repeating task
    • Automatic retry
  • Executors for running different types of tasks:
    • Running arbitrary Docker containers
    • Making HTTP requests
    • Sending emails
    • Running jq command
    • Executing remote commands via SSH
  • Email notification
  • Scheduling with Cron expressions
  • REST API Interface
  • Basic Authentication over HTTPS

Use Cases

  • Data Pipeline Automation: Schedule ETL tasks for data processing and centralization.
  • Infrastructure Monitoring: Periodically check infrastructure components with HTTP requests or SSH commands.
  • Automated Reporting: Generate and send periodic reports via email.
  • Batch Processing: Schedule batch jobs for tasks like data cleansing or model training.
  • Task Dependency Management: Manage complex workflows with interdependent tasks.
  • Microservices Orchestration: Define and manage dependencies between microservices.
  • CI/CD Integration: Automate code deployment, testing, and environment updates.
  • Alerting System: Create notifications based on specific triggers or conditions.
  • Custom Task Automation: Define and schedule custom tasks using code snippets.

Web UI

DAG Details

It shows the real-time status, logs, and DAG configurations. You can edit DAG configurations on a browser.

example

You can switch to the vertical graph with the button on the top right corner.

Details-TD

DAGs

It shows all DAGs and the real-time status.

DAGs

Search

It greps given text across all DAG definitions. History

Execution History

It shows past execution results and logs.

History

Log Viewer

It shows the detail log and standard output of each execution and step.

DAG Log

Installation

You can install Dagu quickly using Homebrew or by downloading the latest binary from the Releases page on GitHub.

Via Bash script

curl -L https://raw.githubusercontent.com/daguflow/dagu/main/scripts/installer.sh | bash

Via GitHub Releases Page

Download the latest binary from the Releases page and place it in your $PATH (e.g. /usr/local/bin).

Via Homebrew (macOS)

brew install daguflow/brew/dagu

Upgrade to the latest version:

brew upgrade daguflow/brew/dagu

Via Docker

docker run \
--rm \
-p 8080:8080 \
-v $HOME/.config/dagu/dags:/home/dagu/.config/dagu/dags \
-v $HOME/.local/share/dagu:/home/dagu/.local/share/dagu \
ghcr.io/daguflow/dagu:latest dagu start-all

See Environment variables to configure those default directories.

Quick Start Guide

1. Launch the Web UI

Start the server and scheduler with the command dagu start-all and browse to http://127.0.0.1:8080 to explore the Web UI.

2. Create a New DAG

Navigate to the DAG List page by clicking the menu in the left panel of the Web UI. Then create a DAG by clicking the NEW button at the top of the page. Enter example in the dialog.

Note: DAG (YAML) files will be placed in ~/.config/dagu/dags by default. See Configuration Options for more details.

3. Edit the DAG

Go to the SPEC Tab and hit the Edit button. Copy & Paste the following example and click the Save button.

Example:

schedule: "* * * * *" # Run the DAG every minute
steps:
    - name: s1
      command: echo Hello Dagu
    - name: s2
      command: echo done!
      depends:
          - s1

4. Execute the DAG

You can execute the example by pressing the Start button. You can see "Hello Dagu" in the log page in the Web UI.

CLI

# Runs the DAG
dagu start [--params=<params>] <file>

# Displays the current status of the DAG
dagu status <file>

# Re-runs the specified DAG run
dagu retry --req=<request-id> <file>

# Stops the DAG execution
dagu stop <file>

# Restarts the current running DAG
dagu restart <file>

# Dry-runs the DAG
dagu dry [--params=<params>] <file>

# Launches both the web UI server and scheduler process
dagu start-all [--host=<host>] [--port=<port>] [--dags=<path to directory>]

# Launches the Dagu web UI server
dagu server [--host=<host>] [--port=<port>] [--dags=<path to directory>]

# Starts the scheduler process
dagu scheduler [--dags=<path to directory>]

# Shows the current binary version
dagu version

Localized Documentation

Documentation

Running as a daemon

The easiest way to make sure the process is always running on your system is to create the script below and execute it every minute using cron (you don't need root account in this way):

#!/bin/bash
process="dagu start-all"
command="/usr/bin/dagu start-all"

if ps ax | grep -v grep | grep "$process" > /dev/null
then
    exit
else
    $command &
fi

exit

Example DAG

This example DAG showcases a data pipeline typically implemented in DevOps and Data Engineering scenarios. It demonstrates an end-to-end data processing cycle starting from data acquisition and cleansing to transformation, loading, analysis, reporting, and ultimately, cleanup.

Details-TD

The YAML code below represents this DAG:

# Environment variables used throughout the pipeline
env:
    - DATA_DIR: /data
    - SCRIPT_DIR: /scripts
    - LOG_DIR: /log
    # ... other variables can be added here

# Handlers to manage errors and cleanup after execution
handlerOn:
    failure:
        command: "echo error"
    exit:
        command: "echo clean up"

# The schedule for the DAG execution in cron format
# This schedule runs the DAG daily at 12:00 AM
schedule: "0 0 * * *"

steps:
    # Step 1: Pull the latest data from a data source
    - name: pull_data
      command: "sh"
      script: |
          echo `date '+%Y-%m-%d'`
      output: DATE

    # Step 2: Cleanse and prepare the data
    - name: cleanse_data
      command: echo cleansing ${DATA_DIR}/${DATE}.csv
      depends:
          - pull_data

    # Step 3: Transform the data
    - name: transform_data
      command: echo transforming ${DATA_DIR}/${DATE}_clean.csv
      depends:
          - cleanse_data

    # Parallel Step 1: Load the data into a database
    - name: load_data
      command: echo loading ${DATA_DIR}/${DATE}_transformed.csv
      depends:
          - transform_data

    # Parallel Step 2: Generate a statistical report
    - name: generate_report
      command: echo generating report ${DATA_DIR}/${DATE}_transformed.csv
      depends:
          - transform_data

    # Step 4: Run some analytics
    - name: run_analytics
      command: echo running analytics ${DATA_DIR}/${DATE}_transformed.csv
      depends:
          - load_data

    # Step 5: Send an email report
    - name: send_report
      command: echo sending email ${DATA_DIR}/${DATE}_analytics.csv
      depends:
          - run_analytics
          - generate_report

    # Step 6: Cleanup temporary files
    - name: cleanup
      command: echo removing ${DATE}*.csv
      depends:
          - send_report

Motivation

Legacy systems often have complex and implicit dependencies between jobs. When there are hundreds of cron jobs on a server, it can be difficult to keep track of these dependencies and to determine which job to rerun if one fails. It can also be a hassle to SSH into a server to view logs and manually rerun shell scripts one by one. Dagu aims to solve these problems by allowing you to explicitly visualize and manage pipeline dependencies as a DAG, and by providing a web UI for checking dependencies, execution status, and logs and for rerunning or stopping jobs with a simple mouse click.

Dagu addresses these pain points by providing a user-friendly solution for explicitly defining and visualizing workflows. With its intuitive web UI, Dagu simplifies the management of workflows, enabling users to easily check dependencies, monitor execution status, view logs, and control job execution with just a few clicks.

Why Not Use an Existing DAG Scheduler Like Airflow?

There are many existing tools such as Airflow, but many of these require you to write code in a programming language like Python to define your DAG. For systems that have been in operation for a long time, there may already be complex jobs with hundreds of thousands of lines of code written in languages like Perl or Shell Script. Adding another layer of complexity on top of these codes can reduce maintainability. Dagu was designed to be easy to use, self-contained, and require no coding, making it ideal for small projects.

How It Works

Dagu is a single command line tool that uses the local file system to store data, so no database management system or cloud service is required. DAGs are defined in a declarative YAML format, and existing programs can be used without modification.


Feel free to contribute in any way you want! Share ideas, questions, submit issues, and create pull requests. Check out our Contribution Guide for help getting started.

We welcome any and all contributions!

License

This project is licensed under the GNU GPLv3.

Support and Community

Join our Discord community to ask questions, request features, and share your ideas.

dagu's People

Contributors

arseniysavin avatar arvintian avatar ddddddo avatar fishnux avatar fruworg avatar garunitule avatar halalala222 avatar hirosassa avatar htcorange avatar jadrho avatar kiyo510 avatar kriyanshii avatar lat0z avatar lucaslah avatar rafiramadhana avatar ramonespinosa avatar rocwang avatar sapankhandwala-tomtom avatar simonwaldherr avatar smekuria1 avatar stefaan1o avatar tahiraii avatar triole avatar vkill avatar x2ocoder avatar x4204 avatar yarikoptic avatar yohamta avatar zph avatar zwzjut avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dagu's Issues

TODO

TODO

  • refactor: Separate Web UI codes from go template
  • refactor: Migrate Web UI codes to typescript
  • feat: Add tags field to enable a workflow to have arbitrary tags (ex: - daily)
  • feat: Improve Web UI design
  • feat: Add the ability to update task status with a mouse click on visual graph on Web UI
  • fix: Not to send mail notifications when a workflow is canceled not failed
  • fix: Not to expandEnv on config page on Web UI
  • feat: Add named parameters (ex: params: "p1=x p2=y")
  • feat: Add stdout field that can write standard output to any file in a step definition
  • feat: Add search function for workflows using free text keyword
  • feat: Add sorting function to HTML tables on Web UI using react-table
  • feat: start/stop buttons on Workflow list page
  • feat: sorting function for dashboard timechart
  • feat: filtering workflows by clicking a tag on table on Web UI
  • feat: Change name field to an optional field by using the file name as a default value
  • feat: Add version subcommand that shows the version (#127)
  • feat: Add function to pass variables from one step to another using output field in a step definition
  • feat: Add function to embed arbitrary script in a YAML file using the script field
  • fix: Fix the formatting issue of scheduler's log on Web UI
  • fix: Fix goreleaser config and Makefile to embed the correct version value (ex: 1.0.0) using -ldflags
  • fix: Fix release GitHub Action and remove web assets from the repository (bundle.js and font files)
  • docs: REST API interface

Not sure

  • feat: Add No-Code GUI for workflow edit
  • feat: i18n for Web UI
  • test: Add tests for frontend Web UI codes in admin directory
  • test: Improve the test coverage to 90%
  • test: Add --race option to test command
  • test: Add tests for Goroutine leaks
  • feat: Add --repeat-until option to start subcommand
  • refactor: Context handling logics in cmd, agent, scheduler, and unix package.
  • feat: Add Until option to RepeatPolicy field to enable to specify the end time of repetitive task
  • feat: Ability to manage multiple instances from a single instance on Web UI using Rest API interface
  • feat: Make the agent an importable package from 3rd party software
  • feat: Scheduler daemon
  • feat: RBAC user authentication functionalities
  • feat: Workflow layering mechanism such as JobNet
  • feat: Windows support
  • feat: Docker container image

Need a better way to hierarchize multiple workflows

It is still possible to call other workflows from within a workflow without problems, but it becomes confusing. Also, when a workflow is renamed, it would be inconvenient if the workflow calling that workflow is not also automatically renamed.

Desirable Functions to handle multi-level workflows:

  • Ability to group and hierarchize workflows
  • When the name (or location) of a dependent workflow is changed, the definition of the parent workflow is also automatically modified.

TODOs

  • feat: #66
  • docs: build website on github.io
  • fix: tests with --race
  • fix: tests with Goroutine leaks
  • refactor: separte the web codes from gotemplate

Not sure

  • feat: scheduler function using existing cron-like go library

Execution difference between web ui and cli

I am currently testing the execution of a workflow created from the web ui, with the examples/minimal.yaml content.
From the start button I get an error "exec: no command".
From ./dagu start /home/minimal.yaml there are no errors.
Am I missing any extra configuration?
This is the content of my admin.yaml

port: 8080
dags: /home/
isBasicAuth: true
basicAuthUsername: admin
basicAuthPassword: admin

Bug: Cannot read properties of undefined (reading 'errors') on HistTable in Web UI

Invalid history data cause the javascript crash on web UI history page.

react-dom.min.js?v=220516130708:141 TypeError: Cannot read properties of undefined (reading 'errors')
    at new Error (<anonymous>:136:22)
    at Function.createFromInputFallback (moment.min.js?v=220516130708:1)
    at moment.min.js?v=220516130708:1
    at bt (moment.min.js?v=220516130708:1)
    at xt (moment.min.js?v=220516130708:1)
    at Tt (moment.min.js?v=220516130708:1)
    at f (moment.min.js?v=220516130708:1)
    at <anonymous>:289:18
    at Array.map (<anonymous>)
    at HistTable (<anonymous>:288:12)

Feature request: dashboard page on web UI

When workflows exist in multiple levels, the current status is not displayed on the top page. It would be good to have a dashboard page where the status can be checked at a glance.

Consider rename the software

I would like to change the name dagu since it's sometimes confusing with the concept of DAG (Directed Acyclic Graph) when it is pronounced. Also, I would like to change the mascot animal, which is an animal kind, Degu, because sometimes I get confused with dagu and mistypo degu.

List of candidates (WIP):

  • Autoflow
  • NocodeFlow
  • Flowsy
  • AutoFluir
  • ProsperFlow
  • SimplexFlow

Why is dagu reading irrelevant YAML files?

I started dagu with dagu server in my homedir [on Linux], and now the webui asks me to check three “errors below”. Those “errors” are — according to dagu — incompatible YAML errors, but why on Earth is dagu reading YAML files that has got nothing to do with dagu? Is it simply scanning the homedir and reading each and every YAML file it can find?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.