Giter VIP home page Giter VIP logo

conductor-oss / conductor Goto Github PK

View Code? Open in Web Editor NEW
10.5K 29.0 251.0 31.68 MB

Conductor is an event driven orchestration platform

Home Page: https://conductor-oss.org

License: Apache License 2.0

Java 77.58% Groovy 15.28% Dockerfile 0.06% Shell 0.02% JavaScript 6.66% HTML 0.04% TypeScript 0.15% SCSS 0.07% CSS 0.02% PLpgSQL 0.06% Python 0.05%
distributed-systems grpc java javascript microservice-orchestration orchestration-engine orchestrator reactjs spring-boot workflow-automation workflow-engine workflow-management workflows durable-execution

conductor's Introduction

Conductor

Github release License

Conductor is a platform originally created at Netflix to orchestrate microservices and events. Conductor OSS is maintained by the team of developers at Orkes along with the members of the open source community.

Conductor OSS

This is the new home for the Conductor open source going forward (previously hosted at Netflix/Conductor).

Important

Going forward, all the bug fixes, feature requests and security patches will be applied and released from this repository.

The last published version of Netflix Conductor will be 3.15.0 which we will continue to support.

If you would like to participate in the roadmap and development, please reach out.

⭐ This repository

Show support for the Conductor OSS. Please help spread the awareness by starring this repo.

GitHub stars

Update your local forks/clones

Please update your forks to point to this repo. This will ensure your commits and PRs can be send against this repository

git remote set-url origin https://github.com/conductor-oss/conductor

Important

Follow the steps below if you have an active PR against the Netflix/Conductor repository

  1. Fork this repository
  2. Update your local repository to change the remote to this repository
  3. Send a PR against the main branch

Releases

The latest version is Github release

Resources

We have an active community of Conductor users and contributors on the channel.

Documentation and tutorial on how to use Conductor

Discussion Forum: Please use the forum for questions and discussing ideas and join the community.

Conductor SDKs

Conductor supports creating workflows using JSON and Code.
SDK support for creating workflows using code is available in multiple languages and can be found at https://github.com/conductor-sdk

Getting Started - Building & Running Conductor

From Source:

If you wish to build your own distribution, you can run ./gradlew build from this project that products the runtime artifacts. The runnable server is in server/ module.

Using Docker (Recommended)

Follow the steps below to launch the docker container:

docker compose -f docker/docker-compose.yaml up

Database Requirements

  • The default persistence used is Redis
  • The indexing backend is Elasticsearch (6.x)

Other Requirements

  • JDK 17+
  • UI requires Node 14 to build. Earlier Node versions may work but are untested.

Get Support

There are several ways to get in touch with us:

Contributors

conductor's People

Contributors

alexmay48 avatar apanicker-nflx avatar aravindanr avatar bjpirt avatar c4lm avatar clari-akhilesh avatar cyzhao avatar dependabot[bot] avatar falu2010-netflix avatar gorzell avatar huangyiminghappy avatar hunterford avatar ismaley avatar james-deee avatar josedab avatar jvemugunta avatar jxu-nflx avatar kishorebanala avatar lordbender avatar manan164 avatar mdepak avatar mstier-nflx avatar naveenchlsn avatar pctreddy avatar peterlau avatar picaron avatar s50600822 avatar v1r3n avatar verstraetebert avatar vmg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

conductor's Issues

[FEATURE]: Support LLM Orchestration

Please read our contributor guide before creating an issue.
Also consider discussing your idea on the discussion forum first.

Describe the Feature Request

A clear and concise description of what the feature request is.

Describe Preferred Solution

A clear and concise description of what you want to happen.

Describe Alternatives

A clear and concise description of any alternative solutions or features you've considered.

Incomplete Constructor Switch(WorkflowTask) Implementation Breaking WorkflowDef-to-ConductorWorkflow Conversion

Describe the bug
Current java-sdk implementation of Switch class constructor Switch(WorkflowTask) is incomplete, hence breaking WorkflowDef-to-ConductorWorkflow conversion.

The shortcoming can be corrected by adding the following statements to constructor Switch(WorkflowTask):
this.useJavascript = workflowTask.getEvaluatorType().equals(JAVASCRIPT_NAME);
this.caseExpression = (String)this.getInput().get("switchCaseValue");

Switch-mod

Details
Conductor version: 3.17.0
Persistence implementation: Cassandra, Postgres
Queue implementation: Postgres
Lock: Redis
Workflow definition:
Task definition: Switch
Event handler definition:

To Reproduce
The WorkflowDef-to-ConductorWorkflow conversion fault is reproduced by modifying example app java-sdk-examples to function as test program.
Statements were added to Main.java starting at Line 54 for ConductorWorkflow -> WorkflowDef -> ConductorWorkflow conversion.

//ConductorWorkflow -> WorkflowDef conversion
WorkflowDef workflowDef = simpleWorkflow.toWorkflowDef();

//WorkflowDef -> ConductorWorkflow conversion
simpleWorkflow = ConductorWorkflow.fromWorkflowDef(workflowDef);

//executeDynamic
workflowExecution = utils.getWorkflowExecutor().executeWorkflow(simpleWorkflow, input);
workflowRun = workflowExecution.get(10, TimeUnit.SECONDS);

Main-test

Expected behavior
If class-constructor Switch(WorkflowTask) is functioning correctly, the resultant workflow which has undergone ConductorWorkflow->WorkflowDef->ConductorWorkflow conversion should behave exactly as the original workflow.

[FEATURE]: Human task missing in this Repo???

Please read our contributor guide before creating an issue.
Also consider discussing your idea on the discussion forum first.

Describe the Feature Request

Is human task available in this repo? Not found it in Swagger.

Describe Preferred Solution

A clear and concise description of what you want to happen.

Describe Alternatives

A clear and concise description of any alternative solutions or features you've considered.

Workflow can be remove from the system in RUNNING status

Describe the bug
Workflow with RUNNING status can be removed from the system

Details
Conductor version: 3.13.8
Persistence implementation: MySQL
Queue implementation: MySQL
Lock: Redis
Workflow definition:

{
  "createTime": 1703154343929,
  "accessPolicy": {},
  "name": "mhTestWf",
  "description": "tltr",
  "version": 1,
  "tasks": [
    {
      "name": "dummy",
      "taskReferenceName": "dummy",
      "inputParameters": {},
      "type": "HUMAN",
      "startDelay": 0,
      "optional": false,
      "asyncComplete": false
    }
  ],
  "inputParameters": [],
  "outputParameters": {},
  "schemaVersion": 2,
  "restartable": true,
  "workflowStatusListenerEnabled": false,
  "ownerEmail": "[email protected]",
  "timeoutPolicy": "ALERT_ONLY",
  "timeoutSeconds": 0,
  "variables": {},
  "inputTemplate": {}
}

To Reproduce
Steps to reproduce the behaviour:

  1. Create Workflow Definition
  2. Run workflow from this definition
curl -X 'POST' \
  'http://conductor-server:8080/api/workflow/mhTestWf?priority=0' \
  -H 'accept: text/plain' \
  -H 'Content-Type: application/json' \
  -d '{}'
  1. Check workflow status:
    3.1 in conductor UI
image

3.2 in db workflow table

MariaDB [conductor_db]> select * from workflow where workflow_id = 'a38e1ebc-a6ec-4a48-8b26-0864e278242d'\G
*************************** 1. row ***************************
          created_on: 2023-12-21 10:29:14
         modified_on: 2023-12-21 10:29:14
         workflow_id: a38e1ebc-a6ec-4a48-8b26-0864e278242d
      correlation_id: NULL
           json_data: {"status":"RUNNING","endTime":0,"workflowId":"a38e1ebc-a6ec-4a48-8b26-0864e278242d","tasks":[],"workflowDefinition":{"createTime":1703154343929,"accessPolicy":{},"name":"mhTestWf","description":"tltr","version":1,"tasks":[{"name":"dummy","taskReferenceName":"dummy","inputParameters":{},"type":"HUMAN","startDelay":0,"optional":false,"asyncComplete":false}],"inputParameters":[],"outputParameters":{},"schemaVersion":2,"restartable":true,"workflowStatusListenerEnabled":false,"ownerEmail":"[email protected]","timeoutPolicy":"ALERT_ONLY","timeoutSeconds":0,"variables":{},"inputTemplate":{}},"priority":0,"variables":{},"lastRetriedTime":0,"ownerApp":"","createTime":1703154554647,"workflowName":"mhTestWf","workflowVersion":1,"input":{},"output":{}}
    parentWorkflowId: NULL
parentWorkflowTaskId: NULL
1 row in set (0.000 sec)

3.3 in db task table:

MariaDB [conductor_db]> select * from task where task_id='dde40ce9-3134-45bf-bf7d-90e060a1f5b9' \G
*************************** 1. row ***************************
   created_on: 2023-12-21 10:29:14
  modified_on: 2023-12-21 10:29:14
      task_id: dde40ce9-3134-45bf-bf7d-90e060a1f5b9
    json_data: {"taskType":"HUMAN","status":"IN_PROGRESS","referenceTaskName":"dummy","retryCount":0,"seq":1,"pollCount":0,"taskDefName":"dummy","scheduledTime":1703154554676,"startTime":1703154554675,"endTime":0,"updateTime":1703154554682,"startDelayInSeconds":0,"retried":false,"executed":false,"callbackFromWorker":true,"responseTimeoutSeconds":0,"workflowInstanceId":"a38e1ebc-a6ec-4a48-8b26-0864e278242d","workflowType":"mhTestWf","taskId":"dde40ce9-3134-45bf-bf7d-90e060a1f5b9","callbackAfterSeconds":0,"workflowTask":{"name":"dummy","taskReferenceName":"dummy","inputParameters":{},"type":"HUMAN","startDelay":0,"optional":false,"asyncComplete":false},"rateLimitPerFrequency":0,"rateLimitFrequencyInSeconds":0,"workflowPriority":0,"iteration":0,"waitTimeout":0,"subworkflowChanged":false,"taskDefinition":null,"loopOverTask":false,"queueWaitTime":-1,"inputData":{},"outputData":{}}
     taskType: HUMAN
       status: IN_PROGRESS
subWorkflowId: NULL
1 row in set (0.000 sec)

As you can see, there is a workflow with status running, execution of the workflow is in db, a task also exists in db.

  1. Use REST API to remove workflow from the system
curl -X 'DELETE' \
  'http://up-be-conductor-server:8080/api/workflow/a38e1ebc-a6ec-4a48-8b26-0864e278242d/remove?archiveWorkflow=true' \
  -H 'accept: */*'

Response body:

{
  "status": 400,
  "message": "Cannot archive workflow: a38e1ebc-a6ec-4a48-8b26-0864e278242d with status: RUNNING",
  "instance": "conductor-server",
  "retryable": false
}

According to the response from server, an operation is not permitted, since the workflow is in running status, and it's good, but actually all related data from database is deleted.

  1. checkout database

workflow table:

MariaDB [conductor_db]> select * from workflow where workflow_id = 'a38e1ebc-a6ec-4a48-8b26-0864e278242d'\G
Empty set (0.000 sec)

task table:

MariaDB [conductor_db]> select * from task where task_id='dde40ce9-3134-45bf-bf7d-90e060a1f5b9' \G
Empty set (0.000 sec)

Expected behavior
Conductor server shouldn`t delete workflow and all related tasks from storage while workflow in running status.

Thx.

CVE Updates for PostgreSQL and Elastic Search 2024-02.

Describe the bug
Critical vulnerability in PostgreSQL SQL package. https://www.cve.org/CVERecord?id=CVE-2024-1597
Additional vulnerabilities in test-utils due to an older version of Elastic Search packages.

Details
Conductor version:
Persistence implementation: Postgres

To Reproduce
Scan postgres-persistence.

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

Basename feature in #3722 doesn't work after "yarn build"

Describe the bug
The feature in #3722 breaks the "NavLink" with "newTab". The package's homepage (aka basename) doesn't work after "yarn build".

This works on v3.15.0 tag, shall we revert ?

@haricane8133

Details
Conductor UI on the latest main.

To Reproduce
Steps to reproduce the behavior:

  1. Build and run the conductor UI
  2. Goto Workflow Definitions
  3. Click the "Executions" 's Query Link,
  4. It doesn't work.

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

conductor oss docker-compose up throw ClassNotFoundException

Describe the bug
conductor oss, when build and startup with docker-compose, throw ClassNotFoundException
Details
Conductor version: latest
Persistence implementation: Cassandra, Postgres, MySQL, Dynomite etc - Postgres
Queue implementation: Postgres, MySQL, Dynoqueues etc
Lock: Redis or Zookeeper?
Workflow definition:
Task definition:
Event handler definition:

To Reproduce
Steps to reproduce the behavior:

  1. git clone https://github.com/conductor-oss/conductor.git
  2. go to docker folder
  3. docker-compose -f docker-compose-postgres-es7.yaml build
  4. docker-compose -f docker-compose-postgres-es7.yaml up
  5. When conductor server starts up it reports below error:

conductor-server | at org.springframework.beans.factory.support.ConstructorResolver.resolveAutowiredArgument(ConstructorResolver.java:910) ~[conductor-es7-persistence-0.1.0-SNAPSHOT.jar!/:0.1.0-SNAPSHOT]
conductor-server | at org.springframework.beans.factory.support.ConstructorResolver.createArgumentArray(ConstructorResolver.java:788) ~[conductor-es7-persistence-0.1.0-SNAPSHOT.jar!/:0.1.0-SNAPSHOT]
conductor-server | ... 27 more
conductor-server | Caused by: java.lang.ClassNotFoundException: javax.validation.Validator
conductor-server | at java.net.URLClassLoader.findClass(URLClassLoader.java:445) ~[?:?]
conductor-server | at java.lang.ClassLoader.loadClass(ClassLoader.java:592) ~[?:?]
conductor-server | at org.springframework.boot.loader.LaunchedURLClassLoader.loadClass(LaunchedURLClassLoader.java:149) ~[conductor-server.jar:0.1.0-SNAPSHOT]
conductor-server | at java.lang.ClassLoader.loadClass(ClassLoader.java:525) ~[?:?]
conductor-server | at java.lang.Class.getDeclaredConstructors0(Native Method) ~[?:?]
conductor-server | at java.lang.Class.privateGetDeclaredConstructors(Class.java:3373) ~[?:?]
conductor-server | at java.lang.Class.getDeclaredConstructors(Class.java:2555) ~[?:?]

Expected behavior
docker-compose up should not report any error

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

Output in webhook of workflow are in string format

Describe the bug
When I receive a workflow status update, I get the workflow output in string format instead of json format.

Details
Conductor version: 3.19.0

To Reproduce
Steps to reproduce the behavior:

  1. Configure conductor with webhook feature
  2. Create workflow
  3. Start workflow
  4. See request sended has output in string format instead json

Expected behavior
Output everywhere should be json

Docker image build fails due to conductor-community repo pull in Dockerfile

Describe the bug
The Dockerfile for server build in docker/server/Dockerfile has a reference to https://github.com/conductor-oss/conductor-community.git by cloning the repo which does not have gradle files anymore
Details
docker build -t conductor:server -f server/Dockerfile ../
[+] Building 183.0s (38/38) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 2.61kB 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 34B 0.0s
=> [internal] load metadata for docker.io/library/alpine:3.18 0.8s
=> [internal] load metadata for docker.io/library/eclipse-temurin:17-jdk 1.4s
=> [server-builder 1/14] FROM docker.io/library/eclipse-temurin:17-jdk@sha256:9b7aa20276a61013d7f8ce6f5e7b7e7c51a9ec20a9c96f8bd0c677aee2a289fa 0.0s
=> [builder 1/8] FROM docker.io/library/alpine:3.18@sha256:11e21d7b981a59554b3f822c49f6e9f57b6068bb74f49c4cd5cc4c663c7e5160 0.0s
=> [internal] load build context 0.1s
=> => transferring context: 193.85kB 0.1s
=> CACHED [server-builder 2/14] RUN mkdir server-build 0.0s
=> CACHED [server-builder 3/14] WORKDIR server-build 0.0s
=> CACHED [server-builder 4/14] RUN ls -ltr 0.0s
=> CACHED [server-builder 5/14] RUN apt-get update -y 0.0s
=> CACHED [server-builder 6/14] RUN apt-get install git -y 0.0s
=> [server-builder 7/14] RUN git clone --branch v3.15.0 https://github.com/conductor-oss/conductor-community.git 3.2s
=> CACHED [builder 2/8] RUN apk add git 0.0s
=> CACHED [builder 3/8] RUN apk add --update nodejs npm yarn 0.0s
=> [builder 4/8] COPY . /conductor 0.5s
=> CACHED [stage-2 2/12] RUN apk add openjdk17 0.0s
=> CACHED [stage-2 3/12] RUN apk add nginx 0.0s
=> CACHED [stage-2 4/12] RUN mkdir -p /app/config /app/logs /app/libs 0.0s
=> CACHED [stage-2 5/12] COPY docker/server/bin /app 0.0s
=> CACHED [stage-2 6/12] COPY docker/server/config /app/config 0.0s
=> [builder 5/8] WORKDIR /conductor/ui 0.0s
=> [builder 6/8] RUN yarn install && cp -r node_modules/monaco-editor public/ && yarn build 177.0s
=> [server-builder 8/14] WORKDIR conductor-community 0.0s
=> [server-builder 9/14] RUN ls -ltr 0.2s
=> [server-builder 10/14] RUN ./gradlew build -x test --no-daemon --stacktrace 160.8s
=> [server-builder 11/14] WORKDIR /server-build 0.0s
=> [server-builder 12/14] RUN ls -ltr 0.3s
=> [server-builder 13/14] RUN pwd 0.4s
=> [stage-2 7/12] COPY --from=server-builder /server-build/conductor-community/community-server/build/libs/boot.jar /app/libs/conductor-server.jar 0.3s
=> [stage-2 8/12] WORKDIR /usr/share/nginx/html 0.0s
=> [stage-2 9/12] RUN rm -rf ./* 0.2s
=> [builder 7/8] RUN ls -ltr 0.2s
=> [builder 8/8] RUN echo "Done building UI" 0.3s
=> [stage-2 10/12] COPY --from=builder /conductor/ui/build . 0.4s
=> [stage-2 11/12] COPY --from=builder /conductor/docker/server/nginx/nginx.conf /etc/nginx/http.d/default.conf 0.0s
=> [stage-2 12/12] RUN chmod +x /app/startup.sh 0.2s
=> exporting to image 0.9s
=> => exporting layers 0.9s
=> => writing image sha256:b75213f621f3e3caf6ee7725ffba0ecdb8fc291237a08d04a0bbad5981b4471a 0.0s
=> => naming to docker.io/library/conductor:server 0.0s

docker build -t conductor:server -f server/Dockerfile ../
[+] Building 2.6s (21/39)
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 2.59kB 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 34B 0.0s
=> [internal] load metadata for docker.io/library/alpine:3.18 1.8s
=> [internal] load metadata for docker.io/library/eclipse-temurin:17-jdk 1.8s
=> [auth] library/alpine:pull token for registry-1.docker.io 0.0s
=> [auth] library/eclipse-temurin:pull token for registry-1.docker.io 0.0s
=> [internal] load build context 0.2s
=> => transferring context: 198.41kB 0.2s
=> [builder 1/8] FROM docker.io/library/alpine:3.18@sha256:11e21d7b981a59554b3f822c49f6e9f57b6068bb74f49c4cd5cc4c663c7e5160 0.0s
=> [server-builder 1/14] FROM docker.io/library/eclipse-temurin:17-jdk@sha256:9b7aa20276a61013d7f8ce6f5e7b7e7c51a9ec20a9c96f8bd0c677aee2a289fa 0.0s
=> CACHED [server-builder 2/14] RUN mkdir server-build 0.0s
=> CACHED [server-builder 3/14] WORKDIR server-build 0.0s
=> CACHED [server-builder 4/14] RUN ls -ltr 0.0s
=> CACHED [server-builder 5/14] RUN apt-get update -y 0.0s
=> CACHED [server-builder 6/14] RUN apt-get install git -y 0.0s
=> CACHED [server-builder 7/14] RUN git clone https://github.com/conductor-oss/conductor-community.git 0.0s
=> CACHED [server-builder 8/14] WORKDIR conductor-community 0.0s
=> CACHED [server-builder 9/14] RUN ls -ltr 0.0s
=> ERROR [server-builder 10/14] RUN ./gradlew build -x test --no-daemon --stacktrace 0.4s
=> CACHED [builder 2/8] RUN apk add git 0.0s
=> CACHED [builder 3/8] RUN apk add --update nodejs npm yarn 0.0s
=> CANCELED [builder 4/8] COPY . /conductor 0.5s


[server-builder 10/14] RUN ./gradlew build -x test --no-daemon --stacktrace:
#23 0.305 /bin/sh: 1: ./gradlew: not found


executor failed running [/bin/sh -c ./gradlew build -x test --no-daemon --stacktrace]: exit code: 127
~/IdeaProjects/conductor-oss/docker on main <88>1 ----------------------------

To Reproduce
follow standalone server image build process here https://docs.conductor-oss.org/devguide/running/docker.html
Expected behavior
successfull docker server image build
Additional context
https://github.com/conductor-oss/conductor-community does not have grade files anymore

[DOC]: Add missing task type documentation

What are you missing in the docs

I see that the following types of tasks are documented in the Open API

  • SIMPLE
  • DYNAMIC
  • FORK_JOIN
  • FORK_JOIN_DYNAMIC
  • DECISION
  • SWITCH
  • JOIN
  • DO_WHILE
  • SUB_WORKFLOW
  • START_WORKFLOW
  • EVENT
  • WAIT
  • HUMAN
  • USER_DEFINED
  • HTTP
  • LAMBDA
  • INLINE
  • EXCLUSIVE_JOIN
  • TERMINATE
  • KAFKA_PUBLISH
  • JSON_JQ_TRANSFORM
  • SET_VARIABLE
  • NOOP

The following types are missing from the conductor oss documentation and orkes documentation:

  • DECISION
  • USER_DEFINED
  • LAMBDA
  • EXCLUSIVE_JOIN
  • NOOP

use the conductoross/conductor:3.15.0 issue some problems

Describe the bug
I want to use the image(conductoross/conductor:3.15.0) in local to try conductor . But when I use the conductor.db.type=redis_standalone and run it , some error message appeared.

Details
Conductor version: docker-image: conductoross/conductor:3.15.0
Persistence implementation: Redis , es
Queue implementation: Redis
Lock: Redis

this is the configuration

# Database persistence type.
# Below are the properties for redis
conductor.db.type=redis_standalone
conductor.queue.type=redis_standalone

conductor.redis.hosts=10.200.0.69:6379
conductor.redis-lock.serverAddress=redis://10.200.0.69:6379
conductor.redis.taskDefCacheRefreshInterval=1
conductor.redis.workflowNamespacePrefix=conductor
conductor.redis.queueNamespacePrefix=conductor_queues

# Elastic search instance indexing is enabled.
conductor.indexing.enabled=true
conductor.elasticsearch.url=http://10.200.0.69:9200
conductor.elasticsearch.indexName=conductor
conductor.elasticsearch.version=7
conductor.elasticsearch.clusterHealthColor=yellow

# Additional modules for metrics collection exposed to Prometheus (optional)
conductor.metrics-prometheus.enabled=true
management.endpoints.web.exposure.include=prometheus

# Load sample kitchen sink workflow
loadSample=true

this is the run command

docker run -v /home/config-redis.properties:/app/config/config.properties --init -p 8080:8080 -p 1234:5000   conductoross/conductor:3.15.0

this is error message.
image

I want to know what is wrong with the configuration. Thank you.

[FEATURE]: Improve the OpenAPI documentation

Please read our contributor guide before creating an issue.
Also consider discussing your idea on the discussion forum first.

Describe the Feature Request

Improve OpenAPI documentation to improve these aspects:

  • Possible answers (now all according to OpenAPI are 200)
  • Improve if possible some answers or use cases that give unexpected results

Describe Preferred Solution

Modify rest conductor to add better definitions for OpenAPI

why jdbc doesn't work

Running Conductor Using Docker and start workflow, but the status of jdbc task is SCHEDULED

workflow definition:
{
"ownerApp": null,
"createTime": 1707025941567,
"updateTime": 1707124459517,
"createdBy": null,
"updatedBy": null,
"accessPolicy": {},
"name": "procure_approval",
"description": "procure_approval_test",
"version": 2,
"tasks": [
{
"name": "queryAmount",
"taskReferenceName": "query",
"description": null,
"inputParameters": {
"connectionId": "jdbc:mysql://[email protected]:32306/cnbmdb_train?user=root&password=Cnbm123456!",
"statement": "SELECT price from purchase_order where code = ? and status = 0",
"parameters": [
{
"code": "${workflow.input.code}"
}
],
"type": "SELECT"
},
"type": "JDBC",
"dynamicTaskNameParam": null,
"caseValueParam": null,
"caseExpression": null,
"scriptExpression": null,
"dynamicForkJoinTasksParam": null,
"dynamicForkTasksParam": null,
"dynamicForkTasksInputParamName": null,
"startDelay": 0,
"subWorkflowParam": null,
"sink": null,
"optional": false,
"taskDefinition": null,
"rateLimited": null,
"asyncComplete": false,
"loopCondition": null,
"retryCount": null,
"evaluatorType": null,
"expression": null
},
{
"name": "switch_task",
"taskReferenceName": "is_warning",
"description": null,
"inputParameters": {
"price": "${query.output.price}"
},
"type": "SWITCH",
"dynamicTaskNameParam": null,
"caseValueParam": null,
"caseExpression": null,
"scriptExpression": null,
"decisionCases": {
"approval": [
{
"name": "sendMessage",
"taskReferenceName": "message",
"description": null,
"inputParameters": {
"receiver": "${workflow.input.receiver}",
"content": "${workflow.input.code}is warning!"
},
"type": "SIMPLE",
"dynamicTaskNameParam": null,
"caseValueParam": null,
"caseExpression": null,
"scriptExpression": null,
"dynamicForkJoinTasksParam": null,
"dynamicForkTasksParam": null,
"dynamicForkTasksInputParamName": null,
"startDelay": 0,
"subWorkflowParam": null,
"sink": null,
"optional": false,
"taskDefinition": null,
"rateLimited": null,
"asyncComplete": false,
"loopCondition": null,
"retryCount": null,
"evaluatorType": null,
"expression": null
}
]
},
"dynamicForkJoinTasksParam": null,
"dynamicForkTasksParam": null,
"dynamicForkTasksInputParamName": null,
"startDelay": 0,
"subWorkflowParam": null,
"sink": null,
"optional": false,
"taskDefinition": null,
"rateLimited": null,
"asyncComplete": false,
"loopCondition": null,
"retryCount": null,
"evaluatorType": "javascript",
"expression": "$.price> 1000 ? 'approval' : ''"
}
],
"inputParameters": [],
"outputParameters": {},
"failureWorkflow": null,
"schemaVersion": 2,
"restartable": true,
"workflowStatusListenerEnabled": false,
"ownerEmail": "[email protected]",
"timeoutPolicy": "ALERT_ONLY",
"timeoutSeconds": 0,
"variables": {},
"inputTemplate": {}
}

Task get stuck and causes running executions to queue up

Describe the bug
Tasks get stuck in “RUNNING” status and this bug causes other executions currently running executions to queue up. Once it is manually terminated, we have a spike in the amount of completed tasks and the service goes back to normal. In logs we observe a high amount of error logs (1-3k every 1-2 minutes).

Details
Conductor version: 3.14.0
Persistence implementation: Postgres
Queue implementation: Dynoqueues
Lock: Redis
It occurred in several types of workflows and tasks.

To Reproduce
We were unable to replicate the bug so far.

Expected behavior
Task to finish and continue with the next task in the workflow.

Screenshots
This is a HTTP task that shouldn’t have lasted more than a couple minutes and ran for 26 minutes before we manually terminated it:
Screenshot 2024-03-12 at 00 00 58

Additional context
When this issue happens, we’ve observed an influx of 1-3k+ error messages every 1-2 minutes. Below is a sample of these errors:

Jan 22 18:26:47.960
i-0abadd3b65ed5a1a2
rp-meli-orkes
at com.netflix.conductor.core.execution.WorkflowExecutor.decide(WorkflowExecutor.java:1082)
Jan 22 18:26:47.960
i-0abadd3b65ed5a1a2
rp-meli-orkes
at com.netflix.conductor.core.execution.WorkflowExecutor.decide(WorkflowExecutor.java:1082)
Jan 22 18:26:47.960
i-0abadd3b65ed5a1a2
rp-meli-orkes
at com.netflix.conductor.core.execution.WorkflowExecutor.decide(WorkflowExecutor.java:1082)
Jan 22 18:26:47.960
i-0abadd3b65ed5a1a2
rp-meli-orkes
at com.netflix.conductor.core.execution.WorkflowExecutor.decide(WorkflowExecutor.java:1078)
Jan 22 18:26:47.960
i-0abadd3b65ed5a1a2
rp-meli-orkes
at com.netflix.conductor.core.dal.ExecutionDAOFacade.updateTasks(ExecutionDAOFacade.java:530)
Jan 22 18:26:47.960
i-0abadd3b65ed5a1a2
rp-meli-orkes
at java.base/java.lang.Iterable.forEach(Iterable.java:75)
Jan 22 18:26:47.960
i-0abadd3b65ed5a1a2
rp-meli-orkes
at com.netflix.conductor.core.dal.ExecutionDAOFacade.updateTask(ExecutionDAOFacade.java:505)
Jan 22 18:26:47.960
i-0abadd3b65ed5a1a2
rp-meli-orkes
at io.orkes.conductor.dao.archive.ArchivedExecutionDAO.updateTask(ArchivedExecutionDAO.java:86)
Jan 22 18:26:47.960
i-0abadd3b65ed5a1a2
rp-meli-orkes
at io.micrometer.core.instrument.composite.CompositeTimer.record(CompositeTimer.java:79)
Jan 22 18:26:47.960
i-0abadd3b65ed5a1a2
rp-meli-orkes
at io.orkes.conductor.dao.archive.ArchivedExecutionDAO.lambda$updateTask$1(ArchivedExecutionDAO.java:88)
Jan 22 18:26:47.960
i-0abadd3b65ed5a1a2
rp-meli-orkes
at com.netflix.conductor.redis.dao.RedisExecutionDAO.updateTask(RedisExecutionDAO.java:254)
Jan 22 18:26:47.960
i-0abadd3b65ed5a1a2
rp-meli-orkes
at com.netflix.conductor.redis.dao.BaseDynoDAO.toJson(BaseDynoDAO.java:70)
Jan 22 18:26:47.960
i-0abadd3b65ed5a1a2
rp-meli-orkes
at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsString(ObjectMapper.java:3964)
Jan 22 18:26:47.960
i-0abadd3b65ed5a1a2
rp-meli-orkes
at com.fasterxml.jackson.databind.ObjectMapper._writeValueAndClose(ObjectMapper.java:4719)
Jan 22 18:26:47.960
i-0abadd3b65ed5a1a2
rp-meli-orkes
at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:318)
Jan 22 18:26:47.960
i-0abadd3b65ed5a1a2
rp-meli-orkes
at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider._serialize(DefaultSerializerProvider.java:479)
Jan 22 18:26:47.960
i-0abadd3b65ed5a1a2
rp-meli-orkes
at com.fasterxml.jackson.databind.ser.BeanSerializer.serialize(BeanSerializer.java:178)
Jan 22 18:26:47.960
i-0abadd3b65ed5a1a2
rp-meli-orkes
at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:790)
Jan 22 18:26:47.960
i-0abadd3b65ed5a1a2
rp-meli-orkes
at com.fasterxml.jackson.databind.JsonMappingException.prependPath(JsonMappingException.java:455)
Jan 22 18:26:47.960
i-0abadd3b65ed5a1a2
rp-meli-orkes
Exception in thread "sweeper-thread-17" java.lang.StackOverflowError

[FEATURE]: Kafka as event queue

Please read our contributor guide before creating an issue.
Also consider discussing your idea on the discussion forum first.

Describe the Feature Request

A clear and concise description of what the feature request is.
Conductor doesn't support using Kafka as event queues.

Describe Preferred Solution

A clear and concise description of what you want to happen.

Describe Alternatives

A clear and concise description of any alternative solutions or features you've considered.

Error in webhook of workflow and task statuses

Describe the bug
When conductor is configured without these 2 properties:

  • conductor.status-notifier.notification.headerPrefer
  • conductor.status-notifier.notification.headerPreferValue

The request is not a valid http

Details
Conductor version: 3.19.0

To Reproduce
Steps to reproduce the behavior:

  1. Configure conductor withtou conductor.status-notifier.notification.headerPrefer and conductor.status-notifier.notification.headerPreferValue
  2. Create workflow
  3. Start workflow
  4. See request sended by webhook is http invalid

Expected behavior
You should check that they have content before doing

headers.put(config.getHeaderPrefer(), config.getHeaderPreferValue());

Error when creating a workflow with ES7

Describe the bug
When I try to create a workflow I get this error:

2024-01-03 11:34:05.349 ERROR 1 --- [led-task-pool-3] c.n.c.c.r.WorkflowReconciler : Error when polling for workflows
java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: org/elasticsearch/common/xcontent/XContentType
at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396) ~[?:?]
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073) ~[?:?]
at com.netflix.conductor.core.reconciliation.WorkflowReconciler.pollAndSweep(WorkflowReconciler.java:77) ~[conductor-core-3.15.0.jar!/:3.15.0]
at jdk.internal.reflect.GeneratedMethodAccessor47.invoke(Unknown Source) ~[?:?]
at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
at java.lang.reflect.Method.invoke(Method.java:568) ~[?:?]
at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:84) ~[conductor-es7-persistence-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54) ~[conductor-es7-persistence-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[?:?]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) ~[?:?]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
at java.lang.Thread.run(Thread.java:840) ~[?:?]

Details
Conductor version: 3.15.0
Persistence implementation: MySQL
Lock: None

Expected behavior
Fix to be able to use condutor 3.15.0 with ES7

[BUG] flyway.postgresql.transactional.lock didn't find in flyway

Describe the bug
After getting a new version of conductor, I've got an error while spring boot app is starting.

As far as I researched, flyway does not find prop flyway.postgresql.transactional.lock

Details
Conductor version:
implementation "org.conductoross:conductor-common:3.18.0"
implementation "org.conductoross:conductor-postgres-persistence:3.18.0"
implementation "org.conductoross:conductor-core:3.18.0"
implementation "org.conductoross:conductor-rest:3.18.0"

To Reproduce
Steps to reproduce the behavior:

  1. start Spring boot app
  2. See error in logs:
    Caused by: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'flywayForPrimaryDb' defined in class path resource [com/netflix/conductor/postgres/config/PostgresConfiguration.class]: Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [org.flywaydb.core.Flyway]: Factory method 'flywayForPrimaryDb' threw exception; nested exception is org.flywaydb.core.api.FlywayException: Unknown configuration property: flyway.postgresql.transactional.lock

Expected behavior
Spring boot app with conductor runs as expected

[FEATURE]:

Please read our contributor guide before creating an issue.
Also consider discussing your idea on the discussion forum first.

Describe the Feature Request

A clear and concise description of what the feature request is.

Describe Preferred Solution

A clear and concise description of what you want to happen.

Describe Alternatives

A clear and concise description of any alternative solutions or features you've considered.

[FEATURE] add support es6 in nashorn

We have faced an issue using default nashorn script engine, it doesn't support ecmascript 6 features while it could be turned on by replacing

ScriptEngine engine = new ScriptEngineManager().getEngineByName("nashorn");

With

NashornScriptEngineFactory factory = new NashornScriptEngineFactory();
ScriptEngine engine = factory.getScriptEngine("--language=es6");

our request is to make it configurable in order to allow users to provide flags or inject custom script engine. Thanks in advance!

Some methods return 2xx codes instead of 404 when the task does not exist.

Describe the bug
Some methods return 2xx codes instead of 404 when the task does not exist.
For example
POST /api/tasks/{task_id}/log -> Return 200 when task does not exists
GET /api/tasks/{task_id} -> Return 204 when task does not exists

Details
Conductor version: v3.14.0
Persistence implementation: MySQL

Expected behavior
When a request is made to a resource (e.g. a task) if it does not exist return code 404.

change conductor root path and swagger ui not work

Describe the bug
Change conductor root path and swagger ui html not work

Details
Conductor version:
Persistence implementation: Cassandra, Postgres, MySQL, Dynomite etc - Postgres
Queue implementation: Postgres, MySQL, Dynoqueues etc
Lock: Redis or Zookeeper?
Workflow definition:
Task definition:
Event handler definition:

As part of Netflix/conductor#3656 conductor support customize web app root by change homepage, however it seems this change miss swagger doc

To Reproduce
Steps to reproduce the behavior:

  1. Go to 'ui/package.json', add one line - "homepage": "https://somedomain/conductor/"
  2. Build the image and deploy
  3. Access "https://somedomain/conductor/swagger-ui/index.html", the page show "failed to load remote configuration"
  4. After investigate, swagger-ui/swagger-initializer.js the "configUrl" : "/api-docs/swagger-config" use absolute address

Expected behavior
Swagger ui should work when customize root is configured. Can you do same change like issue 3656
Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

some workflow deleted automatically

Describe the bug
some workflow deleted automatically

Details
Conductor version: 3.2.0
Persistence implementation: dynamite
Queue implementation: Redis queues
Lock: Redis

Expected behavior
workflow should not be deleted

Screenshots
14:35:33.798 [http-nio-8080-exec-5] ERROR com.netflix.conductor.rest.controllers.ApplicationExceptionMapper - Error ApplicationException url: '/api/workflow/provision_sim_wf'
com.netflix.conductor.core.exception.ApplicationException: No such workflow found by name: provision_sim_wf, version: null
at com.netflix.conductor.service.MetadataServiceImpl.lambda$getWorkflowDef$0(MetadataServiceImpl.java:133) ~[conductor-core-0.1.0-SNAPSHOT.jar!/:0.1.0-SNAPSHOT]
at java.util.Optional.orElseThrow(Unknown Source) ~[?:?]
at com.netflix.conductor.service.MetadataServiceImpl.getWorkflowDef(MetadataServiceImpl.java:132) ~[conductor-core-0.1.0-SNAPSHOT.jar!/:0.1.0-SNAPSHOT]
at com.netflix.conductor.service.MetadataServiceImpl$$FastClassBySpringCGLIB$$726d989d.invoke() ~[conductor-core-0.1.0-SNAPSHOT.jar!/:0.1.0-SNAPSHOT]
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218) ~[conductor-es7-persistence-0.1.0-SNAPSHOT.jar!/:0.1.0-SNAPSHOT]
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:779) ~[conductor-es7-persistence-0.1.0-SNAPSHOT.jar!/:0.1.0-SNAPSHOT]
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) ~[conductor-es7-persistence-0.1.0-SNAPSHOT.jar!/:0.1.0-SNAPSHOT]
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:750) ~[conductor-es7-persistence-0.1.0-SNAPSHOT.jar!/:0.1.0-SNAPSHOT]
at org.springframework.validation.beanvalidation.MethodValidationInterceptor.invoke(MethodValidationInterceptor.java:119) ~[conductor-es7-persistence-0.1.0-SNAPSHOT.jar!/:0.1.0-SNAPSHOT]
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) ~[conductor-es7-persistence-0.1.0-SNAPSHOT.jar!/:0.1.0-SNAPSHOT]
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:750) ~[conductor-es7-persistence-0.1.0-SNAPSHOT.jar!/:0.1.0-SNAPSHOT]
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:692) ~[conductor-es7-persistence-0.1.0-SNAPSHOT.jar!/:0.1.0-SNAPSHOT]
at com.netflix.conductor.service.MetadataServiceImpl$$EnhancerBySpringCGLIB$$9234a04b.getWorkflowDef() ~[conductor-core-0.1.0-SNAPSHOT.jar!/:0.1.0-SNAPSHOT]
at com.netflix.conductor.service.WorkflowServiceImpl.startWorkflow(WorkflowServiceImpl.java:161) ~[conductor-core-0.1.0-SNAPSHOT.jar!/:0.1.0-SNAPSHOT]
at com.netflix.conductor.service.WorkflowServiceImpl$$FastClassBySpringCGLIB$$c01ac20d.invoke() ~[conductor-core-0.1.0-SNAPSHOT.jar!/:0.1.0-SNAPSHOT]
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218) ~[conductor-es7-persistence-0.1.0-SNAPSHOT.jar!/:0.1.0-SNAPSHOT]
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:779) ~[conductor-es7-persistence-0.1.0-SNAPSHOT.jar!/:0.1.0-SNAPSHOT]
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) ~[conductor-es7-persistence-0.1.0-SNAPSHOT.jar!/:0.1.0-SNAPSHOT]
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:750) ~[conductor-es7-persistence-0.1.0-SNAPSHOT.jar!/:0.1.0-SNAPSHOT]
at org.springframework.validation.beanvalidation.MethodValidationInterceptor.invoke(MethodValidationInterceptor.java:119) ~[conductor-es7-persistence-0.1.0-SNAPSHOT.jar!/:0.1.0-SNAPSHOT]
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) ~[conductor-es7-persistence-0.1.0-SNAPSHOT.jar!/:0.1.0-SNAPSHOT]
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:750) ~[conductor-es7-persistence-0.1.0-SNAPSHOT.jar!/:0.1.0-SNAPSHOT]
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:692) ~[conductor-es7-persistence-0.1.0-SNAPSHOT.jar!/:0.1.0-SNAPSHOT]
at com.netflix.conductor.service.WorkflowServiceImpl$$EnhancerBySpringCGLIB$$6c49829b.startWorkflow() ~[conductor-core-0.1.0-SNAPSHOT.jar!/:0.1.0-SNAPSHOT]
at com.netflix.conductor.rest.controllers.WorkflowResource.startWorkflow(WorkflowResource.java:65) ~[conductor-rest-0.1.0-SNAPSHOT.jar!/:0.1.0-SNAPSHOT]
at jdk.internal.reflect.GeneratedMethodAccessor411.invoke(Unknown Source) ~[?:?]
at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) ~[?:?]
at java.lang.reflect.Method.invoke(Unknown Source) ~[?:?]

Human task is scheduled twice even if I set redis lock

Describe the bug
I set up a workflow with several simple tasks and one human task.
I also have several workers pull and update simple tasks and 1 worker update human task.
However, I found that human task is scheduled twice after previous simple task is finished

Details
Conductor version: 3.16.0
Persistence implementation: Postgres
Queue implementation: Redis
Lock: Redis
Workflow definition: see attachment workflow_definition.json
workflow_definition.json

Task definition: see attachment task_definition.json
tasks_definition.json

Below is the result returned by api/workflow/8f4d7300-5dd8-42dd-a58b-aadbc68db157?includeTasks=true

workflow_result.json

Below is the properties we use:
conductor-config.properties.log

Below is the env setup:
1)Redis(1primary + 1 replica): AWS elasticache: cache.t4g.micro
2)postgreSQL(1w + 1 ro): AWS aurora RDS: db.t4g.medium
3)conductor service(2 pod/replica): AWS ec2: m7a.xlarge

To Reproduce
Steps to reproduce the behavior:

  1. Create workflow definition

  2. Create task definitions

  3. Start 50 workers for each of the simple tasks with poll time 200ms

  4. Start 1 process to update human task if there is a human in_progress in current instance.
    Once worker updates task "saveDbWithWorkflowDummy_0" with finish status, then insert the workflow instance to local memory table.
    There will be a process check the table every 200ms to get the instance id out and check if there is any human task scheduled. If yes, update the human task with complete status and delete the instance from table, otherwise wait next check poll.

  5. Start 2000 workflow instances(If it doesn't work, after this finishes, start another 2000, usually 4 round can reproduce the issue)

  6. See duplicate human tasks in screenshot

Expected behavior
Human task is only scheduled once

Screenshots
If applicable, add screenshots to help explain your problem.
duplicate_scheduled_human_task
duplicate_scheduled_human_task2

Additional context
Add any other context about the problem here.

AWS SDK not able to assume IAM role for service accounts in Kubernetes

Describe the bug
The Conductor components which make use of the AWS SDK (i.e. conductor-awssqs-event-queue etc.) will currently not assume an IAM role which is associated with a Kubernetes service account. Enabling the AWS SDK debug logs reveals that the WebIdentityTokenCredentialsProvider credentials provider is not being included as part of the DefaultAWSCredentialsProviderChain:

c.a.a.AWSCredentialsProviderChain : Unable to load credentials from EnvironmentVariableCredentialsProvider: Unable to load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))
c.a.a.AWSCredentialsProviderChain : Unable to load credentials from SystemPropertiesCredentialsProvider: Unable to load AWS credentials from Java system properties (aws.accessKeyId and aws.secretKey)
c.a.a.AWSCredentialsProviderChain : Unable to load credentials from com.amazonaws.auth.profile.ProfileCredentialsProvider@109f8c7e: profile file cannot be null
c.a.a.AWSCredentialsProviderChain : Loading credentials from com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper@75156240

Details
Conductor version: Snapshot (main at fec3116)
Persistence implementation: Postgres
Queue implementation: Postgres, Dynoqueues, SQS

To Reproduce
Steps to reproduce the behaviour:

  1. Build server component from main at fec3116 using the Dockerfile from docker/server
  2. Run the server using Docker image as a service in Kubernetes with the pods using a service account with an IAM role associated

Expected behavior
The WebIdentityTokenCredentialsProvider should be used to assume the IAM role and these credentials should be used for all AWS SDK requests.

Additional context
In our AWS EKS cluster we use IAM roles for service accounts which means making use of the WebIdentityTokenCredentialsProvider credentials provider. It turns out that the current version of the AWS SDK used by Conductor is 1.11.86, however the minimum supported version to support the WebIdentityTokenCredentialsProvider provider is 1.11.704. I've upgraded our fork to use the latest 1.11 version (which is currently 1.11.1034) and this seems to resolve the issue. It's worth noting that as part of this change you also need to make sure that com.amazonaws:aws-java-sdk-sts is included in the class path at runtime.

Dynamic Fork Join tasks jumps to join task without completion of forked tasks

Describe the bug
I am facing issues when scheduling a dynamic-fork join task with 6000+ workflows. The JOIN task finishes before the completion of dynamic fork join whenever there are too many forks. Also the JOIN tasks shows empty value for joinOn in task input on workflow execution page

Details
Conductor version: orkesio/orkes-conductor-community:1.1.11
Persistence implementation: Postgres
Queue implementation: Orkes queue
Lock: Redis or Zookeeper? Redis

image (4)
image (3)

[FEATURE]: Elasticsearch 8 Compatibility

Dear Conductor Team,

I am writing to inquire about the current status and potential plans for making Conductor compatible with Elasticsearch 8. As you might be aware, Elasticsearch 8 introduces several significant improvements and features that are beneficial for performance and scalability.

However, after upgrading to Elasticsearch 8, we have encountered compatibility issues with the current version of Conductor.

We would greatly appreciate it if the team could consider adding support for Elasticsearch 8 in an upcoming release. This upgrade would not only enhance the functionality and efficiency of Conductor but also enable users like us to leverage the latest advancements in Elasticsearch.

Could you please provide any information regarding your plans for this compatibility update?

Thank you for your continuous efforts in improving Conductor!
Looking forward to your response.

Dynamic fork does not support array input as in Orkes version

Describe the bug
The Orkes conductor docs (currently the only docs available for Conductor unless you dig in the archived Netflix repo) describe a simple usage pattern of a Dynamic Fork, when given a single task (or subworkflow) name to execute and an array of inputs to execute that task with.

This behaviour is not supported in the OSS version of Conductor. We are instead forced to make a preceding inline/worker task to generate dynamic task JSON definitions and task input mappings. This is overly complex and would result in a huge amount of boilerplate JSON for 100+ length array that all need the same task or sub-workflow to be executed.

Details
Using default conductor standalone docker image, taken from OSS Conductor ReadMe.
Conductor version: OSS 3.15
Persistence implementation: default (Redis)
Queue implementation: default (Postgres)
Lock: default (Redis?)
Workflow definition:

{
  "createTime": 1702551738787,
  "updateTime": 1703000317219,
  "name": "DynamicForkTest",
  "description": "Very simple dynamic fork",
  "version": 1,
  "tasks": [
    {
      "name": "dynamic_task",
      "taskReferenceName": "dynamic_task_ref",
      "inputParameters": {
        "forkTaskName": "add_one",
        "forkTaskInputs": "${workflow.input.Items}"
      },
      "type": "FORK_JOIN_DYNAMIC",
      "decisionCases": {},
      "dynamicForkTasksParam": "dynamicTasks",
      "dynamicForkTasksInputParamName": "dynamicTasksInput",
      "defaultCase": [],
      "forkTasks": [],
      "startDelay": 0,
      "joinOn": [],
      "optional": false,
      "defaultExclusiveJoinTask": [],
      "asyncComplete": false,
      "loopOver": [],
      "onStateChange": {}
    },
    {
      "name": "join_task",
      "taskReferenceName": "join_task_ref",
      "inputParameters": {},
      "type": "JOIN",
      "decisionCases": {},
      "defaultCase": [],
      "forkTasks": [],
      "startDelay": 0,
      "joinOn": [],
      "optional": false,
      "defaultExclusiveJoinTask": [],
      "asyncComplete": false,
      "loopOver": [],
      "onStateChange": {}
    }
  ],
  "inputParameters": [
    "Items"
  ],
  "outputParameters": {
    "Output": "${join_task_ref.output}"
  },
  "failureWorkflow": "",
  "schemaVersion": 2,
  "restartable": true,
  "workflowStatusListenerEnabled": false,
  "ownerEmail": "***",
  "timeoutPolicy": "ALERT_ONLY",
  "timeoutSeconds": 0,
  "variables": {},
  "inputTemplate": {},
  "onStateChange": {}
}

Task definition:

{
  "createTime": 1702551592911,
  "createdBy": "***",
  "name": "add_one",
  "description": "Super simple, adds 1 to input",
  "retryCount": 3,
  "timeoutSeconds": 3600,
  "inputKeys": [],
  "outputKeys": [],
  "timeoutPolicy": "TIME_OUT_WF",
  "retryLogic": "FIXED",
  "retryDelaySeconds": 60,
  "responseTimeoutSeconds": 600,
  "concurrentExecLimit": 0,
  "inputTemplate": {
    "Number": 1
  },
  "rateLimitPerFrequency": 0,
  "rateLimitFrequencyInSeconds": 1,
  "ownerEmail": "***",
  "pollTimeoutSeconds": 3600,
  "backoffScaleFactor": 1
}

Event handler definition: None

To Reproduce
Steps to reproduce the behaviour:

  1. Load up local standalone OSS Conductor via Docker:
docker volume create postgres
docker volume create redis

docker run --init -p 8080:8080 -p 1234:5000 --mount source=redis,target=/redis \
--mount source=postgres,target=/pgdata conductoross/conductor-standalone:3.15.0
  1. Add definitions above (no need to implement the worker to reproduce the bug, we won't make it that far)
  2. Execute workflow with some array input, e.g. {"numbers":[1,2,3,4]}
  3. See error: Input to the dynamically forked tasks is not a map -> expecting a map of K,V but found null

Expected behaviour
Input array should get split into 4 separate tasks (where 4 is array length) and processed concurrently in dynamic fork. As in Orkes playground test run.

Screenshots
Expected:
image

Observed in OSS Conductor:
image

conductor server hikari pool deadlock

Details
Conductor version: 3.13.6
Persistence implementation: Mysql
Queue implementation: Mysql
Lock: Zookeeper

We are using the Mysql implementation for all DAO interfaces except IndexDAO. The worker sends a polling request to the Conductor-server every one second. Currently, there are one conductor-server and seven or eight conductor-workers in use.
However, the following error occurs when Workflow and Task are not running. The reason is expected to be due to executionDAOFacade.updateTaskLastPoll(taskType, domain, workerId) inside the Polling method.

Any suggestions other than increasing the number of Conductor-servers? I'm thinking of updating only if there is a Polled Task.

09:05:11.982 [http-nio-8080-exec-8] ERROR com.netflix.conductor.core.dal.ExecutionDAOFacade - Error updating PollData for task: sink-worker in domain: null from worker: demo-ndap-ndap-conductor-worker-6c695bd677-ccfz8
com.netflix.conductor.core.exception.NonTransientException: (conn=20936) Lock wait timeout exceeded; try restarting transaction
	at com.netflix.conductor.mysql.dao.MySQLBaseDAO.getWithRetriedTransactions(MySQLBaseDAO.java:147) ~[conductor-mysql-persistence-3.13.3.jar!/:3.13.3]
	at com.netflix.conductor.mysql.dao.MySQLBaseDAO.withTransaction(MySQLBaseDAO.java:192) ~[conductor-mysql-persistence-3.13.3.jar!/:3.13.3]
	at com.netflix.conductor.mysql.dao.MySQLExecutionDAO.updateLastPollData(MySQLExecutionDAO.java:527) ~[conductor-mysql-persistence-3.13.3.jar!/:3.13.3]
	at com.netflix.conductor.core.dal.ExecutionDAOFacade.updateTaskLastPoll(ExecutionDAOFacade.java:585) ~[conductor-core-7.2.0-SNAPSHOT.jar!/:7.2.0-SNAPSHOT]
	at com.netflix.conductor.service.ExecutionService.poll(ExecutionService.java:182) ~[conductor-core-7.2.0-SNAPSHOT.jar!/:7.2.0-SNAPSHOT]
	at com.netflix.conductor.service.TaskServiceImpl.batchPoll(TaskServiceImpl.java:89) ~[conductor-core-7.2.0-SNAPSHOT.jar!/:7.2.0-SNAPSHOT]
	at com.netflix.conductor.service.TaskServiceImpl$$FastClassBySpringCGLIB$$d78c3007.invoke(<generated>) ~[conductor-core-7.2.0-SNAPSHOT.jar!/:7.2.0-SNAPSHOT]
	at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218) ~[spring-core-5.3.22.jar!/:5.3.22]
...
Caused by: com.netflix.conductor.core.exception.NonTransientException: (conn=20936) Lock wait timeout exceeded; try restarting transaction
	at com.netflix.conductor.mysql.util.Query.executeUpdate(Query.java:284) ~[conductor-mysql-persistence-3.13.3.jar!/:3.13.3]
	at com.netflix.conductor.mysql.dao.MySQLExecutionDAO.lambda$insertOrUpdatePollData$55(MySQLExecutionDAO.java:1008) ~[conductor-mysql-persistence-3.13.3.jar!/:3.13.3]
	at com.netflix.conductor.mysql.dao.MySQLBaseDAO.query(MySQLBaseDAO.java:224) ~[conductor-mysql-persistence-3.13.3.jar!/:3.13.3]
	at com.netflix.conductor.mysql.dao.MySQLExecutionDAO.insertOrUpdatePollData(MySQLExecutionDAO.java:1001) ~[conductor-mysql-persistence-3.13.3.jar!/:3.13.3]
	at com.netflix.conductor.mysql.dao.MySQLExecutionDAO.lambda$updateLastPollData$24(MySQLExecutionDAO.java:527) ~[conductor-mysql-persistence-3.13.3.jar!/:3.13.3]
	at com.netflix.conductor.mysql.dao.MySQLBaseDAO.lambda$withTransaction$4(MySQLBaseDAO.java:194) ~[conductor-mysql-persistence-3.13.3.jar!/:3.13.3]
	at com.netflix.conductor.mysql.dao.MySQLBaseDAO.getWithTransaction(MySQLBaseDAO.java:121) ~[conductor-mysql-persistence-3.13.3.jar!/:3.13.3]

S3 external storage cannot be used due to missing java dependencies

Description:
When one tries to use an external storage for payloads (AWS S3) together with payload size thresholds as described here, the task fails and the workflow is terminated with the following error:

conductor-server                  | 894377 [http-nio-8080-exec-10] ERROR com.netflix.conductor.rest.controllers.ApplicationExceptionMapper [] - Error ServletException url: '/api/tasks'
conductor-server                  | jakarta.servlet.ServletException: Handler dispatch failed: java.lang.NoClassDefFoundError: javax/xml/bind/DatatypeConverter
conductor-server                  | 	at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1096) [spring-webmvc-6.0.12.jar!/:6.0.12]
conductor-server                  | 	at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:974) [spring-webmvc-6.0.12.jar!/:6.0.12]
conductor-server                  | 	at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1011) [spring-webmvc-6.0.12.jar!/:6.0.12]
conductor-server                  | 	at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:914) [spring-webmvc-6.0.12.jar!/:6.0.12]
conductor-server                  | 	at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:590) [tomcat-embed-core-10.1.13.jar!/:?]
conductor-server                  | 	at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:885) [spring-webmvc-6.0.12.jar!/:6.0.12]
conductor-server                  | 	at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:658) [tomcat-embed-core-10.1.13.jar!/:?]
conductor-server                  | 	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:205) [tomcat-embed-core-10.1.13.jar!/:?]
conductor-server                  | 	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) [tomcat-embed-core-10.1.13.jar!/:?]
conductor-server                  | 	at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:51) [tomcat-embed-websocket-10.1.13.jar!/:?]
conductor-server                  | 	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174) [tomcat-embed-core-10.1.13.jar!/:?]
conductor-server                  | 	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) [tomcat-embed-core-10.1.13.jar!/:?]
conductor-server                  | 	at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:100) [spring-web-6.0.12.jar!/:6.0.12]
conductor-server                  | 	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116) [spring-web-6.0.12.jar!/:6.0.12]
conductor-server                  | 	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174) [tomcat-embed-core-10.1.13.jar!/:?]
conductor-server                  | 	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) [tomcat-embed-core-10.1.13.jar!/:?]
conductor-server                  | 	at org.springframework.web.filter.FormContentFilter.doFilterInternal(FormContentFilter.java:93) [spring-web-6.0.12.jar!/:6.0.12]
conductor-server                  | 	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116) [spring-web-6.0.12.jar!/:6.0.12]
conductor-server                  | 	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174) [tomcat-embed-core-10.1.13.jar!/:?]
conductor-server                  | 	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) [tomcat-embed-core-10.1.13.jar!/:?]
conductor-server                  | 	at org.springframework.web.filter.ServerHttpObservationFilter.doFilterInternal(ServerHttpObservationFilter.java:109) [spring-web-6.0.12.jar!/:6.0.12]
conductor-server                  | 	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116) [spring-web-6.0.12.jar!/:6.0.12]
conductor-server                  | 	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174) [tomcat-embed-core-10.1.13.jar!/:?]
conductor-server                  | 	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) [tomcat-embed-core-10.1.13.jar!/:?]
conductor-server                  | 	at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:201) [spring-web-6.0.12.jar!/:6.0.12]
conductor-server                  | 	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116) [spring-web-6.0.12.jar!/:6.0.12]
conductor-server                  | 	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174) [tomcat-embed-core-10.1.13.jar!/:?]
conductor-server                  | 	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) [tomcat-embed-core-10.1.13.jar!/:?]
conductor-server                  | 	at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:167) [tomcat-embed-core-10.1.13.jar!/:?]
conductor-server                  | 	at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:90) [tomcat-embed-core-10.1.13.jar!/:?]
conductor-server                  | 	at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:482) [tomcat-embed-core-10.1.13.jar!/:?]
conductor-server                  | 	at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:115) [tomcat-embed-core-10.1.13.jar!/:?]
conductor-server                  | 	at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:93) [tomcat-embed-core-10.1.13.jar!/:?]
conductor-server                  | 	at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:74) [tomcat-embed-core-10.1.13.jar!/:?]
conductor-server                  | 	at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:341) [tomcat-embed-core-10.1.13.jar!/:?]
conductor-server                  | 	at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:391) [tomcat-embed-core-10.1.13.jar!/:?]
conductor-server                  | 	at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:63) [tomcat-embed-core-10.1.13.jar!/:?]
conductor-server                  | 	at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:894) [tomcat-embed-core-10.1.13.jar!/:?]
conductor-server                  | 	at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1740) [tomcat-embed-core-10.1.13.jar!/:?]
conductor-server                  | 	at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:52) [tomcat-embed-core-10.1.13.jar!/:?]
conductor-server                  | 	at org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1191) [tomcat-embed-core-10.1.13.jar!/:?]
conductor-server                  | 	at org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659) [tomcat-embed-core-10.1.13.jar!/:?]
conductor-server                  | 	at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) [tomcat-embed-core-10.1.13.jar!/:?]
conductor-server                  | 	at java.lang.Thread.run(Thread.java:840) [?:?]
conductor-server                  | Caused by: java.lang.NoClassDefFoundError: javax/xml/bind/DatatypeConverter
conductor-server                  | 	at com.amazonaws.util.Base64.encodeAsString(Base64.java:39) ~[aws-java-sdk-core-1.11.86.jar!/:?]
conductor-server                  | 	at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1722) ~[aws-java-sdk-s3-1.11.86.jar!/:?]
conductor-server                  | 	at com.netflix.conductor.s3.storage.S3PayloadStorage.upload(S3PayloadStorage.java:131) ~[conductor-awss3-storage-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at com.netflix.conductor.core.utils.ExternalPayloadStorageUtils.uploadHelper(ExternalPayloadStorageUtils.java:209) ~[conductor-core-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at com.netflix.conductor.core.utils.ExternalPayloadStorageUtils.verifyAndUpload(ExternalPayloadStorageUtils.java:164) ~[conductor-core-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at com.netflix.conductor.core.dal.ExecutionDAOFacade.externalizeTaskData(ExecutionDAOFacade.java:270) ~[conductor-core-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at com.netflix.conductor.core.dal.ExecutionDAOFacade.updateTask(ExecutionDAOFacade.java:507) ~[conductor-core-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at com.netflix.conductor.core.execution.WorkflowExecutor.updateTask(WorkflowExecutor.java:852) ~[conductor-core-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at com.netflix.conductor.service.ExecutionService.updateTask(ExecutionService.java:245) ~[conductor-core-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at com.netflix.conductor.service.TaskServiceImpl.updateTask(TaskServiceImpl.java:134) ~[conductor-core-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at jdk.internal.reflect.GeneratedMethodAccessor124.invoke(Unknown Source) ~[?:?]
conductor-server                  | 	at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
conductor-server                  | 	at java.lang.reflect.Method.invoke(Method.java:568) ~[?:?]
conductor-server                  | 	at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:343) ~[conductor-es7-persistence-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:196) ~[conductor-es7-persistence-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) ~[conductor-es7-persistence-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:751) ~[conductor-es7-persistence-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at org.springframework.validation.beanvalidation.MethodValidationInterceptor.invoke(MethodValidationInterceptor.java:141) ~[conductor-es7-persistence-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:184) ~[conductor-es7-persistence-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:751) ~[conductor-es7-persistence-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:703) ~[conductor-es7-persistence-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at com.netflix.conductor.service.TaskServiceImpl$$SpringCGLIB$$0.updateTask(<generated>) ~[conductor-core-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at com.netflix.conductor.rest.controllers.TaskResource.updateTask(TaskResource.java:83) ~[conductor-rest-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at jdk.internal.reflect.GeneratedMethodAccessor123.invoke(Unknown Source) ~[?:?]
conductor-server                  | 	at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
conductor-server                  | 	at java.lang.reflect.Method.invoke(Method.java:568) ~[?:?]
conductor-server                  | 	at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205) ~[spring-web-6.0.12.jar!/:6.0.12]
conductor-server                  | 	at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:150) ~[spring-web-6.0.12.jar!/:6.0.12]
conductor-server                  | 	at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:118) ~[spring-webmvc-6.0.12.jar!/:6.0.12]
conductor-server                  | 	at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:884) ~[spring-webmvc-6.0.12.jar!/:6.0.12]
conductor-server                  | 	at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:797) ~[spring-webmvc-6.0.12.jar!/:6.0.12]
conductor-server                  | 	at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87) ~[spring-webmvc-6.0.12.jar!/:6.0.12]
conductor-server                  | 	at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1081) [spring-webmvc-6.0.12.jar!/:6.0.12]
conductor-server                  | 	... 43 more
conductor-server                  | Caused by: java.lang.ClassNotFoundException: javax.xml.bind.DatatypeConverter
conductor-server                  | 	at com.amazonaws.util.Base64.encodeAsString(Base64.java:39) ~[aws-java-sdk-core-1.11.86.jar!/:?]
conductor-server                  | 	at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1722) ~[aws-java-sdk-s3-1.11.86.jar!/:?]
conductor-server                  | 	at com.netflix.conductor.s3.storage.S3PayloadStorage.upload(S3PayloadStorage.java:131) ~[conductor-awss3-storage-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at com.netflix.conductor.core.utils.ExternalPayloadStorageUtils.uploadHelper(ExternalPayloadStorageUtils.java:209) ~[conductor-core-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at com.netflix.conductor.core.utils.ExternalPayloadStorageUtils.verifyAndUpload(ExternalPayloadStorageUtils.java:164) ~[conductor-core-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at com.netflix.conductor.core.dal.ExecutionDAOFacade.externalizeTaskData(ExecutionDAOFacade.java:270) ~[conductor-core-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at com.netflix.conductor.core.dal.ExecutionDAOFacade.updateTask(ExecutionDAOFacade.java:507) ~[conductor-core-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at com.netflix.conductor.core.execution.WorkflowExecutor.updateTask(WorkflowExecutor.java:852) ~[conductor-core-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at com.netflix.conductor.service.ExecutionService.updateTask(ExecutionService.java:245) ~[conductor-core-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at com.netflix.conductor.service.TaskServiceImpl.updateTask(TaskServiceImpl.java:134) ~[conductor-core-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at jdk.internal.reflect.GeneratedMethodAccessor124.invoke(Unknown Source) ~[?:?]
conductor-server                  | 	at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
conductor-server                  | 	at java.lang.reflect.Method.invoke(Method.java:568) ~[?:?]
conductor-server                  | 	at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:343) ~[conductor-es7-persistence-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:196) ~[conductor-es7-persistence-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) ~[conductor-es7-persistence-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:751) ~[conductor-es7-persistence-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at org.springframework.validation.beanvalidation.MethodValidationInterceptor.invoke(MethodValidationInterceptor.java:141) ~[conductor-es7-persistence-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:184) ~[conductor-es7-persistence-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:751) ~[conductor-es7-persistence-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:703) ~[conductor-es7-persistence-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at com.netflix.conductor.service.TaskServiceImpl$$SpringCGLIB$$0.updateTask(<generated>) ~[conductor-core-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at com.netflix.conductor.rest.controllers.TaskResource.updateTask(TaskResource.java:83) ~[conductor-rest-3.16.0-SNAPSHOT.jar!/:3.16.0-SNAPSHOT]
conductor-server                  | 	at jdk.internal.reflect.GeneratedMethodAccessor123.invoke(Unknown Source) ~[?:?]
conductor-server                  | 	at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
conductor-server                  | 	at java.lang.reflect.Method.invoke(Method.java:568) ~[?:?]
conductor-server                  | 	at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205) ~[spring-web-6.0.12.jar!/:6.0.12]
conductor-server                  | 	at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:150) ~[spring-web-6.0.12.jar!/:6.0.12]
conductor-server                  | 	at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:118) ~[spring-webmvc-6.0.12.jar!/:6.0.12]
conductor-server                  | 	at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:884) ~[spring-webmvc-6.0.12.jar!/:6.0.12]
conductor-server                  | 	at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:797) ~[spring-webmvc-6.0.12.jar!/:6.0.12]
conductor-server                  | 	at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87) ~[spring-webmvc-6.0.12.jar!/:6.0.12]
conductor-server                  | 	at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1081) [spring-webmvc-6.0.12.jar!/:6.0.12]
conductor-server                  | 	... 43 more

Solution

The issue can be fixed by adding the following lines inside build.gradle:

    dependencies {
        ...
        implementation('javax.xml.bind:jaxb-api:2.3.1')
        implementation('com.sun.xml.bind:jaxb-core:2.3.0.1')
        implementation('com.sun.xml.bind:jaxb-impl:2.3.3')
        ...
    }

The issue is related to the version of java and which dependencies are contained on the default classpath. Here is a related topic on SO.

After adding the aforementioned lines and restarting conductor, S3 as external storage works normally.

Why didn't I submit a PR, when I have the solution? Because I am not a Java developer and I would very much appreciate if someone more experienced took a look at this and implemented a proper solution. Mine works, but maybe it's not entirely correct... I just don't know.

Details
Conductor version: v3.15.0
Persistence implementation: Redis
Queue implementation: Redis
Lock: Redis
Workflow definition: -
Task definition: -
Event handler definition: -

To Reproduce
Steps to reproduce the behavior:

  1. Clone repo from main branch and latest
  2. Deploy conductor locally using docker compose build. Note that some changes must be made to the docker-compose and config-redis.properties in order for this to work.

Docker-compose.yaml changes - add the following env vars to conductor-server:

  • AWS_ACCESS_KEY_ID
  • AWS_SECRET_ACCESS_KEY
  • AWS_REGION

config-redis.properties changes - setup S3 as external storage, add these params:

  • conductor.external-payload-storage.type=s3
  • conductor.external-payload-storage.s3.bucketName=YOUR BUCKET NAME HERE
  • conductor.external-payload-storage.s3.region=YOUR AWS REGION HERE
  • conductor.app.workflowInputPayloadSizeThreshold=5120
  • conductor.app.maxWorkflowInputPayloadSizeThreshold=10240
  • conductor.app.workflowOutputPayloadSizeThreshold=100
  • conductor.app.maxWorkflowOutputPayloadSizeThreshold=1024
  • conductor.app.taskInputPayloadSizeThreshold=3072
  • conductor.app.maxTaskInputPayloadSizeThreshold=10240
  • conductor.app.taskOutputPayloadSizeThreshold=100
  • conductor.app.maxTaskOutputPayloadSizeThreshold=1024
  1. You need to create a dummy worker that creates an output larger than conductor.app.taskOutputPayloadSizeThreshold (the value is in KB).
  2. Create a workflow that uses the worker from step 3.
  3. Execute workflow
  4. See error.

Expected behavior
Upload payload to S3 if bigger than threshold.

Thank you in advance!

Postgres indexing problem with task_type length

Describe the bug
When changing the indexing database from elastic to postgres, an error occurs related to the length of the task name.

Caused by: com.netflix.conductor.core.exception.NonTransientException: ERROR: value too long for type character varying(32)
	at com.netflix.conductor.postgres.dao.PostgresBaseDAO.getWithRetriedTransactions(PostgresBaseDAO.java:148) ~[conductor postgres-persistence.jar!/:?]
	at com.netflix.conductor.postgres.dao.PostgresBaseDAO.queryWithTransaction(PostgresBaseDAO.java:210) ~[conductor-postgres-persistence.jar!/:?]
	at com.netflix.conductor.postgres.dao.PostgresIndexDAO.indexTask(PostgresIndexDAO.java:151) ~[conductor-postgres-persistence.jar!/:?]
	at com.netflix.conductor.core.dal.ExecutionDAOFacade.updateTask(ExecutionDAOFacade.java:517) ~[conductor-core.jar!/:?]
	... 18 more

It turns out that there is a problem with the length of the task_type field. According to the database schema, it should be up to 32 characters long. However, in the case of SIMPLE tasks, the task_name values go there instead of task_type

Details
Conductor version: 3.18
Persistence implementation: Postgres
Queue implementation: Redis
Lock: Redis
Workflow definition:

{
  "accessPolicy": {},
  "name": "TestWorkflowWithLongTaskName",
  "description": "Test Workflow to Index long task name in postgres",
  "version": 1,
  "tasks": [
    {
      "name": "TaskWithLongNameInTestWorkflowWithLongTaskName",
      "taskReferenceName": "TaskWithLongNameInTestWorkflowWithLongTaskName",
      "inputParameters": {},
      "type": "SIMPLE",
      "startDelay": 0,
      "optional": false,
      "asyncComplete": false,
      "permissive": false
    }
  ],
  "inputParameters": [],
  "outputParameters": {},
  "schemaVersion": 2,
  "restartable": true,
  "workflowStatusListenerEnabled": false,
  "ownerEmail": "[email protected]",
  "timeoutPolicy": "ALERT_ONLY",
  "timeoutSeconds": 0,
  "variables": {},
  "inputTemplate": {}
}

To Reproduce
Steps to reproduce the behavior:

  1. Use Postgres Index
  2. Create workflow with task which has name more then 32 symbols
  3. Run it and terminate (because lack of worker)

Expected behavior
In case on simple tasks value 'SIMPLE' should be inserted to task_type column, or task_type column should be extended to 255 length (like task_name)

Additional context
bug initially reported at: Netflix/conductor-community#252

Unit Test Error

I just wanted to document the Unit-Test problems I encountered through this, so that other people might also encounter them.

I run the source code on my window-system , the execute the test case , it reported errors.

source code version : v3.16.0

this is full of error message

Multiple Failures (2 failures)
	org.flywaydb.core.api.FlywayException: Unable to obtain inputstream for resource: db/migration_external_postgres/R__initial_schema.sql
	java.lang.NullPointerException: Cannot invoke "com.netflix.conductor.postgres.storage.PostgresPayloadTestUtil.getDataSource()" because "this.testPostgres" is null
org.gradle.internal.exceptions.DefaultMultiCauseException: Multiple Failures (2 failures)
	org.flywaydb.core.api.FlywayException: Unable to obtain inputstream for resource: db/migration_external_postgres/R__initial_schema.sql
	java.lang.NullPointerException: Cannot invoke "com.netflix.conductor.postgres.storage.PostgresPayloadTestUtil.getDataSource()" because "this.testPostgres" is null
	at app//org.junit.vintage.engine.execution.TestRun.getStoredResultOrSuccessful(TestRun.java:200)
	at app//org.junit.vintage.engine.execution.RunListenerAdapter.fireExecutionFinished(RunListenerAdapter.java:248)
	at app//org.junit.vintage.engine.execution.RunListenerAdapter.testFinished(RunListenerAdapter.java:214)
	at app//org.junit.vintage.engine.execution.RunListenerAdapter.testFinished(RunListenerAdapter.java:88)
	at app//org.junit.runner.notification.SynchronizedRunListener.testFinished(SynchronizedRunListener.java:87)
	at app//org.junit.runner.notification.RunNotifier$9.notifyListener(RunNotifier.java:225)
	at app//org.junit.runner.notification.RunNotifier$SafeNotifier.run(RunNotifier.java:72)
	at app//org.junit.runner.notification.RunNotifier.fireTestFinished(RunNotifier.java:222)
	at app//org.junit.internal.runners.model.EachTestNotifier.fireTestFinished(EachTestNotifier.java:38)
	at app//org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:372)
	at app//org.springframework.test.context.junit4.SpringJUnit4ClassRunner.runChild(SpringJUnit4ClassRunner.java:252)
	at app//org.springframework.test.context.junit4.SpringJUnit4ClassRunner.runChild(SpringJUnit4ClassRunner.java:97)
	at app//org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
	at app//org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
	at app//org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
	at app//org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
	at app//org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
	at app//org.springframework.test.context.junit4.statements.RunBeforeTestClassCallbacks.evaluate(RunBeforeTestClassCallbacks.java:61)
	at app//org.springframework.test.context.junit4.statements.RunAfterTestClassCallbacks.evaluate(RunAfterTestClassCallbacks.java:70)
	at app//org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
	at app//org.junit.runners.ParentRunner.run(ParentRunner.java:413)
	at app//org.springframework.test.context.junit4.SpringJUnit4ClassRunner.run(SpringJUnit4ClassRunner.java:191)
	at app//org.junit.runner.JUnitCore.run(JUnitCore.java:137)
	at app//org.junit.runner.JUnitCore.run(JUnitCore.java:115)
	at app//org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:42)
	at app//org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:80)
	at app//org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:72)
	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:107)
	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88)
	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54)
	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67)
	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52)
	at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:114)
	at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:86)
	at org.junit.platform.launcher.core.DefaultLauncherSession$DelegatingLauncher.execute(DefaultLauncherSession.java:86)
	at org.junit.platform.launcher.core.SessionPerRequestLauncher.execute(SessionPerRequestLauncher.java:53)
	at org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.processAllTestClasses(JUnitPlatformTestClassProcessor.java:99)
	at org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.access$000(JUnitPlatformTestClassProcessor.java:79)
	at org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor.stop(JUnitPlatformTestClassProcessor.java:75)
	at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.stop(SuiteTestClassProcessor.java:61)
	at [email protected]/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at [email protected]/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at [email protected]/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at [email protected]/java.lang.reflect.Method.invoke(Method.java:568)
	at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36)
	at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
	at org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:33)
	at org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:94)
	at jdk.proxy1/jdk.proxy1.$Proxy2.stop(Unknown Source)
	at org.gradle.api.internal.tasks.testing.worker.TestWorker$3.run(TestWorker.java:193)
	at org.gradle.api.internal.tasks.testing.worker.TestWorker.executeAndMaintainThreadName(TestWorker.java:129)
	at org.gradle.api.internal.tasks.testing.worker.TestWorker.execute(TestWorker.java:100)
	at org.gradle.api.internal.tasks.testing.worker.TestWorker.execute(TestWorker.java:60)
	at org.gradle.process.internal.worker.child.ActionExecutionWorker.execute(ActionExecutionWorker.java:56)
	at org.gradle.process.internal.worker.child.SystemApplicationClassLoaderWorker.call(SystemApplicationClassLoaderWorker.java:133)
	at org.gradle.process.internal.worker.child.SystemApplicationClassLoaderWorker.call(SystemApplicationClassLoaderWorker.java:71)
	at app//worker.org.gradle.process.internal.worker.GradleWorkerMain.run(GradleWorkerMain.java:69)
	at app//worker.org.gradle.process.internal.worker.GradleWorkerMain.main(GradleWorkerMain.java:74)
	Suppressed: org.flywaydb.core.api.FlywayException: Unable to obtain inputstream for resource: db/migration_external_postgres/R__initial_schema.sql
		at app//org.flywaydb.core.internal.resource.classpath.ClassPathResource.read(ClassPathResource.java:118)
		at app//org.flywaydb.core.internal.resolver.sql.SqlMigrationResolver$1.read(SqlMigrationResolver.java:85)
		at app//org.flywaydb.core.internal.resolver.ChecksumCalculator.calculateChecksumForResource(ChecksumCalculator.java:64)
		at app//org.flywaydb.core.internal.resolver.ChecksumCalculator.calculate(ChecksumCalculator.java:43)
		at app//org.flywaydb.core.internal.resolver.sql.SqlMigrationResolver.getChecksumForLoadableResource(SqlMigrationResolver.java:110)
		at app//org.flywaydb.core.internal.resolver.sql.SqlMigrationResolver.addMigrations(SqlMigrationResolver.java:150)
		at app//org.flywaydb.core.internal.resolver.sql.SqlMigrationResolver.resolveMigrations(SqlMigrationResolver.java:72)
		at app//org.flywaydb.core.internal.resolver.sql.SqlMigrationResolver.resolveMigrations(SqlMigrationResolver.java:49)
		at app//org.flywaydb.core.internal.resolver.CompositeMigrationResolver.lambda$collectMigrations$0(CompositeMigrationResolver.java:127)
		at [email protected]/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:273)
		at [email protected]/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1625)
		at [email protected]/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
		at [email protected]/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
		at [email protected]/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921)
		at [email protected]/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
		at [email protected]/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682)
		at app//org.flywaydb.core.internal.resolver.CompositeMigrationResolver.collectMigrations(CompositeMigrationResolver.java:128)
		at app//org.flywaydb.core.internal.resolver.CompositeMigrationResolver.doFindAvailableMigrations(CompositeMigrationResolver.java:117)
		at app//org.flywaydb.core.internal.resolver.CompositeMigrationResolver.resolveMigrations(CompositeMigrationResolver.java:107)
		at app//org.flywaydb.core.internal.resolver.CompositeMigrationResolver.resolveMigrations(CompositeMigrationResolver.java:113)
		at app//org.flywaydb.core.internal.command.DbValidate.validate(DbValidate.java:86)
		at app//org.flywaydb.core.Flyway.doValidate(Flyway.java:402)
		at app//org.flywaydb.core.Flyway.lambda$migrate$0(Flyway.java:144)
		at app//org.flywaydb.core.FlywayExecutor.execute(FlywayExecutor.java:196)
		at app//org.flywaydb.core.Flyway.migrate(Flyway.java:140)
		at app//com.netflix.conductor.postgres.storage.PostgresPayloadTestUtil.flywayMigrate(PostgresPayloadTestUtil.java:64)
		at app//com.netflix.conductor.postgres.storage.PostgresPayloadTestUtil.<init>(PostgresPayloadTestUtil.java:40)
		at app//com.netflix.conductor.postgres.storage.PostgresPayloadStorageTest.setup(PostgresPayloadStorageTest.java:73)
		at [email protected]/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
		at [email protected]/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
		at [email protected]/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
		at [email protected]/java.lang.reflect.Method.invoke(Method.java:568)
		at app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
		at app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
		at app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
		at app//org.junit.internal.runners.statements.RunBefores.invokeMethod(RunBefores.java:33)
		at app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
		at app//org.springframework.test.context.junit4.statements.RunBeforeTestMethodCallbacks.evaluate(RunBeforeTestMethodCallbacks.java:75)
		at app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
		at app//org.springframework.test.context.junit4.statements.RunAfterTestMethodCallbacks.evaluate(RunAfterTestMethodCallbacks.java:86)
		at app//org.springframework.test.context.junit4.statements.SpringRepeat.evaluate(SpringRepeat.java:84)
		at app//org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
		... 48 more
	Suppressed: java.lang.NullPointerException: Cannot invoke "com.netflix.conductor.postgres.storage.PostgresPayloadTestUtil.getDataSource()" because "this.testPostgres" is null
		at com.netflix.conductor.postgres.storage.PostgresPayloadStorageTest.teardown(PostgresPayloadStorageTest.java:298)
		at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
		at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
		at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
		at java.base/java.lang.reflect.Method.invoke(Method.java:568)
		at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
		at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
		at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
		at org.junit.internal.runners.statements.RunAfters.invokeMethod(RunAfters.java:46)
		at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:33)
		at org.springframework.test.context.junit4.statements.RunAfterTestMethodCallbacks.evaluate(RunAfterTestMethodCallbacks.java:86)
		at org.springframework.test.context.junit4.statements.SpringRepeat.evaluate(SpringRepeat.java:84)
		at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
		... 48 more

image

Error when using the image conductoross/conductor:3.15.0

Describe the bug
With the following docker-compose:

services:
  conductor:
    container_name: conductor
    image: reg.1u1.it/dockerhub/conductoross/conductor:3.15.0
    ports:
      - 8080:8080
      - 5000:5000
    networks:
      - conductor
    volumes:
      - ./server/config.properties:/app/config/config.properties
    healthcheck:
      test: [ "CMD", "curl","-I" ,"-XGET", "http://localhost:8080/health" ]
      interval: 30s
      timeout: 10s
      retries: 12
    depends_on:
      elastic:
        condition: service_healthy

  elastic:
    image: reg.1u1.it/dockerhub/elasticsearch:7.17.16
    container_name: elastic
    environment:
      - cluster.name=conductor-elastic-cluster
      - node.name=conductor-elastic-node-1
      - xpack.security.enabled=false
      - discovery.type=single-node
      - transport.host=0.0.0.0
      - "ES_JAVA_OPTS=-Xms512m -Xmx1024m"
      - bootstrap.memory_lock=true
    networks:
      - conductor
    ports:
      - 9200:9200
      - 9300:9300
    healthcheck:
      test: [ "CMD", "curl", "-I", "-XGET", "http://localhost:9200"]
      interval: 5s
      timeout: 5s
      retries: 12
    logging:
      driver: "json-file"
      options:
        max-size: "1k"
        max-file: "3"


networks:
  conductor:

And the following configuration file:

conductor.grpc-server.enabled=false

conductor.indexing.enabled=true
conductor.elasticsearch.version=7
conductor.elasticsearch.url=http://elastic:9200
conductor.elasticsearch.indexName=conductor
conductor.elasticsearch.indexReplicasCount=0

conductor.app.ownerEmailMandatory=false

I get the following error:

conductor  | 2024-01-04 10:44:23,338 ERROR [main] com.netflix.conductor.redis.dao.RedisMetadataDAO: refresh TaskDefs failed

This with my image generated from driver-community works, am I missing something?

Expected behavior
I hope I can use this docker image instead of having to generate one myself.

Support for keepLastN in DO_WHILE Task

Discussed in https://github.com/orgs/conductor-oss/discussions/134

Originally posted by sriharimalagi April 16, 2024
Problem statement:

  • We are running DO_WHILE tasks for large number of iterations (>1000). When each DO_WHILE task iteration is completed, the output is stored into the Task.outputData.
  • Since the results after each iteration is appended, the Task Output Max Threshold limit is breached, causing the workflow to get terminated.
  • We would like to apply a solution such that previously executed iterations' output can be discarded.

Analysis:

  • Upon reviewing the DO_WHILE task documentation, I have identified a property named keepLastN which is currently not supported in the open source code.
  • Leveraging this logic for our DO_WHILE task would be beneficial. I am prepared to initiate a pull request for its implementation.

Validation errors in tasks with SWITCH type containing JS expressions after upgrading to 3.16.0

Describe the bug
When loading workflow definition with SWITCH type tasks with JS expressions, I get validation error.
Also, when loading workflow definition in task there are no inputParameters, which can lead to erroneous execution of eval()
Object returnValue = ScriptEvaluator.eval(expression, inputParameters);

Details
Conductor version: 3.16.0
Persistence implementation: Redis
Queue implementation: Dynoqueues
Lock: Redis
Task definition: SWITCH

To Reproduce
Сreate a workflow definition with a SWITCH type task with an expression:

function check(){
    var inputResult = $.inputResult;
    var result = "OK";
    if (!inputResult || !inputResult.serviceResult || inputResult.serviceResult.resultType === "ERROR") {
        result = "ERROR";
    }
    if (inputResult.serviceResult.resultCode === "-1001") {
        result = "NOT_FOUND";
}
    return result;
}

check()

Get the logs:
ConstraintViolationImpl{interpolatedMessage='Expression is not well formatted: TypeError: Cannot read property "resultCode" from undefined in at line number 10, taskType: SWITCH taskName SWITCH_TASK', propertyPath=tasks[12], rootBeanClass=class com.netflix.conductor.common.metadata.workflow.WorkflowDef, messageTemplate='Expression is not well formatted: TypeError: Cannot read property "resultCode" from undefined in at line number 10, taskType: SWITCH taskName SWITCH_TASK'}

Expected behavior
Before the upgrade, these tasks with expressions worked properly and eval() returned the expected result

Additional context
changes in validation with version upgrade:
Netflix/conductor#3805

Server running in container failing with exception encountered during context initialization

Describe the bug
We're building everything using the Dockerfile from docker/server, however I noticed that the built app was failing on start-up with Exception encountered during context initialization. The errors relate to not being able to resolve certain dependencies:

 Unsatisfied dependency expressed through constructor parameter 0: Error creating bean with name 'workflowExecutor' defined in URL [jar:file:/app/libs/conductor-server.jar!/BOOT-INF/lib/conductor-core-3.10.7.jar!/com/netflix/conductor/core/execution/WorkflowExecutor.class]:
 Unsatisfied dependency expressed through constructor parameter 0: Error creating bean with name 'deciderService' defined in URL [jar:file:/app/libs/conductor-server.jar!/BOOT-INF/lib/conductor-core-3.10.7.jar!/com/netflix/conductor/core/execution/DeciderService.class]:
 Unsatisfied dependency expressed through constructor parameter 4: Error creating bean with name 'systemTaskRegistry' defined in URL [jar:file:/app/libs/conductor-server.jar!/BOOT-INF/lib/conductor-core-3.10.7.jar!/com/netflix/conductor/core/execution/tasks/SystemTaskRegistry.class]:
 Unsatisfied dependency expressed through constructor parameter 0: Error creating bean with name 'START_WORKFLOW': Resolution of declared constructors on bean Class [com.netflix.conductor.core.execution.tasks.StartWorkflow] from ClassLoader [org.springframework.boot.loader.LaunchedURLClassLoader@b81eda8] failed

I've attached a full debug log for reference as well: error.txt

Details
Conductor version: Snapshot (main at fec3116)
Persistence implementation: Postgres
Queue implementation: Postgres, Dynoqueues, SQS

To Reproduce
Steps to reproduce the behaviour:

  1. Build server component from main at fec3116 using the Dockerfile from docker/server
  2. Run the server using Docker image

Expected behaviour
Server starts without erroring.

Additional context
I've actually noticed what I think is the issue here. From the exception stack trace you can see that conductor-server seems to be referencing the 3.10.7 version of conductor-core. This is despite build.gradle for server referencing core as a project reference. Narrowing this down further I can see that this reference issue seems to be coming from the introduction of the orkes-conductor-queues package in 5a0dcc1, reverting this commit seems to resolve the problem and make the references correctly use the core project and product a snapshot JAR etc. I've actually reverted this commit in our fork and it's working as expected without this change.

Do not triggered failure workflow when task terminated or terminated with reason

Describe the bug
I have workflow qn_test1 with 2 task and failureWorkflow ANY_WORKFLOW (we can use here any name for test)
if one of the task failed in qn_test1 workflow I have success triggered ANY_WORKFLOW workflow
but when I used curl request to conductor server DELETE api/workflow/{workflowId} ANY_WORKFLOW was not triggered

UPD:
also I tried python conductor SDK
https://github.com/conductor-sdk/conductor-python/blob/main/src/conductor/client/workflow/executor/workflow_executor.py#L155
with trigger_failure_workflow=true, but looks like conductor server has no this parameter

Details
Conductor version: 3.13.8
Persistence implementation: MySQL
Queue implementation: MySQL
Lock: Redis
Workflow definition:

"workflowDefinition": {
"createTime": 1703003690022,
"updateTime": 1703003859294,
"accessPolicy": {},
"name": "qn_test1",
"description": "Edit or extend this sample workflow. Set the workflow name to get started",
"version": 4,
"tasks": [
{
"name": "get_population_data",
"taskReferenceName": "get_population_data",
"inputParameters": {
    "http_request": {
    "uri": "[https://datausa.io/api/data?drilldowns=Nation&measures=Population"](https://datausa.io/api/data?drilldowns=Nation&measures=Population%22),
    "method": "GET"
    },
    "asyncComplete": false
},
"type": "HTTP",
"startDelay": 110,
"optional": false,
"taskDefinition": {
    "accessPolicy": {},
    "retryCount": 10,
    "timeoutSeconds": 10,
    "inputKeys": [],
    "outputKeys": [],
    "timeoutPolicy": "TIME_OUT_WF",
    "retryLogic": "FIXED",
    "retryDelaySeconds": 60,
    "responseTimeoutSeconds": 3600,
    "inputTemplate": {},
    "rateLimitPerFrequency": 0,
    "rateLimitFrequencyInSeconds": 1,
    "backoffScaleFactor": 1
},
"asyncComplete": false
},
{
"name": "wrong_get_population_data",
"taskReferenceName": "wrong_get_population_data",
"inputParameters": {
    "http_request": {
    "uri": "https://WRONG",
    "method": "GET"
    }
},
"type": "HTTP",
"startDelay": 0,
"optional": false,
"asyncComplete": false
}
],
"inputParameters": [],
"outputParameters": {
"data": "${wrong_get_population_data.output.response.body.data}",
"source": "${wrong_get_population_data.output.response.body.source}"
},
"failureWorkflow": "ANY_WORKFLOW",
"schemaVersion": 2,
"restartable": true,
"workflowStatusListenerEnabled": false,
"ownerEmail": "[email protected]",
"timeoutPolicy": "ALERT_ONLY",
"timeoutSeconds": 0,
"variables": {},
"inputTemplate": {}
},
"priority": 0,
"variables": {},
"lastRetriedTime": 0,
"failedTaskNames": [],
"startTime": 1703003876792,
"workflowName": "qn_test1",
"workflowVersion": 4
}

Http Task is scheduled but never run

Describe the bug
Create a simple workflow has only one http task, when start workflow, http task is scheduled but never run till timeout.

Details
Conductor version: the latest release 3.18.
Persistence implementation: Postgres and Redis both tested and the same issue
Queue implementation: Postgres, MySQL, Dynoqueues etc
Lock: Redis
Workflow definition:
Task definition:
Event handler definition:

To Reproduce
Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

[FEATURE]: Revert OSS default to dyno queues to avoid Apache 2.0 license conflict with Orkes queues.

This has been discussed on https://github.com/orgs/conductor-oss/discussions/108 and directly with Orkes. After the discussion we have come to a conclusion that commercial use of default Orkes queues in conductor is not allowed without a license agreement with Orkes and doesn't meet the overall Apache 2.0 license of Conductor.

The issue is similar to what was brought up earlier with licensing on Netflix/conductor#2091

Describe the Feature Request

Current use of Orkes queues needs licensing for commercial use and isn't Apache 2.0 compliant.
Prior to Orkes queues the default implementation for Netflix dyno queues which is Apache 2.0 license.

Describe Preferred Solution

Revert default queue implementation to dyno queues as it existed prior to Orkes in the the OSS version of Conductor to make it Apache 2.0 licensed across all of conductor.

Describe Alternatives

Update license of Orkes queues to Apache 2.0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.