Giter VIP home page Giter VIP logo

Comments (30)

leepc12 avatar leepc12 commented on June 27, 2024

Thanks for the tar ball. This is very helpful.

You already posted the following error line but I need to look at the full log. Please post it here.

[2018-09-26 13:51:45,54] [error] WorkflowManagerActor Workflow 59fb6fa8-c5bc-4928-9d8a-6a8fea701b24 failed (during ExecutingWorkflowState): Job atac.trim_adapter:0:1 exited with return code 1 which has not been declared as a valid return code. See 'continueOnReturnCode' runtime attribute for more details.

from atac-seq-pipeline.

shanmukhasampath avatar shanmukhasampath commented on June 27, 2024

Hi Jin,

I am pasting the full log here. I have changed the directory $ATAC at the beginning which is the pipeline directory.

$ java -jar -Dconfig.file=backends/backend.conf -Dbackend.default=sge_singularity cromwell-34.jar run atac.wdl -i ${INPUT} -o workflow_opts/sge.json
[2018-09-26 13:48:23,74] [info] Running with database db.url = jdbc:hsqldb:mem:22e778ed-f8a1-48f3-83b2-f669ecc7562f;shutdown=false;hsqldb.tx=mvcc
[2018-09-26 13:49:07,44] [info] Running migration RenameWorkflowOptionsInMetadata with a read batch size of 100000 and a write batch size of 100000
[2018-09-26 13:49:07,47] [info] [RenameWorkflowOptionsInMetadata] 100%
[2018-09-26 13:49:08,02] [info] Running with database db.url = jdbc:hsqldb:mem:286d3165-ca84-45c7-99f6-49253455e369;shutdown=false;hsqldb.tx=mvcc
[2018-09-26 13:49:09,72] [warn] This actor factory is deprecated. Please use cromwell.backend.google.pipelines.v1alpha2.PipelinesApiLifecycleActorFactory for PAPI v1 or cromwell.backend.google.pipelines.v2alpha1.PipelinesApiLifecycleActorFactory for PAPI v2
[2018-09-26 13:49:09,73] [warn] Couldn't find a suitable DSN, defaulting to a Noop one.
[2018-09-26 13:49:09,74] [info] Using noop to send events.
[2018-09-26 13:49:11,56] [info] Slf4jLogger started
[2018-09-26 13:49:13,71] [info] Workflow heartbeat configuration:
{
  "cromwellId" : "cromid-f62a6b0",
  "heartbeatInterval" : "2 minutes",
  "ttl" : "10 minutes",
  "writeBatchSize" : 10000,
  "writeThreshold" : 10000
}
[2018-09-26 13:49:14,23] [info] Metadata summary refreshing every 2 seconds.
[2018-09-26 13:49:14,69] [info] WriteMetadataActor configured to flush with batch size 200 and process rate 5 seconds.
[2018-09-26 13:49:15,04] [info] KvWriteActor configured to flush with batch size 200 and process rate 5 seconds.
[2018-09-26 13:49:16,42] [info] CallCacheWriteActor configured to flush with batch size 100 and process rate 3 seconds.
[2018-09-26 13:49:21,38] [info] JobExecutionTokenDispenser - Distribution rate: 50 per 1 seconds.
[2018-09-26 13:49:21,53] [info] SingleWorkflowRunnerActor: Version 34
[2018-09-26 13:49:21,64] [info] SingleWorkflowRunnerActor: Submitting workflow
[2018-09-26 13:49:21,91] [info] PAPIQueryManager Running with 3 workers
[2018-09-26 13:49:21,91] [info] JES batch polling interval is 33333 milliseconds
[2018-09-26 13:49:21,94] [info] JES batch polling interval is 33333 milliseconds
[2018-09-26 13:49:21,97] [info] JES batch polling interval is 33333 milliseconds
[2018-09-26 13:49:22,07] [info] Unspecified type (Unspecified version) workflow 59fb6fa8-c5bc-4928-9d8a-6a8fea701b24 submitted
[2018-09-26 13:49:22,45] [info] SingleWorkflowRunnerActor: Workflow submitted 59fb6fa8-c5bc-4928-9d8a-6a8fea701b24
[2018-09-26 13:49:22,47] [warn] SingleWorkflowRunnerActor: received unexpected message: Done in state RunningSwraData
[2018-09-26 13:49:22,55] [info] 1 new workflows fetched
[2018-09-26 13:49:22,56] [info] WorkflowManagerActor Starting workflow 59fb6fa8-c5bc-4928-9d8a-6a8fea701b24
[2018-09-26 13:49:22,56] [info] WorkflowManagerActor Successfully started WorkflowActor-59fb6fa8-c5bc-4928-9d8a-6a8fea701b24
[2018-09-26 13:49:22,56] [info] Retrieved 1 workflows from the WorkflowStoreActor
[2018-09-26 13:49:22,94] [info] WorkflowStoreHeartbeatWriteActor configured to flush with batch size 10000 and process rate 2 minutes.
[2018-09-26 13:49:23,35] [info] MaterializeWorkflowDescriptorActor [59fb6fa8]: Parsing workflow as WDL draft-2
[2018-09-26 13:49:30,39] [info] Message [cromwell.docker.DockerHashActor$DockerHashSuccessResponse] from Actor[akka://cromwell-system/user/HealthMonitorDockerHashActor#187183141] to Actor[akka://cromwell-system/deadLetters] was not delivered. [1] dead letters encountered, no more dead letters will be logged. If this is not an expected behavior, then [Actor[akka://cromwell-system/deadLetters]] may have terminated unexpectedly, This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[2018-09-26 13:50:31,28] [info] MaterializeWorkflowDescriptorActor [59fb6fa8]: Call-to-Backend assignments: atac.idr -> sge_singularity, atac.pool_ta_pr2 -> sge_singularity, atac.macs2_pr2 -> sge_singularity, atac.idr_pr -> sge_singularity, atac.xcor -> sge_singularity, atac.overlap_ppr -> sge_singularity, atac.ataqc -> sge_singularity, atac.idr_ppr -> sge_singularity, atac.read_genome_tsv -> sge_singularity, atac.macs2_ppr2 -> sge_singularity, atac.macs2_ppr1 -> sge_singularity, atac.bowtie2 -> sge_singularity, atac.overlap -> sge_singularity, atac.overlap_pr -> sge_singularity, atac.macs2 -> sge_singularity, atac.macs2_pr1 -> sge_singularity, atac.trim_adapter -> sge_singularity, atac.bam2ta -> sge_singularity, atac.qc_report -> sge_singularity, atac.macs2_pooled -> sge_singularity, atac.filter -> sge_singularity, atac.spr -> sge_singularity, atac.reproducibility_idr -> sge_singularity, atac.reproducibility_overlap -> sge_singularity, atac.pool_ta_pr1 -> sge_singularity, atac.pool_ta -> sge_singularity
[2018-09-26 13:50:32,85] [warn] sge_singularity [59fb6fa8]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-09-26 13:50:32,91] [warn] sge_singularity [59fb6fa8]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-09-26 13:50:32,94] [warn] sge_singularity [59fb6fa8]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-09-26 13:50:32,96] [warn] sge_singularity [59fb6fa8]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-09-26 13:50:32,99] [warn] sge_singularity [59fb6fa8]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-09-26 13:50:33,00] [warn] sge_singularity [59fb6fa8]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-09-26 13:50:33,02] [warn] sge_singularity [59fb6fa8]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-09-26 13:50:33,03] [warn] sge_singularity [59fb6fa8]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-09-26 13:50:33,04] [warn] sge_singularity [59fb6fa8]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-09-26 13:50:33,05] [warn] sge_singularity [59fb6fa8]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-09-26 13:50:33,06] [warn] sge_singularity [59fb6fa8]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-09-26 13:50:33,07] [warn] sge_singularity [59fb6fa8]: Key/s [preemptible, disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-09-26 13:50:33,08] [warn] sge_singularity [59fb6fa8]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-09-26 13:50:33,10] [warn] sge_singularity [59fb6fa8]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-09-26 13:50:33,11] [warn] sge_singularity [59fb6fa8]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-09-26 13:50:33,12] [warn] sge_singularity [59fb6fa8]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-09-26 13:50:33,15] [warn] sge_singularity [59fb6fa8]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-09-26 13:50:33,20] [warn] sge_singularity [59fb6fa8]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-09-26 13:50:33,23] [warn] sge_singularity [59fb6fa8]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-09-26 13:50:33,24] [warn] sge_singularity [59fb6fa8]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-09-26 13:50:33,25] [warn] sge_singularity [59fb6fa8]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-09-26 13:50:33,26] [warn] sge_singularity [59fb6fa8]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-09-26 13:50:33,28] [warn] sge_singularity [59fb6fa8]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-09-26 13:50:33,29] [warn] sge_singularity [59fb6fa8]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-09-26 13:50:33,31] [warn] sge_singularity [59fb6fa8]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-09-26 13:50:33,34] [warn] sge_singularity [59fb6fa8]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-09-26 13:50:35,83] [info] WorkflowExecutionActor-59fb6fa8-c5bc-4928-9d8a-6a8fea701b24 [59fb6fa8]: Starting atac.read_genome_tsv
[2018-09-26 13:50:35,83] [info] WorkflowExecutionActor-59fb6fa8-c5bc-4928-9d8a-6a8fea701b24 [59fb6fa8]: Condition met: 'enable_idr'. Running conditional section
[2018-09-26 13:50:35,83] [info] WorkflowExecutionActor-59fb6fa8-c5bc-4928-9d8a-6a8fea701b24 [59fb6fa8]: Condition met: '!align_only && !true_rep_only'. Running conditional section
[2018-09-26 13:50:35,83] [info] WorkflowExecutionActor-59fb6fa8-c5bc-4928-9d8a-6a8fea701b24 [59fb6fa8]: Condition met: '!align_only && !true_rep_only && enable_idr'. Running conditional section
[2018-09-26 13:50:35,83] [info] WorkflowExecutionActor-59fb6fa8-c5bc-4928-9d8a-6a8fea701b24 [59fb6fa8]: Condition met: '!disable_xcor'. Running conditional section
[2018-09-26 13:50:35,85] [info] WorkflowExecutionActor-59fb6fa8-c5bc-4928-9d8a-6a8fea701b24 [59fb6fa8]: Condition met: 'enable_idr'. Running conditional section
[2018-09-26 13:50:35,85] [info] WorkflowExecutionActor-59fb6fa8-c5bc-4928-9d8a-6a8fea701b24 [59fb6fa8]: Condition met: '!true_rep_only'. Running conditional section
[2018-09-26 13:50:37,83] [warn] DispatchedConfigAsyncJobExecutionActor [59fb6fa8atac.read_genome_tsv:NA:1]: Unrecognized runtime attribute keys: disks
[2018-09-26 13:50:38,99] [info] DispatchedConfigAsyncJobExecutionActor [59fb6fa8atac.read_genome_tsv:NA:1]: cat $ATAC/cromwell-executions/atac/59fb6fa8-c5bc-4928-9d8a-6a8fea701b24/call-read_genome_tsv/inputs/-1470820443/hg38_local.tsv
[2018-09-26 13:50:39,21] [info] DispatchedConfigAsyncJobExecutionActor [59fb6fa8atac.read_genome_tsv:NA:1]: executing: echo "chmod u+x $ATAC/cromwell-executions/atac/59fb6fa8-c5bc-4928-9d8a-6a8fea701b24/call-read_genome_tsv/execution/script && SINGULARITY_BINDPATH=$(echo $ATAC/cromwell-executions/atac/59fb6fa8-c5bc-4928-9d8a-6a8fea701b24/call-read_genome_tsv | sed 's/cromwell-executions/\n/g' | head -n1) singularity  exec     ~/.singularity/atac-seq-pipeline-v1.1.simg $ATAC/cromwell-executions/atac/59fb6fa8-c5bc-4928-9d8a-6a8fea701b24/call-read_genome_tsv/execution/script" | qsub \
-terse \
-b n \
-N cromwell_59fb6fa8_read_genome_tsv \
-wd $ATAC/cromwell-executions/atac/59fb6fa8-c5bc-4928-9d8a-6a8fea701b24/call-read_genome_tsv \
-o $ATAC/cromwell-executions/atac/59fb6fa8-c5bc-4928-9d8a-6a8fea701b24/call-read_genome_tsv/execution/stdout \
-e $ATAC/cromwell-executions/atac/59fb6fa8-c5bc-4928-9d8a-6a8fea701b24/call-read_genome_tsv/execution/stderr \
  \
-l h_vmem=4000m \
-l s_vmem=4000m \
-l h_rt=3600 \
-l s_rt=3600 \
-q all.q \
 \
 \
-V
[2018-09-26 13:50:40,27] [info] DispatchedConfigAsyncJobExecutionActor [59fb6fa8atac.read_genome_tsv:NA:1]: job id: 3119987
[2018-09-26 13:50:40,29] [info] DispatchedConfigAsyncJobExecutionActor [59fb6fa8atac.read_genome_tsv:NA:1]: Status change from - to WaitingForReturnCodeFile
[2018-09-26 13:50:40,94] [info] WorkflowExecutionActor-59fb6fa8-c5bc-4928-9d8a-6a8fea701b24 [59fb6fa8]: Starting atac.trim_adapter (2 shards)
[2018-09-26 13:50:41,48] [warn] DispatchedConfigAsyncJobExecutionActor [59fb6fa8atac.trim_adapter:1:1]: Unrecognized runtime attribute keys: disks
[2018-09-26 13:50:41,53] [warn] DispatchedConfigAsyncJobExecutionActor [59fb6fa8atac.trim_adapter:0:1]: Unrecognized runtime attribute keys: disks
[2018-09-26 13:50:42,39] [info] DispatchedConfigAsyncJobExecutionActor [59fb6fa8atac.trim_adapter:0:1]: python $(which encode_trim_adapter.py) \
        $ATAC/cromwell-executions/atac/59fb6fa8-c5bc-4928-9d8a-6a8fea701b24/call-trim_adapter/shard-0/execution/write_tsv_132e1c068cf02fac1db5910a0b5624d2.tmp \
        --adapters $ATAC/cromwell-executions/atac/59fb6fa8-c5bc-4928-9d8a-6a8fea701b24/call-trim_adapter/shard-0/execution/write_tsv_d41d8cd98f00b204e9800998ecf8427e.tmp \
        --paired-end \
        --auto-detect-adapter \
        --min-trim-len 5 \
        --err-rate 0.1 \
        --nth 1
[2018-09-26 13:50:42,48] [info] DispatchedConfigAsyncJobExecutionActor [59fb6fa8atac.trim_adapter:0:1]: executing: echo "chmod u+x $ATAC/cromwell-executions/atac/59fb6fa8-c5bc-4928-9d8a-6a8fea701b24/call-trim_adapter/shard-0/execution/script && SINGULARITY_BINDPATH=$(echo $ATAC/cromwell-executions/atac/59fb6fa8-c5bc-4928-9d8a-6a8fea701b24/call-trim_adapter/shard-0 | sed 's/cromwell-executions/\n/g' | head -n1) singularity  exec     ~/.singularity/atac-seq-pipeline-v1.1.simg $ATAC/cromwell-executions/atac/59fb6fa8-c5bc-4928-9d8a-6a8fea701b24/call-trim_adapter/shard-0/execution/script" | qsub \
-terse \
-b n \
-N cromwell_59fb6fa8_trim_adapter \
-wd $ATAC/cromwell-executions/atac/59fb6fa8-c5bc-4928-9d8a-6a8fea701b24/call-trim_adapter/shard-0 \
-o $ATAC/cromwell-executions/atac/59fb6fa8-c5bc-4928-9d8a-6a8fea701b24/call-trim_adapter/shard-0/execution/stdout \
-e $ATAC/cromwell-executions/atac/59fb6fa8-c5bc-4928-9d8a-6a8fea701b24/call-trim_adapter/shard-0/execution/stderr \
  \
-l h_vmem=12000m \
-l s_vmem=12000m \
-l h_rt=86400 \
-l s_rt=86400 \
-q all.q \
 \
 \
-V
[2018-09-26 13:50:42,67] [info] DispatchedConfigAsyncJobExecutionActor [59fb6fa8atac.trim_adapter:1:1]: python $(which encode_trim_adapter.py) \
        $ATAC/cromwell-executions/atac/59fb6fa8-c5bc-4928-9d8a-6a8fea701b24/call-trim_adapter/shard-1/execution/write_tsv_3b2c67e7474625f086fb2cfb4d2ab98e.tmp \
        --adapters $ATAC/cromwell-executions/atac/59fb6fa8-c5bc-4928-9d8a-6a8fea701b24/call-trim_adapter/shard-1/execution/write_tsv_d41d8cd98f00b204e9800998ecf8427e.tmp \
        --paired-end \
        --auto-detect-adapter \
        --min-trim-len 5 \
        --err-rate 0.1 \
        --nth 1
[2018-09-26 13:50:42,73] [info] DispatchedConfigAsyncJobExecutionActor [59fb6fa8atac.trim_adapter:1:1]: executing: echo "chmod u+x $ATAC/cromwell-executions/atac/59fb6fa8-c5bc-4928-9d8a-6a8fea701b24/call-trim_adapter/shard-1/execution/script && SINGULARITY_BINDPATH=$(echo $ATAC/cromwell-executions/atac/59fb6fa8-c5bc-4928-9d8a-6a8fea701b24/call-trim_adapter/shard-1 | sed 's/cromwell-executions/\n/g' | head -n1) singularity  exec     ~/.singularity/atac-seq-pipeline-v1.1.simg $ATAC/cromwell-executions/atac/59fb6fa8-c5bc-4928-9d8a-6a8fea701b24/call-trim_adapter/shard-1/execution/script" | qsub \
-terse \
-b n \
-N cromwell_59fb6fa8_trim_adapter \
-wd $ATAC/cromwell-executions/atac/59fb6fa8-c5bc-4928-9d8a-6a8fea701b24/call-trim_adapter/shard-1 \
-o $ATAC/cromwell-executions/atac/59fb6fa8-c5bc-4928-9d8a-6a8fea701b24/call-trim_adapter/shard-1/execution/stdout \
-e $ATAC/cromwell-executions/atac/59fb6fa8-c5bc-4928-9d8a-6a8fea701b24/call-trim_adapter/shard-1/execution/stderr \
  \
-l h_vmem=12000m \
-l s_vmem=12000m \
-l h_rt=86400 \
-l s_rt=86400 \
-q all.q \
 \
 \
-V
[2018-09-26 13:50:45,30] [info] DispatchedConfigAsyncJobExecutionActor [59fb6fa8atac.trim_adapter:0:1]: job id: 3119988
[2018-09-26 13:50:45,33] [info] DispatchedConfigAsyncJobExecutionActor [59fb6fa8atac.trim_adapter:1:1]: job id: 3119989
[2018-09-26 13:50:45,36] [info] DispatchedConfigAsyncJobExecutionActor [59fb6fa8atac.trim_adapter:1:1]: Status change from - to WaitingForReturnCodeFile
[2018-09-26 13:50:45,37] [info] DispatchedConfigAsyncJobExecutionActor [59fb6fa8atac.trim_adapter:0:1]: Status change from - to WaitingForReturnCodeFile
[2018-09-26 13:51:18,39] [info] DispatchedConfigAsyncJobExecutionActor [59fb6fa8atac.trim_adapter:0:1]: Status change from WaitingForReturnCodeFile to Done
[2018-09-26 13:51:23,02] [info] DispatchedConfigAsyncJobExecutionActor [59fb6fa8atac.trim_adapter:1:1]: Status change from WaitingForReturnCodeFile to Done
[2018-09-26 13:51:44,58] [info] DispatchedConfigAsyncJobExecutionActor [59fb6fa8atac.read_genome_tsv:NA:1]: Status change from WaitingForReturnCodeFile to Done
[2018-09-26 13:51:45,54] [error] WorkflowManagerActor Workflow 59fb6fa8-c5bc-4928-9d8a-6a8fea701b24 failed (during ExecutingWorkflowState): Job atac.trim_adapter:0:1 exited with return code 1 which has not been declared as a valid return code. See 'continueOnReturnCode' runtime attribute for more details.
Check the content of stderr for potential additional information: $ATAC/cromwell-executions/atac/59fb6fa8-c5bc-4928-9d8a-6a8fea701b24/call-trim_adapter/shard-0/execution/stderr.
 Traceback (most recent call last):
  File "/software/atac-seq-pipeline/src/encode_trim_adapter.py", line 269, in <module>
    main()
  File "/software/atac-seq-pipeline/src/encode_trim_adapter.py", line 233, in main
    fastqs = ret_val.get(BIG_INT)
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 567, in get
    raise self._value
OSError: [Errno 3] No such process
ln: failed to access 'merge_fastqs_R?_*.fastq.gz': No such file or directory

Job atac.trim_adapter:1:1 exited with return code 1 which has not been declared as a valid return code. See 'continueOnReturnCode' runtime attribute for more details.
Check the content of stderr for potential additional information: $HOME/cromwell-executions/atac/59fb6fa8-c5bc-4928-9d8a-6a8fea701b24/call-trim_adapter/shard-1/execution/stderr.
 Traceback (most recent call last):
  File "/software/atac-seq-pipeline/src/encode_trim_adapter.py", line 269, in <module>
    main()
  File "/software/atac-seq-pipeline/src/encode_trim_adapter.py", line 233, in main
    fastqs = ret_val.get(BIG_INT)
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 567, in get
    raise self._value
OSError: [Errno 3] No such process
ln: failed to access 'merge_fastqs_R?_*.fastq.gz': No such file or directory

[2018-09-26 13:51:45,56] [info] WorkflowManagerActor WorkflowActor-59fb6fa8-c5bc-4928-9d8a-6a8fea701b24 is in a terminal state: WorkflowFailedState
[2018-09-26 13:52:11,48] [info] SingleWorkflowRunnerActor workflow finished with status 'Failed'.
[2018-09-26 13:52:14,82] [info] Workflow polling stopped
[2018-09-26 13:52:14,85] [info] Shutting down WorkflowStoreActor - Timeout = 5 seconds
[2018-09-26 13:52:14,85] [info] Shutting down WorkflowLogCopyRouter - Timeout = 5 seconds
[2018-09-26 13:52:14,89] [info] Shutting down JobExecutionTokenDispenser - Timeout = 5 seconds
[2018-09-26 13:52:14,93] [info] Aborting all running workflows.
[2018-09-26 13:52:14,94] [info] WorkflowLogCopyRouter stopped
[2018-09-26 13:52:14,95] [info] JobExecutionTokenDispenser stopped
[2018-09-26 13:52:14,95] [info] Shutting down WorkflowManagerActor - Timeout = 3600 seconds
[2018-09-26 13:52:14,95] [info] WorkflowStoreActor stopped
[2018-09-26 13:52:14,95] [info] WorkflowManagerActor All workflows finished
[2018-09-26 13:52:14,95] [info] WorkflowManagerActor stopped
[2018-09-26 13:52:14,95] [info] Connection pools shut down
[2018-09-26 13:52:14,95] [info] Shutting down SubWorkflowStoreActor - Timeout = 1800 seconds
[2018-09-26 13:52:14,95] [info] Shutting down JobStoreActor - Timeout = 1800 seconds
[2018-09-26 13:52:14,95] [info] Shutting down CallCacheWriteActor - Timeout = 1800 seconds
[2018-09-26 13:52:14,95] [info] Shutting down ServiceRegistryActor - Timeout = 1800 seconds
[2018-09-26 13:52:14,95] [info] Shutting down DockerHashActor - Timeout = 1800 seconds
[2018-09-26 13:52:14,95] [info] Shutting down IoProxy - Timeout = 1800 seconds
[2018-09-26 13:52:14,95] [info] DockerHashActor stopped
[2018-09-26 13:52:14,95] [info] IoProxy stopped
[2018-09-26 13:52:14,95] [info] SubWorkflowStoreActor stopped
[2018-09-26 13:52:14,98] [info] CallCacheWriteActor Shutting down: 0 queued messages to process
[2018-09-26 13:52:14,98] [info] CallCacheWriteActor stopped
[2018-09-26 13:52:14,99] [info] JobStoreActor stopped
[2018-09-26 13:52:15,01] [info] WriteMetadataActor Shutting down: 0 queued messages to process
[2018-09-26 13:52:15,01] [info] KvWriteActor Shutting down: 0 queued messages to process
[2018-09-26 13:52:15,02] [info] ServiceRegistryActor stopped
[2018-09-26 13:52:15,25] [info] Database closed
[2018-09-26 13:52:15,25] [info] Stream materializer shut down
Workflow 59fb6fa8-c5bc-4928-9d8a-6a8fea701b24 transitioned to state Failed
[2018-09-26 13:52:15,96] [info] Automatic shutdown of the async connection
[2018-09-26 13:52:15,96] [info] Gracefully shutdown sentry threads.
[2018-09-26 13:52:16,35] [info] Shutdown finished.

from atac-seq-pipeline.

leepc12 avatar leepc12 commented on June 27, 2024

Please run this command and post its output.

singularity exec ~/.singularity/atac-seq-pipeline-v1.1.simg python -c "from cutadapt.scripts import cutadapt"

from atac-seq-pipeline.

shanmukhasampath avatar shanmukhasampath commented on June 27, 2024

Hi Jin,

Here is the output for the command

$ singularity exec ~/.singularity/atac-seq-pipeline-v1.1.simg python -c "from cutadapt.scripts import cutadapt"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ImportError: No module named scripts

from atac-seq-pipeline.

leepc12 avatar leepc12 commented on June 27, 2024

Can you remove ~/.singularity/atac-seq-pipeline-v1.1.simg and rebuild it and then try again?

rm ~/.singularity/atac-seq-pipeline-v1.1.simg

SINGULARITY_PULLFOLDER=~/.singularity singularity pull docker://quay.io/encode-dcc/atac-seq-pipeline:v1.1

singularity exec ~/.singularity/atac-seq-pipeline-v1.1.simg python -c "from cutadapt.scripts import cutadapt"

from atac-seq-pipeline.

shanmukhasampath avatar shanmukhasampath commented on June 27, 2024

Hi Jin,

I have ran the commands and pasted the output here.

$ rm ~/.singularity/atac-seq-pipeline-v1.1.simg

The singularity image was removed and was pulled again

$ SINGULARITY_PULLFOLDER=~/.singularity singularity pull docker://quay.io/encode-dcc/atac-seq-pipeline:v1.1
WARNING: pull for Docker Hub is not guaranteed to produce the
WARNING: same image on repeated pull. Use Singularity Registry
WARNING: (shub://) to pull exactly equivalent images.
Docker image path: quay.io/encode-dcc/atac-seq-pipeline:v1.1
Cache folder set to /home/padmanabs1/.singularity/docker
[2/2] |===================================| 100.0%
Importing: base Singularity environment
Exploding layer: sha256:4f1bb8b6572003ca3526ab23c80c385c5eadd1748e3996e444b66d47ce0ee7af.tar.gz
Exploding layer: sha256:4791a9f808607abc64b0398fb4a88b7e79d44a770ca0c7206cd8b0d12e307e02.tar.gz
Exploding layer: sha256:c7bccbb1d18371fe9f6514c1d93f9c8007b4653a0d22aadbc3ed56c8faf840a6.tar.gz
Exploding layer: sha256:94925a7a8f895b9bfae70aa0e767634353bb2514863af8c6d299749b70102f8d.tar.gz
Exploding layer: sha256:1a776d5f8f21442c36036f32af57c3bca06450718bc89623a42efd63442ab809.tar.gz
Exploding layer: sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4.tar.gz
Exploding layer: sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4.tar.gz
Exploding layer: sha256:5d3cc52509843bdc69ab5a70bdc45949856007e1685b83341682e9a40a36eef8.tar.gz
Exploding layer: sha256:6ebb2849c0a021e608f162c1f36809bb7ac77759ab73b4176fd6749262e4079e.tar.gz
Exploding layer: sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4.tar.gz
Exploding layer: sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4.tar.gz
Exploding layer: sha256:fe2a882c6bdfe04e0ac31f8c1dbd2c4b1006a6d9a3faf60d00ef25ee2a7729b4.tar.gz
Exploding layer: sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4.tar.gz
Exploding layer: sha256:ba0bcb371fcd753931e22fc9ba15557ae74a2d3f24f25e38ed3d17a63e26ec20.tar.gz
Exploding layer: sha256:cfdd02a577fba02ce49dfc4058264d88b818849cd4a73e3a3009ca5b019e15bb.tar.gz
Exploding layer: sha256:1a5898730b3be2f7457662ea511c8edd0f6a2df5e46243fcae4225a664033301.tar.gz
Exploding layer: sha256:73f9b243df164aa6f90a048b32d603b8c94d3a3a95c210b648a8a2f5a634a1c3.tar.gz
Exploding layer: sha256:072ad4b9e63098fc2de3f5a16b79b484f4af1ba7653e5676705b9aac310dab4f.tar.gz
Exploding layer: sha256:b2db447d8cfa1d7046edd914022407d911acae27c92caa83f518c05c20503b53.tar.gz
Exploding layer: sha256:324b13500e02df2ef0ac4ebc56d6c421335ed70a0a17cfd378367f7262395043.tar.gz
Exploding layer: sha256:5b542b9ed3d2b186d787a71d941929d1ecd5fa617b48752f00a741b506964d04.tar.gz
Exploding layer: sha256:89076249beacc6c91b61e4ae0ff62a626334391f8db1fd2e1faddeb7dfdb27c5.tar.gz
Exploding layer: sha256:0171e45c00b1e21b9d00c3af81732ab067d852e00487e96a5b1af788f2cc2d47.tar.gz
Exploding layer: sha256:349f1b635721ac892ce50e869ea65e8054b4e2160533fb168f1752f0cda7ac12.tar.gz
Exploding layer: sha256:7508e9fd821c3320406e41534bd9596fa503813ed5fc084b78b8fb87ec884436.tar.gz
Exploding layer: sha256:aeead9f818e2c4d93d682fff4fcfe94afe9856233e57a67525a171e124ee0a66.tar.gz
Exploding layer: sha256:93e01131d755b153287c6ba670980b94be112768c2fb0f17faed4d1d2bf9fe79.tar.gz
Exploding layer: sha256:cab37989f58a8148de6bfa54a09d539609227628ab89ac8b3515f079dbf1d614.tar.gz
Exploding layer: sha256:3fc67ffe6db9a620c05284da18e1e66d7b70c58524f6618d64379e90ad415c30.tar.gz
Exploding layer: sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4.tar.gz
Exploding layer: sha256:b545a3ffe15dca067d6106883bbfae3b42ffc71be318008a1ea65dfb5ff6c283.tar.gz
Exploding layer: sha256:b277b5c44c466fc4664766431c9b9e051dfd28e0225ee50722d4cd61dbab0eb2.tar.gz
Exploding layer: sha256:c4f737f68b4b16d71b8c7d242aa324ee63063f47e5d6d834b836e4ce2805fa36.tar.gz
Exploding layer: sha256:0607c613a5e161b9c2ac7608fcabe159309df8fdfdad4abe5e60af0311bac88c.tar.gz
Exploding layer: sha256:f3f9d42c4aa277cbc09afaea7e96c81180154c9093aaf0e9b438694353eaccfa.tar.gz
Exploding layer: sha256:950d0e5df6c8164df48d9b29eb35613aca07681cc6366111ff4e4108ed921be6.tar.gz
Exploding layer: sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4.tar.gz
Exploding layer: sha256:3e9ca7e22259d7a92726955cafde059968ef6fb9a1ce01eeabe615dbd8306c65.tar.gz
Exploding layer: sha256:d0897642c5c0c2477db1b363f3d68bd790c527751e0ba107fde5d55b5e3b1408.tar.gz
Exploding layer: sha256:1f733bf9967e004ce7937564ce5c5bdf737db29c12272d4f25dcfa7b0f4d0144.tar.gz
Exploding layer: sha256:582f033e3063bd4a2069d789b9484773a3cb9200c265ee2da9ae9800a07a2fbb.tar.gz
Exploding layer: sha256:1cb4ddd81f6dea8c7f81cd68830d1bfc8dd7cf066cd7a475971d6173174e93bd.tar.gz
Exploding layer: sha256:1f546d71522f57f1170ddb36115d6967f6bbfaa3d0729758b67f22328480b6f6.tar.gz
Exploding layer: sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4.tar.gz
Exploding layer: sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4.tar.gz
Exploding layer: sha256:93e126dd7652973fd5e540dcd1650210c5fc1030d91cb947766f4c389ada9411.tar.gz
WARNING: Building container as an unprivileged user. If you run this container as root
WARNING: it may be missing some functionality.
Building Singularity image...
Singularity container built: /home/padmanabs1/.singularity/atac-seq-pipeline-v1.1.simg
Cleaning up...
Done. Container is at: /home/padmanabs1/.singularity/atac-seq-pipeline-v1.1.simg

After it Singularity container was built. I ran this command

$ singularity exec ~/.singularity/atac-seq-pipeline-v1.1.simg python -c "from cutadapt.scripts import cutadapt"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ImportError: No module named scripts

from atac-seq-pipeline.

leepc12 avatar leepc12 commented on June 27, 2024

Can you run the followings too?

python -c "from cutadapt.scripts import cutadapt"

which python

singularity exec ~/.singularity/atac-seq-pipeline-v1.1.simg which python

which cutadapt

singularity exec ~/.singularity/atac-seq-pipeline-v1.1.simg which cutadapt

from atac-seq-pipeline.

shanmukhasampath avatar shanmukhasampath commented on June 27, 2024

Hi Jin,

Here are the outputs for the commands

$ python -c "from cutadapt.scripts import cutadapt"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ImportError: No module named scripts

I have changed the folder name to $SOFTWARE

$ which python
/SOFTWARE/anaconda/anaconda2/bin/python
$ singularity exec ~/.singularity/atac-seq-pipeline-v1.1.simg which python
/usr/bin/python
$ which cutadapt
~/.local/bin/cutadapt
$ singularity exec ~/.singularity/atac-seq-pipeline-v1.1.simg which cutadapt
/usr/local/bin/cutadapt

from atac-seq-pipeline.

leepc12 avatar leepc12 commented on June 27, 2024

I am very sorry bothering you but please run the following too:

$ ls -l $HOME/.local/lib/python2.7/site-packages/

$ singularity exec ~/.singularity/atac-seq-pipeline-v1.1.simg ls -l $HOME/.local/lib/python2.7/site-packages/

Can you also remove your locally installed cutadapt (~/.local/bin/cutadapt) and try again?

from atac-seq-pipeline.

shanmukhasampath avatar shanmukhasampath commented on June 27, 2024

Hi Jin,

I am pasting the outputs for the commands here.

]$ ls -l $HOME/.local/lib/python2.7/site-packages/
total 736
drwxr-xr-x  2 padmanabs1 reslnusers   131 Sep 24 14:19 bz2file-0.98.dist-info
-rw-r--r--  1 padmanabs1 reslnusers 18340 Sep 24 14:19 bz2file.py
-rw-r--r--  1 padmanabs1 reslnusers 15849 Sep 24 14:19 bz2file.pyc
drwxr-xr-x  5 padmanabs1 reslnusers   528 Aug 30 11:29 dateutil
drwxr-xr-x  3 padmanabs1 reslnusers  1354 Sep 21 16:31 MACS2
drwxr-xr-x  2 padmanabs1 reslnusers   131 Sep 21 16:31 MACS2-2.1.1.20160309.dist-info
drwxr-xr-x 17 padmanabs1 reslnusers  1109 Aug 30 11:29 numpy
drwxr-xr-x  2 padmanabs1 reslnusers   131 Aug 30 11:29 numpy-1.15.1.dist-info
drwxr-xr-x 16 padmanabs1 reslnusers   772 Aug 30 11:29 pandas
drwxr-xr-x  2 padmanabs1 reslnusers   131 Aug 30 11:29 pandas-0.23.4.dist-info
drwxr-xr-x  4 padmanabs1 reslnusers   170 Aug 30 11:25 pip
drwxr-xr-x  2 padmanabs1 reslnusers   194 Aug 30 11:26 pip-18.0.dist-info
drwxr-xr-x  2 padmanabs1 reslnusers   186 Aug 30 11:29 python_dateutil-2.7.3.dist-info
drwxr-xr-x  4 padmanabs1 reslnusers   760 Aug 30 11:26 wheel
drwxr-xr-x  2 padmanabs1 reslnusers   194 Aug 30 11:26 wheel-0.31.1.dist-info
drwxr-xr-x  2 padmanabs1 reslnusers   195 Sep 24 14:19 xopen-0.3.5.dist-info
-rw-r--r--  1 padmanabs1 reslnusers  7158 Sep 24 14:19 xopen.py
-rw-r--r--  1 padmanabs1 reslnusers  8709 Sep 24 14:19 xopen.pyc
$ singularity exec ~/.singularity/atac-seq-pipeline-v1.1.simg ls -l $HOME/.local/lib/python2.7/site-packages/
total 736
drwxr-xr-x  3 padmanabs1 reslnusers  1354 Sep 21 20:31 MACS2
drwxr-xr-x  2 padmanabs1 reslnusers   131 Sep 21 20:31 MACS2-2.1.1.20160309.dist-info
drwxr-xr-x  2 padmanabs1 reslnusers   131 Sep 24 18:19 bz2file-0.98.dist-info
-rw-r--r--  1 padmanabs1 reslnusers 18340 Sep 24 18:19 bz2file.py
-rw-r--r--  1 padmanabs1 reslnusers 15849 Sep 24 18:19 bz2file.pyc
drwxr-xr-x  5 padmanabs1 reslnusers   528 Aug 30 15:29 dateutil
drwxr-xr-x 17 padmanabs1 reslnusers  1109 Aug 30 15:29 numpy
drwxr-xr-x  2 padmanabs1 reslnusers   131 Aug 30 15:29 numpy-1.15.1.dist-info
drwxr-xr-x 16 padmanabs1 reslnusers   772 Aug 30 15:29 pandas
drwxr-xr-x  2 padmanabs1 reslnusers   131 Aug 30 15:29 pandas-0.23.4.dist-info
drwxr-xr-x  4 padmanabs1 reslnusers   170 Aug 30 15:25 pip
drwxr-xr-x  2 padmanabs1 reslnusers   194 Aug 30 15:26 pip-18.0.dist-info
drwxr-xr-x  2 padmanabs1 reslnusers   186 Aug 30 15:29 python_dateutil-2.7.3.dist-info
drwxr-xr-x  4 padmanabs1 reslnusers   760 Aug 30 15:26 wheel
drwxr-xr-x  2 padmanabs1 reslnusers   194 Aug 30 15:26 wheel-0.31.1.dist-info
drwxr-xr-x  2 padmanabs1 reslnusers   195 Sep 24 18:19 xopen-0.3.5.dist-info
-rw-r--r--  1 padmanabs1 reslnusers  7158 Sep 24 18:19 xopen.py
-rw-r--r--  1 padmanabs1 reslnusers  8709 Sep 24 18:19 xopen.pyc

Also I have removed the cutadapt

$ which cutadapt
/usr/bin/which: no cutadapt in ($ENV)

It is working now after removing the cutadapt. Installing cutadapt locally seems to mess up the pipeline.

from atac-seq-pipeline.

shanmukhasampath avatar shanmukhasampath commented on June 27, 2024

Hi Jin,

One more thing, I am facing is that ATAC-Seq pipeline does not finish properly. After QC step it gets stuck for a long time.

[2018-09-27 09:52:01,82] [info] DispatchedConfigAsyncJobExecutionActor [095444fcatac.ataqc:1:1]: job id: 3121761
[2018-09-27 09:52:01,83] [info] DispatchedConfigAsyncJobExecutionActor [095444fcatac.qc_report:NA:1]: job id: 3121763
[2018-09-27 09:52:01,83] [info] DispatchedConfigAsyncJobExecutionActor [095444fcatac.ataqc:0:1]: job id: 3121762
[2018-09-27 09:52:01,83] [info] DispatchedConfigAsyncJobExecutionActor [095444fcatac.ataqc:1:1]: Status change from - to WaitingForReturnCodeFile
[2018-09-27 09:52:01,83] [info] DispatchedConfigAsyncJobExecutionActor [095444fcatac.ataqc:0:1]: Status change from - to WaitingForReturnCodeFile
[2018-09-27 09:52:01,83] [info] DispatchedConfigAsyncJobExecutionActor [095444fcatac.qc_report:NA:1]: Status change from - to WaitingForReturnCodeFile
[2018-09-27 09:52:45,15] [info] DispatchedConfigAsyncJobExecutionActor [095444fcatac.qc_report:NA:1]: Status change from WaitingForReturnCodeFile to Done

And it is still running. Is there a way to fix this. I can get it finished if I press ctl+c

from atac-seq-pipeline.

leepc12 avatar leepc12 commented on June 27, 2024

Thanks for reporting this. I also figured it out and will make a hot fix. Please keep this issue open until then.

ataqc takes long even for the test data set. Does it take longer than 2-3 hours? Please check cpu usage of pipeline processes with top or htop.

qc_report task should take several minutes.

from atac-seq-pipeline.

shanmukhasampath avatar shanmukhasampath commented on June 27, 2024

Hi Jin,

It was still running so I cancelled it but it took close to 7 hours 30 minutes before I cancelled it.

from atac-seq-pipeline.

leepc12 avatar leepc12 commented on June 27, 2024

Okay, please upload your log and tar ball. I guess that something is wrong with the ataqc task.

from atac-seq-pipeline.

shanmukhasampath avatar shanmukhasampath commented on June 27, 2024

Hi Jin,

I am attaching the tar ball here for the atac pipeline I have run for the test data.
debug_10.tar.gz

from atac-seq-pipeline.

leepc12 avatar leepc12 commented on June 27, 2024

Please run the following:

$ which picard.jar
$ singularity exec ~/.singularity/atac-seq-pipeline-v1.1.simg which picard.jar

from atac-seq-pipeline.

shanmukhasampath avatar shanmukhasampath commented on June 27, 2024

Hi Jin,

I am pasting the output for the commands here

$ which picard.jar
/usr/bin/which: no picard.jar in ($SOFTWARE/anaconda/anaconda2/bin:$APP/singularity/2.5.2/bin:$APP/rsem/1.3.0/bin:$APP/bcl2fastq/2.20.0.422/bin:$APP/vcftools/0.1.15/bin:$APP/sam-bcf-tools/1.6/bin:$APP/bedtools/2.25.0/bin:$APP/R/3.5.0/bin:$APP/jdk/1.8.0_144/bin:$APP/gcc/6.3.0/bin:$APP/uge/8.5.3/bin/lx-amd64:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/sbin:$APP/environment-modules/3.2.10/bin:/opt/dell/srvadmin/bin:/home/padmanabs1/.local/bin:/home/padmanabs1/bin)
$ singularity exec ~/.singularity/atac-seq-pipeline-v1.1.simg which picard.jar
/software/picard.jar

from atac-seq-pipeline.

leepc12 avatar leepc12 commented on June 27, 2024

@shanmukhasampath : It's very hard to debug the second problem (hanging at which picard.jar in the ataqc step).

Can you simply re-run the pipeline and see if it works for ataqc?

BTW The cutadapt problem will be fixed in the next release.

from atac-seq-pipeline.

shanmukhasampath avatar shanmukhasampath commented on June 27, 2024

Hi Jin,

I am getting an error now for the test data

$ java -jar -Dconfig.file=backends/backend.conf -Dbackend.default=sge_singularity cromwell-34.jar run atac.wdl -i ${INPUT} -o workflow_opts/sge.json
[2018-10-03 17:30:44,41] [info] Running with database db.url = jdbc:hsqldb:mem:6221cf7d-92f1-40e1-b59f-ebf17764f2cb;shutdown=false;hsqldb.tx=mvcc
[2018-10-03 17:30:54,81] [info] Running migration RenameWorkflowOptionsInMetadata with a read batch size of 100000 and a write batch size of 100000
[2018-10-03 17:30:54,82] [info] [RenameWorkflowOptionsInMetadata] 100%
[2018-10-03 17:30:54,99] [info] Running with database db.url = jdbc:hsqldb:mem:2b8155f1-0737-44f5-826f-72c9464d7786;shutdown=false;hsqldb.tx=mvcc
[2018-10-03 17:30:55,95] [warn] This actor factory is deprecated. Please use cromwell.backend.google.pipelines.v1alpha2.PipelinesApiLifecycleActorFactory for PAPI v1 or cromwell.backend.google.pipelines.v2alpha1.PipelinesApiLifecycleActorFactory for PAPI v2
[2018-10-03 17:30:55,96] [warn] Couldn't find a suitable DSN, defaulting to a Noop one.
[2018-10-03 17:30:55,96] [info] Using noop to send events.
[2018-10-03 17:30:56,26] [info] Slf4jLogger started
[2018-10-03 17:30:56,52] [info] Workflow heartbeat configuration:
{
  "cromwellId" : "cromid-5663e1d",
  "heartbeatInterval" : "2 minutes",
  "ttl" : "10 minutes",
  "writeBatchSize" : 10000,
  "writeThreshold" : 10000
}
[2018-10-03 17:30:56,99] [info] Metadata summary refreshing every 2 seconds.
[2018-10-03 17:30:57,26] [info] WriteMetadataActor configured to flush with batch size 200 and process rate 5 seconds.
[2018-10-03 17:30:57,26] [info] KvWriteActor configured to flush with batch size 200 and process rate 5 seconds.
[2018-10-03 17:30:57,26] [info] CallCacheWriteActor configured to flush with batch size 100 and process rate 3 seconds.
[2018-10-03 17:30:58,73] [info] JobExecutionTokenDispenser - Distribution rate: 50 per 1 seconds.
[2018-10-03 17:30:58,76] [info] JES batch polling interval is 33333 milliseconds
[2018-10-03 17:30:58,76] [info] JES batch polling interval is 33333 milliseconds
[2018-10-03 17:30:58,76] [info] SingleWorkflowRunnerActor: Version 34
[2018-10-03 17:30:58,76] [info] JES batch polling interval is 33333 milliseconds
[2018-10-03 17:30:58,76] [info] PAPIQueryManager Running with 3 workers
[2018-10-03 17:30:58,77] [info] SingleWorkflowRunnerActor: Submitting workflow
[2018-10-03 17:30:58,86] [info] Unspecified type (Unspecified version) workflow 7b1f8046-bee2-4ed7-bf62-5cd034f7460e submitted
[2018-10-03 17:30:58,94] [info] SingleWorkflowRunnerActor: Workflow submitted 7b1f8046-bee2-4ed7-bf62-5cd034f7460e
[2018-10-03 17:30:58,95] [info] 1 new workflows fetched
[2018-10-03 17:30:58,95] [info] WorkflowManagerActor Starting workflow 7b1f8046-bee2-4ed7-bf62-5cd034f7460e
[2018-10-03 17:30:58,95] [warn] SingleWorkflowRunnerActor: received unexpected message: Done in state RunningSwraData
[2018-10-03 17:30:58,96] [info] WorkflowManagerActor Successfully started WorkflowActor-7b1f8046-bee2-4ed7-bf62-5cd034f7460e
[2018-10-03 17:30:58,96] [info] Retrieved 1 workflows from the WorkflowStoreActor
[2018-10-03 17:30:58,97] [info] WorkflowStoreHeartbeatWriteActor configured to flush with batch size 10000 and process rate 2 minutes.
[2018-10-03 17:30:59,07] [info] MaterializeWorkflowDescriptorActor [7b1f8046]: Parsing workflow as WDL draft-2
[2018-10-03 17:31:19,63] [info] MaterializeWorkflowDescriptorActor [7b1f8046]: Call-to-Backend assignments: atac.macs2_pr1 -> sge_singularity, atac.reproducibility_idr -> sge_singularity, atac.ataqc -> sge_singularity, atac.trim_adapter -> sge_singularity, atac.idr_ppr -> sge_singularity, atac.overlap -> sge_singularity, atac.macs2_ppr1 -> sge_singularity, atac.filter -> sge_singularity, atac.spr -> sge_singularity, atac.read_genome_tsv -> sge_singularity, atac.overlap_ppr -> sge_singularity, atac.pool_ta -> sge_singularity, atac.overlap_pr -> sge_singularity, atac.macs2_pr2 -> sge_singularity, atac.macs2_ppr2 -> sge_singularity, atac.bowtie2 -> sge_singularity, atac.reproducibility_overlap -> sge_singularity, atac.xcor -> sge_singularity, atac.idr_pr -> sge_singularity, atac.idr -> sge_singularity, atac.macs2_pooled -> sge_singularity, atac.pool_ta_pr1 -> sge_singularity, atac.bam2ta -> sge_singularity, atac.macs2 -> sge_singularity, atac.qc_report -> sge_singularity, atac.pool_ta_pr2 -> sge_singularity
[2018-10-03 17:31:19,87] [warn] sge_singularity [7b1f8046]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-10-03 17:31:19,87] [warn] sge_singularity [7b1f8046]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-10-03 17:31:19,87] [warn] sge_singularity [7b1f8046]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-10-03 17:31:19,87] [warn] sge_singularity [7b1f8046]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-10-03 17:31:19,87] [warn] sge_singularity [7b1f8046]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-10-03 17:31:19,87] [warn] sge_singularity [7b1f8046]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-10-03 17:31:19,87] [warn] sge_singularity [7b1f8046]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-10-03 17:31:19,88] [warn] sge_singularity [7b1f8046]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-10-03 17:31:19,88] [warn] sge_singularity [7b1f8046]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-10-03 17:31:19,88] [warn] sge_singularity [7b1f8046]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-10-03 17:31:19,88] [warn] sge_singularity [7b1f8046]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-10-03 17:31:19,88] [warn] sge_singularity [7b1f8046]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-10-03 17:31:19,88] [warn] sge_singularity [7b1f8046]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-10-03 17:31:19,88] [warn] sge_singularity [7b1f8046]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-10-03 17:31:19,88] [warn] sge_singularity [7b1f8046]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-10-03 17:31:19,88] [warn] sge_singularity [7b1f8046]: Key/s [preemptible, disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-10-03 17:31:19,88] [warn] sge_singularity [7b1f8046]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-10-03 17:31:19,88] [warn] sge_singularity [7b1f8046]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-10-03 17:31:19,88] [warn] sge_singularity [7b1f8046]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-10-03 17:31:19,89] [warn] sge_singularity [7b1f8046]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-10-03 17:31:19,89] [warn] sge_singularity [7b1f8046]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-10-03 17:31:19,89] [warn] sge_singularity [7b1f8046]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-10-03 17:31:19,89] [warn] sge_singularity [7b1f8046]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-10-03 17:31:19,89] [warn] sge_singularity [7b1f8046]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-10-03 17:31:19,89] [warn] sge_singularity [7b1f8046]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-10-03 17:31:19,89] [warn] sge_singularity [7b1f8046]: Key/s [disks] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-10-03 17:31:22,33] [info] WorkflowExecutionActor-7b1f8046-bee2-4ed7-bf62-5cd034f7460e [7b1f8046]: Starting atac.read_genome_tsv
[2018-10-03 17:31:22,33] [info] WorkflowExecutionActor-7b1f8046-bee2-4ed7-bf62-5cd034f7460e [7b1f8046]: Condition met: '!true_rep_only'. Running conditional section
[2018-10-03 17:31:22,33] [info] WorkflowExecutionActor-7b1f8046-bee2-4ed7-bf62-5cd034f7460e [7b1f8046]: Condition met: 'enable_idr'. Running conditional section
[2018-10-03 17:31:22,33] [info] WorkflowExecutionActor-7b1f8046-bee2-4ed7-bf62-5cd034f7460e [7b1f8046]: Condition met: '!align_only && !true_rep_only && enable_idr'. Running conditional section
[2018-10-03 17:31:22,34] [info] WorkflowExecutionActor-7b1f8046-bee2-4ed7-bf62-5cd034f7460e [7b1f8046]: Condition met: 'enable_idr'. Running conditional section
[2018-10-03 17:31:22,35] [info] WorkflowExecutionActor-7b1f8046-bee2-4ed7-bf62-5cd034f7460e [7b1f8046]: Condition met: '!disable_xcor'. Running conditional section
[2018-10-03 17:31:22,36] [info] WorkflowExecutionActor-7b1f8046-bee2-4ed7-bf62-5cd034f7460e [7b1f8046]: Condition met: '!align_only && !true_rep_only'. Running conditional section
[2018-10-03 17:31:23,54] [warn] DispatchedConfigAsyncJobExecutionActor [7b1f8046atac.read_genome_tsv:NA:1]: Unrecognized runtime attribute keys: disks
[2018-10-03 17:31:23,99] [info] DispatchedConfigAsyncJobExecutionActor [7b1f8046atac.read_genome_tsv:NA:1]: cat $ATAC/cromwell-executions/atac/7b1f8046-bee2-4ed7-bf62-5cd034f7460e/call-read_genome_tsv/inputs/-1470820443/hg38_local.tsv
[2018-10-03 17:31:24,09] [info] DispatchedConfigAsyncJobExecutionActor [7b1f8046atac.read_genome_tsv:NA:1]: executing: echo "chmod u+x $ATAC/cromwell-executions/atac/7b1f8046-bee2-4ed7-bf62-5cd034f7460e/call-read_genome_tsv/execution/script && SINGULARITY_BINDPATH=$(echo $ATAC/cromwell-executions/atac/7b1f8046-bee2-4ed7-bf62-5cd034f7460e/call-read_genome_tsv | sed 's/cromwell-executions/\n/g' | head -n1) singularity  exec     ~/.singularity/atac-seq-pipeline-v1.1.simg $ATAC/cromwell-executions/atac/7b1f8046-bee2-4ed7-bf62-5cd034f7460e/call-read_genome_tsv/execution/script" | qsub \
-terse \
-b n \
-N cromwell_7b1f8046_read_genome_tsv \
-wd $ATAC/cromwell-executions/atac/7b1f8046-bee2-4ed7-bf62-5cd034f7460e/call-read_genome_tsv \
-o $ATAC/cromwell-executions/atac/7b1f8046-bee2-4ed7-bf62-5cd034f7460e/call-read_genome_tsv/execution/stdout \
-e $ATAC/cromwell-executions/atac/7b1f8046-bee2-4ed7-bf62-5cd034f7460e/call-read_genome_tsv/execution/stderr \
  \
-l h_vmem=4000m \
-l s_vmem=4000m \
-l h_rt=3600 \
-l s_rt=3600 \
-q all.q \
 \
 \
-V
[2018-10-03 17:31:26,01] [error] DispatchedConfigAsyncJobExecutionActor [7b1f8046atac.read_genome_tsv:NA:1]: Error attempting to Execute
java.lang.RuntimeException: Could not find job ID from stdout file. Check the stderr file for possible errors: $ATAC/cromwell-executions/atac/7b1f8046-bee2-4ed7-bf62-5cd034f7460e/call-read_genome_tsv/execution/stderr.submit
        at cromwell.backend.impl.sfs.config.DispatchedConfigAsyncJobExecutionActor.getJob(ConfigAsyncJobExecutionActor.scala:226)
        at cromwell.backend.sfs.SharedFileSystemAsyncJobExecutionActor.$anonfun$execute$2(SharedFileSystemAsyncJobExecutionActor.scala:133)
        at scala.util.Either.fold(Either.scala:188)
        at cromwell.backend.sfs.SharedFileSystemAsyncJobExecutionActor.execute(SharedFileSystemAsyncJobExecutionActor.scala:126)
        at cromwell.backend.sfs.SharedFileSystemAsyncJobExecutionActor.execute$(SharedFileSystemAsyncJobExecutionActor.scala:121)
        at cromwell.backend.impl.sfs.config.DispatchedConfigAsyncJobExecutionActor.execute(ConfigAsyncJobExecutionActor.scala:208)
        at cromwell.backend.standard.StandardAsyncExecutionActor.$anonfun$executeAsync$1(StandardAsyncExecutionActor.scala:600)
        at scala.util.Try$.apply(Try.scala:209)
        at cromwell.backend.standard.StandardAsyncExecutionActor.executeAsync(StandardAsyncExecutionActor.scala:600)
        at cromwell.backend.standard.StandardAsyncExecutionActor.executeAsync$(StandardAsyncExecutionActor.scala:600)
        at cromwell.backend.impl.sfs.config.DispatchedConfigAsyncJobExecutionActor.executeAsync(ConfigAsyncJobExecutionActor.scala:208)
        at cromwell.backend.standard.StandardAsyncExecutionActor.executeOrRecover(StandardAsyncExecutionActor.scala:915)
        at cromwell.backend.standard.StandardAsyncExecutionActor.executeOrRecover$(StandardAsyncExecutionActor.scala:907)
        at cromwell.backend.impl.sfs.config.DispatchedConfigAsyncJobExecutionActor.executeOrRecover(ConfigAsyncJobExecutionActor.scala:208)
        at cromwell.backend.async.AsyncBackendJobExecutionActor.$anonfun$robustExecuteOrRecover$1(AsyncBackendJobExecutionActor.scala:65)
        at cromwell.core.retry.Retry$.withRetry(Retry.scala:37)
        at cromwell.backend.async.AsyncBackendJobExecutionActor.withRetry(AsyncBackendJobExecutionActor.scala:61)
        at cromwell.backend.async.AsyncBackendJobExecutionActor.cromwell$backend$async$AsyncBackendJobExecutionActor$$robustExecuteOrRecover(AsyncBackendJobExecutionActor.scala:65)
        at cromwell.backend.async.AsyncBackendJobExecutionActor$$anonfun$receive$1.applyOrElse(AsyncBackendJobExecutionActor.scala:88)
        at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
        at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
        at akka.actor.Actor.aroundReceive(Actor.scala:517)
        at akka.actor.Actor.aroundReceive$(Actor.scala:515)
        at cromwell.backend.impl.sfs.config.DispatchedConfigAsyncJobExecutionActor.aroundReceive(ConfigAsyncJobExecutionActor.scala:208)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:588)
        at akka.actor.ActorCell.invoke(ActorCell.scala:557)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
        at akka.dispatch.Mailbox.run(Mailbox.scala:225)
        at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
        at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
[2018-10-03 17:31:26,57] [error] WorkflowManagerActor Workflow 7b1f8046-bee2-4ed7-bf62-5cd034f7460e failed (during ExecutingWorkflowState): cromwell.core.CromwellFatalException: java.lang.RuntimeException: Could not find job ID from stdout file. Check the stderr file for possible errors: $ATAC/cromwell-executions/atac/7b1f8046-bee2-4ed7-bf62-5cd034f7460e/call-read_genome_tsv/execution/stderr.submit
        at cromwell.core.CromwellFatalException$.apply(core.scala:18)
        at cromwell.core.retry.Retry$$anonfun$withRetry$1.applyOrElse(Retry.scala:38)
        at cromwell.core.retry.Retry$$anonfun$withRetry$1.applyOrElse(Retry.scala:37)
        at scala.concurrent.Future.$anonfun$recoverWith$1(Future.scala:413)
        at scala.concurrent.impl.Promise.$anonfun$transformWith$1(Promise.scala:37)
        at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60)
        at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
        at akka.dispatch.BatchingExecutor$BlockableBatch.$anonfun$run$1(BatchingExecutor.scala:91)
        at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
        at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:81)
        at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:91)
        at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
        at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:44)
        at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.RuntimeException: Could not find job ID from stdout file. Check the stderr file for possible errors: $ATAC/cromwell-executions/atac/7b1f8046-bee2-4ed7-bf62-5cd034f7460e/call-read_genome_tsv/execution/stderr.submit
        at cromwell.backend.impl.sfs.config.DispatchedConfigAsyncJobExecutionActor.getJob(ConfigAsyncJobExecutionActor.scala:226)
        at cromwell.backend.sfs.SharedFileSystemAsyncJobExecutionActor.$anonfun$execute$2(SharedFileSystemAsyncJobExecutionActor.scala:133)
        at scala.util.Either.fold(Either.scala:188)
        at cromwell.backend.sfs.SharedFileSystemAsyncJobExecutionActor.execute(SharedFileSystemAsyncJobExecutionActor.scala:126)
        at cromwell.backend.sfs.SharedFileSystemAsyncJobExecutionActor.execute$(SharedFileSystemAsyncJobExecutionActor.scala:121)
        at cromwell.backend.impl.sfs.config.DispatchedConfigAsyncJobExecutionActor.execute(ConfigAsyncJobExecutionActor.scala:208)
        at cromwell.backend.standard.StandardAsyncExecutionActor.$anonfun$executeAsync$1(StandardAsyncExecutionActor.scala:600)
        at scala.util.Try$.apply(Try.scala:209)
        at cromwell.backend.standard.StandardAsyncExecutionActor.executeAsync(StandardAsyncExecutionActor.scala:600)
        at cromwell.backend.standard.StandardAsyncExecutionActor.executeAsync$(StandardAsyncExecutionActor.scala:600)
        at cromwell.backend.impl.sfs.config.DispatchedConfigAsyncJobExecutionActor.executeAsync(ConfigAsyncJobExecutionActor.scala:208)
        at cromwell.backend.standard.StandardAsyncExecutionActor.executeOrRecover(StandardAsyncExecutionActor.scala:915)
        at cromwell.backend.standard.StandardAsyncExecutionActor.executeOrRecover$(StandardAsyncExecutionActor.scala:907)
        at cromwell.backend.impl.sfs.config.DispatchedConfigAsyncJobExecutionActor.executeOrRecover(ConfigAsyncJobExecutionActor.scala:208)
        at cromwell.backend.async.AsyncBackendJobExecutionActor.$anonfun$robustExecuteOrRecover$1(AsyncBackendJobExecutionActor.scala:65)
        at cromwell.core.retry.Retry$.withRetry(Retry.scala:37)
        at cromwell.backend.async.AsyncBackendJobExecutionActor.withRetry(AsyncBackendJobExecutionActor.scala:61)
        at cromwell.backend.async.AsyncBackendJobExecutionActor.cromwell$backend$async$AsyncBackendJobExecutionActor$$robustExecuteOrRecover(AsyncBackendJobExecutionActor.scala:65)
        at cromwell.backend.async.AsyncBackendJobExecutionActor$$anonfun$receive$1.applyOrElse(AsyncBackendJobExecutionActor.scala:88)
        at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
        at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
        at akka.actor.Actor.aroundReceive(Actor.scala:517)
        at akka.actor.Actor.aroundReceive$(Actor.scala:515)
        at cromwell.backend.impl.sfs.config.DispatchedConfigAsyncJobExecutionActor.aroundReceive(ConfigAsyncJobExecutionActor.scala:208)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:588)
        at akka.actor.ActorCell.invoke(ActorCell.scala:557)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
        at akka.dispatch.Mailbox.run(Mailbox.scala:225)
        at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
        ... 4 more

[2018-10-03 17:31:26,58] [info] WorkflowManagerActor WorkflowActor-7b1f8046-bee2-4ed7-bf62-5cd034f7460e is in a terminal state: WorkflowFailedState
[2018-10-03 17:31:29,44] [info] SingleWorkflowRunnerActor workflow finished with status 'Failed'.
[2018-10-03 17:31:32,28] [info] Workflow polling stopped
[2018-10-03 17:31:32,29] [info] Shutting down WorkflowStoreActor - Timeout = 5 seconds
[2018-10-03 17:31:32,30] [info] Shutting down WorkflowLogCopyRouter - Timeout = 5 seconds
[2018-10-03 17:31:32,30] [info] Shutting down JobExecutionTokenDispenser - Timeout = 5 seconds
[2018-10-03 17:31:32,30] [info] Aborting all running workflows.
[2018-10-03 17:31:32,31] [info] JobExecutionTokenDispenser stopped
[2018-10-03 17:31:32,31] [info] WorkflowStoreActor stopped
[2018-10-03 17:31:32,31] [info] WorkflowLogCopyRouter stopped
[2018-10-03 17:31:32,31] [info] Shutting down WorkflowManagerActor - Timeout = 3600 seconds
[2018-10-03 17:31:32,31] [info] WorkflowManagerActor All workflows finished
[2018-10-03 17:31:32,31] [info] WorkflowManagerActor stopped
[2018-10-03 17:31:32,31] [info] Connection pools shut down
[2018-10-03 17:31:32,31] [info] Shutting down SubWorkflowStoreActor - Timeout = 1800 seconds
[2018-10-03 17:31:32,31] [info] Shutting down JobStoreActor - Timeout = 1800 seconds
[2018-10-03 17:31:32,31] [info] Shutting down CallCacheWriteActor - Timeout = 1800 seconds
[2018-10-03 17:31:32,31] [info] SubWorkflowStoreActor stopped
[2018-10-03 17:31:32,31] [info] Shutting down ServiceRegistryActor - Timeout = 1800 seconds
[2018-10-03 17:31:32,31] [info] Shutting down DockerHashActor - Timeout = 1800 seconds
[2018-10-03 17:31:32,32] [info] CallCacheWriteActor Shutting down: 0 queued messages to process
[2018-10-03 17:31:32,32] [info] JobStoreActor stopped
[2018-10-03 17:31:32,32] [info] Shutting down IoProxy - Timeout = 1800 seconds
[2018-10-03 17:31:32,32] [info] KvWriteActor Shutting down: 0 queued messages to process
[2018-10-03 17:31:32,32] [info] WriteMetadataActor Shutting down: 0 queued messages to process
[2018-10-03 17:31:32,32] [info] CallCacheWriteActor stopped
[2018-10-03 17:31:32,32] [info] DockerHashActor stopped
[2018-10-03 17:31:32,32] [info] IoProxy stopped
[2018-10-03 17:31:32,32] [info] ServiceRegistryActor stopped
[2018-10-03 17:31:32,35] [info] Database closed
[2018-10-03 17:31:32,35] [info] Stream materializer shut down
Workflow 7b1f8046-bee2-4ed7-bf62-5cd034f7460e transitioned to state Failed
[2018-10-03 17:31:32,39] [info] Automatic shutdown of the async connection
[2018-10-03 17:31:32,39] [info] Gracefully shutdown sentry threads.
[2018-10-03 17:31:32,40] [info] Shutdown finished.

It was working few days and now I am getting this error. I am attaching the log file here
debug_37.tar.gz

from atac-seq-pipeline.

leepc12 avatar leepc12 commented on June 27, 2024

Please git clone our dev branch including the above fix commit and try again. I hope this fixes the problem.

git clone --branch PIP-388_singularity_bind_home_dir https://github.com/ENCODE-DCC/atac-seq-pipeline

from atac-seq-pipeline.

shanmukhasampath avatar shanmukhasampath commented on June 27, 2024

Hi Jin,

I have git cloned the directory and fixed the backend configuation file with -S /bin/sh. I am getting the following error when I try to use the SGE Singularity.

$ java -jar -Dconfig.file=backends/backend.conf -Dbackend.default=sge_singularity cromwell-34.jar run atac.wdl -i ${INPUT} -o workflow_opts/sge.json
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 54256 bytes for Chunk::new
# An error report file with more information is saved as:
# $ATAC/hs_err_pid139236.log
[thread 46914692994816 also had an error]
[thread 46914697205504 also had an error]
#
# Can't open file to dump replay data. Error: Cannot allocate memory

from atac-seq-pipeline.

shanmukhasampath avatar shanmukhasampath commented on June 27, 2024

Hi Jin,

I have contacted the server administrator regarding this issue and admin mentioned that the command

$ java -jar -Dconfig.file=backends/backend.conf -Dbackend.default=sge_singularity cromwell-34.jar run atac.wdl -i ${INPUT} -o workflow_opts/sge.json

is asking for 44.59 G which is why I am getting insufficient memory errors like this

$ java -jar -Dconfig.file=backends/backend.conf -Dbackend.default=sge_singularity cromwell-34.jar run atac.wdl -i ${INPUT} -o workflow_opts/sge.json
[2018-10-10 14:17:45,83] [info] Running with database db.url = jdbc:hsqldb:mem:1c64cc62-4de9-4f51-97e9-02cab156b321;shutdown=false;hsqldb.tx=mvcc
[2018-10-10 14:17:59,04] [info] Running migration RenameWorkflowOptionsInMetadata with a read batch size of 100000 and a write batch size of 100000
[2018-10-10 14:17:59,06] [info] [RenameWorkflowOptionsInMetadata] 100%
[2018-10-10 14:17:59,26] [info] Running with database db.url = jdbc:hsqldb:mem:48a4f47a-aaea-465d-92d7-0b383f75e40a;shutdown=false;hsqldb.tx=mvcc
[2018-10-10 14:18:00,28] [warn] This actor factory is deprecated. Please use cromwell.backend.google.pipelines.v1alpha2.PipelinesApiLifecycleActorFactory for PAPI v1 or cromwell.backend.google.pipelines.v2alpha1.PipelinesApiLifecycleActorFactory for PAPI v2
[2018-10-10 14:18:00,29] [warn] Couldn't find a suitable DSN, defaulting to a Noop one.
[2018-10-10 14:18:00,30] [info] Using noop to send events.
[2018-10-10 14:18:01,29] [info] Slf4jLogger started
[2018-10-10 14:18:01,63] [info] Workflow heartbeat configuration:
{
  "cromwellId" : "cromid-b049fd4",
  "heartbeatInterval" : "2 minutes",
  "ttl" : "10 minutes",
  "writeBatchSize" : 10000,
  "writeThreshold" : 10000
}
[2018-10-10 14:18:01,68] [info] Metadata summary refreshing every 2 seconds.
[2018-10-10 14:18:01,76] [info] WriteMetadataActor configured to flush with batch size 200 and process rate 5 seconds.
[2018-10-10 14:18:01,76] [info] KvWriteActor configured to flush with batch size 200 and process rate 5 seconds.
[2018-10-10 14:18:01,76] [info] CallCacheWriteActor configured to flush with batch size 100 and process rate 3 seconds.
Uncaught error from thread [cromwell-system-akka.dispatchers.service-dispatcher-10]: unable to create new native thread, shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[cromwell-system]
java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method)
        at java.lang.Thread.start(Thread.java:717)
        at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957)
        at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1367)
        at slick.util.AsyncExecutor$$anon$2$$anon$3.execute(AsyncExecutor.scala:161)
        at slick.basic.BasicBackend$DatabaseDef.runSynchronousDatabaseAction(BasicBackend.scala:264)
        at slick.basic.BasicBackend$DatabaseDef.runSynchronousDatabaseAction$(BasicBackend.scala:262)
        at slick.jdbc.JdbcBackend$DatabaseDef.runSynchronousDatabaseAction(JdbcBackend.scala:37)
        at slick.basic.BasicBackend$DatabaseDef.slick$basic$BasicBackend$DatabaseDef$$runInContextInline(BasicBackend.scala:241)
        at slick.basic.BasicBackend$DatabaseDef.runInContextSafe(BasicBackend.scala:147)
        at slick.basic.BasicBackend$DatabaseDef.slick$basic$BasicBackend$DatabaseDef$$runInContextInline(BasicBackend.scala:171)
        at slick.basic.BasicBackend$DatabaseDef.runInContextSafe(BasicBackend.scala:147)
        at slick.basic.BasicBackend$DatabaseDef.slick$basic$BasicBackend$DatabaseDef$$runInContextInline(BasicBackend.scala:171)
        at slick.basic.BasicBackend$DatabaseDef.runInContextSafe(BasicBackend.scala:147)
        at slick.basic.BasicBackend$DatabaseDef.runInContext(BasicBackend.scala:141)
        at slick.basic.BasicBackend$DatabaseDef.runInContext$(BasicBackend.scala:140)
        at slick.jdbc.JdbcBackend$DatabaseDef.runInContext(JdbcBackend.scala:37)
        at slick.basic.BasicBackend$DatabaseDef.$anonfun$runInContextInline$1(BasicBackend.scala:171)
        at scala.concurrent.Future.$anonfun$flatMap$1(Future.scala:303)
        at scala.concurrent.impl.Promise.$anonfun$transformWith$1(Promise.scala:37)
        at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60)
        at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
        at akka.dispatch.BatchingExecutor$BlockableBatch.$anonfun$run$1(BatchingExecutor.scala:91)
        at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
        at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:81)
        at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:91)
        at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
        at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:44)
        at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

As per my understanding it is strange that the command is asking for that much memory as it only submits the commands using qsub.

from atac-seq-pipeline.

leepc12 avatar leepc12 commented on June 27, 2024

Please try to increase your Java heap size. Add the following to your ~/.bashrc and re-login.

export _JAVA_OPTIONS="-Xms256M -Xmx1024M -XX:ParallelGCThreads=1"
export MAX_JAVA_MEM="8G"

from atac-seq-pipeline.

shanmukhasampath avatar shanmukhasampath commented on June 27, 2024

Hi Jin,

I have increased the Java heap size by adding the export commands to my ~/.bashrc.

I got the memory allocation problem.

$ java -jar -Dconfig.file=backends/backend.conf -Dbackend.default=sge_singularity cromwell-34.jar run atac.wdl -i ${INPUT} -o workflow_opts/sge.json
Picked up _JAVA_OPTIONS: -Xms256M -Xmx1024M -XX:ParallelGCThreads=1
[2018-10-11 17:14:32,59] [info] Running with database db.url = jdbc:hsqldb:mem:05d6b9ed-0028-4829-b28f-561cda775f2e;shutdown=false;hsqldb.tx=mvcc
[2018-10-11 17:14:42,34] [info] Running migration RenameWorkflowOptionsInMetadata with a read batch size of 100000 and a write batch size of 100000
[2018-10-11 17:14:42,35] [info] [RenameWorkflowOptionsInMetadata] 100%
[2018-10-11 17:14:42,45] [info] Running with database db.url = jdbc:hsqldb:mem:9351cd51-c193-486c-bfa4-ddc144e61d9c;shutdown=false;hsqldb.tx=mvcc
[2018-10-11 17:14:43,05] [warn] This actor factory is deprecated. Please use cromwell.backend.google.pipelines.v1alpha2.PipelinesApiLifecycleActorFactory for PAPI v1 or cromwell.backend.google.pipelines.v2alpha1.PipelinesApiLifecycleActorFactory for PAPI v2
[2018-10-11 17:14:43,05] [warn] Couldn't find a suitable DSN, defaulting to a Noop one.
[2018-10-11 17:14:43,06] [info] Using noop to send events.
[2018-10-11 17:14:43,53] [info] Slf4jLogger started
[2018-10-11 17:14:43,84] [info] Workflow heartbeat configuration:
{
  "cromwellId" : "cromid-119a9e8",
  "heartbeatInterval" : "2 minutes",
  "ttl" : "10 minutes",
  "writeBatchSize" : 10000,
  "writeThreshold" : 10000
}
[2018-10-11 17:14:43,90] [info] Metadata summary refreshing every 2 seconds.
[2018-10-11 17:14:43,97] [info] KvWriteActor configured to flush with batch size 200 and process rate 5 seconds.
[2018-10-11 17:14:43,97] [info] WriteMetadataActor configured to flush with batch size 200 and process rate 5 seconds.
[2018-10-11 17:14:43,98] [info] CallCacheWriteActor configured to flush with batch size 100 and process rate 3 seconds.
[2018-10-11 17:14:45,46] [info] JobExecutionTokenDispenser - Distribution rate: 50 per 1 seconds.
[2018-10-11 17:14:45,51] [info] SingleWorkflowRunnerActor: Version 34
[2018-10-11 17:14:45,52] [info] SingleWorkflowRunnerActor: Submitting workflow
[2018-10-11 17:14:45,54] [info] JES batch polling interval is 33333 milliseconds
[2018-10-11 17:14:45,55] [info] JES batch polling interval is 33333 milliseconds
[2018-10-11 17:14:45,55] [info] JES batch polling interval is 33333 milliseconds
[2018-10-11 17:14:45,55] [info] PAPIQueryManager Running with 3 workers
[thread 46917120141056 also had an error]
#
[thread 46917124339456 also had an error]# There is insufficient memory for the Java Runtime Environment to continue.

# Native memory allocation (malloc) failed to allocate 88 bytes[thread 46917128189696 also had an error] for
AllocateHeap
[thread 46917130422016 also had an error]
# An error report file with more information is saved as:
# /mnt/isilon/sfgi/programs/atac_dnase_pipelines_noBDS_new/hs_err_pid29691.log

from atac-seq-pipeline.

leepc12 avatar leepc12 commented on June 27, 2024

https://plumbr.io/outofmemoryerror/unable-to-create-new-native-thread
Please check max user processes.

$ ulimit -a

from atac-seq-pipeline.

shanmukhasampath avatar shanmukhasampath commented on June 27, 2024

Hi Jin,

I have checked the max user processes and here is the output of command

$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 515210
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 131072
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) unlimited
cpu time               (seconds, -t) unlimited
max user processes              (-u) 4096
virtual memory          (kbytes, -v) 8000000
file locks                      (-x) unlimited

from atac-seq-pipeline.

leepc12 avatar leepc12 commented on June 27, 2024
  1. Can you remove -S /bin/sh and try again? I would like to check if -S /bin/sh caused the memory problem.

  2. Also, can you try without using SGE? You can make an interactive node with qlogin and run it with thesingularity backend.

from atac-seq-pipeline.

shanmukhasampath avatar shanmukhasampath commented on June 27, 2024

Hi Jin,

I have commented the shell command -S /bin/sh and ran the pipeline but still got the OutofMemory

$ grep -n '\-S' backends/backend.conf
69:          # -S /bin/sh \
145:        # -S /bin/sh \
$ singularity --version
2.5.2-dist
$ java -jar -Dconfig.file=backends/backend.conf -Dbackend.default=sge_singularity cromwell-34.jar run atac.wdl -i ${INPUT} -o workflow_opts/sge.json
Picked up _JAVA_OPTIONS: -Xms256M -Xmx1024M -XX:ParallelGCThreads=1
[2018-10-17 14:34:49,14] [info] Running with database db.url = jdbc:hsqldb:mem:00c91e2c-b4dd-47a0-b62d-c8c6839ae9e8;shutdown=false;hsqldb.tx=mvcc
[2018-10-17 14:35:03,18] [info] Running migration RenameWorkflowOptionsInMetadata with a read batch size of 100000 and a write batch size of 100000
[2018-10-17 14:35:03,21] [info] [RenameWorkflowOptionsInMetadata] 100%
[2018-10-17 14:35:03,40] [info] Running with database db.url = jdbc:hsqldb:mem:911febb1-c51a-45f0-983a-2970f1ce9fbe;shutdown=false;hsqldb.tx=mvcc
[2018-10-17 14:35:04,04] [warn] This actor factory is deprecated. Please use cromwell.backend.google.pipelines.v1alpha2.PipelinesApiLifecycleActorFactory for PAPI v1 or cromwell.backend.google.pipelines.v2alpha1.PipelinesApiLifecycleActorFactory for PAPI v2
[2018-10-17 14:35:04,06] [warn] Couldn't find a suitable DSN, defaulting to a Noop one.
[2018-10-17 14:35:04,07] [info] Using noop to send events.
[2018-10-17 14:35:04,53] [info] Slf4jLogger started
[2018-10-17 14:35:04,86] [info] Workflow heartbeat configuration:
{
  "cromwellId" : "cromid-2332272",
  "heartbeatInterval" : "2 minutes",
  "ttl" : "10 minutes",
  "writeBatchSize" : 10000,
  "writeThreshold" : 10000
}
[2018-10-17 14:35:04,91] [info] Metadata summary refreshing every 2 seconds.
[2018-10-17 14:35:04,98] [info] WriteMetadataActor configured to flush with batch size 200 and process rate 5 seconds.
[2018-10-17 14:35:04,99] [info] KvWriteActor configured to flush with batch size 200 and process rate 5 seconds.
[2018-10-17 14:35:04,99] [info] CallCacheWriteActor configured to flush with batch size 100 and process rate 3 seconds.
[2018-10-17 14:35:06,94] [info] JobExecutionTokenDispenser - Distribution rate: 50 per 1 seconds.
[2018-10-17 14:35:07,00] [info] SingleWorkflowRunnerActor: Version 34
[2018-10-17 14:35:07,02] [info] SingleWorkflowRunnerActor: Submitting workflow
[2018-10-17 14:35:07,06] [info] JES batch polling interval is 33333 milliseconds
Uncaught error from thread [cromwell-system-akka.dispatchers.api-dispatcher-33]: Uncaught error from thread [cromwell-system-akka.dispatchers.backend-dispatcher-64]: unable to create new native thread, shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[cromwell-systemunable to create new native thread, shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[cromwell-system]
]
java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method)
        at java.lang.Thread.start(Thread.java:717)
        at akka.dispatch.forkjoin.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1672)
        at akka.dispatch.forkjoin.ForkJoinPool.deregisterWorker(ForkJoinPool.java:1795)
        at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:117)
Uncaught error from thread [cromwell-system-akka.dispatchers.backend-dispatcher-62]: unable to create new native thread, shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[cromwell-system]
java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method)
        at java.lang.Thread.start(Thread.java:717)
        at akka.dispatch.forkjoin.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1672)
        at akka.dispatch.forkjoin.ForkJoinPool.signalWork(ForkJoinPool.java:1966)
        at akka.dispatch.forkjoin.ForkJoinPool.fullExternalPush(ForkJoinPool.java:1905)
        at akka.dispatch.forkjoin.ForkJoinPool.externalPush(ForkJoinPool.java:1834)
        at akka.dispatch.forkjoin.ForkJoinPool.execute(ForkJoinPool.java:2955)
        at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinPool.execute(ForkJoinExecutorConfigurator.scala:30)
        at akka.dispatch.ExecutorServiceDelegate.execute(ThreadPoolBuilder.scala:211)
        at akka.dispatch.ExecutorServiceDelegate.execute$(ThreadPoolBuilder.scala:211)
        at akka.dispatch.Dispatcher$LazyExecutorServiceDelegate.execute(Dispatcher.scala:39)
        at akka.dispatch.Dispatcher.executeTask(Dispatcher.scala:72)
        at akka.dispatch.MessageDispatcher.unbatchedExecute(AbstractDispatcher.scala:146)
        at akka.dispatch.BatchingExecutor.execute(BatchingExecutor.scala:120)
        at akka.dispatch.BatchingExecutor.execute$(BatchingExecutor.scala:114)
        at akka.dispatch.MessageDispatcher.execute(AbstractDispatcher.scala:86)
        at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:68)
        at scala.concurrent.impl.Promise$KeptPromise$Kept.onComplete(Promise.scala:368)
        at scala.concurrent.impl.Promise$KeptPromise$Kept.onComplete$(Promise.scala:367)
        at scala.concurrent.impl.Promise$KeptPromise$Successful.onComplete(Promise.scala:375)
        at scala.concurrent.impl.Promise.transform(Promise.scala:29)
        at scala.concurrent.impl.Promise.transform$(Promise.scala:27)
        at scala.concurrent.impl.Promise$KeptPromise$Successful.transform(Promise.scala:375)
        at scala.concurrent.Future.map(Future.scala:288)
        at scala.concurrent.Future.map$(Future.scala:288)
        at scala.concurrent.impl.Promise$KeptPromise$Successful.map(Promise.scala:375)
        at scala.concurrent.Future$.apply(Future.scala:654)
        at cromwell.engine.workflow.workflowstore.WorkflowStoreSubmitActor.processSource(WorkflowStoreSubmitActor.scala:137)
        at cromwell.engine.workflow.workflowstore.WorkflowStoreSubmitActor.$anonfun$processSources$1(WorkflowStoreSubmitActor.scala:102)
        at cats.data.NonEmptyList.map(NonEmptyList.scala:76)
        at cromwell.engine.workflow.workflowstore.WorkflowStoreSubmitActor.processSources(WorkflowStoreSubmitActor.scala:102)
        at cromwell.engine.workflow.workflowstore.WorkflowStoreSubmitActor.cromwell$engine$workflow$workflowstore$WorkflowStoreSubmitActor$$storeWorkflowSources(WorkflowStoreSubmitActor.scala:95)
        at cromwell.engine.workflow.workflowstore.WorkflowStoreSubmitActor$$anonfun$1.applyOrElse(WorkflowStoreSubmitActor.scala:39)
        at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
        at akka.actor.Actor.aroundReceive(Actor.scala:517)
        at akka.actor.Actor.aroundReceive$(Actor.scala:515)
        at cromwell.engine.workflow.workflowstore.WorkflowStoreSubmitActor.aroundReceive(WorkflowStoreSubmitActor.scala:29)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:588)
        at akka.actor.ActorCell.invoke(ActorCell.scala:557)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
        at akka.dispatch.Mailbox.run(Mailbox.scala:225)
        at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
        at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method)
        at java.lang.Thread.start(Thread.java:717)
        at akka.dispatch.forkjoin.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1672)
        at akka.dispatch.forkjoin.ForkJoinPool.signalWork(ForkJoinPool.java:1966)
        at akka.dispatch.forkjoin.ForkJoinPool.externalPush(ForkJoinPool.java:1829)
        at akka.dispatch.forkjoin.ForkJoinPool.execute(ForkJoinPool.java:2955)
        at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinPool.execute(ForkJoinExecutorConfigurator.scala:30)
        at akka.dispatch.ExecutorServiceDelegate.execute(ThreadPoolBuilder.scala:211)
        at akka.dispatch.ExecutorServiceDelegate.execute$(ThreadPoolBuilder.scala:211)
        at akka.dispatch.Dispatcher$LazyExecutorServiceDelegate.execute(Dispatcher.scala:39)
        at akka.dispatch.Dispatcher.registerForExecution(Dispatcher.scala:115)
        at akka.dispatch.Mailbox.run(Mailbox.scala:229)
        at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
        at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Uncaught error from thread [cromwell-system-akka.dispatchers.backend-dispatcher-61]: unable to create new native thread, shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[cromwell-system]
java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method)
        at java.lang.Thread.start(Thread.java:717)
        at akka.dispatch.forkjoin.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1672)
        at akka.dispatch.forkjoin.ForkJoinPool.deregisterWorker(ForkJoinPool.java:1795)
        at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:117)

I have ran the pipeline with backend singularity on a qlogin and it worked.

[2018-10-17 16:12:44,61] [info] WorkflowManagerActor WorkflowActor-d2ebeb8f-b13e-4a26-b096-83a6d825b8ea is in a terminal state: WorkflowSucceededState
[2018-10-17 16:13:18,11] [info] SingleWorkflowRunnerActor workflow finished with status 'Succeeded'.

This time the pipeline did not get stuck at the qc report step.

from atac-seq-pipeline.

leepc12 avatar leepc12 commented on June 27, 2024

@shanmukhasampath : Thanks for this, but you should not comment it out. Please delete the line and try again.

from atac-seq-pipeline.

shanmukhasampath avatar shanmukhasampath commented on June 27, 2024

Hi Jin,

I have tried by deleting the shell command in the qsub and still getting OutofMemory

$ grep -n '\-S' backends/backend.conf
$ java -jar -Dconfig.file=backends/backend.conf -Dbackend.default=sge_singularity cromwell-34.jar run atac.wdl -i ${INPUT} -o workflow_opts/sge.json
Picked up _JAVA_OPTIONS: -Xms256M -Xmx1024M -XX:ParallelGCThreads=1
[2018-10-18 08:36:57,41] [info] Running with database db.url = jdbc:hsqldb:mem:da82cf9d-1b27-49a5-b0f1-73167d1a5733;shutdown=false;hsqldb.tx=mvcc
[2018-10-18 08:37:08,89] [info] Running migration RenameWorkflowOptionsInMetadata with a read batch size of 100000 and a write batch size of 100000
[2018-10-18 08:37:08,90] [info] [RenameWorkflowOptionsInMetadata] 100%
[2018-10-18 08:37:09,06] [info] Running with database db.url = jdbc:hsqldb:mem:a32d87a5-5b10-431d-96a6-0a777de39055;shutdown=false;hsqldb.tx=mvcc
[2018-10-18 08:37:09,66] [warn] This actor factory is deprecated. Please use cromwell.backend.google.pipelines.v1alpha2.PipelinesApiLifecycleActorFactory for PAPI v1 or cromwell.backend.google.pipelines.v2alpha1.PipelinesApiLifecycleActorFactory for PAPI v2
[2018-10-18 08:37:09,69] [warn] Couldn't find a suitable DSN, defaulting to a Noop one.
[2018-10-18 08:37:09,70] [info] Using noop to send events.
[2018-10-18 08:37:10,23] [info] Slf4jLogger started
[2018-10-18 08:37:10,65] [info] Workflow heartbeat configuration:
{
  "cromwellId" : "cromid-7a7f620",
  "heartbeatInterval" : "2 minutes",
  "ttl" : "10 minutes",
  "writeBatchSize" : 10000,
  "writeThreshold" : 10000
}
[2018-10-18 08:37:10,71] [info] Metadata summary refreshing every 2 seconds.
[2018-10-18 08:37:10,85] [info] WriteMetadataActor configured to flush with batch size 200 and process rate 5 seconds.
[2018-10-18 08:37:10,85] [info] KvWriteActor configured to flush with batch size 200 and process rate 5 seconds.
[2018-10-18 08:37:10,85] [info] CallCacheWriteActor configured to flush with batch size 100 and process rate 3 seconds.
[2018-10-18 08:37:12,55] [info] JobExecutionTokenDispenser - Distribution rate: 50 per 1 seconds.
[2018-10-18 08:37:12,59] [info] SingleWorkflowRunnerActor: Version 34
[2018-10-18 08:37:12,61] [info] SingleWorkflowRunnerActor: Submitting workflow
[2018-10-18 08:37:12,62] [info] JES batch polling interval is 33333 milliseconds
Uncaught error from thread [Uncaught error from thread [cromwell-system-akka.dispatchers.backend-dispatcher-64cromwell-system-akka.dispatchers.api-dispatcher-36]: ]: unable to create new native threadunable to create new native thread, shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for, shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[ ActorSystem[cromwell-system]
cromwell-system]
java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method)
        at java.lang.Thread.start(Thread.java:717)
        at akka.dispatch.forkjoin.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1672)
        at akka.dispatch.forkjoin.ForkJoinPool.signalWork(ForkJoinPool.java:1966)
        at akka.dispatch.forkjoin.ForkJoinPool.externalPush(ForkJoinPool.java:1829)
        at akka.dispatch.forkjoin.ForkJoinPool.execute(ForkJoinPool.java:2955)
        at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinPool.execute(ForkJoinExecutorConfigurator.scala:30)
        at akka.dispatch.ExecutorServiceDelegate.execute(ThreadPoolBuilder.scala:211)
        at akka.dispatch.ExecutorServiceDelegate.execute$(ThreadPoolBuilder.scala:211)
        at akka.dispatch.Dispatcher$LazyExecutorServiceDelegate.execute(Dispatcher.scala:39)
        at akka.dispatch.Dispatcher.registerForExecution(Dispatcher.scala:115)
        at akka.dispatch.Mailbox.run(Mailbox.scala:229)
        at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
        at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method)
        at java.lang.Thread.start(Thread.java:717)
        at akka.dispatch.forkjoin.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1672)
        at akka.dispatch.forkjoin.ForkJoinPool.signalWork(ForkJoinPool.java:1966)
        at akka.dispatch.forkjoin.ForkJoinPool.fullExternalPush(ForkJoinPool.java:1905)
        at akka.dispatch.forkjoin.ForkJoinPool.externalPush(ForkJoinPool.java:1834)
        at akka.dispatch.forkjoin.ForkJoinPool.execute(ForkJoinPool.java:2955)
        at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinPool.execute(ForkJoinExecutorConfigurator.scala:30)
        at akka.dispatch.ExecutorServiceDelegate.execute(ThreadPoolBuilder.scala:211)
        at akka.dispatch.ExecutorServiceDelegate.execute$(ThreadPoolBuilder.scala:211)
        at akka.dispatch.Dispatcher$LazyExecutorServiceDelegate.execute(Dispatcher.scala:39)
        at akka.dispatch.Dispatcher.executeTask(Dispatcher.scala:72)
        at akka.dispatch.MessageDispatcher.unbatchedExecute(AbstractDispatcher.scala:146)
        at akka.dispatch.BatchingExecutor.execute(BatchingExecutor.scala:120)
        at akka.dispatch.BatchingExecutor.execute$(BatchingExecutor.scala:114)
        at akka.dispatch.MessageDispatcher.execute(AbstractDispatcher.scala:86)
        at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:68)
        at scala.concurrent.impl.Promise$KeptPromise$Kept.onComplete(Promise.scala:368)
        at scala.concurrent.impl.Promise$KeptPromise$Kept.onComplete$(Promise.scala:367)
        at scala.concurrent.impl.Promise$KeptPromise$Successful.onComplete(Promise.scala:375)
        at scala.concurrent.impl.Promise.transform(Promise.scala:29)
        at scala.concurrent.impl.Promise.transform$(Promise.scala:27)
        at scala.concurrent.impl.Promise$KeptPromise$Successful.transform(Promise.scala:375)
        at scala.concurrent.Future.map(Future.scala:288)
        at scala.concurrent.Future.map$(Future.scala:288)
        at scala.concurrent.impl.Promise$KeptPromise$Successful.map(Promise.scala:375)
        at scala.concurrent.Future$.apply(Future.scala:654)
        at cromwell.engine.workflow.workflowstore.WorkflowStoreSubmitActor.processSource(WorkflowStoreSubmitActor.scala:137)
        at cromwell.engine.workflow.workflowstore.WorkflowStoreSubmitActor.$anonfun$processSources$1(WorkflowStoreSubmitActor.scala:102)
        at cats.data.NonEmptyList.map(NonEmptyList.scala:76)
        at cromwell.engine.workflow.workflowstore.WorkflowStoreSubmitActor.processSources(WorkflowStoreSubmitActor.scala:102)
        at cromwell.engine.workflow.workflowstore.WorkflowStoreSubmitActor.cromwell$engine$workflow$workflowstore$WorkflowStoreSubmitActor$$storeWorkflowSources(WorkflowStoreSubmitActor.scala:95)
        at cromwell.engine.workflow.workflowstore.WorkflowStoreSubmitActor$$anonfun$1.applyOrElse(WorkflowStoreSubmitActor.scala:39)
        at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
        at akka.actor.Actor.aroundReceive(Actor.scala:517)
        at akka.actor.Actor.aroundReceive$(Actor.scala:515)
        at cromwell.engine.workflow.workflowstore.WorkflowStoreSubmitActor.aroundReceive(WorkflowStoreSubmitActor.scala:29)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:588)
        at akka.actor.ActorCell.invoke(ActorCell.scala:557)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
        at akka.dispatch.Mailbox.run(Mailbox.scala:225)
        at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
        at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Uncaught error from thread [cromwell-system-akka.dispatchers.backend-dispatcher-63]: unable to create new native thread, shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[cromwell-system]
java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method)
        at java.lang.Thread.start(Thread.java:717)
        at akka.dispatch.forkjoin.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1672)
        at akka.dispatch.forkjoin.ForkJoinPool.deregisterWorker(ForkJoinPool.java:1795)
        at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:117)

Also, the singularity backend mode is only working on qlogin but not on the run mode

$ java -jar -Dconfig.file=backends/backend.conf -Dbackend.default=singularity cromwell-34.jar run atac.wdl -i ${INPUT} -o workflow_opts/sge.json
Picked up _JAVA_OPTIONS: -Xms256M -Xmx1024M -XX:ParallelGCThreads=1
[2018-10-18 11:22:24,25] [info] Running with database db.url = jdbc:hsqldb:mem:c93f72c4-808d-4f83-906b-f2446c5b3305;shutdown=false;hsqldb.tx=mvcc
[2018-10-18 11:22:37,18] [info] Running migration RenameWorkflowOptionsInMetadata with a read batch size of 100000 and a write batch size of 100000
[2018-10-18 11:22:37,19] [info] [RenameWorkflowOptionsInMetadata] 100%
[2018-10-18 11:22:37,32] [info] Running with database db.url = jdbc:hsqldb:mem:d1602c93-2148-4a61-be1a-c01291141205;shutdown=false;hsqldb.tx=mvcc
[2018-10-18 11:22:37,96] [warn] This actor factory is deprecated. Please use cromwell.backend.google.pipelines.v1alpha2.PipelinesApiLifecycleActorFactory for PAPI v1 or cromwell.backend.google.pipelines.v2alpha1.PipelinesApiLifecycleActorFactory for PAPI v2
[2018-10-18 11:22:37,97] [warn] Couldn't find a suitable DSN, defaulting to a Noop one.
[2018-10-18 11:22:37,98] [info] Using noop to send events.
[2018-10-18 11:22:38,60] [info] Slf4jLogger started
[2018-10-18 11:22:38,96] [info] Workflow heartbeat configuration:
{
  "cromwellId" : "cromid-39f9c5f",
  "heartbeatInterval" : "2 minutes",
  "ttl" : "10 minutes",
  "writeBatchSize" : 10000,
  "writeThreshold" : 10000
}
[2018-10-18 11:22:39,01] [info] Metadata summary refreshing every 2 seconds.
[2018-10-18 11:22:39,06] [info] WriteMetadataActor configured to flush with batch size 200 and process rate 5 seconds.
[2018-10-18 11:22:39,06] [info] CallCacheWriteActor configured to flush with batch size 100 and process rate 3 seconds.
[2018-10-18 11:22:39,06] [info] KvWriteActor configured to flush with batch size 200 and process rate 5 seconds.
[2018-10-18 11:22:41,28] [info] JobExecutionTokenDispenser - Distribution rate: 50 per 1 seconds.
Uncaught error from thread [cromwell-system-akka.actor.default-dispatcher-27]: unable to create new native thread, shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[cromwell-system]
java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method)
        at java.lang.Thread.start(Thread.java:717)
        at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957)
        at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1367)
        at akka.dispatch.ExecutorServiceDelegate.execute(ThreadPoolBuilder.scala:211)
        at akka.dispatch.ExecutorServiceDelegate.execute$(ThreadPoolBuilder.scala:211)
        at akka.dispatch.Dispatcher$LazyExecutorServiceDelegate.execute(Dispatcher.scala:39)
        at akka.dispatch.Dispatcher.registerForExecution(Dispatcher.scala:115)
        at akka.dispatch.Dispatcher.dispatch(Dispatcher.scala:55)
        at akka.actor.dungeon.Dispatch.sendMessage(Dispatch.scala:142)
        at akka.actor.dungeon.Dispatch.sendMessage$(Dispatch.scala:136)
[thread 46918402819840 also had an error]
        at akka.actor.ActorCell.sendMessage(ActorCell.scala:431)#
# There is insufficient memory for the Java Runtime Environment to continue.

# Native memory allocation (malloc) failed to allocate 104 bytes for AllocateHeap
        at akka.actor.Cell.sendMessage(ActorCell.scala:352)
        at akka.actor.Cell.sendMessage$(ActorCell.scala:351)
        at akka.actor.ActorCell.sendMessage(ActorCell.scala:431)
        at akka.actor.LocalActorRef.$bang(ActorRef.scala:400)
        at cromwell.services.healthmonitor.HealthMonitorServiceActor.$anonfun$checkSubsystem$1(HealthMonitorServiceActor.scala:87)
        at cromwell.services.healthmonitor.HealthMonitorServiceActor.$anonfun$checkSubsystem$1$adapted(HealthMonitorServiceActor.scala:85)
        at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60)
        at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
        at akka.dispatch.BatchingExecutor$BlockableBatch.$anonfun$run$1(BatchingExecutor.scala:91)
        at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
        at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:81)
        at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:91)
        at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
        at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:44)
        at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

from atac-seq-pipeline.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.