Giter VIP home page Giter VIP logo

Comments (14)

delgod avatar delgod commented on July 24, 2024

can be connected to #37

from ec2-fleet-plugin.

terma avatar terma commented on July 24, 2024

Hi @delgod, Termination fix was released under version 1.1.8. Did it fix this issue?

from ec2-fleet-plugin.

delgod avatar delgod commented on July 24, 2024

sorry, due to this bug I was decided to use https://plugins.jenkins.io/ec2 plugin (it also has bugs).

from ec2-fleet-plugin.

terma avatar terma commented on July 24, 2024

@delgod nw, I will try to test it and close if it was fixed.

from ec2-fleet-plugin.

delgod avatar delgod commented on July 24, 2024

Steps to Reproduce:
set minimal amount on workers - 0
choose AMI with very long start duration (like CentOS)
wait till you have 0 workers
a run job which requires plenty of workers (like 50 workers)

result was:
plugin started ~55 workers (because of instance startup had long duration)
after 50 workers become used -- plugin started to do scale down
and killed workers under load

from ec2-fleet-plugin.

terma avatar terma commented on July 24, 2024

@delgod thx for steps, I have reopened issue to test and try to fix

from ec2-fleet-plugin.

terma avatar terma commented on July 24, 2024

https://jenkins.io/doc/developer/testing/ to emulate steps as integration test

from ec2-fleet-plugin.

terma avatar terma commented on July 24, 2024

@delgod im trying to reproduce, question: under load means executors on those workers still are doing some jobs, correct?

from ec2-fleet-plugin.

delgod avatar delgod commented on July 24, 2024

yes

from ec2-fleet-plugin.

terma avatar terma commented on July 24, 2024

@delgod what do you have for configuration item Max Idle Minutes Before Scaledown? Empty, 0?

from ec2-fleet-plugin.

terma avatar terma commented on July 24, 2024

@delgod I think I found the problem, according to plugin logic, the node could be removed if last execution on this node was too far (settings) however we don't check if a node has any build in progress. Unfortunately, Jenkins logic is really complicated about that so I cannot give 100%, so I made a fix based on Jenkins examples.

Fix released under version 1.4.1

I will keep the issue open, please ping if you can try, thx a lot

from ec2-fleet-plugin.

delgod avatar delgod commented on July 24, 2024

job duration is 30-60 minutes and "Max Idle Minutes Before Scaledown" was 5 minutes.
sorry, now, another person is the maintainer of CI/CD env, so cannot recheck my case.

thank you for your work!

from ec2-fleet-plugin.

terma avatar terma commented on July 24, 2024

np, thx for reporting.

I think fix is applicable for your case, idle time less then job duration, for now it checks if any job in progress too.

from ec2-fleet-plugin.

piotrplenik avatar piotrplenik commented on July 24, 2024

Hi all,
i discovered this issue today.
I have "EC2 Fleet" plugin in "1.12.0" version.
Found in logs:

2019-10-17 02:24:45.027+0000 [id=227367]        INFO    h.r.SynchronousCommandTransport$ReaderThread#run: I/O error in channel i-09140f54c5e375ce9
java.io.EOFException
        at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2680)
        at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3155)
        at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:861)
        at java.io.ObjectInputStream.<init>(ObjectInputStream.java:357)
        at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:49)
        at hudson.remoting.Command.readFrom(Command.java:140)
        at hudson.remoting.Command.readFrom(Command.java:126)
        at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35)
        at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63)
Caused: java.io.IOException: Unexpected termination of the channel
        at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77)
2019-10-17 02:24:45.028+0000 [id=227367]        INFO    c.a.j.e.EC2FleetAutoResubmitComputerLauncher#afterDisconnect: Unexpected EC2 Fleet Spot CompilationServer VS 2010 i-09140f54c5e375ce9 termination,  resubmit
2019-10-17 02:24:45.028+0000 [id=227367]        INFO    c.a.j.e.EC2FleetAutoResubmitComputerLauncher#afterDisconnect: Unexpected EC2 Fleet Spot CompilationServer VS 2010 i-09140f54c5e375ce9 termination, resubmit finished
2019-10-17 02:24:53.158+0000 [id=36]    INFO    c.a.j.ec2fleet.EC2FleetCloud#info: EC2 Fleet Spot CompilationServer VS 2010 [CompilationServer_vs2010 CompilationServer_vs2019 Windows] start
2019-10-17 02:24:53.249+0000 [id=36]    INFO    c.a.j.ec2fleet.EC2FleetCloud#info: EC2 Fleet Spot CompilationServer VS 2010 [CompilationServer_vs2010 CompilationServer_vs2019 Windows] fleet instances: [i-09140f54c5e375ce9, i-0e678fdaedd376e0b, i-0954d8250872ce6c0]
2019-10-17 02:24:53.334+0000 [id=36]    INFO    c.a.j.ec2fleet.EC2FleetCloud#info: EC2 Fleet Spot CompilationServer VS 2010 [CompilationServer_vs2010 CompilationServer_vs2019 Windows] described instances: []
2019-10-17 02:24:53.334+0000 [id=36]    INFO    c.a.j.ec2fleet.EC2FleetCloud#info: EC2 Fleet Spot CompilationServer VS 2010 [CompilationServer_vs2010 CompilationServer_vs2019 Windows] jenkins nodes [i-09140f54c5e375ce9, i-0e678fdaedd376e0b, i-0954d8250872ce6c0]
2019-10-17 02:24:53.334+0000 [id=36]    INFO    c.a.j.ec2fleet.EC2FleetCloud#info: EC2 Fleet Spot CompilationServer VS 2010 [CompilationServer_vs2010 CompilationServer_vs2019 Windows] jenkins nodes without instance []
2019-10-17 02:24:53.334+0000 [id=36]    INFO    c.a.j.ec2fleet.EC2FleetCloud#info: EC2 Fleet Spot CompilationServer VS 2010 [CompilationServer_vs2010 CompilationServer_vs2019 Windows] terminated instances [i-09140f54c5e375ce9, i-0e678fdaedd376e0b, i-0954d8250872ce6c0]
2019-10-17 02:24:53.334+0000 [id=36]    INFO    c.a.j.ec2fleet.EC2FleetCloud#info: EC2 Fleet Spot CompilationServer VS 2010 [CompilationServer_vs2010 CompilationServer_vs2019 Windows] new instances []
2019-10-17 02:24:53.334+0000 [id=36]    INFO    c.a.j.ec2fleet.EC2FleetCloud#info: EC2 Fleet Spot CompilationServer VS 2010 [CompilationServer_vs2010 CompilationServer_vs2019 Windows] Fleet (CompilationServer_vs2010 CompilationServer_vs2019 Windows) no longer has the instance i-09140f54c5e375ce9, removing from Jenkins.
2019-10-17 02:24:53.334+0000 [id=36]    INFO    c.a.j.e.IdleRetentionStrategy#check: Check if node idle i-009b2681111979488
2019-10-17 02:24:53.334+0000 [id=227980]        INFO    c.a.j.e.EC2FleetAutoResubmitComputerLauncher#afterDisconnect: Unexpected EC2 Fleet Spot CompilationServer VS 2010 i-09140f54c5e375ce9 termination but resubmit disabled, no actions, disableTaskResubmit: false, offline: true, offlineCause: class hudson.slaves.OfflineCause$SimpleOfflineCause
2019-10-17 02:24:53.334+0000 [id=36]    INFO    c.a.j.e.IdleRetentionStrategy#isIdleForTooLong: Instance: nonprod-fusion-packer-compilationserver-full-2019-09-19T06-37-29Z; AMI: ami-08352d341139d1d2a i-009b2681111979488 Age: 150572 Max Age:6000000
2019-10-17 02:24:53.334+0000 [id=36]    INFO    c.a.j.e.IdleRetentionStrategy#check: Check if node idle i-0954d8250872ce6c0
2019-10-17 02:24:53.334+0000 [id=36]    INFO    c.a.j.e.IdleRetentionStrategy#isIdleForTooLong: Instance: EC2 Fleet Spot CompilationServer VS 2010 i-0954d8250872ce6c0 Age: 58878001 Max Age:3600000
2019-10-17 02:24:53.334+0000 [id=36]    INFO    c.a.j.ec2fleet.EC2FleetCloud#info: EC2 Fleet Spot CompilationServer VS 2010 [CompilationServer_vs2010 CompilationServer_vs2019 Windows] Attempting to terminate instance: i-0954d8250872ce6c0

As a resuts Node instance have state "suspended" and not shutdown.

Let me know if I can somehow give you more valuable infromation.

from ec2-fleet-plugin.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.