Comments (14)
can be connected to #37
from ec2-fleet-plugin.
Hi @delgod, Termination fix was released under version 1.1.8. Did it fix this issue?
from ec2-fleet-plugin.
sorry, due to this bug I was decided to use https://plugins.jenkins.io/ec2 plugin (it also has bugs).
from ec2-fleet-plugin.
@delgod nw, I will try to test it and close if it was fixed.
from ec2-fleet-plugin.
Steps to Reproduce:
set minimal amount on workers - 0
choose AMI with very long start duration (like CentOS)
wait till you have 0 workers
a run job which requires plenty of workers (like 50 workers)
result was:
plugin started ~55 workers (because of instance startup had long duration)
after 50 workers become used -- plugin started to do scale down
and killed workers under load
from ec2-fleet-plugin.
@delgod thx for steps, I have reopened issue to test and try to fix
from ec2-fleet-plugin.
https://jenkins.io/doc/developer/testing/ to emulate steps as integration test
from ec2-fleet-plugin.
@delgod im trying to reproduce, question: under load means executors on those workers still are doing some jobs, correct?
from ec2-fleet-plugin.
yes
from ec2-fleet-plugin.
@delgod what do you have for configuration item Max Idle Minutes Before Scaledown
? Empty, 0
?
from ec2-fleet-plugin.
@delgod I think I found the problem, according to plugin logic, the node could be removed if last execution on this node was too far (settings) however we don't check if a node has any build in progress. Unfortunately, Jenkins logic is really complicated about that so I cannot give 100%
, so I made a fix based on Jenkins examples.
Fix released under version 1.4.1
I will keep the issue open, please ping if you can try, thx a lot
from ec2-fleet-plugin.
job duration is 30-60 minutes and "Max Idle Minutes Before Scaledown" was 5 minutes.
sorry, now, another person is the maintainer of CI/CD env, so cannot recheck my case.
thank you for your work!
from ec2-fleet-plugin.
np, thx for reporting.
I think fix is applicable for your case, idle time less then job duration, for now it checks if any job in progress too.
from ec2-fleet-plugin.
Hi all,
i discovered this issue today.
I have "EC2 Fleet" plugin in "1.12.0" version.
Found in logs:
2019-10-17 02:24:45.027+0000 [id=227367] INFO h.r.SynchronousCommandTransport$ReaderThread#run: I/O error in channel i-09140f54c5e375ce9
java.io.EOFException
at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2680)
at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3155)
at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:861)
at java.io.ObjectInputStream.<init>(ObjectInputStream.java:357)
at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:49)
at hudson.remoting.Command.readFrom(Command.java:140)
at hudson.remoting.Command.readFrom(Command.java:126)
at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35)
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63)
Caused: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77)
2019-10-17 02:24:45.028+0000 [id=227367] INFO c.a.j.e.EC2FleetAutoResubmitComputerLauncher#afterDisconnect: Unexpected EC2 Fleet Spot CompilationServer VS 2010 i-09140f54c5e375ce9 termination, resubmit
2019-10-17 02:24:45.028+0000 [id=227367] INFO c.a.j.e.EC2FleetAutoResubmitComputerLauncher#afterDisconnect: Unexpected EC2 Fleet Spot CompilationServer VS 2010 i-09140f54c5e375ce9 termination, resubmit finished
2019-10-17 02:24:53.158+0000 [id=36] INFO c.a.j.ec2fleet.EC2FleetCloud#info: EC2 Fleet Spot CompilationServer VS 2010 [CompilationServer_vs2010 CompilationServer_vs2019 Windows] start
2019-10-17 02:24:53.249+0000 [id=36] INFO c.a.j.ec2fleet.EC2FleetCloud#info: EC2 Fleet Spot CompilationServer VS 2010 [CompilationServer_vs2010 CompilationServer_vs2019 Windows] fleet instances: [i-09140f54c5e375ce9, i-0e678fdaedd376e0b, i-0954d8250872ce6c0]
2019-10-17 02:24:53.334+0000 [id=36] INFO c.a.j.ec2fleet.EC2FleetCloud#info: EC2 Fleet Spot CompilationServer VS 2010 [CompilationServer_vs2010 CompilationServer_vs2019 Windows] described instances: []
2019-10-17 02:24:53.334+0000 [id=36] INFO c.a.j.ec2fleet.EC2FleetCloud#info: EC2 Fleet Spot CompilationServer VS 2010 [CompilationServer_vs2010 CompilationServer_vs2019 Windows] jenkins nodes [i-09140f54c5e375ce9, i-0e678fdaedd376e0b, i-0954d8250872ce6c0]
2019-10-17 02:24:53.334+0000 [id=36] INFO c.a.j.ec2fleet.EC2FleetCloud#info: EC2 Fleet Spot CompilationServer VS 2010 [CompilationServer_vs2010 CompilationServer_vs2019 Windows] jenkins nodes without instance []
2019-10-17 02:24:53.334+0000 [id=36] INFO c.a.j.ec2fleet.EC2FleetCloud#info: EC2 Fleet Spot CompilationServer VS 2010 [CompilationServer_vs2010 CompilationServer_vs2019 Windows] terminated instances [i-09140f54c5e375ce9, i-0e678fdaedd376e0b, i-0954d8250872ce6c0]
2019-10-17 02:24:53.334+0000 [id=36] INFO c.a.j.ec2fleet.EC2FleetCloud#info: EC2 Fleet Spot CompilationServer VS 2010 [CompilationServer_vs2010 CompilationServer_vs2019 Windows] new instances []
2019-10-17 02:24:53.334+0000 [id=36] INFO c.a.j.ec2fleet.EC2FleetCloud#info: EC2 Fleet Spot CompilationServer VS 2010 [CompilationServer_vs2010 CompilationServer_vs2019 Windows] Fleet (CompilationServer_vs2010 CompilationServer_vs2019 Windows) no longer has the instance i-09140f54c5e375ce9, removing from Jenkins.
2019-10-17 02:24:53.334+0000 [id=36] INFO c.a.j.e.IdleRetentionStrategy#check: Check if node idle i-009b2681111979488
2019-10-17 02:24:53.334+0000 [id=227980] INFO c.a.j.e.EC2FleetAutoResubmitComputerLauncher#afterDisconnect: Unexpected EC2 Fleet Spot CompilationServer VS 2010 i-09140f54c5e375ce9 termination but resubmit disabled, no actions, disableTaskResubmit: false, offline: true, offlineCause: class hudson.slaves.OfflineCause$SimpleOfflineCause
2019-10-17 02:24:53.334+0000 [id=36] INFO c.a.j.e.IdleRetentionStrategy#isIdleForTooLong: Instance: nonprod-fusion-packer-compilationserver-full-2019-09-19T06-37-29Z; AMI: ami-08352d341139d1d2a i-009b2681111979488 Age: 150572 Max Age:6000000
2019-10-17 02:24:53.334+0000 [id=36] INFO c.a.j.e.IdleRetentionStrategy#check: Check if node idle i-0954d8250872ce6c0
2019-10-17 02:24:53.334+0000 [id=36] INFO c.a.j.e.IdleRetentionStrategy#isIdleForTooLong: Instance: EC2 Fleet Spot CompilationServer VS 2010 i-0954d8250872ce6c0 Age: 58878001 Max Age:3600000
2019-10-17 02:24:53.334+0000 [id=36] INFO c.a.j.ec2fleet.EC2FleetCloud#info: EC2 Fleet Spot CompilationServer VS 2010 [CompilationServer_vs2010 CompilationServer_vs2019 Windows] Attempting to terminate instance: i-0954d8250872ce6c0
As a resuts Node instance have state "suspended" and not shutdown.
Let me know if I can somehow give you more valuable infromation.
from ec2-fleet-plugin.
Related Issues (20)
- Scale-in Protection always enable HOT 3
- 3.10.0 does not assume role from credentials
- Upgrading 3.0.1 to 3.1.0 deleted cloud configuration HOT 11
- Complete removal of data about the Executors structure 3.0.2 --> 3.1.0 HOT 2
- NoDelayProvisionStrategy won't provision after scaling down to 0 instances in auto scaling group HOT 10
- Cloud is null for computer unknown HOT 1
- EC2 Fleet label based cloud cannot create node
- EC2 ASG agents are not assgined to Jenkins fleet tags - Error during fleet '<fleet_name>' stats update java.lang.NullPointerException HOT 10
- Protected From Scale In HOT 4
- Instance are not shutting down due to "Protection from scale In" HOT 4
- Waiting for next available executor on βtest i-123456789 HOT 4
- jenkins connect with ipv6 HOT 1
- jenkins connect with ipv6
- EC2 Fleet with AutoScaling Group receives scale-down request prematurely HOT 1
- jnlp connection?
- Agent root directories cannot be on different drives other than C: when using Windows
- Job is not resubmited because of post condition failure
- Ability to configure minimum number of executors, much like minimum number of nodes
- NoProxy Configuration Not Working As Expected
- the plugin re-triggers the last failed build instead aborted one
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ec2-fleet-plugin.