Comments (3)
Hi Klaas,
I just double-checked and I can't find anywhere in our code that's making that change. I also checked on a vanilla Slurm cluster deployed with CycleCloud 7.9.3 and don't see the stack size change that you're seeing. When we do modify limits, we put those changes in /etc/security/limits.d/cyclecloud.conf, but the only modifications there are increasing the number of open files. Nothing to do with stack size. Could another package you're installing either via a cluster-init project or via a custom image be adding that?
Thanks,
-Andy
from cyclecloud-pbspro.
Hi Klaas,
I just realized after my last comment that this is the PBSpro repo, not Slurm (which is where I've spent most of my time lately). Sure enough, I can reproduce this with a fresh CycleCloud PBSpro cluster. I'll look through our recipes more closely, but I'm not aware of any changes we made to limits recently.
When you say the behavior "recently changed", do you know what version you upgraded from? It's possible if you were previously using a version that had an older PBSpro installation that maybe their packages changed to increase the stack limit. The other possibility is that one of the dependency packages has updated to make this change.
One thing you could do as a workaround would be to set the stack limit explicitly in your job script. Just doing ulimit -s <int>
will set the stack size lower than the hard limit. That may get your Abaqus jobs working again.
from cyclecloud-pbspro.
@anhoward during the last ~2 months, I did not update the cyclecloud version, that's why I think this is from some content that is being downloaded on the fly.
My last cyclecloud update:
Name : cyclecloud
Version : 7.9.2
Install Date: Thu 23 Jan 2020 10:25:08 AM UTC
I know how to work around the problem, the issue is more that this change seems to be a silent one, I am fairly sure my master install worked after the 7.9.2 update, and stopped working a couple of days ago when I tried out the HB120v2 machines - this first lead me to believe it is an issue related to the machine type until I figured out that abaqus is so stupid it can't deal with unlimited stacksize softlimits....
In general I would be interested where the modification is coming from, I could not find it in the installation here, or in the OS image which would be my first candidates to look. Are your 'common' chef modules also located on github?
Greetings
Klaas
from cyclecloud-pbspro.
Related Issues (20)
- Slot_type seems to be ignored when provisioning nodes HOT 2
- Jetpack error while deploying pbspro cluster
- Issues with output files and working directory
- Nodes are most of the time not unregistered in PBS
- job history is disabled by default
- Add Ubuntu 18 support with OpenPBS
- azpbs remove_nodes doesn't remove them from pbsnodes HOT 2
- Add HBv3 support
- Add ND96asr support
- slot_type is case sensitive HOT 1
- cyclecloud-pbspro scalelib module links to /Users/ryhamel/code/cyclecloud-scalelib/
- Job History is not in release 2.0.2 HOT 1
- hwloc-libs RPM is no longer provided in the epel repo CentOS 8 HOT 3
- Ignore 'm' resource flag, so that PBS version < 19 scale correctly
- autoscaler is not adding nodes
- cyclecloud-pbspro-pkg-2.0.9.tar.gz is not found.
- Request for PBS Pro 19.1.2
- autoscaler not handling well bad formatted JSON qstat output
- This repo is missing important files
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cyclecloud-pbspro.