Comments (11)
It should be the broker to reject the tickets from the SM. I was thinking the case of multiple controllers, a controller won't know the resource availabilty, but the broker would.
Yufeng
Sent from my Verizon Wireless 4G LTE smartphone
-------- Original message --------
From: Ilya Baldin [email protected]
Date: 12/21/2015 2:14 PM (GMT-05:00)
To: RENCI-NRIG/orca5 [email protected]
Cc: Yufeng Xin [email protected]
Subject: [orca5] Resource exhaustion behavior (#36)
This may or may not be a bug. I was testing modify by asking for more resources than available.
With nodes, the controller properly rejects if the number of nodes exceeds available. With link bandwidth, I asked for a 100G intra-domain link. The controller accepted it, but then the AM properly rejected it. I think it would be better if the controller rejected it at the start?
This behavior is the same for both modify and create new slice (I tried a new slice with a 100G link with same result). Just wondering why that is.
Reply to this email directly or view it on GitHubhttps://github.com//issues/36.
from orca5.
You are right, it is the broker rejecting, not the AM.
I guess, then why does the controller reject on the number of units? Because from userβs perspective the behavior is different. With the number of nodes, the controller barfs out, no slice is created. With bandwidth, slice may be created with a failed reservation.
-ilya
Ilya Baldin
Director, Networking Research and Infrastructure
RENCI/UNC Chapel Hill
http://www.renci.org
On Dec 21, 2015, at 2:26 PM, YufengXin <[email protected]mailto:[email protected]> wrote:
It should be the broker to reject the tickets from the SM. I was thinking the case of multiple controllers, a controller won't know the resource availabilty, but the broker would.
Yufeng
Sent from my Verizon Wireless 4G LTE smartphone
-------- Original message --------
From: Ilya Baldin <[email protected]mailto:[email protected]>
Date: 12/21/2015 2:14 PM (GMT-05:00)
To: RENCI-NRIG/orca5 <[email protected]mailto:[email protected]>
Cc: Yufeng Xin <[email protected]mailto:[email protected]>
Subject: [orca5] Resource exhaustion behavior (#36)
This may or may not be a bug. I was testing modify by asking for more resources than available.
With nodes, the controller properly rejects if the number of nodes exceeds available. With link bandwidth, I asked for a 100G intra-domain link. The controller accepted it, but then the AM properly rejected it. I think it would be better if the controller rejected it at the start?
This behavior is the same for both modify and create new slice (I tried a new slice with a 100G link with same result). Just wondering why that is.
Reply to this email directly or view it on GitHubhttps://github.com//issues/36.
β
Reply to this email directly or view it on GitHubhttps://github.com//issues/36#issuecomment-166396530.
from orca5.
ping @YufengXin @paul-ruth
This talks about modify and create - I think this is the same issue we just discussed in the atrium. Feel free to leave comments about how else this behavior manifests.
from orca5.
@anriban reports issues with accounting XXL VMs and receiving misleading info from broker. This should be looked at after SC.
from orca5.
Note that while the ticket states initially the problems with MODIFY operation, the bug manifests itself even with simple slice creation. I suggest looking at it first, before testing modify scenarios.
from orca5.
I don't think I know the best way to fix this yet, but I think the problem is rooted in CloudHandler:runEmbedding(), and the fact that this function only looks at request.getTypeTotalUnits()
. If it looked instead/additionally at request.getElements()
, it would have access to the number of CPU cores requested.
from orca5.
from orca5.
Looking at this problem a little bit deeper, here are a few of the different layers of code where this issue could be addressed:
- CloudHandler could look at both
request.getTypeTotalUnits
andrequest.getElements
in order to calculate whether there were enough resources available. Similar changes or refactoring of common code would probably also have to be made in UnboundRequestHandler. - RequestReservation (request) could be modified to have
typeTotalUnits
reflect the CPU count, and not just the VM count. typeTotalUnits currently gets the unit count from theDomainResourceType
insideNetworkElement
. - NetworkElement could be modified to have its resourceType (DomainResourceType) reflect not the VM count but the CPU count (from the bandwidth
map
ofDomainResource
, also referred to as Constraints.)
I'm still trying to figure out which parts of the code are expecting to find a count of the number of Nodes/VMs, compared to expecting to find a count of the number of CPU cores needed.
It seems like a NetworkElement could/would represent a node/VM. So you wouldn't expect to get a number for how many Nodes/VMs it represents -- it would always represent exactly one Node/VM. But there could be a count of CPU cores attached. Should that CPU count be reflected in resourceType
?
Should the typeTotalUnits in RequestReservation be a count of Nodes/VMs, or of CPUs?
Do any of these assumptions change if NetworkElement is representing a VLAN instead of a VM? Would it ever represent more than one VLAN?
from orca5.
SimpleVMControl is another likely place to look. The ticket has Properties for the number of CPU Cores / amount of RAM, but the call to getVMs() is not provided with that information.
The one place where this is calculated correctly is in BrokerSimplerUnitsPolicy. The properties are sent to NDLVMInventory, where the constraints are checked against available resources.
So, CloudHandler
is in the Controller? And SimpleVMControl
is in the AM? While BrokerSimplerUnitsPolicy
is in the Broker? Where would be the best place to fix this?
from orca5.
This is the exception I got
INFO | jvm 1 | 2016/12/12 16:54:13 | 2016-12-12 16:54:13,394-[test-ndl-broker]-{ERROR}-orca.test-ndl-broker-(BrokerSimplerUnitsPolicy.java:281)-Failed to ticket: ncsuvmsite.vm for slice: hincha.orca-issue-36.50
INFO | jvm 1 | 2016/12/12 16:54:13 | java.lang.RuntimeException: Insufficient <memoryCapacity,0> to meet request:12000
INFO | jvm 1 | 2016/12/12 16:54:13 | at orca.plugins.ben.broker.NDLVMInventory.allocate(NDLVMInventory.java:111)
INFO | jvm 1 | 2016/12/12 16:54:13 | at orca.plugins.ben.broker.NDLVMInventory.allocate(NDLVMInventory.java:150)
INFO | jvm 1 | 2016/12/12 16:54:13 | at orca.policy.core.BrokerSimplerUnitsPolicy.ticket(BrokerSimplerUnitsPolicy.java:271)
INFO | jvm 1 | 2016/12/12 16:54:13 | at orca.policy.core.BrokerSimplerUnitsPolicy.ticket(BrokerSimplerUnitsPolicy.java:238)
INFO | jvm 1 | 2016/12/12 16:54:13 | at orca.policy.core.BrokerSimplerUnitsPolicy.allocateTicketing(BrokerSimplerUnitsPolicy.java:153)
INFO | jvm 1 | 2016/12/12 16:54:13 | at orca.policy.core.BrokerSimplerUnitsPolicy.allocate(BrokerSimplerUnitsPolicy.java:77)
INFO | jvm 1 | 2016/12/12 16:54:13 | at orca.shirako.core.Broker.tickHandler(Broker.java:487)
INFO | jvm 1 | 2016/12/12 16:54:13 | at orca.shirako.core.Actor.actorTick(Actor.java:428)
INFO | jvm 1 | 2016/12/12 16:54:13 | at orca.shirako.core.Actor.access$000(Actor.java:60)
INFO | jvm 1 | 2016/12/12 16:54:13 | at orca.shirako.core.Actor$1.process(Actor.java:333)
INFO | jvm 1 | 2016/12/12 16:54:13 | at orca.shirako.core.Actor.actorMain(Actor.java:377)
INFO | jvm 1 | 2016/12/12 16:54:13 | at orca.shirako.core.Actor$4.run(Actor.java:1018)
INFO | jvm 1 | 2016/12/12 16:54:13 | at java.lang.Thread.run(Thread.java:745)
from orca5.
Anything not specifically about intra-domain bandwidth resources has been moved to a new ticket: #88
My understanding is that for the issue of intra-domain bandwidth resource behavior, the Broker is behaving correctly in issuing tickets in the circumstances described at the top of this issue. The Broker is not provided with any information about the internal networking, and so cannot deny any requests on the basis of those constraints.
from orca5.
Related Issues (20)
- Broken manifest when adding Stitchport to node with multiple existing interfaces HOT 11
- revert OSSRH changes
- Exception observed in LabelSync Thread HOT 1
- Slice request failure for stitchport connected to unbound VM
- Inter-rack slices with unbounded node-groups receive error
- Security Issue reported by Github for com.fasterxml.jackson.core:jackson-databind HOT 1
- Security issue reported by github
- Enable Stitchport-to-Stitchport Request
- JAVA 11 upgrade
- Add support for Horizon property queries to NdlCommons
- NDL manifest contains replicated incorrectly formatted labels for some objects
- Orca cleanup images when cleaning up openstack project on slice deletion
- Slice creation with more than 1 VM fails
- Issues observed in Upgrade
- Update AUTs to work with slice name restriction
- Return list of interfaces in Postboot script and maintain the order of the interfaces on VM
- ScriptConstructor returns link with null name when node has dependency set
- Resource Leak Issue
- TDB issue on controller recovery HOT 8
- Add warning to users trying to use DSA keys that they cannot be used
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from orca5.