Comments (13)
Hi everybody,
@GianniLunardi and I are trying to reduce the computation time of DDP (or FDDP) for an MPC application. Thanks to the parallelization of the backward pass, now the bottleneck is the forward pass, which takes up to 75% of the time in some tests. Looking at the implementation of the forward pass I was therefore wondering about whether we could improve a few things.
Do you refer to the parallelisation of computing the derivatives? (This is not the same to parallelise the backward pass). If you are talking about the backward pass, then it might be good to arrange a meeting. In the last 5 months, we have done some progress in paralleling both backward and forward passes. This is still an ongoing research activity.
First, since in case the expected improvement is negative (i.e.
dVexp_ < 0
) the step is not accepted (regardless of the value ofdV
), I don't understand why in those cases we should computedV
with thetryStep
function. Isn't that a waste of time? We could first computedVexp
, check whether it's positive, and only in those cases computedV
.
Yes, we could run tryStep depending on the result of dVexp. I guess we haven't paid so much attention to improving DDP as everyone is using FDDP or Box-FDDP. Feel free to make a PR about it.
Second, the fact that
dVexp_ < 0
means that the local approximation of the Q function is concave. However, this doesn't necessarily mean the step is not gonna improve the cost. So I wonder whether there could be something smarter we could do with the computed step in casedVexp_ < 0
. Maybe simply checking whetherdV > 0
?
I guess you are still talking about the DDP and not FDDP, right? In any case, I am not aware of a smarter strategy. I am happy to read any paper that proposes something better. But, this criteria is already good. I personally spent some decent time evaluating them, including in cases with constraints.
Third, the current API of the solvers does not allow us to change the values of alpha used in the line search. Is there a reason for that? How did people who did MPC with crocoddyl handle this? Testing 10 values of alpha can be computationally demanding and blow up the computation time of your MPC iteration. Ideally we should have an "adaptive line search" that tests as many values of alpha as possible in the remaining time of the current MPC iteration.
The devel branch allows you to change alpha, I don't remember about the master one. Are you sure you cannot change alpha?
On the other hand, we have implemented a simple backtracking procedure. It is indeed worth investing time in finding a better strategy. Do you have any paper that suggests a better alternative that is still cheat to compute?
Finally, can't we parallelize the forward pass computing the
tryStep
method for multiple values of alpha in parallel?
Yes, we can, but as I said below we have been working on a potentially better idea. I could tell you more in a face-to-face meeting, if you're interested in the topic.
from crocoddyl.
Hi @cmastalli
thanks for the fast reply!
Do you refer to the parallelisation of computing the derivatives?
Yes, sorry I took that for granted.
Yes, we could run tryStep depending on the result of dVexp. I guess we haven't paid so much attention to improving DDP as everyone is using FDDP or Box-FDDP. Feel free to make a PR about it.
Got it
Second, the fact that
dVexp_ < 0
means that the local approximation of the Q function is concave. However, this doesn't necessarily mean the step is not gonna improve the cost. So I wonder whether there could be something smarter we could do with the computed step in casedVexp_ < 0
. Maybe simply checking whetherdV > 0
?I guess you are still talking about the DDP and not FDDP, right? In any case, I am not aware of a smarter strategy. I am happy to read any paper that proposes something better. But, this criteria is already good. I personally spent some decent time evaluating them, including in cases with constraints.
Yes, on this the FDDP does something different. I am not aware of any state-of-the-art alternative, I was just wondering whether somebody knew one.
The devel branch allows you to change alpha, I don't remember about the master one. Are you sure you cannot change alpha?
You're right, my bad, you can actually set the list of alpha values even on master.
On the other hand, we have implemented a simple backtracking procedure. It is indeed worth investing time in finding a better strategy. Do you have any paper that suggests a better alternative that is still cheat to compute?
No, but I have this simple idea that seems better than nothing. We should have an "adaptive line search" that tests as many values of alpha as possible in the remaining time of the current MPC iteration. I guess we could do that easily by dynamically changing the list of alpha values after measuring how much time the backward pass has taken.
Finally, can't we parallelize the forward pass computing the
tryStep
method for multiple values of alpha in parallel?Yes, we can, but as I said below we have been working on a potentially better idea. I could tell you more in a face-to-face meeting, if you're interested in the topic.
Sure, I'll contact you by email then.
from crocoddyl.
I know there are some ideas in the numerical-optimisation literature that design the next step length based on performance of previous ones. The whole idea is to get the most 'descending" result, as a bigger step is not necessary the optimal one.
Another possibility is to investigate other conditions, instead of the naive Armijo one.
from crocoddyl.
Coming back to the forward pass of FDDP, I don't understand the logic of the code for the case dVexp<0. The comment says "reducing the gaps by allowing a small increment in the cost value", which makes sense, but if that's the objective, then a cost increment should only be allowed if is_feasible==false
. Or am I missing something? Currently we are accepting a cost increase even if the gaps are already zero.
from crocoddyl.
Coming back to the forward pass of FDDP, I don't understand the logic of the code for the case dVexp<0. The comment says "reducing the gaps by allowing a small increment in the cost value", which makes sense, but if that's the objective, then a cost increment should only be allowed if
is_feasible==false
. Or am I missing something? Currently we are accepting a cost increase even if the gaps are already zero.
(if my understanding is correct) dVexp
is always positive (i.e., descend direction) when the gaps are zero; it boils down to the DDP approach. And this statement comes from the fact that we are using a Cholesky decomposition equipped with a regularisation scheme. Therefore, it won't happen the following sentence "we are accepting a cost increase even if the gaps are already zero". Perhaps, we could double-check this empirically.
However, I agree that we might want to include another test in that if-loop (i.e., is_feasible==false).
Let me do a couple of extra notes that could be useful here:
- We evaluate
dVexp > 0
in DDP as we could have an infeasible warm start. dVext
can have a negative value when there is a gap, and it depends on the rollout itself (see Eq. 19 in https://arxiv.org/pdf/1909.04947.pdf)
from crocoddyl.
Coming back to the forward pass of FDDP, I don't understand the logic of the code for the case dVexp<0. The comment says "reducing the gaps by allowing a small increment in the cost value", which makes sense, but if that's the objective, then a cost increment should only be allowed if
is_feasible==false
. Or am I missing something? Currently we are accepting a cost increase even if the gaps are already zero.(if my understanding is correct)
dVexp
is always positive (i.e., descend direction) when the gaps are zero; it boils down to the DDP approach. And this statement comes from the fact that we are using a Cholesky decomposition equipped with a regularisation scheme. Therefore, it won't happen the following sentence "we are accepting a cost increase even if the gaps are already zero".
Ok, so what you are saying is that, in case Quu
is not positive-definite, even after adding the regularization, then the Cholesky decomposition would fail and the backward pass would be repeated with a larger regularization. Therefore, once we get to the forward pass we can be sure that Quu_inverse
is positive-definite, and therefore dVexp>0
. Is my understanding correct?
1. We evaluate `dVexp > 0` in DDP as we could have an infeasible warm start.
Good to know, I didn't know this.
from crocoddyl.
Ok, so what you are saying is that, in case
Quu
is not positive-definite, even after adding the regularization, then the Cholesky decomposition would fail and the backward pass would be repeated with a larger regularization. Therefore, once we get to the forward pass we can be sure thatQuu_inverse
is positive-definite, and thereforedVexp>0
. Is my understanding correct?
YES!
from crocoddyl.
Ok, so to summarize, the possible actions to take now are the following:
- Improve the forward pass of the DDP solver to avoid the computation of dV in case dVexp<0
- Implement a new forward pass for MPC application, where rather than testing values of alpha until a given condition is satisfied, we test all the user-specified values of alpha and take the one who gave the best result. The number of values of alpha could be automatically adjusted by the solver itself based on how much time is left in the current MPC iteration after computing the backward pass.
Would you agree that the second point is interesting for MPC applications? This could actually be done for all solvers (DDP, FDDP, boxDDP, ...).
from crocoddyl.
Ok, so to summarize, the possible actions to take now are the following:
- Improve the forward pass of the DDP solver to avoid the computation of dV in case dVexp<0
- Implement a new forward pass for MPC application, where rather than testing values of alpha until a given condition is satisfied, we test all the user-specified values of alpha and take the one who gave the best result. The number of values of alpha could be automatically adjusted by the solver itself based on how much time is left in the current MPC iteration after computing the backward pass.
Would you agree that the second point is interesting for MPC applications? This could actually be done for all solvers (DDP, FDDP, boxDDP, ...).
I agree! I am happy if you (or someone in your team) want to take the lead. I would suggest creating a repo with the Python code of the solver. I am also happy to share our equality-constrained DDP solver written in Python. I think this last point would help to try these developments in a wider range of problems, e.g., with inverse dynamics.
Note that there is still a number of modifications to be merged, so I would suggest validating them in Python and writing in c++ when it is convenient. If this sounds good to you, then we could arrange a meeting to discuss more it.
from crocoddyl.
Actually this seems a rather straightforward extension of the current solvers, so I was thinking of implementing it directly in C++. My idea would be to create a child class for each solver (e.g., DDP -> DDP_MPC, FDDP -> FDDP_MPC), where we just reimplement the forward pass according to the logic described above. Could that work, or are you afraid it would clash with the ongoing modifications that haven't been merged yet?
from crocoddyl.
Actually this seems a rather straightforward extension of the current solvers, so I was thinking of implementing it directly in C++. My idea would be to create a child class for each solver (e.g., DDP -> DDP_MPC, FDDP -> FDDP_MPC), where we just reimplement the forward pass according to the logic described above. Could that work, or are you afraid it would clash with the ongoing modifications that haven't been merged yet?
Yes, it is but it is not often straightforward to analyse the behaviour in c++. But I will let you work as you prefer.
It could create conflicts, but if you modify only a few lines (as expected) then it is manageable. The only thing is that I would like to see results in our unmerged equality-constrained solver. But, you could cherry-pick your commit into this branch: https://github.com/cmastalli/crocoddyl/tree/topic/preview-2-rebasing for further tests.
from crocoddyl.
Btw, please do not create a child class for each solver. You can start working from this branch: #1107. If you finish before merging this PR, then you have two options: (1) wait or (2) create a PR to my branch.
from crocoddyl.
Ok, so to summarize, the possible actions to take now are the following:
- Improve the forward pass of the DDP solver to avoid the computation of dV in case dVexp<0
- Implement a new forward pass for MPC application, where rather than testing values of alpha until a given condition is satisfied, we test all the user-specified values of alpha and take the one who gave the best result. The number of values of alpha could be automatically adjusted by the solver itself based on how much time is left in the current MPC iteration after computing the backward pass.
Would you agree that the second point is interesting for MPC applications? This could actually be done for all solvers (DDP, FDDP, boxDDP, ...).
I have implemented many improvements in the step acceptance, regularization and rollout in a private branch, which will be released later. In any case, point 1 is not applicable anymore. In short, the idea of not computing dV
when dVexp < 0
is wrong. When dealing with constraints, we need to balance between cost improvement (optimality) and constraint satisfaction (feasibility). The suggested approach would interfere with this.
The second point still remains valid despite that I DID implement a new approach for the forward pass. But, @andreadelprete you shouldn't work on this as I am working on a major improvement of solvers. There are many features we're pushing forward, and this will take us some time. I'll handle this.
Please accept closing this issue, as there is no need for further discussions or input from others. Thanks!
from crocoddyl.
Related Issues (20)
- Trajectory optimization of a wheeled robot HOT 3
- `ActivationModelQuadraticBarrier` does not respect upper and lower bounds when one of the bounds is `np.inf` HOT 6
- Some errors happen when importing crocoddyl HOT 11
- SIGABRT in Custom DifferentialActionModel HOT 1
- SIGABRT in ActionModelFull HOT 2
- Crocoddyl with pinocchio3-preview branch tests fail HOT 2
- Obstacle avoidance utilizing pairCollision HOT 3
- Time optimal trajectories HOT 1
- Undefined Symbols for architecture arm64 on MacM1 when compiling custom action model HOT 3
- About setting up uneven terrain for bipedal walking
- About setting up uneven terrain for bipedal walking HOT 4
- Wrong dimension for ResidualModelState.set_reference() HOT 1
- Problems trying to switch from default datatypes to floats HOT 10
- Exiting solver based on timeout HOT 1
- Enabling vectorization HOT 2
- Two actuators but solved us are always the same pairwise HOT 2
- Using constraints in custom action models (Python) HOT 4
- Add instructions for building crocoddyl with multi-threading support
- Loss of performance in Crocoddyl parallelization HOT 4
- Possible error in the documentation of DifferentialActionModelAbstractTpl HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from crocoddyl.