A repository tracking the planning for implementation and evaluation of RFC 2229.
See also:
A repository tracking the planning for implementation and evaluation of RFC 2229.
A repository tracking the planning for implementation and evaluation of RFC 2229.
See also:
Updated by nikomatsakis with current status:
In rust-lang/rust#69968, @eddyb landed the "first step" here of introducing a tuple into the upvar types. However, in current rustc, we create that tuple "eagerly", as soon as we create the closure type:
The next step is to move that tuple creation "late", so that it occurs during the "upvar analysis" code, so somewhere around here:
The problem with this is that the current accessor, ClosureSubsts::upvar_tys
, assumes that the upvar types are a tuple with known arity. Thus any calls to upvar_tys
that occur before the upvar analysis code runs will ICE.
Last time, we added an accessor upvar_tuple_ty
that returned the tuple itself (or an unbound type variable), since it turned out that many of the calls to upvar_tys
could be replaced by a call that operated on the tuple. Most notably, in the trait system, for example here:
We changed this to yield the tuple always. The problem with that was that it was effecting the error messages, because the tuple was being introduced into the "backtrace" and we somehow never got things quiite right to remove it.
I still think that is perhaps the right approach, but there is an alternative we could try: we could get the tuple ty and check whether it's been "resolved" yet (via a call to self.infcx.shallow_resolve
). If so, we can put each of its components as we do today, but if not, we can return Ambiguous
, as we do here.
The borrow checker just prints the root variable when printing captures from within the context of the closure, we should update this to be print the captured path precisely.
Ideally, we do this once rust-lang/rust#80635 lands because it introduces a function to prints paths precisely.
As part of refactoring to the new HIR place, we want to introduce a PlaceWithHirId
that separates out the hir_id
field from the rest of Place
.
We can do this by:
PlaceWithHirId
, get that to buildPlaceWithHirId
into a new struct, Place
, get that to buildSince upvars_mentioned
might not be same as paths that are captured, once type check is done, we know the list of captured path and we should be using that instead.
Ensure that field/member are marked borrowed/moved instead of variable hir_id.
This would look something like https://github.com/rust-lang/project-rfc-2229/blob/master/closure_captures.md
Consider the following closure:
let x;
let c = || {
let _ = x;
};
let f: fn() = c;
When capture_disjoint_fields
is enabled, this closure doesn't capture x
, implying c
can be coerced to an FnPtr
.
However, the code that checks if coercion can happen depends on upvars_mentioned
which are computed before the capture analysis and would result in an error.
Check #17
Currently within hir::Place
type information is only stored for the final value/memory being read.
Type information at each intermediate access can be useful later and computing this information again at each Projection could be complicated and costly.
The final structure is explained here:
https://github.com/rust-lang/project-rfc-2229/blob/master/hir-place-target.md
Projection is defined here: https://github.com/rust-lang/rust/blob/master/src/librustc_typeck/mem_categorization.rs#L77
Based on discussion on Zulip we also want the following to close this issue
Over the last year of iterations for edge cases, the capture analysis has gotten pretty complicated, and code even more so.
It would be useful to create a formal specification for the analysis and validate all the cases with the current implementation. It might not be the worst idea to refactor the code to mimic the formal analysis more closely.
One question that remains a bit unclear from the hackmds alone (don't have time to see the recording), is the difference between the suggested let x = x;, which makes the closure not be more than FnOnce if x is not Copy, vs. if false { let _x = x; loop {} }, which just makes the closure capture x without necessarily consuming it when called (same as currently doing let _ = x; in a move closure).
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=90049afee37fc9377ca13c89523d8d4b
Discussion happened here:
Capture analysis happens in two passes:
If we take an example
let s: String; // hir_id_s
let mut p: Point; // his_id_p
let c = || {
println!("{}, s"); // L1
p.x += 10; // L2
println!("{}" , p.y) // L3
println!("{}, p) // L4
drop(s); // L5
};
After 1. the structure we will have is:
{
Place(base: hir_id_s, projections: [], ....) -> (L5, ByValue),
Place(base: hir_id_p, projections: [Field(0, 0)], ...) -> (L2, ByRef(MutBorrow))
Place(base: hir_id_p, projections: [Field(1, 0)], ...) -> (L3, ByRef(ImmutBorrow))
Place(base: hir_id_p, projections: [], ...) -> (L4, ByRef(ImmutBorrow))
After 2
{
hir_id_s -> [
Place(base: hir_id_s, projections: [], ....) -> (L4, ByValue)
],
hir_id_p -> [
Place(base: hir_id_p, projections: [], ...) -> (L2, ByRef(MutBorrow)),
]
The structures in reality look a bit more complex. We want to print them out to stderr in a nicer fashion.
The Place can be printed as <variable_name>[<projecitons>]
, where projections
is a comma-separated list of pretty printed projection kinds.
Deref -> Deref
Field(u32, VariantIndex) -> (x.y); where x corresponds to u32, y corresponds to VariantIndex
Index -> Index
Subsclice -> Subsclice
For each capture, we want to throw an error at the expression that resulted in the capture along with the Place and the CaptureKind.
let s: String;
let mut p: Point;
let c = || {
println!("{}, s");
p.x += 10;
// ^ ERROR: Capturing p[(0.0)] -> MutBorrow
println!("{}" , p.y);
// ^ ERROR: Capturing p[(1.0)] -> ImmBorrow
println!("{}", p);
// ^ ERROR: Capturing p[] -> ImmBorrow
drop(s);
// ^ ERROR: Capturing s[] -> ByValue
};
When this gets extended to min capture
let c = || {
println!("{}, s");
p.x += 10;
// ^ ERROR: Capturing p[(0.0)] -> MutBorrow
// ^ ERROR: MinCapture p[] -> MutBorrow
println!("{}" , p.y);
// ^ ERROR: Capturing p[(1.0)] -> ImmBorrow
println!("{}", p);
// ^ ERROR: Capturing p[] -> ImmBorrow
drop(s);
// ^ ERROR: Capturing s[] -> ByValue
// ^ ERROR: MinCapture s[] -> ByValue
};
Currently, we store and report back only one expression for why a particular path is captured and how the path is captured. We prioritize how the path is captured (i.e on the CaptureKind -- ByValue/ByRef). This works today since the root variable is captured in entirety, however won't result in good diagnostics once the feature is complete.
Consider the following example
let p = Point { x: 10, y: 10 };
let c = || {
p.x += 10;
// ^ E1 ^
println!("{:?}", p);
// ^ E2 ^
};
Here the result of the capture analysis is p
is borrowed by a MutBorrow
. This is because E1
force the MutBorrow
and E2
forces the path p
.
By storing only one piece of information currently E1
(since we prioritize the capture kind) in diagnostics can't express all the information a user might need to fix their code.
We need to update CaptureInformation
(rustc_middle/src/ty/mod.rs
) to contain two expression ids (capture_kind_expr_id
, path_expr_id
). We then use this to emit two notes.
let mut p = Point { x: 10, y: 10 };
let c = || {
p.x += 10;
// ^ mutable borrow of `p` happens here
println!("{:?}", p);
// ^ `p` is captured since it's used here.
};
In case both the capture_kind_expr_id
and path_expr_id
are the same we only emit one-note similar to how we do it today.
let mut p = Point { x: 10, y: 10 };
let c = || {
p = Point { ... } ;
// ^ mutable borrow of `p` happens here
};
Consider the following codesample:
#![feature(capture_disjoint_fields)]
fn main() {
let mut t = (10, 10);
let t1 = (&mut t, 10);
let mut c = || {
t1.0.0 += 10;
};
c();
}
The code sample is valid because t1.0
is a mutable reference to t
. However today this results in an error saying t1.0.0
is not mutable.
This is because the two things that are looked at are t1
or t1.0.0
which are &(&mut, i32)
and i32
respectively which are both immutable.
We need to add extra information in ty::CapturePlace
storing if the place can be mutated.
In the following example (playground):
let foo = [1, 2, 3];
|| let _ = foo[0];
foo[0]
as an overloaded index operationfoo
to be usedfoo
However, it would be plausible:
foo[0]
is not overloaded, and 0
is constant, ecfoo[0]
as a place expression_
foo
Details over here: https://hackmd.io/71qq-IOpTNqzMkPpAI1dVg?view#Migration-Plan
The current test suite doesn't test the feature works properly in case of async closures/blocks.
One example: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=11a567d3091c70dab92effac43a62819
The current HIR Place data structure includes only two kinds of projections:
If you compare that to the MIR Place, you can see that the MIR version has a lot more kinds of projections. I'm making a list of them below. We may not need to add them all to HIR Place, but we probably want most of them. For each one, there is a check-box, and in some cases I may link to additional issues that give more details about how to make a change.
Within the HIR we don't need to be as precise as the MIR.
Field(u32, VariantIdx)
where the u32
is for the field within a struct.Index
.ConstIndex
is used for some special cases where we know the index is a fixed constant -- mostly pattern matching like let [a, b, _] = some_array
. The compiler is (sometimes) smart enough to see that these moves are distinct from (say) let [_, _, c] = some_array
, but to be that smart, it needs to know the indices as constants. We are just going to capture the entire array for now, so we don't care for now.struct X {
a : String,
b : String,
c : String,
}
impl X {
fn from_other_X(o: Self) -> Self {
let c = || {
X { a: "".into(), ..o }
};
c()
}
}
The closure captures o.b
, and o.c
. But in MIR we expect to see *o
which causes issues
If we need to capture a.b.c
and a.b
, then we only capture a.b
With/After #7
Will RFC#2229 technically be a breaking change because of that? It seems possible, if uncommon, for a move closure to currently be Clone because it captured the whole instance, but that capturing subfields could capture things that are not themselves Clone, and thus make the closure no longer Clone. (Maybe one of the fields cannot be safely copied, but the captured things wraps it up into something that can always do so soundly.)
(That can't be a problem with Copy, since all subfields have to be Copy anyway. So it'll just make more closures Copy than used to be.)
Consider the following example
let x : DropType;
let y : DropType;
let p = P { x, y };
let c = || {
move(p.y);
move(p.x);
};
Even though the closure captures all paths starting from p
, without the feature the fields are dropped in order, so p.x
and then p.y
. However, with the feature enabled, we would drop p.y
, and then p.x
because that's the order (use within the closure) in which we would place them in the desugared closure structure.
This pretty printing code could run at different times, it should probably be changed, but not entirely clear to what yet.
Currently the code that does checks for trait bounds that won't be met in 2021 doesn't include the capture kind in the type of the captured path.
For computing the capture kind as per 2018:
We don't take any benefit from ty::RootVariableMinCaptureList
being an index map. Given the nature of how the data is stored, we had to write our function to assign an index to each capture. So instead we should just use a FxHashMap
.
This will probably result in some memory and/or perf savings.
Aside: Maybe rename it? The name suggests that the data structure itself is a list but it returns lists, and isn't itself a list.
As discussed today, we should create a custom lint that checks how big a closure is when using this feature (this can be done by looking at the results of min-capture analysis, even if the feature is disabled) and compares it to the size of a closure using only upvars-mentioned. The lint could trigger if the size increases by more than N% and we can do a crater run to get a sense of how often that occurs. Something like that.
When capture_disjoint_fields
is enabled we see that the compiler ICEs when closure contains something like
let x = 10;
let tup = (1, 2);
let p = Point { x: 10, y: 20 };
let c = || {
let _ = x;
let Point { x, y } = p; // 1
let (_, _) = tup; // 2
};
The problem here is that in case 1 p[0]
and p[1]
are captured, but when we build MIR we need to get information about the initializer which is p
in this case which is missing since p
itself isn't captured.
The issue with 2 is that since nothing from tup
is used, nothing is captured. Nothing will be read when MIR is built either, but to build MIR we need access to tup
, which we don't have.
Ensure ui/closures/2229_closure_analysis/wild_patterns
passes
We should create a capture!(x)
macro that expands to drop(&x)
. This will be used with the migrations.
Today we use the #![feature]
to decide whether to use new-style capture, but we should also consider the 2021 edition.
this should be more descriptive? i.e. CapturePlace instead of CapturedVar?
closure_kind_origin
uses a (Span, Symbol)
to store which Variable and which span resulted in us selected a particular closure kind(Fn
, FnMut
, FnOnce
).
We want to update this to now use (Span, Place)
to point to precise access. This would require us to modify the definition in rustc_middle/src/ty/context.rs
, adjust_closure-kind
, and uses of this map.
Currently upvars don't generate projections: https://github.com/rust-lang/rust/blob/28946b3486d507418b8a4acb92d5e2baae193d65/src/librustc_typeck/mem_categorization.rs#L4860
The Delegate trait currently use PlaceWithHirId
which is composed of Hir Place
and the corresponding expression id.
Even though this is an accurate way of expressing how a Place is used, it can cause confusion during diagnostics.
Eg:
let arr : [String; 5];
let [a, ...] = arr;
^^^ E1 ^^^ = ^^E2^^
Here arr
is moved because of the binding created E1. However, when we point to E1 in diagnostics with the message arr
was moved, it can be confusing. Rather we would like to report E2 to the user.
We want to modify the Delegate methods to provide an expression that should be used for diagnostics.
We have tried to previously achieve something similar before https://github.com/rust-lang/rust/pull/75933/files#diff-1f18144176a6e44f905262991bed987098348cdbd688ac76f8c915ec91b7fd58R316-R319
One way to use AssertUnwindSafe is to do something like test/mir_calls_to_shim.
fn assert_panics<F>(f: F) where F: FnOnce() {
let f = panic::AssertUnwindSafe(f);
let result = panic::catch_unwind(move || {
f.0()
});
if let Ok(..) = result {
panic!("diverging function returned");
}
}
This would not work once capture_disjoint_fields
is enabled since f.0
is captured, essentially rendering the wrapper redundant.
Currently, min capture analysis results are written to MaybeTypeckResults, which is a transient data structure within rustc_typeck.
For the results to be available post-typeck they need to be written out to TypeckResults.
The writeback stage in rustc_typeck/src/check/writeback.rs
needs to be updated to add the analysis results to the final results table.
In this example playground:
let foo = [1, 2, 3];
let c = || match foo { _ => () };
the match foo
winds up registering a read of foo
in the ExprUseVisitor
. Note that the behavior here is different from let _ = foo
. This is also somewhat inconsistent with the borrow checker, which considers match foo { _ => () }
to not be a use (i.e., this code compiles, playground):
let mut x = 1;
let y = &x;
match x { _ => () }
drop(y);
As part of #1, one of the first steps would be to extract out a "Field" projection for HIR Places.
To make this change, we would start by extending the definition of Projection
with a new variant, Field
, based on the MIR variant.
XXX add more details
Compiling tte following code using the old capture logic
#![feature(rustc_attrs)]
struct S(i32);
fn foo() {
let b = Box::new(S(10));
let c = #[rustc_capture_analysis] || {
let _b = b.0;
};
c();
}
prints the following diagnostic when compiled(view on playground):
note: Min Capture b[] -> ImmBorrow
--> src/lib.rs:9:39
|
9 | let c = #[rustc_capture_analysis] || {
| _______________________________________^
10 | | let _b = b.0;
| | ^^^ b[] captured as ImmBorrow here
11 | | };
| |_____^ b[] used here
It should instead look something like:
note: Min Capture b[] -> ImmBorrow
--> /users/cpardy/example2.rs:10:18
|
10 | let _b = b.0;
| ^^^
struct S {
x: String,
y: String
}
fn x(s: &mut S) {
let mut c = move || {
s.x.truncate(0);
};
c();
}
fn main() {
let mut s = S { x: "".into(), y: "".into() };
x(&mut s);
}
Without the feature gate s
which is a &mut S
is moved. However with the feature gate *s.x
is moved which is a String
which behind a &mut S
and can't be moved out.
Initial thoughts:
If the root variable is a shared ref then any path starting at that root variable will be captured by ref or copied (in case the path is Copy and the closure is a move closure)
We will never have a borrow conflict if we reduce the precision
In the case where the path would've been captured by ref, truncating to the root variable and capturing the root variable will result in capture of the same size or maybe cheaper.
f: &F
works is we caputre &*f
which ends up capturing a fat pointer which is two words. Before precise capture we'd capture &f
which is a thin pointer and a single word.In the case where the path would've been copied, truncating to the root variable can possibly result increase in size compared to precise capture.
x: &T
, x.y: u8
then it's cheaper to copy x.y
is cheaper than copying x
We lower paths that start from an Upvar
here.
Essentially what this code does is convert upvars to refer to a field within the desugared ClosureSubsts
. We want to refactor this code as follows:
Thir::Expr::UpvarRef(var_hir_id)
in case the var_hir_id is mentioned within the closure. (This will allow us to deal with the let _ = x
problem in MIR).convert_var
to rustc_mir_build/as_place
Thir::Expr::SelfRef
, it's most likely only for Closures.#[repr(packed)]
struct Foo { x: u8, y: u8 }
fn main(foo: Foo) {
let c = || {
let z: &u8 = unsafe { &foo.x };
};
// cannot capture `foo.x` at the point of closure *creation*, must just capture `foo`
// In your code:
// You'll see a place like `[foo, x]`
// if you have a field projection where the field is defined in a `#[repr(packed)]` struct, you need to capture the struct itself
}
Both these examples currently truncate the capture to the root variable
fn box_mut() {
let mut s = Foo { x: 0 } ;
let px = &mut s;
let bx = Box::new(px);
let c = #[rustc_capture_analysis] move || bx.x += 10;
}
fn mut_box() {
let s = String::new();
let bx = Box::new(s);
let px = &mut bx;
let c = #[rustc_capture_analysis] move || px.truncate(0);
}
#![feature(capture_disjoint_fields)]
#![feature(rustc_attrs)]
struct S(i32, String);
fn main() {
let x = Box::new(S(0, String::new()));
let c = move || {
println!("{}, x.1); // currently capture x.1 by Move
};
}
This should capture x
by move.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.