rust-lang / project-rfc-2229 Goto Github PK

View Code? Open in Web Editor NEW

5.0 27.0 7.0 54 KB

A repository tracking the planning for implementation and evaluation of RFC 2229.

project-rfc-2229's Introduction

project-rfc-2229

A repository tracking the planning for implementation and evaluation of RFC 2229.

project-rfc-2229's People

Contributors

Stargazers

Watchers

Forkers

global-localhost global19 global19-atlassian-net isabella232 tryweirder seanpm2001

project-rfc-2229's Issues

Write blog post encouraging public testing

Always make the "upvar types" a tuple of the actual upvar types (Parent issue#53488)

Updated by nikomatsakis with current status:

In rust-lang/rust#69968, @eddyb landed the "first step" here of introducing a tuple into the upvar types. However, in current rustc, we create that tuple "eagerly", as soon as we create the closure type:

https://github.com/rust-lang/rust/blob/1a87c49e33d87955e28fa92a8d59a17264ac6044/src/librustc_typeck/check/closure.rs#L90-L103

The next step is to move that tuple creation "late", so that it occurs during the "upvar analysis" code, so somewhere around here:

https://github.com/rust-lang/rust/blob/1a87c49e33d87955e28fa92a8d59a17264ac6044/src/librustc_typeck/check/upvar.rs#L194-L202

The problem with this is that the current accessor, ClosureSubsts::upvar_tys, assumes that the upvar types are a tuple with known arity. Thus any calls to upvar_tys that occur before the upvar analysis code runs will ICE.

Last time, we added an accessor upvar_tuple_ty that returned the tuple itself (or an unbound type variable), since it turned out that many of the calls to upvar_tys could be replaced by a call that operated on the tuple. Most notably, in the trait system, for example here:

https://github.com/rust-lang/rust/blob/1a87c49e33d87955e28fa92a8d59a17264ac6044/src/librustc_trait_selection/traits/select.rs#L2239-L2242

We changed this to yield the tuple always. The problem with that was that it was effecting the error messages, because the tuple was being introduced into the "backtrace" and we somehow never got things quiite right to remove it.

I still think that is perhaps the right approach, but there is an alternative we could try: we could get the tuple ty and check whether it's been "resolved" yet (via a call to self.infcx.shallow_resolve). If so, we can put each of its components as we do today, but if not, we can return Ambiguous, as we do here.

Print captures precisely in borrow checker

The borrow checker just prints the root variable when printing captures from within the context of the closure, we should update this to be print the captured path precisely.

Ideally, we do this once rust-lang/rust#80635 lands because it introduces a function to prints paths precisely.

Distinguish `Place` and `PlaceWithHirId`

As part of refactoring to the new HIR place, we want to introduce a PlaceWithHirId that separates out the hir_id field from the rest of Place.

We can do this by:

rename the existing struct to PlaceWithHirId, get that to build
refactor to extract a few fields out of PlaceWithHirId into a new struct, Place, get that to build
Split projections into another structure, adding type information will be done as part of #5

After typechk use closure_captures

Since upvars_mentioned might not be same as paths that are captured, once type check is done, we know the list of captured path and we should be using that instead.

librustc_middle/mir
librustc_mir
librustc_mir_build
librustc_passes/liveness.rs

Use hir::Place in closure_captures

Ensure that field/member are marked borrowed/moved instead of variable hir_id.

This would look something like https://github.com/rust-lang/project-rfc-2229/blob/master/closure_captures.md

Handle `let _ = x` in coercing closures to FnPtr

Consider the following closure:

let x;

let c = || {
    let _ = x;
};

let f: fn() = c;

When capture_disjoint_fields is enabled, this closure doesn't capture x, implying c can be coerced to an FnPtr.

However, the code that checks if coercion can happen depends on upvars_mentioned which are computed before the capture analysis and would result in an error.

Update closure printing

Check #17

Rewrite HIR->MIR lowering to use Places

Replace uses of upvars with captured places

Store Type Information within projections

Currently within hir::Place type information is only stored for the final value/memory being read.

Type information at each intermediate access can be useful later and computing this information again at each Projection could be complicated and costly.

The final structure is explained here:
https://github.com/rust-lang/project-rfc-2229/blob/master/hir-place-target.md

Projection is defined here: https://github.com/rust-lang/rust/blob/master/src/librustc_typeck/mem_categorization.rs#L77

Based on discussion on Zulip we also want the following to close this issue

Remove current type from Place
Add Place::base_ty -- Type information of PlaceBase
Extending Projection with type information. Provide interface to read ty before and after the projection is applied.
Removing ProjectionKind::Deref 's type
Associating PlaceBase::Rvalue with expression hir_id ? (This one is optional for now and can be refactored later, if required.)

Write specification for Capture Analysis

Over the last year of iterations for edge cases, the capture analysis has gotten pretty complicated, and code even more so.

It would be useful to create a formal specification for the analysis and validate all the cases with the current implementation. It might not be the worst idea to refactor the code to mimic the formal analysis more closely.

Migrations might result in closures not being able to be more than FnOnce

One question that remains a bit unclear from the hackmds alone (don't have time to see the recording), is the difference between the suggested let x = x;, which makes the closure not be more than FnOnce if x is not Copy, vs. if false { let _x = x; loop {} }, which just makes the closure capture x without necessarily consuming it when called (same as currently doing let _ = x; in a move closure).
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=90049afee37fc9377ca13c89523d8d4b

Improve logging of capture analysis

Discussion happened here:

Capture analysis happens in two passes:

Collecting information about each place, i.e. the CaptureKind (ByValue/ByRef) and the expression that led to such capture.
Minimum capture analysis, where we find the set of the smallest set of Places (and associated CatureKind) that need to be captured to meet the same requirements as collected by 1.

If we take an example

let s: String;  // hir_id_s
let mut p: Point; // his_id_p  
let c = || {
       println!("{}, s");  // L1
       p.x += 10;  // L2
       println!("{}" , p.y) // L3
       println!("{}, p) // L4
       drop(s);   // L5
};

After 1. the structure we will have is:

{
      Place(base: hir_id_s, projections: [], ....) -> (L5, ByValue),
      Place(base: hir_id_p, projections: [Field(0, 0)], ...) -> (L2, ByRef(MutBorrow))
      Place(base: hir_id_p, projections: [Field(1, 0)], ...) -> (L3, ByRef(ImmutBorrow))
      Place(base: hir_id_p, projections: [], ...) -> (L4, ByRef(ImmutBorrow))

After 2

{
      hir_id_s -> [
           Place(base: hir_id_s, projections: [], ....) -> (L4, ByValue)
      ],
      hir_id_p -> [
           Place(base: hir_id_p, projections: [], ...) -> (L2, ByRef(MutBorrow)),
       ]

The structures in reality look a bit more complex. We want to print them out to stderr in a nicer fashion.

What should the output look like

The Place can be printed as <variable_name>[<projecitons>], where projections is a comma-separated list of pretty printed projection kinds.

Deref -> Deref
Field(u32, VariantIndex) -> (x.y); where x corresponds to u32, y corresponds to VariantIndex
Index -> Index
Subsclice -> Subsclice

For each capture, we want to throw an error at the expression that resulted in the capture along with the Place and the CaptureKind.

let s: String;
let mut p: Point;
let c = || {
       println!("{}, s");
       p.x += 10;
       // ^ ERROR: Capturing p[(0.0)] -> MutBorrow
       println!("{}" , p.y);
       // ^ ERROR: Capturing p[(1.0)] -> ImmBorrow
       println!("{}", p);
       // ^ ERROR: Capturing p[] -> ImmBorrow
       drop(s);
       // ^ ERROR: Capturing s[] -> ByValue
};

When this gets extended to min capture

let c = || {
       println!("{}, s");
       p.x += 10;
       // ^ ERROR: Capturing p[(0.0)] -> MutBorrow
       // ^ ERROR: MinCapture p[] -> MutBorrow
       println!("{}" , p.y);
       // ^ ERROR: Capturing p[(1.0)] -> ImmBorrow
       println!("{}", p);
       // ^ ERROR: Capturing p[] -> ImmBorrow
       drop(s);
       // ^ ERROR: Capturing s[] -> ByValue
       // ^ ERROR: MinCapture s[] -> ByValue
};

Improve diagnostics for Precise Capture

Currently, we store and report back only one expression for why a particular path is captured and how the path is captured. We prioritize how the path is captured (i.e on the CaptureKind -- ByValue/ByRef). This works today since the root variable is captured in entirety, however won't result in good diagnostics once the feature is complete.

Consider the following example

let p = Point { x: 10, y: 10 };

let c = || {
    p.x      += 10;
//  ^ E1 ^
    println!("{:?}",  p);
//                  ^ E2 ^
};

Here the result of the capture analysis is p is borrowed by a MutBorrow. This is because E1 force the MutBorrow and E2 forces the path p.

By storing only one piece of information currently E1 (since we prioritize the capture kind) in diagnostics can't express all the information a user might need to fix their code.

We need to update CaptureInformation (rustc_middle/src/ty/mod.rs) to contain two expression ids (capture_kind_expr_id, path_expr_id). We then use this to emit two notes.

let mut p = Point { x: 10, y: 10 };

let c = || {
    p.x      += 10;
//  ^ mutable borrow of `p` happens here
    println!("{:?}",  p);
//                    ^ `p` is captured since it's used here.
};

In case both the capture_kind_expr_id and path_expr_id are the same we only emit one-note similar to how we do it today.

let mut p = Point { x: 10, y: 10 };

let c = || {
    p  = Point { ... } ;
//  ^ mutable borrow of `p` happens here
};

Handle mutablity because of a mut reference

Consider the following codesample:

#![feature(capture_disjoint_fields)]

fn main() {
    let mut t = (10, 10);

    let t1 = (&mut t, 10);

    let mut c = || {
        t1.0.0 += 10;
    };

    c();
}

The code sample is valid because t1.0 is a mutable reference to t. However today this results in an error saying t1.0.0 is not mutable.

This is because the two things that are looked at are t1 or t1.0.0 which are &(&mut, i32) and i32 respectively which are both immutable.

We need to add extra information in ty::CapturePlace storing if the place can be mutated.

base of an index projection is always captured

In the following example (playground):

let foo = [1, 2, 3];
|| let _ = foo[0];

ExprUseVisior sees foo[0] as an overloaded index operation
It therefore considers foo to be used
So the closure will capture foo

However, it would be plausible:

because foo[0] is not overloaded, and 0 is constant, ec
we could consider foo[0] as a place expression
and thus ignore it because it is assigned to _
and thus not capure foo

Do a crater run to test migration

Implement migrations for RFC 2229

Details over here: https://hackmd.io/71qq-IOpTNqzMkPpAI1dVg?view#Migration-Plan

Basic first pass (rust-lang/rust#80629)
Insignificant destructors
Precise migrations (rust-lang/rust#81912)

Add tests for async closures

The current test suite doesn't test the feature works properly in case of async closures/blocks.

One example: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=11a567d3091c70dab92effac43a62819

Make hir::Place projections more precise

The current HIR Place data structure includes only two kinds of projections:

Deref
Other

If you compare that to the MIR Place, you can see that the MIR version has a lot more kinds of projections. I'm making a list of them below. We may not need to add them all to HIR Place, but we probably want most of them. For each one, there is a check-box, and in some cases I may link to additional issues that give more details about how to make a change.

Within the HIR we don't need to be as precise as the MIR.

Downcast is use for enum variants and in Hir we can combine it with field Field(u32, VariantIdx) where the u32 is for the field within a struct.
Since we are capturing the complete array even if a single element is captured we don't need to preserve index information and store that information as just Index.
ConstIndex is used for some special cases where we know the index is a fixed constant -- mostly pattern matching like let [a, b, _] = some_array. The compiler is (sometimes) smart enough to see that these moves are distinct from (say) let [_, _, c] = some_array, but to be that smart, it needs to know the indices as constants. We are just going to capture the entire array for now, so we don't care for now.

Struct Update/ Functional record update syntax error

struct X {
    a : String,
    b : String,
    c : String,
}

impl X {
    fn from_other_X(o: Self) -> Self {
        let c = || {
            X { a: "".into(), ..o }
        };
        c()
    }
}

The closure captures o.b, and o.c. But in MIR we expect to see *o which causes issues

Ensure minimal places are captured

If we need to capture a.b.c and a.b, then we only capture a.b

With/After #7

Possible closure not being Clone anymore

Will RFC#2229 technically be a breaking change because of that? It seems possible, if uncommon, for a move closure to currently be Clone because it captured the whole instance, but that capturing subfields could capture things that are not themselves Clone, and thus make the closure no longer Clone. (Maybe one of the fields cannot be safely copied, but the captured things wraps it up into something that can always do so soundly.)

(That can't be a problem with Copy, since all subfields have to be Copy anyway. So it'll just make more closures Copy than used to be.)

https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/Capture-disjoint-fields.20and.20Clone/near/216698416

Drop order affected because of use order

Consider the following example

let x : DropType;
let y : DropType;

let p = P { x, y };
let c = || {
   move(p.y);
   move(p.x);
};

Even though the closure captures all paths starting from p, without the feature the fields are dropped in order, so p.x and then p.y. However, with the feature enabled, we would drop p.y, and then p.x because that's the order (use within the closure) in which we would place them in the desugared closure structure.

Update pretty.rs

This pretty printing code could run at different times, it should probably be changed, but not entirely clear to what yet.

Investigate what this should look like and define next steps.
#15

Account for capture kind in trait migration

Currently the code that does checks for trait bounds that won't be met in 2021 doesn't include the capture kind in the type of the captured path.

For computing the capture kind as per 2018:

If move closure, then capture kind = ByValue
else, Max capture kind according to Imm < UniqueImm < Mut < ByValue

Convert `ty::RootVariableMinCaptureList` to be a FxHashMap

We don't take any benefit from ty::RootVariableMinCaptureList being an index map. Given the nature of how the data is stored, we had to write our function to assign an index to each capture. So instead we should just use a FxHashMap.

This will probably result in some memory and/or perf savings.

Aside: Maybe rename it? The name suggests that the data structure itself is a list but it returns lists, and isn't itself a list.

assessing size overhead from this feature

As discussed today, we should create a custom lint that checks how big a closure is when using this feature (this can be done by looking at the results of min-capture analysis, even if the feature is disabled) and compares it to the size of a closure using only upvars-mentioned. The lint could trigger if the size increases by more than N% and we can do a crater run to get a sense of how often that occurs. Something like that.

Handle patterns within closures with the feature gate enabled

When capture_disjoint_fields is enabled we see that the compiler ICEs when closure contains something like

let x = 10;
let tup = (1, 2);
let p = Point { x: 10, y: 20 };

let c = || {
    let _ = x;
    let Point { x, y } = p; // 1
    let (_, _) = tup; // 2
};

The problem here is that in case 1 p[0] and p[1] are captured, but when we build MIR we need to get information about the initializer which is p in this case which is missing since p itself isn't captured.

The issue with 2 is that since nothing from tup is used, nothing is captured. Nothing will be read when MIR is built either, but to build MIR we need access to tup, which we don't have.

Ensure ui/closures/2229_closure_analysis/wild_patterns passes

`capture!` macro

We should create a capture!(x) macro that expands to drop(&x). This will be used with the migrations.

Tie use of feature to the current edition

Today we use the #![feature] to decide whether to use new-style capture, but we should also consider the 2021 edition.

make validity more descriptive

https://github.com/sexxi-goose/rust/blob/8cb92072c8827de431c3cd932389acc67d410107/compiler/rustc_mir/src/interpret/validity.rs#L244

this should be more descriptive? i.e. CapturePlace instead of CapturedVar?

Update TypeckResuts::closure_kind_origin to use Place

closure_kind_origin uses a (Span, Symbol) to store which Variable and which span resulted in us selected a particular closure kind(Fn, FnMut, FnOnce).

We want to update this to now use (Span, Place) to point to precise access. This would require us to modify the definition in rustc_middle/src/ty/context.rs, adjust_closure-kind, and uses of this map.

Improve explanations in the migration lint

See https://zulip-archive.rust-lang.org/189812tcompilerwgrfc2229/79230improvingthemigrationlint.html

Generate projections for upvars

Currently upvars don't generate projections: https://github.com/rust-lang/rust/blob/28946b3486d507418b8a4acb92d5e2baae193d65/src/librustc_typeck/mem_categorization.rs#L4860

Provide diagnostic suggestion in ExprUseVisitor Delegate

The Delegate trait currently use PlaceWithHirId which is composed of Hir Place and the corresponding expression id.

Even though this is an accurate way of expressing how a Place is used, it can cause confusion during diagnostics.

Eg:

let arr : [String; 5];

let [a, ...]     =   arr;
 ^^^ E1 ^^^      =  ^^E2^^

Here arr is moved because of the binding created E1. However, when we point to E1 in diagnostics with the message arr was moved, it can be confusing. Rather we would like to report E2 to the user.

We want to modify the Delegate methods to provide an expression that should be used for diagnostics.

We have tried to previously achieve something similar before https://github.com/rust-lang/rust/pull/75933/files#diff-1f18144176a6e44f905262991bed987098348cdbd688ac76f8c915ec91b7fd58R316-R319

Auto traits with closures

One way to use AssertUnwindSafe is to do something like test/mir_calls_to_shim.

fn assert_panics<F>(f: F) where F: FnOnce() {
    let f = panic::AssertUnwindSafe(f);
    let result = panic::catch_unwind(move || {
        f.0()
    });
    if let Ok(..) = result {
        panic!("diverging function returned");
    }
}

This would not work once capture_disjoint_fields is enabled since f.0 is captured, essentially rendering the wrapper redundant.

Use min capture results everywhere

Currently, min capture analysis results are written to MaybeTypeckResults, which is a transient data structure within rustc_typeck.

For the results to be available post-typeck they need to be written out to TypeckResults.

The writeback stage in rustc_typeck/src/check/writeback.rs needs to be updated to add the analysis results to the final results table.

Mark list of stdlib types as insignificant_dtor

matching is always considered a use of the place, even with `_` patterns

In this example playground:

    let foo = [1, 2, 3];
    let c = || match foo { _ => () };

the match foo winds up registering a read of foo in the ExprUseVisitor. Note that the behavior here is different from let _ = foo. This is also somewhat inconsistent with the borrow checker, which considers match foo { _ => () } to not be a use (i.e., this code compiles, playground):

let mut x = 1;
let y = &x;
match x { _ => () }
drop(y);

Update error logging in borrow code

Depends on #22 and #10

Refactoring HIR Place: Fields

As part of #1, one of the first steps would be to extract out a "Field" projection for HIR Places.

To make this change, we would start by extending the definition of Projection with a new variant, Field, based on the MIR variant.

XXX add more details

Incorrect diagnostic when the feature isn't enabled

Compiling tte following code using the old capture logic

#![feature(rustc_attrs)]

struct S(i32);

fn foo() {
    let b = Box::new(S(10));
    
    let c = #[rustc_capture_analysis] || {
        let _b = b.0;
    };
    
    c();
}

prints the following diagnostic when compiled(view on playground):

note: Min Capture b[] -> ImmBorrow
  --> src/lib.rs:9:39
   |
9  |       let c = #[rustc_capture_analysis] || {
   |  _______________________________________^
10 | |         let _b = b.0;
   | |                  ^^^ b[] captured as ImmBorrow here
11 | |     };
   | |_____^ b[] used here

It should instead look something like:

note: Min Capture b[] -> ImmBorrow
  --> /users/cpardy/example2.rs:10:18
   |
10 |         let _b = b.0;
   |                  ^^^

Move out of reference in move closures

struct S {
    x: String,
    y: String
}

fn x(s: &mut S) {
    let mut c = move || {
        s.x.truncate(0);
    };
    c();
}


fn main() {
    let mut s = S { x: "".into(), y: "".into() };
    x(&mut s);
}

Without the feature gate s which is a &mut S is moved. However with the feature gate *s.x is moved which is a String which behind a &mut S and can't be moved out.

Size Optimization: Capture root varaiable if it's a sharef ref

Initial thoughts:

If the root variable is a shared ref then any path starting at that root variable will be captured by ref or copied (in case the path is Copy and the closure is a move closure)
We will never have a borrow conflict if we reduce the precision
In the case where the path would've been captured by ref, truncating to the root variable and capturing the root variable will result in capture of the same size or maybe cheaper.
- It's cheaper in case of function pointers. Right now the way preicse capture analysis would capture f: &F works is we caputre &*f which ends up capturing a fat pointer which is two words. Before precise capture we'd capture &f which is a thin pointer and a single word.
In the case where the path would've been copied, truncating to the root variable can possibly result increase in size compared to precise capture.
- eg: Consider x: &T, x.y: u8 then it's cheaper to copy x.y is cheaper than copying x

Move Upvar lowering from THIR to MIR

We lower paths that start from an Upvar here.

Essentially what this code does is convert upvars to refer to a field within the desugared ClosureSubsts. We want to refactor this code as follows:

Instead of doing the above desugaring we want to return Thir::Expr::UpvarRef(var_hir_id) in case the var_hir_id is mentioned within the closure. (This will allow us to deal with the let _ = x problem in MIR).
Move the logic currently in convert_var to rustc_mir_build/as_place
Maybe remove Thir::Expr::SelfRef, it's most likely only for Closures.

Handle `#[repr(packed)]` with feature gate

#[repr(packed)]
struct Foo { x: u8, y: u8 }

fn main(foo: Foo) {
    let c = || {
        let z: &u8 = unsafe { &foo.x };
    
    };
    // cannot capture `foo.x` at the point of closure *creation*, must just capture `foo`
    // In your code:
    // You'll see a place like `[foo, x]`
    // if you have a field projection where the field is defined in a `#[repr(packed)]` struct, you need to capture the struct itself
}

Don't truncate Derefs if the data is not moved into the closure

Both these examples currently truncate the capture to the root variable

fn box_mut() {
    let mut s = Foo { x: 0 } ;
    
    let px = &mut s;
    let bx = Box::new(px);
    
    
    let c = #[rustc_capture_analysis] move || bx.x += 10;
}

fn mut_box() {
    let s = String::new();
    
    let bx = Box::new(s);
    let px = &mut bx;
    
    
    
    let c = #[rustc_capture_analysis]  move || px.truncate(0);
}

Truncate box deref in move closures

#![feature(capture_disjoint_fields)]
#![feature(rustc_attrs)]

struct S(i32, String);

fn main() {
    let x = Box::new(S(0, String::new()));
    
    let c = move || {
        println!("{}, x.1); // currently capture x.1 by Move
    };
}

This should capture x by move.