Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implicit calls to deref and deref_mut are a footgun with raw pointers #131847

Open
AngelicosPhosphoros opened this issue Oct 17, 2024 · 12 comments
Open
Labels
C-discussion Category: Discussion or questions that doesn't represent real issues.

Comments

@AngelicosPhosphoros
Copy link
Contributor

AngelicosPhosphoros commented Oct 17, 2024

I tried this code (which models library code with invariants for all valid instances of A but without all the details):

use std::ops::Deref;
use std::mem::MaybeUninit;

struct A{
    b: MaybeUninit<B>,
}

struct B{
    field: String,
}

impl Deref for A {
    type Target = B;
    fn deref(&self)->&B{
        println!("Called deref!");
        // SAFETY
        // This deref is expected to be called only outside of the current module and field b is private
        // so as long as all instances of A have it initialized before being accessible outside,
        // this code is sound.
        unsafe {
            self.b.assume_init_ref()
        }
    }
}

// Note: while this is `main`, it is intended to show code that used for
// initialization of new instance of A. I omitted wrapping of A in a module
// and using separate factory method for brewity.
fn main() {
    let mut a: MaybeUninit<A> = MaybeUninit::uninit();
    unsafe {
        let p = a.as_ptr();
        // The only way to get pointer to field of pointee.
        let b_ptr = &raw const (*p).b;
        // However, it may cause unintended calls to a deref if we are not vigilant
        // and mistakenly type name of a field of a type our pointee dereferences to.
        let field_ptr = &raw const (*p).field;
    }
}

I expected to see this happen: this code should be rejected because it causes implicit calls to Deref::deref which assumes *p to be initialized.

Instead, this happened: code compiles and prints Called deref! when executed.

Meta

rustc --version --verbose:

rustc 1.84.0-nightly (9322d183f 2024-10-14)
binary: rustc
commit-hash: 9322d183f45e0fd5a509820874cc5ff27744a479
commit-date: 2024-10-14
host: x86_64-unknown-linux-gnu
release: 1.84.0-nightly
LLVM version: 19.1.1
@AngelicosPhosphoros AngelicosPhosphoros added the C-bug Category: This is a bug. label Oct 17, 2024
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Oct 17, 2024
@AngelicosPhosphoros
Copy link
Contributor Author

I think, it should call derefs only this way:

(**p) <-- Explicitly dereference object again.
(*).deref() <-- Explicitly call deref method.

@tgross35 tgross35 added the I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness label Oct 17, 2024
@rustbot rustbot added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Oct 17, 2024
@carbotaniuman
Copy link
Contributor

While an unfortunate footgun, this is not a soundness issue. Removing the unsafe shows that only the first line is (correctly) exempt from needing an unsafe block, while the second line does require one.

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=7545201409c69d43e6b2d9700b04aef3

@VorpalBlade
Copy link

It would make sense to have a lint for this still (clippy or rustc).

@fmease
Copy link
Member

fmease commented Oct 17, 2024

Might get caught by #123239, haven't checked.

@compiler-errors
Copy link
Member

This is not unsound, so I'm gonna untag it as such.

@compiler-errors compiler-errors removed I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Oct 17, 2024
@traviscross
Copy link
Contributor

traviscross commented Oct 17, 2024

@rustbot labels -C-bug +C-discussion

@rustbot rustbot added C-discussion Category: Discussion or questions that doesn't represent real issues. and removed C-bug Category: This is a bug. labels Oct 17, 2024
@jieyouxu jieyouxu removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Oct 18, 2024
@workingjubilee workingjubilee changed the title Implicit calls to deref and deref_mut must be forbidden in raw pointer operations. Implicit calls to deref and deref_mut are a footgun with raw pointers Oct 18, 2024
@RalfJung
Copy link
Member

I would argue you asked for trouble by writing that Deref instance. That instance is unsound, it can be used to cause UB in safe code:

fn main() {
    let a = A { b: MaybeUninit::uninit() };
    let val = &a.field;
    println!("{val}");
}

So, the fix is to not write such Deref instances. I am not convinced a lint is justified. (Well, ideally we'd have a lint against unsound code, but alas, that's not possible. ;)

@RalfJung
Copy link
Member

because it causes implicit calls to Deref::deref which assumes *p to be initialized.

deref is not allowed to make such assumptions. It is a safe function, and as such must not make any assumptions beyond being given a well-typed argument.

Cc @rust-lang/opsem

@ChayimFriedman2
Copy link
Contributor

@RalfJung The problem (AFAIK) is not that Rust is unsound. Rust is behaving correctly. The problem is that it's too easy to make mistakes.

Making a safe Deref is fine if the type has an invariant that it's always initialized, but private code may break this invariant temporarily, and it is too easy to invoke Deref by mistake then. Ultimately, we should be careful when writing unsafe code, and the language could assist with that.

@AngelicosPhosphoros
Copy link
Contributor Author

AngelicosPhosphoros commented Oct 18, 2024

@RalfJung

I would argue you asked for trouble by writing that Deref instance. That instance is unsound, it can be used to cause UB in safe code:

Your example is possible only inside module that declares A because field b is private. If all ways to acquire instance of A outside of module guarantee that b is initialized, the code is sound.

By your logic, Vec::deref (which calls Vec::as_slice) is unsound because I can write code like this inside alloc::vec module:

let my_vec: Vec<i32> = Vec { len: 200, buf: RawVec::new() };
let slice: &[i32] = &*my_vec;

However, we actually know that deref implementation of Vec is sound because we know that fields len and buf cannot be modified outside of alloc::vec module and that we write only valid values inside the module.

Similarly, A::deref in my example expects to be called only with valid instances. I didn't include all the code for modules and making everything private in the example because we use minimal examples for bug reports.

@ChayimFriedman2 Argues same thing as me, but more eloquently.

@RalfJung
Copy link
Member

RalfJung commented Oct 18, 2024

Yes we use minimal examples, but this was too minimal. ;) The safety invariant and abstraction barrier are a key part of the argument here. Given the example has neither and there were no comments to indicate anything like that, it is reasonable to assume there is no such invariant. (Well, there is a SAFETY comment, but it didn't explain why this invariant is supposed to hold, given that the code clearly breaks it. Vec is inside a carefully designed mod with carefully chosen pub functions. You reduced all that away which then makes it not the same situation as Vec at all.)

Thanks for clarifying!

Would be good to add a comment in main explaining that this is meant to model code inside the library and hence is allowed to temporarily break the invariant.

@AngelicosPhosphoros
Copy link
Contributor Author

I added comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-discussion Category: Discussion or questions that doesn't represent real issues.
Projects
None yet
Development

No branches or pull requests