Recently I’ve found some cursed Rust code and decided to make a little joke/question on twitter. In the tweet I’ve presented some unusual code and asked “how could it compile?”. The solution was found quite fast by @Veykril (kudos to them!) and in this post I want to explain it in detail.

The joke/question

So in the tweet I’ve presented the following code:

fn main() {
    for x in lib::iter::<u32> {
        let _: u32 = x; // assert that `x` has type u32
    }
    
    for y in lib::iter::<String>() {
        let _: String = y;
    }
    
    for z in lib::iter { // infer type
        let _: &'static str = z;
    }
    
    for k in lib::iter() {
        let _: f64 = k;
    }
}

That compiles with stable compiler, if you have a right lib! The interesting bit here is of course that function lib::iter can be called with or without parentheses. This is normally not possible, so what is going on?

First of all, some low hanging fruit: for loops in Rust desugar to something like this:

// Original
for x in i { f(x); }
// Desugaring
{
    let mut iter = IntoIterator::into_iter(i);
    while let Some(x) = iter.next() { f(x); }
    // while let itself desugars to loop+match,
    // but that's not the point
}

So before iteration starts into_iter is called. Which allows you to pass any type implementing IntoIterator, not just Iterators. That’s just to say that we need iter and iter() somehow evaluate to type(s) that implement IntoIterator.

As a side note: on nightly there is a trait IntoFuture that is just like IntoIterator, but for futures and is used for .await desugaring. You can do all the same stuff with it, so async functions with optional () are possible too:

#![feature(into_future)]
async fn _f() {
    let _: String = lib2::fut::<String>.await;
    let _: [u8; 2] = lib2::fut::<[u8; 2]>().await;
    let _: &str = lib2::fut.await;
    let _: u128 = lib2::fut().await;
}

BTW I think it should be stabilized soon, so keep an eye on the tracking issue (or don’t (I’m just very excited for this feature)).

But, this all still leaves us with a question: how can functions be called without parenthesis?

Tempting idea that doesn’t work

🐸

The title of this section is concerning, but why can’t we just make a const+impl IntoIterator for fn()?

So the idea is to write something like this:

const iter: fn() -> Iter = || Iter;
impl IntoIterator for fn() -> Iter { /* not important */ }

struct Iter;
impl Iterator for Iter {}

Then for _ in iter {} would work because of the IntoIterator impl and for _ in iter() {} would work because iter’s type is a function pointer that returns a type that implements Iterator.

But… this doesn’t work for the following two reasons:

  1. You can’t implement a foreign trait (IntoIterator) for a foreign type (fn() -> _) and standard library doesn’t (yet?) implement IntoIterator for function pointers. (you could patch std but then this won’t work with the stable compiler)
  2. Constants can’t have generic parameters! So iter::<T> won’t work.

So, we need to find something else to (ab)use.

Hack #1

🐸

Uh? There are multiple hacks at play here already?

Yes! There is a hack for iter::<T> and for iter::<T>(), we’ll start with the former.

iter::<T> looks a lot like a unit structure with a generic parameter.

🐸

Maybe it is a unit structure with a generic parameter?

If only things were that simple… In Rust you need to use all generic parameters in the type, or else your code won’t compile:

struct S<T>(u8);
//~^ error: parameter `T` is never used
//~| help: consider removing `T`, referring to it in a field, or using a marker such as `PhantomData`
//~| help: if you intended `T` to be a const parameter, use `const T: usize` instead

This is because compiler wants to infer variance of all parameters. Since a unit structure, by definition, doesn’t have any fields, you can’t use generic parameters in it!

🐸

Wait, compiler mentioned PhantomData, isn’t that a unit structure with a generic parameter?…

It is!

🐸

So I assume we can’t use it because we can’t impl IntoIterator for it either. But why can’t we copy its definition into our code?

Well… just look at its definition:

#[lang = "phantom_data"] // <-- *compiler magic*
pub struct PhantomData<T: ?Sized>;
🐸

Oh.

Yeah…

So, there is no way around this, to define a PhantomData-like type, we need to do something hacky… A hack to do this I first saw implemented by dtolnay (not surprising, is it?) in their crate ghost.

The hack basically looks like this:

mod imp {
    pub enum Void {}

    pub enum Type<T> {
        Type,
        __Phantom(T, Void),
    }

    pub mod reexport_hack {
        pub use super::Type::Type;
    }
}

#[doc(hidden)]
pub use imp::reexport_hack::*;
pub type Type<T> = imp::Type<T>;
🐸

Wha-

It may seem convoluted at first, but it’s actually quite simple!

So, let’s unpack this item-by-item:

  1. pub enum Void {} defines a type with no values, also known as uninhabited type. The nice property for us is that values of this type can’t be created. That is basically a stable replacement for the ! type.
  2. Type<T> has two variants: unit variant Type and a struct variant __Phantom(T, Void). The latter uses T, solving the “parameter T is never used” error while simultaneously being impossible to construct because of the Void field. Since __Phantom variant is impossible to create / uninhabited, Type<_> effectively has only a single usable variant.
  3. reexport_hack reexports Type::Type (variant Type of the type Type)
  4. pub use imp::reexport_hack::*; is a glob reexport that reexports Type::Type that was reexported by reexport_hack. I’m not entirely sure why, but using glob is important.
  5. pub type Type<T> = imp::Type<T>; basically reexports the Type itself. It’s just rendered in docs in a nicer way than if reexported by pub use

And now the magic: Type now refers both to the type and to the variant. This works because Rust has different namespaces for types and values. Glob reexport somehow suppresses an error about clashing names that arises when importing directly. Idk why it’s this way :shrug:

Ah, and let _ = Type::<u8> works because you can apply generic parameters to variants. It’s the same way as None::<Fish> is an expression of type Option<Fish> or Ok::<_, E>(()) is an expression of type Result<(), E>.

🐸

That’s a lot… But I think I’ve grasped the concept

With this hack, you can define types that are indistinguishable from PhantomData! And this time we can use it to define an iter<_> “unit struct”:

pub type iter<T> = imp::iter<T>;
pub use imp::reexport_hack::*;

impl<T> IntoIterator for iter<T> {
    type Item = T;
    type IntoIter = Iter<T>; 
    // capitalized -^
}

struct Iter<T>(...);
impl<T> Iterator for Iter<T> { /* not that important */ }

mod imp { /* basically the same as before */}

This already allows us to do cool stuff like this:

for x in lib::iter::<u32> {
    let _: u32 = x; // assert that `x` has type u32
}

for z in lib::iter { // infer type
    let _: &'static str = z;
}

let iter: lib::iter<()> = lib::iter::<()>;
//     type ---^^^^^^^^        ^^^^^^^^^^--- constant

Now to the next hack, that would allow us to call iter too instead of using it as a constant!

Hack #2

🐸

I want to make a guess of what we’ll do!!

Uh ok, go ahead!

🐸

So I’ve heard that in Rust all functions and closures have unique types. And the ability to call these types with () is controlled via Fn, FnMut and FnOnce traits. Can we just implement these traits for iter<T>?

We could! But we can’t. These traits are unstable and I’m in the stable-compiler jail today.

🐸

Oh… Okay then… Do you have another “impossible to guess if haven’t seen before” kind of thing?

Kind of! We can (ab)use Deref trait:

impl<T> Deref for iter<T> {
    type Target = fn() -> Iter<T>;
    
    fn deref(&self) -> &Self::Target {
        &((|| Iter([])) as _)
    }
}

Normally Deref is used for smart pointers like Box or Arc so that

  • You can use the dereference operator on them (*my_beloved_arc)
  • You can call methods of the inner type (my_beloved_arc.nice())

This makes a lot of sense because smart pointers still just point to values and it’s nice to be able to just call methods.

But!

There is nothing stopping you from implementing Deref for non-smart pointer types (besides, what is a smart pointer?). And so abnormally Deref is used to forward methods.

🐸

Is this considered a bad practice?

Uh well mmm yemmm aaa phhhh mmm… mh.. yes? But everyone uses it anyway. It’s even used this way in the compiler itself, so who cares?

Ok, so what was I- Ah, right, and what came as a surprise to me, when you are writing f() deref coercions can deref f too! So f() can become more like (&*f)() or in other words f.deref()(). This means that by implementing deref to a function pointer for our iter<T> we can allow to call it!

Full code is on the playground if you want to play with it.

That’s all I have for today, two hacks that I saw used “in the wild” (right, this one is also by dtolnay) and thought that it’s quite surprising and fun thing.

bye.