Thursday, July 31, 2014

What do expressions look like underneath the language?

Section 1.4 of the language specification deals with expressions.  Everything from the dot for member access to lambda expressions.  Here are some little programs which implement some of the interesting ones so that we can see how they are expressed as IL.  First, let's see what x++ looks like as IL.  From:

        static void Main(string[] args)
        {
            int x = 0;
            x++;
        }

we get:

    .method private hidebysig static void Main (
            string[] args
        ) cil managed
    {
        .entrypoint
        .locals init (
            [0] int32 x
        )
        IL_0000: nop
        IL_0001: ldc.i4.0
        IL_0002: stloc.0
        IL_0003: ldloc.0
        IL_0004: ldc.i4.1
        IL_0005: add
        IL_0006: stloc.0
        IL_0007: ret
    }

Digging into it, IL_0000 (henceforth line 0) is a no-op command, that is, do nothing.  Line 1 pushes a 0 onto the stack.  Line 2 pops the top value of the stack (the 0 just pushed) into the variable [0].  Line 3 pushes the value of variable [0] onto the stack.  Line 4 pushes a 1 onto the stack.  Line 5 adds the top two values in the stack and pushes the results.  Line 6 pops the stack value into variable [0].

Not very interesting in and of itself, but what if we change from post-increment to pre-increment.  What change would we see?  Well, none, because the compiler knows that it doesn't matter in this expression.  To see a difference, we need to get a little bit more complex.  How about something ridiculous like:

        static void Main(string[] args)
        {
            int x = 0;
            int y = 0;
            int z = 0;
            z = x++ + ++y;
        }

which becomes:

    .method private hidebysig static void Main (
            string[] args
        ) cil managed
    {
        .entrypoint
        .locals init (
            [0] int32 x,
            [1] int32 y,
            [2] int32 z
        )

        IL_0000: nop
        IL_0001: ldc.i4.0
        IL_0002: stloc.0
        IL_0003: ldc.i4.0
        IL_0004: stloc.1
        IL_0005: ldc.i4.0
        IL_0006: stloc.2
        IL_0007: ldloc.0
        IL_0008: dup
        IL_0009: ldc.i4.1
        IL_000a: add
        IL_000b: stloc.0
        IL_000c: ldloc.1
        IL_000d: ldc.i4.1
        IL_000e: add
        IL_000f: dup
        IL_0010: stloc.1
        IL_0011: add
        IL_0012: stloc.2
        IL_0013: ret
    }

A bit more complex, but Telerik JustDecompile does a great job turning it into readable C#:

        private static void Main(string[] args)
        {
            int x = 0;
            int y = 0;
            int z = 0;
            int num = x;
            x = num + 1;
            int num1 = y + 1;
            y = num1;
            z = num + num1;
        }

Notice the use of temporary variables to represent the pushed stack values?

So, that is it for right now, I want to look at some of the more complex expressions so we will stay with 1.4 next time.

No comments:

Post a Comment