Reviewing C# foreach, Part 1
If you’re a C# developer, you probably use the foreach language keyword all the time. I love this keyword, because it drastically simplifies a lot of the boilerplate we’d otherwise have to write ourselves. It makes consuming Iterators very easy, and aligns the programming language very closely to natural language constructs. With that said, have you ever looked at the code that gets generated by the C# compiler? That’s the topic of this post, so let’s dig in!
I love digging into details about programming languages, and frequently look at the Intermediate Language (IL) the Roslyn compiler generates for my code. A lot of my interest in this area came from Joe Duffy and Eric Lippert, and Mads Torgersen. Joe talks a lot about safety mechanisms in programming languages and concurrency on his blog, and Eric has discussed all sorts of interesting programming language things on his blog. A lot of this also gave me a deep appreciation for code contracts (not to be confused with the Code Contracts technology).
So how does the foreach
keyword work, anyway? You could read the part of the
language specification regarding foreach, and would save
yourself quite a bit of time doing it. The docs are also great,
and I highly recommend reading them (if you haven’t). But, with that said, it’s
more satisfying to dig in yourself, which is what we’re going to do here!
We’re going to use an extraordinarily simple method as our sample for looking at foreach:
static void PrintStuff(IEnumerable<string> strings) {
foreach (var str in strings)
Console.WriteLine(str);
}
This one will be easy enough to de-construct. So, let’s get this thing compiled. I compiled the method above using Visual Studio, and the default DEBUG build. Here’s the resulting IL:
.method private hidebysig static void
PrintStuff(class [System.Runtime]System.Collections.Generic.IEnumerable`1<string> strings)
cil managed
{
// Code size 47 (0x2f)
.maxstack 1
.locals init (
class [System.Runtime]System.Collections.Generic.IEnumerator`1<string> V_0,
string V_1)
IL_0000: nop
IL_0001: nop
IL_0002: ldarg.0
IL_0003: callvirt instance class [System.Runtime]System.Collections.Generic.IEnumerator`1<!0> class [System.Runtime]System.Collections.Generic.IEnumerable`1<string>::GetEnumerator()
IL_0008: stloc.0
.try
{
IL_0009: br.s IL_0019
IL_000b: ldloc.0
IL_000c: callvirt instance !0 class [System.Runtime]System.Collections.Generic.IEnumerator`1<string>::get_Current()
IL_0011: stloc.1
IL_0012: ldloc.1
IL_0013: call void [System.Console]System.Console::WriteLine(string)
IL_0018: nop
IL_0019: ldloc.0
IL_001a: callvirt instance bool [System.Runtime]System.Collections.IEnumerator::MoveNext()
IL_001f: brtrue.s IL_000b
IL_0021: leave.s IL_002e
} // end .try
finally
{
IL_0023: ldloc.0
IL_0024: brfalse.s IL_002d
IL_0026: ldloc.0
IL_0027: callvirt instance void [System.Runtime]System.IDisposable::Dispose()
IL_002c: nop
IL_002d: endfinally
} // end handler
IL_002e: ret
} // end of method Program::PrintStuff
This first bit, .method private hidebysig static void
is just header
information. It’s so similar to C# that I don’t think you’ll need an explanation
for it. The .maxstack
directive tells the CLR how large the stack needs to be
to execute the method. Specifically, it’s saying “the stack must have space for
at most n
values to be stored at any given time.”
The .localsinit
portion is used for local variables. There is sometimes some
disparity between the number of variables we declare in our code, and the
number emitted to support our application. This isn’t surprising: many of C#’s
language features are syntactic sugar for much more complex code - hence the
syntactic sugar!
The next two items are nop
s:
IL_0000: nop
IL_0001: nop
These are only there because I compiled the application in Debug mode. They’re
the bits that allow us to set breakpoints in locations like the start of the
method (at the brace). Release builds would not usually include extraneous
nop
s (which we’ll see at the end of this post).
The next part gives us real functionality:
IL_0003: callvirt instance class [System.Runtime]System.Collections.Generic.IEnumerator`1<!0> class [System.Runtime]System.Collections.Generic.IEnumerable`1<string>::GetEnumerator()
IL_0008: stloc.0
callvirt
is used here primarily because the type is not known at compile-time,
but is also seen frequently because callvirt
generates a null-check. stloc.0
is used to - predictably - assign the result of calling GetEnumerator()
to the
local variable list.
The next part - br.s IL_0019
- may also be there to support the debugger.
This is an unconditional jump directive, which will start the iterator. This
next part is where the really interesting stuff happens:
IL_000b: ldloc.0
IL_000c: callvirt instance !0 class [System.Runtime]System.Collections.Generic.IEnumerator`1<string>::get_Current()
IL_0011: stloc.1
IL_0012: ldloc.1
IL_0013: call void [System.Console]System.Console::WriteLine(string)
IL_0018: nop
IL_0019: ldloc.0
IL_001a: callvirt instance bool [System.Runtime]System.Collections.IEnumerator::MoveNext()
IL_001f: brtrue.s IL_000b
IL_0021: leave.s IL_002e
ldloc.0
loads our local enumerator off of the stack, and accesses the
IEnumerator<string>.Current
property. In the C# language, properties are just
syntactic sugar for equivalent <property-type> get_<property_name>
and
void set_<property_name>(<property-type> value)
methods, which is shown in
the snippet above. After that, we call stloc.1
to set the value in a local
variable, and then re-load the local variable so we can do something with it.
In this case, we’re just calling a method - Console.WriteLine(string)
. More
complex variations of this method obviously require additional parameters,
which would lead to more stack operations.
The next IL instruction - il_0018: nop
- is also there to support the
debugger. Thereafter, we re-load the enumerator, calls it’s bool MoveNext()
method, and setup a branch based on the result of calling MoveNext()
:
- Continue executing the loop
- Leave the method (
IL_002e: ret
)
Finally, we enter the finally
block (pun intended):
finally
{
IL_0023: ldloc.0
IL_0024: brfalse.s IL_002d
IL_0026: ldloc.0
IL_0027: callvirt instance void [System.Runtime]System.IDisposable::Dispose()
IL_002c: nop
IL_002d: endfinally
} // end handler
This is all looks straight forward, but there’s actually a hidden gem in here
for IL_0024: brfalse.s IL_002d
. The documentation for
this instruction says:
Transfers control to a target instruction if
value
isfalse
, a null reference, or zero.
Thus, after loading the local variable, a null-check is issued before any
other work is performed. We then re-load the local, and call the
IDisposable.Dispose()
implementation. Interestingly, as described above, the
callvirt
opcode will issue it’s own null check. The obvious reasons for why
the compiler would issue it’s own null check first are:
- The argument is not proven to be non-null first
- Null checking was not exhaustive in the compilation process (standard for
“Debug” builds, which pass the
-optimize-
compiler flag)
The remainder of the method is fairly self-explanatory… Just finish the method and leave.
Now, with all this review of a Debug target, let’s have a look at a Release target:
.method private hidebysig static void PrintStuff(class [System.Runtime]System.Collections.Generic.IEnumerable`1<string> strings) cil managed
{
// Code size 41 (0x29)
.maxstack 1
.locals init (class [System.Runtime]System.Collections.Generic.IEnumerator`1<string> V_0)
IL_0000: ldarg.0
IL_0001: callvirt instance class [System.Runtime]System.Collections.Generic.IEnumerator`1<!0> class [System.Runtime]System.Collections.Generic.IEnumerable`1<string>::GetEnumerator()
IL_0006: stloc.0
.try
{
IL_0007: br.s IL_0014
IL_0009: ldloc.0
IL_000a: callvirt instance !0 class [System.Runtime]System.Collections.Generic.IEnumerator`1<string>::get_Current()
IL_000f: call void [System.Console]System.Console::WriteLine(string)
IL_0014: ldloc.0
IL_0015: callvirt instance bool [System.Runtime]System.Collections.IEnumerator::MoveNext()
IL_001a: brtrue.s IL_0009
IL_001c: leave.s IL_0028
} // end .try
finally
{
IL_001e: ldloc.0
IL_001f: brfalse.s IL_0027
IL_0021: ldloc.0
IL_0022: callvirt instance void [System.Runtime]System.IDisposable::Dispose()
IL_0027: endfinally
} // end handler
IL_0028: ret
} // end of method Program::PrintStuff
As you can see, the code size is a tiny bit smaller between Debug and Release
builds (0x2f
versus 0x29
). Mostly, this was achieved by eliminating all of
those nop
instructions.
This concludes Part 1 of our evaluation of C# Iterators. In the Part 2, we’ll
look at how the compiler optimizes the foreach
iterator when it can prove
the type being iterated is an array. Thanks for reading!