Reviewing C# foreach, Part 2
In the previous post, Reviewing C# foreach, Part 1,
we looked at how the C# compiler transforms our foreach
loops over the
IEnumerable<T>
interface into something the runtime understands. In this post,
we’ll look at what code the compiler emits when an object is statically known to
be an array.
It’s been a while since I’ve spent any time on this blog. At this stage, I don’t expect I have very many readers, but I apologize to you, the future reader, nonetheless. Ignoring my blog for weeks at a time is not something I aspire to do. The last six weeks have included a lot of ups and downs for me, including:
- A family vacation to Nashville, Tennessee
- A very deep dive into updating ClickOnce deployment builds
- A myriad of other activities
Anyway, given the content of my blog, I expect you’re probably not here to
read about my personal life, so let’s dig in on today’s topic: how the C#
compiler transforms a foreach
loop over an array into the .NET Intermediate
Language (covered in the .NET Framework Platform Architecture)
that is consumed by the runtime!
The last time, we evaluated a very simple program method, which uses the C#
foreach
language keyword to iterate over a set of string
objects. We also
established that the IEnumerable<T>
interface, coupled with the IEnumerator<T>
interface, provides a framework-native contract for the
Iterator pattern. The C# code we reviewed last time was very
simple:
static void PrintStuff(IEnumerable<string> strings)
{
foreach (var str in strings)
Console.WriteLine(str);
}
Today, we’re going to review a small variant on the previous example. Below is the output of a simplified Git diff to illustrate the changes:
diff --git a/Program.cs b/Program.cs
index acf9aeb..a66d665 100644
--- a/Program.cs
+++ b/Program.cs
@@ -10,7 +10,7 @@ namespace ForEach
}
- static void PrintStuff(IEnumerable<string> strings)
+ static void PrintStuff(string[] strings)
{
foreach (var str in strings)
Console.WriteLine(str);
As you can see, the only change is to the type of the strings
parameter.
Previously, this was an IEnumerable<string>
, and is now an array of string
s
(string[]
). When we compile this new method, we get the following output:
.method private hidebysig static void PrintStuff(string[] strings) cil managed
{
// Code size 32 (0x20)
.maxstack 2
.locals init (string[] V_0,
int32 V_1,
string V_2)
IL_0000: nop
IL_0001: nop
IL_0002: ldarg.0
IL_0003: stloc.0
IL_0004: ldc.i4.0
IL_0005: stloc.1
IL_0006: br.s IL_0019
IL_0008: ldloc.0
IL_0009: ldloc.1
IL_000a: ldelem.ref
IL_000b: stloc.2
IL_000c: nop
IL_000d: ldloc.2
IL_000e: call void [System.Console]System.Console::WriteLine(string)
IL_0013: nop
IL_0014: nop
IL_0015: ldloc.1
IL_0016: ldc.i4.1
IL_0017: add
IL_0018: stloc.1
IL_0019: ldloc.1
IL_001a: ldloc.0
IL_001b: ldlen
IL_001c: conv.i4
IL_001d: blt.s IL_0008
IL_001f: ret
} // end of method Program::PrintStuff
Once again, we have our standard stack initialization. We have many of the same
stack variables as last time, but with the addition of a new int32 V_1
local
this time. This is because the compiler knows our method is iterating over an
array, and uses a different strategy for iterating the array’s members.
Similar to the previous implementation, the whole method starts with a couple
of nop
s. These are there, as noted last time, to support the debugger. Next,
we have initialization of a local:
IL_0004: ldc.i4.0
IL_0005: stloc.1
IL_0006: br.s IL_0019
The ldc.i4.0
opcode is used for loading the constant value 0
. The next part,
stloc.1
, assigns the value 0
to int32 V_1
on the stack. The br.s
opcode, as detailed last time, is an unconditional jump to the IL_0019
label.
Before we examine the loop body, let’s look at what’s happening at the IL_0019
label:
IL_0019: ldloc.1
IL_001a: ldloc.0
IL_001b: ldlen
IL_001c: conv.i4
IL_001d: blt.s IL_0008
There’re a few things happening here:
- Load our local variable for iterating the loop (
int32 V_1
) - Load the local
strings
parameter - Get the length of the
strings
array - Convert the length to an
int32
(from auint32
) - “Break less than length”; else, jump to
IL_0008
Basically, read this as the i < strings.Length;
part of the following for
loop:
for (var i = 0; i < strings.Length; i++) {
// Do work
}
When we jump back to IL_0008
, we have the following IL:
IL_0008: ldloc.0
IL_0009: ldloc.1
IL_000a: ldelem.ref
IL_000b: stloc.2
IL_000c: nop
IL_000d: ldloc.2
IL_000e: call void [System.Console]System.Console::WriteLine(string)
IL_0013: nop
IL_0014: nop
IL_0015: ldloc.1
IL_0016: ldc.i4.1
IL_0017: add
IL_0018: stloc.1
The first few parts are really straight forward:
ldloc.0
loads thestrings
parameterldloc.1
loads our induction variable (i
in most loops)ldelem.ref
loads anative int
from thestrings
array at indexi
stloc.2
assigns the reference loaded byldelem.ref
to the localV_2
variable on the stack- The
nop
supports the debugger - The call to
ldloc.2
re-loads theV_2
ref from the stack call void [System.Console]::System.Console::WriteLine(string)
calls the actual console API (shown in the original method)- More debugger-supporting
nop
s - Reload our induction variable (
IL_0015: ldloc.1
), the constant1
(IL_0016: ldc.i4.1
), increment the value on the stack, and re-assign it toV_1
(our induction variable)
Thereafter, we re-visit the loop induction starting at IL_0019
, as described
above.
The last opcode is IL_001f: ret
. This one returns from the method body, quite
predictably. If you read Part 1, then
you’ll note that this method does not emit a try/finally block.
This is because the compiler is generating a basic for
loop, which is known
to not produce any disposable resources. The core Iterator interface in the
.NET Framework, [IEnumerator
In a future post, I’ll create a performance benchmark that compares this method to the one shared in the first post. In the mean time, I hope you enjoyed this particular dose of esoteric information about what the C# compiler does with your code. If you’re into this sort of thing, I’ll be providing similar information in the future.
Thanks for reading!
- Brian