.NET – What is an Assembly?

I usually got a need to do some coding either using PowerShell or .NET. For me, both are similar as PowerShell is built on top of .NET framework, and you can even call helper .NET classes from within your PowerShell code.

I guess a good foundation knowledge in .NET is becoming more and more important for any IT professional. I recall that when i started to learn about .NET framework, I got some difficulties understanding some key concepts. For example, we always hear the word “Assembly”, but for me it is not enough to define it as a DLL you reference or .EXE that you execute, because what I really need to know is what an Assembly actually contains and how the magic happens under the table.

So i decided to share my thoughts about couple of .NET framework concepts and I hope it makes sense to most of you.

Define an Assembly

When I think of an assembly and read the definitions from the web, it get confusing to me to really define what is an assembly:

“An assembly is the logical boundary of functionality”

To understand what an assembly is, we have to explain how .NET code get executed:

  • Step 1: You start by writing your code using any of the available .NET languages. For example, you open Visual Studio, you start typing a code in C# and you have a file with .CS extension. This .CS file is called the Source Code file, because it contains the original source code.
  • Step 2 : Using Visual Studio, you compile your code, and the output result is a DLL or EXE file. This output file is called Portable Executable PE or an Assembly file, and it contains code that is written in Microsoft Intermediate Language (MSIL or IL for short)
  • Step 3: you run the code, and the .NET Common Language Runtime (CLR) will inspect the output assembly and convert it to machine language (called Native Code) via a Just In Time (JIT). The Native Code is a very low language that the CPU can understand and execute.

Assembly Generation

So the assembly is the output of your source code compilation, and it is written in MSIL or IL for short. This code is still human readable.  IL code is a CPU-independent machine language created by Microsoft.

It is so important to know that there are two compilation happening. When you write your code in Visual Studio for example, and hit Compile, this is just turning your code into the MSIL (human readable high level language). in the form of DLL or EXE (Assembly). Now later on, when you run that assembly [RUN TIME], the CLR (Just In Time) compilation will inspect your assembly (MSIL code), and will turn it into machine low level language (native code) that a CPU can understand and execute.

Regardless of what programming language you are writing your code, when your code compiles, and turns into an assembly file (DLL or EXE), the intermediate language output is the same.

For me to really appreciate what an assembly is, I started to answer this question “what are the benefits of an assembly?”. For me, answering the “why” would help me understand the “what” part. So let us start talking about some of the Assembly purposes in life:

  • Security Boundary: When you want to make sure a piece of code is signed with a strong name to ensure uniqueness, or to use digital certificate to identify the signer, the smallest unit to do this is the Assembly file.
  • Type Boundary: When you define a type, the type definition cannot span multiple assemblies, while two types with the same name can exist if they are located in different assemblies.
  • Reference Info: Each assembly broadcast its types and resources inside it, and also it gives info about other assemblies it reference. So you can think of Assemblies as unit of functionality from this perspective.
  • Versioning: An assembly is the smallest unit of versioning.
  • Deployment: An assembly represent a unit of deployment
  • Language Boundary: If you want to use multiple programming language to write your project, you have to know that an assembly can only contain code from one language, so you have to break your code to more than one assembly to use another programming language.

Assembly Purpose

Managed Module

There is also another concept to understand here which is the Managed Module. Managed Modules and Assemblies are related like this:

An assembly contains one or managed modules and perhaps another resource files (jpeg, gif, html, etc.)

When the assembly contains more than just one file, a Manifest data is created in the assembly file to describe the set of files in the assembly“.

A managed module is like the internal structure of an assembly. You can have one or more of these modules inside an assembly. Most of the cases, the assembly file will contain just one module. In fact, visual studio does not offer a way to produce an assembly with more than module i guess. If you want to do this, you need to do this using command line.

Why we are interested in learning about managed modules? Well, because sometime, in order to understand how .NET works, you may have to inspect the IL code itself. When you do that, it will be easier to inspect the IL code if you know that a module exists inside an assembly.  Each module contains many data, but for me the most important pieces that make an managed module are the IL code for your types, and the metadata.

Assembly vs Modules

So let us see what the Managed Module contains:

  • PE32 or PE32+ header:
    • Time Stamp
    • CPU Support information
    • Type of file (GUI,CUI or DLL)
  • CLR  header:
    •  Version of CLR required
    • Flags
    • Info about the entry point method (Main Method)
    • Strong name
  • Metadata: contains two table
    • Table for types and members defined inside this assembly.
    • Table for types and members referenced by this assembly.*
  • IL code of your types.

*Each module contains information about the referenced assemblies and their version number. This information is very important as the CLR can know the assembly’s immediate dependencies. This makes the assembly self-describing.

Managed Module

Metadata 

The most interesting part in the assembly file components is the metadata. Every compiler targeting CLR is required to provide metadata information. Metadata provides the following benefits:

  • Microsoft Visual Studio uses metadata. IntelliSence feature parses metadata information to give you what methods, properties, events, and field a type offers.
  • CLR uses metadata to ensure that your code is using “type-safe” operations.
  • Metadata enables object serializing (serializing the object in a memory block, move it across the wire, and recreating it at destination).
  • Metadata allows garbage collector to track lifetime of objects.

Metadata is the most important piece in the .NET story. The whole .NET framework is built around the idea of metadata. The more you know about metadata, the more you get closer in understanding the secrets of .NET framework.

By ammar hasayen Posted in .NET Tagged

3 comments on “.NET – What is an Assembly?

  1. Pingback: .NET – What is CLR ? | Ammar Hasayen - Blog

  2. Pingback: .NET Assembly – Random Thoughts | Ammar Hasayen - Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s