The Common Type System (CTS)
Before we start diving into more technical articles and lots
of code, I wanted to make sure that all the Buzz Words of the .NET framework is
understood by all readers and maybe answer couple of questions regarding where all
the pieces fit.
First, we defined the CLR (Common Language Runtime) in the
last article, now we will attack the CTS (Common Type System).
The Common Type System (CTS)
Probably one of the most
attractive features of the .NET CLR in that it is based on the CTS, which
provides a very rich and standard set of data types. The CTS is object oriented
by design but also supports procedural and functional languages as well.
The CTS is what allows .NET to
provide a unified programming model, and to support multiple languages. The CTS
supports two general categories of types, each of which actually have a number
of subcategories.
A
Type System
A type describes a value and
specifies a contract that all values of that type must support. Because the CTS
is Object Oriented and also supports functional and procedural languages, it
supports two different kinds of entities:
- Reference Types
- Value Types.
Value types directly contain their data, and instances
of value types are either allocated on the stack or allocated inline in a structure.
Value types can be built-in
(implemented by the runtime), user-defined, or enumerations.
Reference Types, store a reference to the value's
memory address, and are allocated on the heap. Reference types can be self-describing
types, pointer types, or interface types. The type of a reference type can be
determined from values of self-describing types.
Self-describing types are further split into arrays
and class types. The class types are user-defined classes, boxed value types,
and delegates. (A lot more on that in later articles)
|
Name in CIL assembler
|
CLS Type?
|
Name in class library
|
|
bool
|
Yes
|
System.Boolean
|
|
char
|
Yes
|
System.Char
|
|
object
|
Yes
|
System.Object
|
|
string
|
Yes
|
System.String
|
|
float32
|
Yes
|
System.Single
|
|
float64
|
Yes
|
System.Double
|
|
int8
|
No
|
System.SByte
|
|
int16
|
Yes
|
System.Int16
|
|
int32
|
Yes
|
System.Int32
|
|
int64
|
Yes
|
System.Int64
|
|
native int
|
Yes
|
System.IntPtr
|
|
native unsigned int
|
No
|
System.UIntPtr
|
|
typedref
|
No
|
System.TypedReference
|
|
unsigned int8
|
Yes
|
System.Byte
|
|
unsigned int16
|
No
|
System.UInt16
|
|
unsigned int32
|
No
|
System.UInt32
|
|
unsigned int64
|
No
|
System.UInt64
|
Value Types
Value types represent values allocated on the stack. They
describe values that can be represented as a sequence of bits. They cannot be NULL
and must always contain a value. When value types are passed into a function,
they are passed by value, meaning that a copy of the value is made prior to the
function execution. This also means that the value cant change, no matter what
goes on in the function. Value types also include intrinsic types. Since intrinsic
types are small in size and dont consume much memory, the resource cost of
making a copy is negligible and outweighs the performance drawbacks of object
management and garbage collection. Examples of value types include:
- Primitives
- Structures
- Enumerations.
You can also create a value type by deriving a class from System.ValueType.
Reference Types
Reference types describe values that contain references to
heap-based objects and they can be null. There are four kinds of Reference
Types:
- An object type is a reference type of self-describing
value. Some object types (e.g. abstract classes) are only a partial
description of a value.
- An interface type is always a partial description
of a value, potentially supported by many object types
- A pointer type is a compile time description of a
value whose representation is a machine address of a location.
- Built-in types
These types are passed by reference, meaning that an actual address of a memory
location is passed when you pass one into a function, Reference types in .NET
are allocated on the managed heap, which means that it is managed by the CLR and garbage-collected
by the CLR.
The Common Language Specification (CLS)
The CLS is simply a
specification that defines the rules to support language integration in such a
way that programs written in any language, yet can interoperate with one
another, taking full advantage of inheritance, polymorphism, exceptions, and
other features. These rules and the specification are documented in the ECMA
proposed standard document, "Partition I Architecture", available
here.
The great news for Delphi and other languages
is that there is essentially a level playing field, in that all CLS compliant
languages can access all the features of the CLR, use the BCL (Borland Component Library,
Oops! I mean Base Class Library), and be managed by the CLR.
Assemblies
Thanks to Assemblies we no longer have DLL Hell,
welcome to Assembly Hell. J
An Assembly is the unit of
deployment, of security, of versioning, and of scope for the types contained
within it, pretty close to what a Package is in Delphi and C++Builder.
It is self-describing. It is
typically one physical DLL or EXE, in the Windows PE file format, but could be
made up of multiple files.
An assembly must have a manifest
(See below) that describes its contents and it usually contains MSIL,
resources, and metadata describing the types contained within the assembly.
An assembly is made up of four
elements:
7
Manifest
7
Metadata describing the types
7
Module(s)
7
Resources
Metadata
Data that describes data.
In the .NET sense, metadata is the
information that is stored in the Assembly to make it self-describing.
So, in .NET, metadata
essentially describes the elements of the Common Type
System that you use in your application, as well as the information that
the runtime needs to do its stuff in the areas of type safety and security.
Manifest
The Manifest is what describes
the assembly itself.
The Manifest contains:
7
A simple name
7
A four-part version number of the form Major.Minor.Build.Revision
7
Publishers private key
7
Culture (Locale)
7
List of files that make up the assembly
7
List of dependent assemblies
7
Permission requests
7
Exported types
7
Resources
Strong Names
An assembly can be given a
unique name when the publisher can include a public cryptographic key.
The key goes into the Manifest.
If the assembly has a public key, it is said to have a Strong Name or shared
name. In addition, the publisher can digitally sign the assembly, using a
public and private key. This prevents tampering with the Assembly.
Digitally signing an Assembly
places a cryptographic hash of the contents of each file in the Assembly. This
is verified at run-time to ensure that it has not been corrupted or tampered
with.
Signing takes place with the
Strong Name tool (sn.exe) included in the .NET Framework SDK. We will discuss
this in more detail in a separate article with an example.
Managed Code
Managed Code is MSIL code that is executed directly
by the Runtime (the CLR's Virtual Execution System (VES)) as opposed to native
X86 code.
In .NET application logic is
encoded in IL when emitted from a .NET Language compiler and then
"managed" by the runtime. This means that the CLR can perform
services to the managed code and types such as debugging, exception handling,
serialization, security and garbage collection.
Unmanaged Code
Unmanaged code is X86 code that you have been
writing for ages. Like COM, COM+, C++ libraries, Win32 code, Windows API OS
functionality. The term unmanaged indicates that this code cannot make use of
the .NET Runtime Services.
COM is like SMOKING: If you have not started,
you should not start now, if you are already doing it, it is time to stop. J
Managed Data
Managed data is data that is under
control of the CLR's Garbage Collector. All dynamically allocated data is allocated
from the managed heap and is thus, managed data. This data has metadata associated
with it and thus, is self-describing. The CLR performs memory layout of the
data and can also operate on it at runtime to discover type data through a process
known as Reflection. (Many articles on Reflection will follow).
Global Assembly Cache and Shared
Assemblies
The GAC is a shared repository of assemblies that
is machine-wide. This allows multiple applications to share an assembly. All
assemblies that are to be placed in the GAC must have a Strong Name. The GAC
also supports versioning. You may have multiple versions of the same assembly
(each with a different version number) executing simultaneously.
If my Application ABC uses an Assembly called
A.dll and another application XYZ uses the same Assembly A.dll. If a new version
of ABC is released with a revised A.dll Assembly there is no need for XYZ to
link with the new A.dll Assembly. Both A.dll Assemblies can co-exist in different
directories and the GAC can notify the application running of which one is needed
to run an app.
Write to
you later, till then, have fun
About Falafel Software Inc:
Falafel Software is all about making the most
of software development technology in order to complete the project on time
and on budget with best possible user experience. Falafel Software offers a
comprehensive suite of software development solutions ranging from strategy
to design to implementation that businesses need in order to realize high returns
on their investment.
Copyright ) 2003
Alain Tadros, Falafel Software Inc.
ALL RIGHTS RESERVED. NO PART OF THIS DOCUMENT CAN BE COPIED IN ANY FORM WITHOUT
THE EXPRESS, WRITTEN CONSENT OF THE AUTHOR.
Connect with Us