Code You Should Know

OOP Pillars – Polymorphism in C#

Reading Time: 3 minutes

Polymorphism is a concept where objects of different types are accessible via the same interface. In this post I talk about the third of the OOP Pillars – Polymorphism in C#. Please see my previous posts for the other OOP Pillars Inheritance and Encapsulation.

Overview

Polymosphism comes from the words Poly meaning many and Morph meaning shapes. The classic example of this concept, in software development is to describe a Shape. A shape can take many forms, for example a circle, square, triangle, rectangle, and the list goes on. All shapes share common attributes, for example all shapes have an area, but the way to calculate the area can vary for different shapes.

In Polymorhpysism, base classes can define and implement Virtual methods. Derived classes can override them to provide their own implementation. In the scenario of the Area calculation for different shapes, the base Shape class can implement a Virtual method to calculate the area of a rectangle as the default. Any objects that derive from Shape and is not a rectangle overrides the Shape method to provide the implementation that is appropriate to its shape.

Virtual Methods

When a hierarchy of objects is created, any derived class gains the methods, fields, properties and events of the base classes. A derived class can override any base methods marked Virtual. Let’s look at an example.

The class shape below defines a Draw() and Shape() method as virtual.

public class Shape
{
    public int X { get; set; }
    public int Y { get; set; }
    public int Height { get; set; }
    public int Width { get; set; }

    public virtual double Area()
    {
        return Height * Width;
    }

    public virtual void Draw()
    {
        Console.WriteLine("Drawing the base Shape.");
    }
}

public class Square : Shape
{
    public override void Draw()
    {
        Console.WriteLine("Drawing a Square.");
    }
}

public class Circle : Shape
{
    public int Radius { get; set; }

    public override void Draw()
    {
        Console.WriteLine("Drawing a Circle");
    }

    public override double Area()
    {
        return  Math.PI * Math.Pow(Radius, 2);
    }
}

In the example above, the Square shape uses overrides the Draw method. The Circle overrides the Draw and Area methods.

Hiding Base Members

A concept also known as Shadowing, is when the derived class declares a member of the same name as the base class using the New keyword. This causes the base implementation to become hidden or, in other words, the base method inaccessible from an instance of the derives class.

It is possible to access the base method by casting the instance to a base type. Therefore it is not completely inaccessible.

public class Circle : Shape
{
    public int Radius { get; set; }

    public new int Height => 0;

    public new int Width => 0;

    public override void Draw()
    {
        Console.WriteLine("Drawing a Circle");
    }

    public override double Area()
    {
        return Math.PI * Math.Pow(Radius, 2);
    }
}

If you attempt to assign a value to a circle’s Height or Width properties, you will get an error (shown below). This is because the Circle class defines a new, read-only version of the Width property.

Sealing Methods to Prevent Overriding

Virtual methods can be marked as Sealed, at any point of the inheritance chain, to prevent further overriding. This works the same as Sealing classes.

Once sealed, a method can no longer be overridden. However, a deriving class can declare a new version of the sealed method. In other words, if a class inherits a base class containing a sealed method called Draw(), it is not able to override Draw(), but it can create a new implementation of Draw() using the new keyword.

public class Square : Shape
{
    public sealed override void Draw()
    {
        Console.WriteLine("Drawing a Square.");
    }
}

public class Rectangle : Square
{
    public new void Draw()
    {
        Console.WriteLine("Drawing a rectangle.");
    }
}

Accessing Virtual Members of the Base Class

When using the instance of a class which overrides a method in a base class, it will call the version of the method in the current instance. It is possible to access the method of the base class as well.

To do this, within the deriving class definition, qualify the method with the word Base. For example, a method called Draw() would be called as base.Draw(). Therefore it will execute the method defined by the base class.

This technique is often used in constructors, when a deriving class calls the base class constructor.

I hope you enjoyed my brief introduction into OOP. Please see Microsoft’s documentation on Polymorphism or OOP for a more in-depth look into these topics.

OOP Pillars – Inheritance in C#

Reading Time: 3 minutes

One of the most important principles of OOP is the DRY principle – Don’t Repeat Yourself. Inheritance is an aspect of OOP that facilitates code reuse and eliminates duplication. In this post I talk about the next of the OOP Pillars – Inheritance in C#. If you missed my previous post on Encapsulation, you can see it here.

The Basics

Inheritance comes in two ways: a “is-a” relationship and a “has-a” relationship. In this article I discuss the “is-a” type of inheritance. In an “is-a” type of inheritance, new classes build upon existing (base) classes as a starting point. The role of the base class is to house all of the common functionality for the classes that extend it. For example a Person class houses the shared details about a person. From a Person you extend the code to create a Student (is-a person), a Teacher (is-a person) and so on.

The class that serves as a basis for a new class is called a base class, super class or parent class. The class extending the base class is called the child or derived class. To define a new class that inherits some base class, in C#, you use the colon operator between the child and base class names, like so:

// a Student is-a Person
public class Student : Person
{
}

Multiple Inheritance

In C#, a given class can only have one direct base class. In other words, a class can only extend one class. It can, however, implement multiple interfaces. This allows for building more sophisticated interface hierarchies to model complex behaviors.

Next, we will see some sample code.

public class Person
{
    private string _firstName;
    private string _lastName;

    public string FirstName
    {
        get => _firstName;
        set
        {
            if (value.Length > 50)
            {
                throw  new ArgumentException("First Name is longer than then allowed 50 characters", 
                    nameof(FirstName));
            }

            _firstName = value;
        }
    }
    public string LastName
    {
        get => _lastName;
        set
        {
            if (value.Length > 50)
            {
                throw new ArgumentException("First Name is longer than then allowed 50 characters",
                    nameof(LastName));
            }

            _lastName = value;
        }
    }
    public DateTime DateOfBirth { get; set; }
    public string Gender { get; set; }
}

Notice that FirstName and LastName are enforcing encapsulation and business rules. A child class cannot access private members of a parent class. In other words, any class extending Person cannot access Person._firstName and Person._lastName directly.

An alternative to this is to change the Private members to Protected. Protected members are directly accessible to all descendants of a class. The downside of this is that it bypasses business rules and violates encapsulation. Next we add a Student class that extends Person:

// a Student is-a Person
public class Student : Person
{
    public int StudentNumber { get; set; }
    public String Type { get; set; }

}

By extending Person, the Student class now has the same public members as the Person class. This table shows the shared members:

    static void Main(string[] args)
    {
        Student student = new Student();

        student.FirstName = "Frank";
        student.LastName = "Jones";
        student.DateOfBirth = new DateTime(1992, 12,1);
        student.StudentType = "Full Time";
    }

Constructors

Classes never inherit the constructors of a parent class. Constructors will only initialize the class in which they are defined. Although a child class can call a parent’s class constructor, it does not initialize the child class.

If we add a constructor to the Person class that initializes its properties, then we can call that constructor form the Student class, like so:

public class Person
{

    public Person(string firstName, string lastName)
    {
        FirstName = firstName;
        LastName = lastName;
    }
    private string _firstName;
    private string _lastName;
    ...
}

The child class can call the parent’s class constructor and pass the appropriate parameters by using chaining. In the example below, the constructor for the Student class takes the parameters for first and last names. It chains the constructor of the base class by using : base(firstName, lastName) . This allows the child class to initialize the parent’s class properties via its own constructor.

public class Student : Person
{
    public int StudentNumber { get; set; }
    public string StudentType { get; set; }

    public Student(string firstName, string lastName, string studentType) 
        : base(firstName, lastName)
    {
        StudentType = studentType;
    }
}

Capping the Chain of Inheritance

Some classes should not be extended. When you create a class and decide that class should not be extended, you can decorate the class with the Sealed keyword. By marking a class Sealed, any attempt to extend that class will produce a compile time error.

Implicit Inheritance

In .Net, all types implicitly inherit from System.Object. All .Net types have the common functionality provided by System.Object. This provide some basic functionality for all types, such as Equals(Object), GetType() and ToString().

OOP Pillars – Encapsulation in C#

Reading Time: 3 minutes

In this article I talk about one of the first of the OOP Pillars – Encapsulation in C#. To fully understand encapsulation, you must have a basic knowledge of Classes in C#. Therefore, if you’re new to C# or need a refresher on Classes, please see my previous post on classes.

Encapsulation is the concept where an object’s data is not directly accessible via an object’s instance. Rather, an object’s data is declared private and access to it is done via public properties.

To illustrate this concept, I will use a Car class.

public class Car
{
    public int NumberOfDoors;
}

The problem with public data members is that the member is unable to validate values assigned to it. While the assigned value can be a valid integer value, but that does not mean that it is a valid number of doors for a car.

To resolve this problem, Encapsulation provides a way to ensure the integrity of an object’s state data by using Accessor and Mutator methods. An Accessor (or get) method is a public method that returns the value of a private member. A Mutator (or set) mothod is a public method that sets the value of private member. C# provides get and set functionality via Properties, which are a simplification of the Accessor and Mutator methods. Let’s update our class definition to use a property to provide access to the private data via a property.

public class Car
{
    private int _numberOfDoors;
    // Constructor
    public Car(int numberOfDoors)
    {
         _numberOfDoors = numberOfDoors;
    }
    public int NumberOfDoors
    {
        get => _numberOfDoors;
        set
        {
            if (value >= 2 && value <= 5)
            {
                _numberOfDoors = value;
            }
            else
            {
                throw new ArgumentOutOfRangeException(nameof(_numberOfDoors));
            }
        }
    }
}

In the code above, _numberOfDoors is only accessible via the public property NumberOfDoors. The public property ensures that the value assigned is within the valid range, in this case 2-5. Once a property is in place, it appears to the caller that they’re accessing the data directly. However, the appropriate Get and Set methods are called behind the scene, preserving Encapsulation.

Setting Private Properties Via a Constructor

A constructor has access to set private members. In the example above, the constructor is setting the value of the _numberOfDoors member. When you set the member directly, the constructor is not performing any business validation on the incoming value, this is a poor approach. A better approach is to validate the incoming data. Rather than duplicating the validation logic, the construction can assign the value to the NumberOfDoors property. Therefore, avoiding duplication of the validation logic.

    public Car(int numberOfDoors)
    {
        // Assignment of NumberOfDoors property
        NumberOfDoors = numberOfDoors;
    }

It is also good practice to use properties throughout your class implementation to ensure business rules are validate and to reduce duplication.

Read-Only and Write-Only Properties

When encapsulating private data, you can configure properties to be read-only or write-only. This is accomplished by omitting the set or get blocks, respectively. Alternatively you can mark the set block with the private keyword. This ensures that the property remains read-only, but allows the constructor to set the value via the property, to ensure business rules are validated.

    public Car(string make, int numberOfDoors)
    {
        Make = make;
        NumberOfDoors = numberOfDoors;
    }

    public string Make
    {
        get => _make;
        private set
        {
            if (value.Length <= 50)
            {
                _make = value;
            }
            else
            {
                throw new ArgumentOutOfRangeException(nameof(_numberOfDoors));
            }
        }
    }

Partial Classes

Lastly, I would like to briefly talk about partial classes. In C#, you can partition a class definition across multiple files. This allows you to isolate code that is more used or modified, from the standard, base code. This is a simple, yet powerful concept.

One example of Partial Classes is when using Entity Frameworks database-first. The generated code is placed into partial classes. Therefore it allows the user to, for example, add validation to the properties, in a separate file. In other words, the developer can add their modifications, unobtrusively and also maintain their changes if EF decides to re-generate the database code.

How to use Lazy initialization in C#

Reading Time: 3 minutes

Background

Lazy initialization (instantiation) is a technique to delay the creation of an object until the first time the application requires the object. This technique provides a mechanism to control when an object is created. Lazy initialization provides an advantage of improving performance by avoiding unnecessary computation and memory footprint. The most typical implementation of Lazy instantiation is to augment a Get method, to check if an instance of the object already exists. The Get method creates the instance, if it does not already exist, otherwise it returns the existing instance. In this article I cover how to improve performance using lazy initialization in C#.

Typical usage

To create large objects, especially the ones the application will not always use.
To create objects which are expensive to create.
Singleton pattern implementation.
In conjunction with the Factory pattern.

Lazy<T> in C#

The Lazy<T> class has six constructor options. I will only cover two of the six. The first constructor is the parameter-less constructor. It calls the matching (parameter-less) constructor of the object T. The second constructor is a constructor that contains a parameter for a delegate. The delegate defines which constructor to call, for the object T. See the following two examples for the different constructors.

    private static void LazyWithDefaultConstructor()
        {
            Console.WriteLine("Lazy<T> example.");
            // Default constructor
            Lazy<Order> order = new Lazy<Order>();

            Console.WriteLine("Order should not yet have been initialize.");

            order.Value.ClientId = 10;
            Console.WriteLine($"Order has been initialized and client Id set to {order.Value.ClientId}");


            Console.WriteLine("Press Enter to close the application.");
            Console.ReadLine();
        }

        private static void LazyWithDelegateConstructor()
        {
            Console.WriteLine("Lazy<T> example.");
            // Default constructor
            Lazy<Order> order = new Lazy<Order>(() => new Order(55));

            Console.WriteLine("Order should not yet have been initialize.");

            
            Console.WriteLine($"Order has been initialized and client Id value, set by the constructor is {order.Value.ClientId}");


            Console.WriteLine("Press Enter to close the application.");
            Console.ReadLine();
        }

A Lazy object does not create an instance of T until the Value property is accessed for the first time. Once initially accessed, the Lazy object creates an instance of T and stores it for future use. In other words, the Value property will always return the same object with which it was initialized.

The Value property is read-only. Therefore, if it is a reference type, a new object cannot be assigned to it. However, any of its public, settable properties can be modified. If Value stores a value type, you cannot modify it. You can, however, create a new variable by providing a different set of parameters to its variable constructor.

Thread safety

As a final note, I like to mention that, by default, Lazy<T> is thread safe. This means that the first thread to access the Value property will cause trigger the object initialization. Every subsequent thread will then receive that same instance. The uniqueness of the object is defined by it’s constructor parameters, if any. For example, if two threads access the Value property of a Lazy object, which is instantiated by the same exact constructor. Both threads will receive the same instance. If the constructor takes a parameter and the parameters are different, each access to the Value property will cause the creation of a new object.

One habit to become a better Software Developer

Reading Time: 3 minutes

In today’s article I discuss the one of the more effective habit to become a better developer. As a software developer you must learn continuously. Technology changes at an incredibly rapid pace. To keep up with all those changes is like trying to drink water from a fire hose. Therefore, it is important to learn your core technologies in depth.

Picture this scenario, if you will. You are working on an application and you must create some functionality, which you don’t know exactly how to. Or you know how to, but you are not sure if the way you know, is the best way to do it. You consult the internet for guidance and you find a many different opinions on how to do this. You think you have enough information to come up with a good solution. It works, great! You implement the changes, run the tests. They all pass. You feel good about this solution. Eventually the application goes into production and you move on to the next task, but chances are that you did not learn that part of the technology well enough, or if at all.

This is something that happens very often. Don’t believe me? Look at some of the development boards, such as StackOverflow or Reddit. There are countless questions about how to do this, how to use that, or what are the best practices for accomplishing X. This is normal. It’s a process of collaboration and information sharing, to lean from those whom have already faced this challenge. The important step here is to not simply learn enough to accomplish your task, rather to dive deeper and truly understand this part of the technology.

The Learning Steps

There are many ways to learn something. These are the ways that has worked best in my experience.

Start by reading the documentation and understanding any examples provided.
Do a few examples on your one, where you must think, not just copy-and-paste. This is crucial because it forces your brain to think and solidifies the knowledge.
Look for Best-Practices, where applicable. This will give you different perspectives of problems it can help solve, beyond yours.
Understand performance implications, if applicable.

Learn Technologies You Can Practice

It is important to know which technologies you should invest your time learning. To learn you core technologies in depth, I mean just that, technologies that you do or will use on a regular basis. Technologies which are part of your day-to-day work.

The reason is that if you are going to devote the time to learn something in depth, it is important that you have the opportunity to practice what you learn. Otherwise, you will likely end up forgetting some or most of the details of what you learned. I say this under the assumption that your time is limited and valuable, therefore you must make the most of it by investing in knowledge you can benefit from, short and long term.

Make It a Lifetime Habit

When you begin digging deeper, your outlook changes and you begin thinking of more in-depth questions. In other words, digging deeper causes you to want to dig even deeper. As a result, it does not take long for the process to become second nature, causing you to ask and think at a deeper level. And when you make this a recurring habit, your knowledge grows quickly, making you a continuously better developer.

Maintaining business-critical data integrity in Azure

Reading Time: 3 minutes

I recently learned about an Azure feature called Immutable Blob Storage. Also known as WORM (write once, read many), these features make data non-erasable and non-modifiable. It is quite a simple feature to understand and implement. In this article I discuss the typical scenarios, how it works and provide an example of how to use this feature.

Side Note: As of the time of this writing, this feature is available for Blob Storage accounts in all Azure regions.

Typical Scenarios

These are a few cases for immutable storage:

Immutable storage is crucial for industries such as Financial and Healthcare, where the data integrity is paramount. It helps business achieve regulatory compliance such as HIPAA and FINRA.
Immutable storage keeps in a tamper proof state. This is a requirement for cases such as litigation related or business documents.
Immutable storage ensures that users cannot modify documents, including those with administrator privileges.

How It Works

Azure Immutable Storage comes in two types of immutable policies, one is time-based and the other is legal hold. Both policies have the same effect, that is, once implemented on a given storage, all documents in that storage go into an immutable state.

It is not possible to delete a container or account if there are any blobs protected by the immutable policy. An attempt to do so will fail while at least one policy is active in at least one storage container in the account.

Policy Types

Time-based policy: When you add a time-based policy to a container, it applies to all documents in a container. For existing documents, the time it remains immutable is based on the time the document was last modified. For new documents, the time it remains immutable is the time specified by the policy. The main difference is that for existing documents, the time span begins when the document was last modified, vs. from its creation date.

Example: Suppose you have a storage container that has one file created 5 months ago. You add a time-based policy defining a 6-month hold. This file will have an effective hold of 1 month because it has remained in an unmodified state for 5 months prior to when the policy went into effect. Therefore, only the difference in time is the effective hold time.

Meanwhile a new file, added after the policy is created will have an effect hold of 6-months.

A time-based policy has two states, locked or unlocked. By locking a time based policy, it can no longer be deleted. Locked policies allow for extensions of the retention interval. Most regulatory compliance rules require a locked time-based policy, before it can be compliant.

Legal-hold policy: When you add a legal hold policy to a container, all documents in that container will remain in an immutable period until the legal hold is removed. A legal hold policy is very straight forward.

How to Implement

Implementing a hold policy is quite simple. Once you have created a storage container, you can add a policy via the Portal or command line.

Here I create a new Resource Group, Storage Account and Container. The container called immutableblob. Then you need to access the container Access Policy options via the options menu to the right of the container name.

Once here, you have the option for Immutable Blob Storage.

Once here, you choose the Policy Type – Time Baser or Legal Hold. The Time Based policy requires a parameter for the number of days for which it should remain in effect. If the Time Based policy is not locked, it can be removed at any time.

This covers the basics of Azure Immutable Blob Storage. As you can see, the feature is fairly simple to understand and implement. More details, including an FAQ and scripts are available on Microsoft’s documentation page.

Understanding Classes in C#

Reading Time: 4 minutes

In today’s article I discuss Classes in C#. Classes are the most fundamental programming construct in .Net and an essential part of the OOP paradigm. Therefore it is crucial to understand classes in order to fully understand OOP concepts.

Definition

Formally, a class is defined as a user-defined type that is composed of field data and methods that operate on this data. A Class is a reference type, meaning a variable holding an instance of a class, stores a pointers or reference to the memory address where the values are stored.

Classes are not the same thing as Objects. A Class is a definition, or prototype that defines the properties, methods, etc. While an Object is an instance of a Class. An object contains state and behavior. Classes allow bundling of data (state) and functionality (behavior) in a single definition. Therefore allowing applications to be modeled after real-world entities. This is a very powerful concept.

Defining a Class

We begin with the simplest possible declaration of a class:

class Bicycle
{
}

Then add two properties, NumberOfGears and Make. NumberOfGears uses a feature called Auto-Property, which is simple syntactic sugar.

public class Bicycle
{
    // A property created using Auto-Property
    public int NumberOfGears { get; set; }

    // A property created without using Auto-Property
    private string _make;
    public string Make
    {
        get { return _make; }
        set { _make = value; }
    }
}

A Side Note: Class data should not be defined as public fields. It is best practices to use private fields and allow access via properties to preserve the integrity of the state data.

As I previously mentioned, an Object is an instance of a Class. An Object must be instantiated before it can be used. By instantiating an Object, memory is allocated for that Object. An Object is instantiated by using the new keyword, like so:

Bicycle schwinn = new Bicycle();

Constructors

Classes contain Constructors, which are methods that return void and can take 0 or more parameters. Constructors run when the class is instantiated. A constructor’s purpose is to assign initial values to the Class’ state. Each class is given a default constructor that takes 0 parameter and will assign each property its default value, based on its type. Once any constructor is defined for a Class, the default constructor is no longer available for that Class.

Let’s add a constructor to our Bicycle class.

public class Bicycle
{
    // A property created using Auto-Property
    public int NumberOfGears { get; set; }

    // A property created without using Auto-Property
    private string _make;
    public string Make
    {
        get { return _make; }
        set { _make = value; }
    }

    public Bicycle(int numberOfGears, string make)
    {
        Make = make;
        NumberOfGears = numberOfGears;
    }
}

Also notice that if we look at our initial instantiation example, it shows an error. This is because the default constructor is no longer available. Therefore the only constructor option now is the one explicitly define.

Classes can have multiple Constructors. The criteria is that they have different method signatures, by either number of parameters, parameter type or a combination of both.

Constructors can also be chained, in other words, one controller can trigger a different controller. Here is an example of creating a parameter-less controller, which calls a second controller and passes default parameters to it.

public class Bicycle
{
    // A property created using Auto-Property
    public int NumberOfGears { get; set; }

    // A property created without using Auto-Property
    private string _make;
    public string Make
    {
        get { return _make; }
        set { _make = value; }
    }

    public Bicycle() : this(0, "N/A")
    {
        Console.WriteLine("Calling parameter-less constructor.");
    }

    public Bicycle(int numberOfGears, string make)
    {
        Make = make;
        NumberOfGears = numberOfGears;

        Console.WriteLine("Calling constructor with 2 parameters.");
    }
}

The best approach for constructor chaining is to start with the constructor that takes the largest number of argument and use it as the master constructor. Then other constructors will chain to the master constructor and provide default values for the parameters which are not part of the previous constructor in the chain.

The Static Modifier

To understand the Static keyword you must first know where is can be applied. The Static keyword can be applied to the following:

A Class definition
Properties and fields of a class
Methods of a class
A constructor

When applied to a Class, the class cannot be instantiated, in other words, you cannot use the new operator to instantiate a variable of the Static class type. A Static class is accessed directly by using the class name.

Static classes are generally used as a container for a set of stateless functions. A non-static class can contain static properties. The static properties are not accessible to instances of that class, in other words, static properties are only accessible by direct access to the class.

There is one static constructor that can be added to a class that contains static properties. The static constructor is defined by static <className>. Static constructors do not take an access modifier and do not have a return type, at all. The static constructor instantiates any static members of the class. The static constructor runs once, only when the class is instantiated for the first time.

Here is an example of our Bicycle class, where I add a static property to keep track of the NumberOfBikes instantiated.

public class Bicycle
{
    public static int NumberOfBikes { get; set; }
    // A property created using Auto-Property
    public int NumberOfGears { get; set; }

    // A property created without using Auto-Property
    private string _make;
    public string Make
    {
        get { return _make; }
        set { _make = value; }
    }

    static Bicycle()
    {
        NumberOfBikes = 0;
    }
    public Bicycle() : this(0, "N/A")
    { }

    public Bicycle(int numberOfGears, string make)
    {
        Make = make;
        NumberOfGears = numberOfGears;
        NumberOfBikes++;
    }
}

Now our application produces the following output:

As you can see above, the static constructor sets the value of NumberOfBikes to 0, then the non-static constructor increments it by 1. This confirms that the static constructor runs only once. Otherwise the NumberOfBikes would always reset to 1 (initial 0 plus increment) for each instantiation.

Where To Go From Here

This concludes my brief introduction to classes in C#. I have certainly not covered everything there is to know about classes. My intention is to provide a starting point, enough to build upon and learn more advanced concepts.

Microsoft provides excellent and in-depth documentation on this topic.

C# Value Types performance problems and solutions.

Reading Time: 4 minutes

Some years ago I worked on a project where I built a small rules engine. The rules runs anytime a user logs into the application and it deals with a fairly large amount of data. Therefore its performance is a key consideration.

In this article I discuss:

How Value Types are stored in memory
How the default Object methods can cause performance issues
Solutions for these issues

Value Types and Memory

.Net stores values types in the Stack, or inline within the containing type. What this means is that if you have a Struct such as:

public struct MyStruct
{
    public int Index;
    public bool IsInitialized;
}

The memory for the Index and IsInitialized values are stored within the memory allocated for each MyStruct instance. For example, an array of MyStruct stores in sequential memory like so:

For comparison, a Reference Type does not contain the data directly, it contains a memory reference to the data. Reference types are stored in the Garbage Collected Heap and it contains two header fields, in addition to the data. This looks something like this:

An array of the Boxed MyStruct values would store in non-sequential memory and look something like this:

reference type array memory representation

Reference types (boxed values) take up more memory than its Value Type version due to the added Headers. The process of Boxing and Unboxing is also very expensive (computationally). Therefore avoiding it is key in order to achieve performance gains.

Default Object Methods

In .Net all types are derived from System.Object (or just Object). Object is a Reference type and also a base type of Value Types, but how can this be?

The reason for this is that Value Types derive from a class called ValueType, which is part of the System namespace. The ValueType class overrides virtual methods from Object with more appropriate implementations for Value Types. This mean that when a Value Type is expected to behave like a Reference Type, a wrapper that makes the Value Type behave like a Reference Type is created (and allocated on the GC Heap), also known as Boxing.

System.Object defines an Equals method. If we look at its signature, we notice that it takes an Object (Reference Type) as the “compare to” value.

public virtual bool Equals (object obj);

The default Value Type implementation of this method uses Reflection and causes the value to be Boxed for each comparison. Therefore understanding this is key to improving its performance.

In this example, I use the following struct to make large number of comparison using the default ValueType.Equals() implementation. The results are shown below. It lists the average time per call to ValueType.Equals() and the number of Garbage Collections performed.

public struct MyStruct
{
    public int Index;
    public bool IsInitialized;
}

The Solution to the Performance Problem

Once you know how to resolve the issue, the solution is quite simple. The first step is to Override the Equals method to prevent unnecessary Reflection. The second is to implement the IEquatable<T> interface, which requires an overload of the Equals() method that takes the type T as the parameter. By implementing IEquatable for a Value Type, you create a strong-typed implementation of Equals method. Therefore effectively eliminating the need for Boxing. The next iteration of the MyStruct shows how this is done.

    public struct MyStructV2 : IEquatable<MyStructV2>
    {
        public int Index;
        public bool IsInitialized;

        public override bool Equals(object obj)
        {
            // If the types is not MyStructV2 return false
            if ((obj is MyStructV2) == false)
            {
                return false;
            }
            
            // Cast the incoming object to a Value Type 
            MyStructV2 other = (MyStructV2) obj;
            // Perform the comparison on the value
            return Index == other.Index;
        }

        public bool Equals(MyStructV2 other)
        {
            return Index == other.Index;
        }
    }

The performance improvement is over 73 fold with 0 Garbage Collection. This is significant since it can become exponentially expensive at scale.

One last thing to note is that the ValutType.GetHashCode is an override of the Object.GetHashCode implementation and not suitable for Value Types . Therefore it is worth providing an implementation of the GetHashCode method for your Value Type. Especially when you override the Equals method because the Hash Code for two equal instances, should also be equal.

HashSets and Dictinonaries make use of the GetHashCode. Therefore if there is any chance your Value Type will be used in one of those collections you should provide a custom implementation of GetHashCode.

Welcome to my site

Reading Time: < 1 minute

Hello, my name is Alessandro Buchala. I am a full-stack, .Net software engineer who’s been working in the Healthcare space for the past 13 + years. This work has spanned over a variety of health care systems. These include Electronic Medical Records, Data Synchronization and Management and Clinical Decision Support systems.

In my time away from the keyboard I like to ride my bike, run and build Lego with my son.

I love learning new things and sharing my knowledge with others and I hope you can learn something from my blog.