A developer seeking to avoid bugs in a statically-typed language has one very dependable ally: the compiler - effectively the first line of defence. It provides guaranteed 100% coverage of code for the errors that it is able to detect, and of course the errors it throws up cannot be ignored. However, the compiler needs guidance to do this in the form of distinct types. A pattern of strong typing can therefore provide this guidance in a way that using more general types cannot.

Strong typing will also improve the readability and clarity of code, as well as making it easier for you to refactor and introduce new features with greater confidence that you are not breaking the existing code.

What I want to do below is illustrate how to go about strong typing (it’s not difficult) and expand on the advantages of doing it. The illustrative examples are financial, but the principles would benefit any codebase. Similarly, although the examples and some of the discussion are C#-specific, again the principles are generally applicable to other statically-typed languages, such as Scala or C++. Tools for making the generation of strong types in C# quicker are also available on our GitHub site.


A motivational example

 

Let’s start with a case study. Say you are a stock broker and have a database for managing customer share accounts, in which data such as account positions are held - that is, a record by customer account of the number of shares held in a particular stock. The AccountPosition table might look like this:
 

AccountRef StockId Position

ACC001

GOOG

10000

ACC001

MSFT

15000

ACC002

GOOG

20000


Let’s also say you have a .NET API for managing this database that allows one to update the position in a particular stock for an account by specifying a position change and which returns the new position:
 

namespace Demo
{
    public interface IAccountsDb
    {
        int UpdatePosition(string accountRef,
                           string stockId,
                           int positionChange);

        /* ... other methods ... */
    }
}


This API exposes the user to a couple of possible bugs that a compiler cannot prevent:

  • the account reference and stock identifier could be passed in the wrong way round;

  • the new position could be calculated then passed in as the last argument, rather than passing in the position change.

Clearly good parameter names help avoid mistakes like this, as would documentation, but both rely on both being present and kept up-to-date. Thorough unit testing would also help, but using strong typing allows us to utilise more in our defence.

By using strong typing, we can utilise the compiler to avoid the problems above. Essentially, the three parameters and return from the UpdatePosition method represent four different, very specific concepts. However, we have chosen to represent them in code using two very general types, which could each represent a myriad of concepts. Instead, we should define a different type for each and adjust the API accordingly:
 

using System;

namespace Demo
{
    public struct AccountRef : IEquatable<AccountRef>
    {
        private readonly string _value;

        /* ... public interface etc. elided ... */
    }

    public struct StockId : IEquatable<StockId>
    {
        private readonly string _value;

        /* ... public interface etc. elided ... */
    }

    public struct PositionChange : IEquatable<PositionChange>
    {
        private readonly int _value;

        /* ... public interface etc. elided ... */
    }

    public struct Position : IEquatable<Position>
    {
        private readonly int _value;

        public Position Adjust(PositionChange positionChange)
        {
            return new Position(_value + (int)positionChange);
        }

        /* ... rest of public interface etc. elided ... */
    }

    public interface IAccountsDb
    {
        Position UpdatePosition(AccountRef accountRef,
                                StockId stockId,
                                PositionChange positionChange);

        /* ... other methods ... */
    }
}


Now the form of the UpdatePosition method helps to prevent the possible bugs mentioned above by using distinct types for all the concepts: it’s no longer possible to use a stock ID for the first parameter, for instance. It also adds a degree of self-documentation to the code.

We’ll discuss the various merits, demerits and caveats of this approach later, but first we’ll look at how you might want to define these types. Taking the AccountRef type in the example above, a basic definition in C# would be as below:
 

using System;

namespace Demo
{
    public struct AccountRef : IEquatable<AccountRef>
    {
        private readonly string _value;

        private AccountRef(string value)
        {
            _value = value;
        }

        public bool Equals(AccountRef other)
        {
            return string.Equals(_value, other._value);
        }

        public override bool Equals(object obj)
        {
            if (ReferenceEquals(null, obj))
            {
                return false;
            }

            return obj is AccountRef && Equals((AccountRef)obj);
        }

        public override int GetHashCode()
        {
            return _value != null ? _value.GetHashCode() : 0;
        }

        public override string ToString()
        {
            return _value ?? "<<UNKNOWN>>";
        }

        public static explicit operator AccountRef(string value)
        {
            return new AccountRef(value);
        }

        public static explicit operator string(AccountRef value)
        {
            return value._value;
        }
    }
}

You might additionally want to implement the System.IComparable<> interface for types that might be useful for ordering by and possibly mathematical operators for types such as Position.

Ideally, anywhere in the code that requires an account reference would use the AccountRef type rather than a string. Clearly, however, inputs from and outputs to other systems such as databases will require conversion from or to a string, and thus the provision of explicit cast operators.

A public constructor could be used in place of or as well as the string cast operator, but I feel that casting better expresses that this is a distinct type.

It could also be argued that the underlying value should be exposed via a public AccountRef.Value property, rather than held in the private member variable _value. Again, I feel providing a cast operator better expresses intention. Moreover, a public property has another failing. If in the future it were decided that the underlying value would be better as a byte array, say, then the return type of AccountRef.Value would need to change from string to byte[] and so break client code. If instead cast operators are used, then additional byte-array operators could be added, and the existing string-based operators retained and amended to do the necessary conversion, thus preserving client code.

Discussion

I stated above that the new form of the UpdatePosition method helps prevent bugs of the type mentioned. The pattern essentially provides a framework for writing safer code, but does not provide an instant cure. To illustrate this, say that UpdatePosition was called somewhere in code in the following manner:
 

string accountRef = ...;
string stockId = ...;
int positionChange = ...;

int newPosition =
    (int)_accountDb
        .UpdatePosition((AccountRef)accountRef,
                        (StockId)stockId,
                        (PositionChange)positionChange);


This is a minor improvement over the equivalent call and at least the casts force the programmer to think a bit more. However, the full benefit will only be accrued should these types be used throughout the code, except where impossible to do so - for example, when interacting directly with a database, or UI controls or another system that does not understand these types. In such cases, conversion to the strong type should take place as soon as possible, and no API of any kind - even if it’s just an internal class - should use anything other than strong types.

The benefits of strong typing go far beyond simply reducing ambiguity in method signatures. The act of encapsulation allows one to alter the underlying data type much more easily than would otherwise be the case. If it were decided that a position were better represented using a double or long rather than an int as it has been above then doing so by changing the data member in Position would be considerably less impactful than needing to change everywhere that an int is used to store a position. There might be places where casting to or from an int might need to be addressed, but finding those places should be straightforward using a modern IDE.

Another benefit of encapsulating is the ability to perform constraint checking. If, for example, an account reference were expected to be three uppercase letters followed by three digits, then the constructor of AccountRef could be made to check that. Clearly, such a check can be made somewhere against a raw string. But the question is what is then responsible for performing that check. If just a string were used then any method handling account references would need to perform the constraint checking or just hope the value is in the correct format. On the other hand, if AccountRef is used by such methods then the correct format is guaranteed by contract. This approach also guarantees that format errors are detected as soon as they are made and not embedded deeper in the code - if, for instance, the text from a text-entry UI control is immediately cast to an AccountRef after a user has entered it, then the error will be presented to the user there and then rather than materialising deeper in the process or in another process.

Yet another advantage of encapsulation is implied by the provision of the Position.Adjust method. Though there is nothing to prevent a developer from casting to int, doing the maths and then constructing a new Position instance, the presence of this method defines the correct usage pattern when adjusting positions.

Further ways in which strong typing can help with changing existing code can be illustrated by way of an example. Say there is an interface to a reporting system, which, among other things, allows for the reporting of overall position changes for a stock:
 

namespace Demo
{
    public interface IPositionReporter
    {
        void ReportPositionUpdate(StockId stockId,
                                  Position newPosition);
    }
}


Now let’s also say that we start trading futures contracts as well as stocks, and we represent futures using the strong type FutureId that, like StockId, uses an underlying type of string. Adding a method for reporting futures position updates is as straightforward as providing an override to ReportPositionUpdate that takes a FutureId instead of a StockId. If both IDs had been represented by a type of string, then a new method name would be required (e.g. ReportFuturesPositionUpdate). Not only would the code become more verbose, there would also then be a decision required as to whether or not to rename the stock-reporting method to be consistent with the naming of the new method. Renaming would possibly widen the impact of adding the new method particularly if this were a public API. Not renaming would make the existing method usage ambiguous - there’s nothing to say it’s only for stock position reporting. And, again, either way we have the now-familiar issue that there’s nothing the compiler can do to stop you from calling the wrong method.

A different modification to the reporting system might be that there’s a request to display a stock using a friendly name rather than an ID (e.g. "Google Inc." instead of "GOOG"). A new type of StockName could be added, again with an underlying type of string, and the interface adjusted to:
 

namespace Demo
{
    public interface IPositionReporter
    {
        void ReportPositionUpdate(StockName stockName,
                                  Position newPosition);
    }
}


This would break existing code such that it would no longer compile. Had a string been used for stock ID and stock name, then that would not be the case. However, the break is a good thing as it makes it a lot easier to analyse existing code and determine the changes required so that the correct data is reported.

The downsides of strong typing

 

Adding these new types will clearly increase the size of your codebase and binaries. How much of a real issue this is though is a moot point, and unless you have a very large number of these types then the benefits will almost definitely outweigh this negative. Remember that using them will reduce the amount of repeated constraint-checking code required should string or int be used instead.

Probably the biggest barrier to using these types is generating them in the first place, especially if you want the types to be comparable as well as equatable. Once generated, they tend to be very low maintenance. But the initial generation can be a pain, particularly if you have a number of them and particularly in C#. Below we look at possible ways of defining these types without having to write too much boiler-plate code. However, even if you find yourself having to write this boiler-plate code, I would argue that the improvements to your code-base you get from using strong typing are worth it. In addition, using strong typing will reduce code required elsewhere to check constraints and reduce the amount of unit-testing required.
 

The definition of strong types in different languages

 

Both Scala and C++ have language features that can reduce the amount of work involved in defining these types. In Scala you could perhaps use case classes which provide default definitions of certain operators in terms of the class parameters provided:
 

package Demo

case class AccountRef(private val value: String)


Whilst in C++, Boost’s BOOST_STRONG_TYPEDEF will allow you to define a type in terms of another, giving equatability and comparability. This is shown below with the addition of a hash_value function which will allow the type to be used in std::unordered_set or as a key in std::unordered_map when used in conjunction with boost::hash.
 

#include <boost/functional/hash.hpp>
#include <boost/serialization/strong_typedef.hpp>

namespace Demo
{
    BOOST_STRONG_TYPEDEF(std::string, AccountRef);

    std::size_t hash_value(AccountRef const& accountRef)
    {
        boost::hash<std::string> hasher;
        return hasher(static_cast<std::string>(accountRef));
    }
}


However, although BOOST_STRONG_TYPEDEF will define a type which will prevent a string being passed to a method expecting an AccountRef, implicit conversions in other circumstances (such as assignment) are possible. Defining a variation on BOOST_STRONG_TYPEDEF that ensures that explicit conversions are required should be straightforward, however, and such a macro could probably include a standard definition of the hash_value function.

In C#, though, there are no handy short-cuts. However, the pain can be alleviated to a degree by using code templates for generation, and to this end you’ll find ready-made templates for use in Visual Studio and Visual Studio Code in our code templates repository on GitHub.


C# language support would be nice …

 

It would be very nice indeed if C# had something built-in to support this pattern such that an AccountRef could perhaps be defined in a way not dissimilar to Scala’s case classes:
 

namespace Demo
{
    public struct AccountRef(private string _value);
}


Ideally this would generate a struct implementing IEquatable<AccountRef>, IComparable<AccountRef> as well as explicit casts to and from a string. Like case classes, customisation would be allowed so that constraints could be added.

As it happens, there is already a proposal for a language feature called "record" types logged at the C# Language Design repository. In their current incarnation, records wouldn’t necessarily remove all of the boiler-plate code as only IEquatable<> is proposed as being provided by default. But they would be a great step towards making it easier to define strong types. Please head over there and champion the cause if you feel this would be worth adding to the language.