Data Technologies at Winton
Data is front and centre across Winton’s businesses. We stream data at different frequencies into column stores; store huge quantities of raw, uncleaned historical data in document databases; and staff at our San Francisco office are exploring new data technologies and searching for novel data sources. But before all of this, we relied solely on Microsoft SQL Server for our data technology needs.
In 2017, Winton will celebrate its 20th anniversary and our SQL databases are not too far behind in terms of age. The business continued to develop and innovate, but our workhorse SQL databases didn’t age quite so well. Technical debt from years of frenzied start-up-like development left us with a highly interconnected SQL estate that was increasingly difficult to manage. Consequently, our development costs and deployment risks grew.
Our move to using Visual Studio and SSDT for SQL development was part of a larger re-think of how we designed and managed our SQL estate.
Moving to Visual Studio
The move to manage SQL code in SSDT was not taken lightly. The previous process of managing and developing shared databases through SQL Management Studio allowed changes to be deployed quickly. Enabling developers to work in their own isolated development environment, using Visual Studio and a local SQL instance, brought down the overheads of builds, tests and formal release processes.
The benefits of Visual Studio and a more traditional development process are clear:
Code is kept in git
Changes are reviewed via pull requests
Builds catch many syntax and reference errors
Unit tests can be written and executed against the database structure
Releases are deterministic, easily reviewed and applied
Rolling back is also more deterministic – the previous state is always known exactly
The basic build and release cycle should be familiar to all developers:
Specific toolsets we use as part of our SQL development are:
SQL Server Data Tools (SSDT)
Enables building, testing and deployment of SQL code inside Visual Studio alongside other languages. Builds will report errors at build time, including missing objects and incorrect parameters. Unit tests can ensure that changes do not break existing code, thus reducing deployment risk as well as development time and cost.
SSDT makes use of the standard MS build project structure too, so we were able to unify and standardise property sheets for building, testing and deploying database projects keeping duplication to a minimum.
|Our SQL codebase is kept in Git, whilst Bitbucket server hosts our internal repositories and code reviews.||We have a continuous integration and, increasingly, deployment processes for all of our SQL code running inside Jenkins.||Red Gate SQL Compare allows us to produce upgrade scripts by comparing current build output with
production. Our upgrade scripts are parametrised such that they can be applied to dev, UAT and production environments without further modification.
When importing our databases into SSDT, we faced a number of challenges, including the following:
Some legacy SQL would not compile after import. This was defunct code left behind when applications were dropped but the SQL remained untouched.
While SQL allows circular references, SSDT projects do not. We found so-called ‘partial projects’ were insufficient to solve these problems in SSDT. Instead, we built a library of ‘Stub’ projects that broke these circular dependencies.
The size of our SQL estate (over 30,000 objects) was of itself a challenge when trying to represent this in SSDT. Therefore, to reduce development timescales we:
Created a ‘Jumpstart’ mechanism, rapidly deploying a versioned SQL build onto a developer’s local SQL instance;
Compiled both SSDT projects and Visual Studio solutions in parallel as far as possible;
Produced our own deployment framework that not only deploys databases from SSDT in parallel, but also generates and validates upgrade scripts in parallel as well.
To ensure build and test process correctly validated SQL code, it was essential that our build servers deployed databases onto a clean SQL instance so we:
- Provided a ‘VMs on Demand’ service, provisioning new servers with a freshly installed and configured SQL instance
- Enabled developers to use this scale-out service directly as required for their development and testing
Has the journey to Visual Studio been worth it? On balance, yes. It hasn’t been without hiccups or difficulties but we are left with a process that moves our changes rapidly from development to production servers. At the same time, we are now confident that we can revert to previous versions of code and identify differences quickly and easily should problems arise.