Technology Blog

Creating Spark Dataframes without a SparkSession for tests

3/26/2019 12:58:01 PM

Back to the scheduled .NET content after this brief diversion into... Java.

I'm currently helping a team put some tests around a Spark application, and one of the big bugbears is testing raw data transformations and functions that'll run inside the spark cluster, on the outside of it. It turns out the most of the core Spark types all hang off a SparkSession and can't really be manually constructed - something a quick StackOverflow query appears to confirm. People just can't seem to create Spark Dataframes outside of a spark session.

Except you can.

All a Spark Dataframe really is, is a schema and a collection of Rows - so with a little bit of digging, you realise that if you can only create a row and a schema, everything'll be alright. So you did, and you discover no public constructors and no obvious ways to create the test data you need.

Unless you apply a little bit of reflection magic, and then you can create a schema with some data rows trivially

Copy pasta until your hearts content. Test that Spark code, it's not going to test itself.

Building .NET Apps in VSCode (Not .NetCore)

11/17/2016 10:09:50 AM

With all the fanfare around .NET Core and VS Code, you might have been lead to believe that you can't build your boring old .NET apps inside of VS Code, but that's not the case.

You can build your plain old .NET solutions (PONS? Boring old .NET projects? BONPS? God knows) by shelling out using the external tasks feature of the editor (

First, make sure you have a couple of plugins available

  • C# for Visual Studio Code (powered by OmniSharp)
  • MSBuild Tools (for syntax highlighting)

Now, "Open Folder" on the root of your repository and press CTRL+SHIFT+B.

VS Code will complain that it can't build your program, and open up a file it generates .vs\tasks.json in the editor.  It'll be configured to use msbuild, but won't work unless MSBuild is in your path, with a trivial edit to correct the path, you'll be building straight away:


// See

// for the documentation about the tasks.json format

"version": "0.1.0",

"command": "C:\\Program Files (x86)\\MSBuild\\14.0\\Bin\\msbuild.exe",

"args": [



"taskSelector": "/t:",

"showOutput": "silent",

"tasks": [


"taskName": "build",

"showOutput": "silent",

"problemMatcher": "$msCompile"




CTRL+SHIFT+B will now build your code by invoking MSBuild.

Get Coding! - I wrote a book.

6/28/2016 1:58:46 PM

Through late 2015 and the start of 2016 I was working on a “secret project” that I could only allude to in conjunction with Walker Books UK and Young Rewired State – to write a book to get kids coding. As a hirer, I’ve seen first hand the difficulty in getting a diverse range of people through the door into technology roles, and I thoroughly believe that the best way we solve the problem is from the ground up - changing the way we teach computer science.

We live in a world where the resources to teach programming are widely available if there’s an appetite for it, and the barrier to entry is lower than ever, so in some ways, this is my contribution to the movement.


Get Coding is a beautifully illustrated (courtesy of Duncan Beedie) book that takes a “broad not deep” approach to teaching HTML5, JavaScript and CSS to kids from ages 8 and up. It was edited by Daisy Jellicoe at Walker Books UK, and without her attention to detail and enthusiasm it wouldn’t have come out half as well as it did. It’s quite long, at 209 pages, and comes with a story that children can follow along to.

Learn how to write code and then build your own website, app and game using HTML, CSS and JavaScript in this essential guide to coding for kids from expert organization Young Rewired State. Over 6 fun missions learn the basic concepts of coding or computer programming and help Professor Bairstone and Dr Day keep the Monk Diamond safe from dangerous jewel thieves. In bite-size chunks learn important real-life coding skills and become a technology star of the future. Young Rewired State is a global community that aims to get kids coding and turn them into the technology stars of the future.

The book is available on Amazon and in major high street bookstores.

I’ve been thrilled by the response from friends, family and the community, and it made the sometimes quite stressful endeavour thoroughly worthwhile. After launch, Get Coding! has ended up number 1 in a handful of Amazon categories, was in the top 2000 books on for about a week, and has received a series of wonderfully positive reviews, not to mention being recently featured in The Guardians Best New Children's Books Guide for Summer 2016.

I’ll leave you with a few photos I’ve been sent or collected and hope that perhaps you’ll buy the book.




IMG_20160618_150039_sm IMG_20160618_150201_sm

IMG_20160618_150212_sm IMG_20160618_145724-sm


13139134_10102059147374565_2571542643273524466_n  13227662_10153563152840373_7245727713019050736_o 

CiB_8H4WEAE1zFd  13138762_10153468356995966_6651036153885572284_n

Why code reviews are important

4/7/2016 2:40:41 PM

Making sure that more than one set of eyes has seen all the code that we produce is an important part of software development - it makes sure that we catch bugs, keep our code readable, and share patterns and practices across the teams.

Code review should answer the questions

  • Are there any logical errors?
  • Are the requirements implemented?
  • Are all the acceptance criteria of the user story met?
  • Do our unit tests and automation tests around this feature pass? Are we missing any?
  • Does the code match our house style?

The interesting thing about the effectiveness of code review is that it isn't just hearsay, it was measured effectively in the seminal book "CODE Complete":

.. software testing alone has limited effectiveness -- the average defect detection rate is only 25 percent for unit testing, 35 percent for function testing, and 45 percent for integration testing. In contrast, the average effectiveness of design and code inspections are 55 and 60 percent. Case studies of review results have been impressive:

  • In a software-maintenance organization, 55 percent of one-line maintenance changes were in error before code reviews were introduced. After reviews were introduced, only 2 percent of the changes were in error. When all changes were considered, 95 percent were correct the first time after reviews were introduced. Before reviews were introduced, under 20 percent were correct the first time.
  • In a group of 11 programs developed by the same group of people, the first 5 were developed without reviews. The remaining 6 were developed with reviews. After all the programs were released to production, the first 5 had an average of 4.5 errors per 100 lines of code. The 6 that had been inspected had an average of only 0.82 errors per 100. Reviews cut the errors by over 80 percent.
  • The Aetna Insurance Company found 82 percent of the errors in a program by using inspections and was able to decrease its development resources by 20 percent.
  • IBM's 500,000 line Orbit project used 11 levels of inspections. It was delivered early and had only about 1 percent of the errors that would normally be expected.
  • A study of an organization at AT&T with more than 200 people reported a 14 percent increase in productivity and a 90 percent decrease in defects after the organization introduced reviews.
  • Jet Propulsion Laboratories estimates that it saves about $25,000 per inspection by finding and fixing defects at an early stage.

We do code reviews because they help us make our code better, and measurably save a lot of money - the cost of fixing software issues only multiplies once they're in production.

Code review check-list

Not sure how to code review? Here's a check-list to get you started, derived from many excellent existing check-lists.


  • Does the code work?
  • Does it perform its intended function?
  • Is all the code easily understood?
  • Does it conform to house style, standard language idioms?
  • Is there any duplicate code?
  • Is the code as modular as possible?
  • Can any global variables be replaced?
  • Is there any commented out code?
  • Can any of the code be replaced with library functions?
  • Can any logging or debugging code be removed?
  • Has the "Boy scout rule" been followed? Is the code now better than before the change?


  • Are all data inputs checked (for the correct type, length, format, and range) and encoded?
  • Where third-party utilities are used, are returning errors being caught?
  • Are output values checked and encoded?
  • Are invalid parameter values handled?


  • Is any unusual behaviour or edge-case handling described?
  • Is there any redundant auto-documentation that can be removed?


  • Is the code unit tested?
  • Do the tests actually test that the code is performing the intended functionality?
  • Could duplication in test code be reduced with builder / setup methods or libraries?

Practical Tips

Don't try to be a human compiler

The first and most important thing to remember when you're doing a code review is that you're not meant to be a human compiler. Ensure you're reviewing the functionality, tests and readability of the code rather than painstakingly inspecting syntax. Syntax and house style are important, but style issues are a much better fit for automated tooling than humans. Don't waste time bickering over style and formatting.

Reviewers are born equal

Our teams are built around mutual trust and respect, and a natural extension of that is that anybody can code review. You may on occasion find yourself working on some code that is someone else's area of particular interest or expertise - but better solicit their advice while you're working than wait for a review. Code reviews aren't limited to your technical lead, and likewise, a lead should have their code reviewed all the same.

It's just code, you're not marrying it

Don't be precious about your code as a submitter, and as a reviewer be honest and open. It's just code, make sure it's the best it can be. A code review is an opportunity to make your code the best code it can be, using the expertise of your colleagues.

That's fine!

Sometimes the code is just fine - don't be the person that nitpicks for change without any quantifiable benefit.


Code reviews should be performed on change-sets via either a branch comparison URL, or ideally, a pull request.

Stash has replaced our legacy tool (FishEye) for code reviews using Git, and as projects migrate from SVN they are expected to migrate to using branch comparison or pull requests in Stash.

Pair programming and code reviews

As one of the founding practices of eXtreme Programming (XP), pair-programming can be seen as "extreme code review"

"code reviews are considered a beneficial practice; taken to the extreme, code can be reviewed continuously, i.e. the practice of pair programming."

Pair programming exhibits exactly the same qualities of a great code review, with the additional benefits of immediacy - it's impossible to ignore a pair critiquing a design in real time, poor choices barely survive, and work doesn't end up blocked in a code review queue somewhere.

Pairing is usually preferably to code review for regular work, and if code is produced as part of a pair it can be considered "code reviewed by default". Unfortunately due to reasons of availability, location or specialisation, you may need to rely on a more tradition code review with some of the code you write.

Realistic expectations

Like everything a code review is not a silver bullet - and while they've proven to protect us against a large quantity of visibly obvious bugs, code reviews aren't great at spotting performance bugs or subtle threading and concurrency issues.

You'll need to rely on existing instrumentation and profilers for these types of metrics - "mentally executing" and finding these kinds of bugs is unlikely, if not impossible, just beware to not believe your code to be free of error by virtue of the fact that it's been subjected to a code review or pairing session. Ironically, the kinds of bugs that'll slip through a code review or pairing session will by definition be these tricky and hard to detect edge cases.

Projections in Agile Software Development

11/28/2015 4:44:24 PM

Preparing a timeline for the development of software is a difficult problem – teams have proven time and time again that they’re tremendously bad at estimating how long it takes to develop software at any kind of scale.

Unfortunately, there’s a conflict between the need for financial planning and budgeting, and the unpredictable nature of software delivery – to the extent that agile software advocate frequently dismisses the accuracy and need for timelines and schedules.

This presents a difficult problem – organisations need to plan and budget, but their best technical people tell them that accurate planning is impossible.

We can make projections to try and close this gap.

Projections are common in financial planning and they serve the same purpose in software – they’re a forecast.


Planning out and working on your company's financial projections each year could be one of the most important things you do for your business. The results--the formal projections--are often less important than the process itself. If nothing else, strategic planning allows you to "come up for air" from the daily problems of running the company, take stock of where your company is, and establish a clear course to follow.


Variances from projections provide early warning of problems. And when variances occur, the plan can provide a framework for determining the financial impact and the effects of various corrective actions.

Projections are formal educated guesses – they don’t claim accuracy, but they intend to give a reasonable idea of progress.

With a well thought out projection, your team can continuously evaluate how their current progress lines up with their projection. As and when the real teams’ world progress and the projection start to diverge, they can realistically measure the difference between the two and adjust the projection accordingly.

Used in this way, projections are a useful tool for communicating with business stakeholders – suggesting a “best guess at when we might be doing this work” and a means of communicating changes in timelines divorced and abstracted away from the “coal face” of user stories, velocity, estimation and planning. Teams use their projections and actual progress to explain and adjust for changes in complexity, scope and delivery time.

If estimation and planning in software is hard, then projection is much harder – it’s “The Lord of the Rings” – a sprawling epic of fiction.

On a small scale, planning is predicated on the understanding that “similar things, with a similar team, take a similar amount of time” – and this generally works quite well.

People understand this quite intuitively – it’s the reason your plumber can tell you roughly how long it’ll take and how much it’ll cost to fit a bathroom. At a larger scale however, planning starts to fall apart. The more complex a job, the more nuance is involved and the more room to loose details and accuracy.

The challenge in putting together a projection is to bridge the gap between the small scale accuracy of planning features, and the long term strategic objectives of a product.

The Simplest Possible Process

In order to put together a projection and make sure that it’s realistic, you need a base-line – some proven and known information to use as a basis of the projection.

You’ll need a few things

  • A backlog of “epics” – or large chunks of work
  • Some user stories of work you’ve already completed
  • A record of the actual time taken to finish that work
  • A big wall
  • Some sticky notes

With these things we’re going to

  • Establish a baseline from the work we’ve already completed
  • Place our completed work on the wall using sticky notes
  • Estimate our new work in relation to our completed work
  • Assign our new work a rough time-box

A note on scales of estimation

Agile teams generally use abstract estimation scales to plan work – frequently using the modified Fibonacci sequence or T-shirt sizes. It’s recommended to use a different scale than you would normally use for story estimation for your projections, to prevent people accidentally equating the estimates of stories in a sprint, to estimates on epics in a roadmap.

I find it most useful to use Fibonacci for scoring user stories, and using t-shirt sizes for projections.

The trick to a successful projection is to find a completed epic or big chunk of work to “hang your hat on” – you can then make a reasonable estimate on new work by asking the question

“Is this new piece of work larger or smaller than the thing we already did?”

This relative estimation technique is called stack ranking – it’s simplistic, but very easy for an entire team to understand.

Take this first item, write it on a sticky note and position it somewhere in the middle of the wall, making a note of the time it took to complete the work as you stick it up.

You should repeat this exercise with a handful of pieces of work that you’ve already completed – giving you a baseline of the kinds of tasks your team performs, and their relative complexity to each other based on their ordering.

As you stick the notes on the wall, if something is significantly more difficult than something else, leave a bit of vertical space between the two notes as you place them on the wall.


Now you’ve established a baseline around work that you’ve already completed, you can start talking – in broad terms, about the large stories or epics you want to projects and schedule.

Work through your backlog of epics, splitting them down if you can into smaller chunks, and arranging them around the stack ranked work on your wall. If something is about as hard as something else, place it at the same level as the thing it’s as hard as.


Once you’ve placed all of your backlog on the wall, arrange a time scale down the right hand side of the wall, based on the work you’ve already completed, and a scale from “XXXS” to “XXXL” on left hand side.


The whole team can then adjust their estimates based on the time windows, and the relative sizes assembled on the wall. Once the team is happy with their relative estimates, they should assign a T-shirt size to each sticky note of work – which in turn, relates to a rough estimate of time.

You can now remove the sticky notes from the wall, and arrange them horizontally into a timeline using the “medium size” sticky note as a guide.


These timelines are based around a known set of completed work, but none the less, are just projections. They’re not delivery dates, they’re not release dates – they’re best guess estimates of the time window in which work will be completed.

Adjusting your projections

Projections are only successful when continually re-adjusted – they’re the early warning, a canary in the coal-mine, a way to know when you’re diverging from your expectations. Treating a projection like a schedule will cause nothing but unmet expectations and disappointment.

As you complete actual pieces of work from your projection, you should verify its original projection against its actual time to completion – making note of the percentage of over or under estimation on that particular piece.

Once you complete a piece, you should adjust your projection based on actual results – checking that your assertion that the “medium sized thing takes three months” holds true.

Once armed with real information, you can have honest conversations about dates based in reality, not fiction, helping your product owners and stakeholders understand if they’re going to be delivering earlier or later than they expected, before it takes them by surprise.

Using C# 6 Language Features In Your Software Tomorrow

7/20/2015 8:42:41 PM

Visual Studio 2015 dropped today, along with C# 6, and a whole host of new language features. While many new features of .NET are tied to framework and library upgrades, language features are specifically tied to compiler upgrades, rather than runtime upgrades.

What this means, is that so long as the build environment for your application supports C# 6, it can produce binaries that can run on earlier versions of the .NET framework, that make use of new language features. The good news for most developers is that this means that you can use these language features in your existing apps today – you don’t have to wait to roll out framework upgrades across your production servers.

You’ll need to install the latest version of Visual Studio 2015 on your dev machines, and you’ll need to make sure your build server has the latest Build Tools 2015 pack installed to be able to compile code that uses C#6.

You can grab the Build Tools 2015 pack from Microsoft here:

I’ve verified C#6 apps, compiled under VS2015, at the very least run on machines that run .NET4.5.
If you’re doing automated deployment with Kudu to Azure, you may need to wait until they roll out the tools pack to the Web Apps infrastructure.

Have fun!

Passenger - A C# Selenium Page Object Library (Video)

4/27/2015 1:37:20 PM

A lap around Passenger - a page object library I’ve been working on for selenium.

In this video, we’ll discuss the general concept of page objects for browser automation testing - then we'll work into a live demo comparing and contrasting traditional C# selenium tests, and tests using Passenger.


Passenger on GitHub -
on NuGet -

Page objects for browser automation - 101

4/1/2015 4:56:11 PM

The page object model is a pattern used by UI automation testers to help keep their test code clean. The idea behind it is very simple - rather than directly using your test driver (Selenium / watin etc) you should encapsulate the calls to it's API in objects that describe the pages you're testing.

Page objects gained popularity with people that felt the pain of maintaining a large suite of browser automation tests, and people that practice acceptance test driven development (ATDD) who write their browser automation tests before any of the code to satisfy them - markup included - exists at all.

The common problem in both of these scenarios is that the markup of the pages changes, and the developer has to perform mass find/replaces to deal with the resulting changes. This problem only aggravates over time, with subsequent tests adding to the burden by re-using element selectors throughout the test codebase. In addition to the practical concern of modification, test automation code is natively verbose and often hides the meaning of the interactions its driving.

Page objects offer some relief - by capturing selectors and browser automation calls behind methods on a page object, the behaviour is encapsulated and need only be changed in one place.

Some page objects are anaemic, providing only the selector glue that the orchestrating driver code needs, while others are smarter, encapsulating selection, operations on the page, and navigation.

The simplest page object is easy to home roll, being little more than a POCO / POJO that contains a bunch of strings representing selectors, and methods that the current instance of the automation driver gets passed to. Increasingly sophisticated page objects capture behaviour and will end up looking like small domain specific languages (DSLs) for the page being tested.

The simplest page object could be no bigger than

public class Homepage


public string Uri { get { return "/"; } }

public string Home { get { return "#home-link"; } }


This kind of page object can be used to help cleanup automation code where the navigation and selection code is external to the page object itself. Its a trivial sample to understand, and effectively serves as a kind of hardcoded configuration.

A more sophisticated page object might look like this:

public class Homepage


public string Uri { get { return "/"; } }

public string Home { get { return "#home-link"; } }

public void VisitHomeLink(RemoteWebDriver driver)





This richer model captures the behaviour of the tests by internalising the navigation glue code that would be repeated across multiple tests.

The final piece of the puzzle is the concept of Page Components - reusable chunks of page object that represent a site-wide construct - like a global navigation bar. Page components are functionally identical to page objects, with the single exception being that they represent a portion of the page so won't have a Uri of their own.

By using the idea of page objects, you can codify the application centric portions of your automation suite (selectors, interactions), away from the repetitive test orchestration. You'll end up with cleaner and less repetitive tests, and tests that are more resilient to frequent modification.

Page objects alone aren't perfect - you still end up writing all the boilerplate driver code somewhere, but they're a step in the right direction for your automation suite.

We need to talk about configuration.

3/5/2015 12:41:31 AM

The vast majority of software that you build and use needs to be configured – feature toggles, file and directory paths, startup options. We build configurable software every single day – especially if you build line of business apps.

When we do this we somehow forget that all of the worst software experiences you have suffer from horrible installation and configuration processes.

Ever had to set up a SharePoint cluster?
Muddle through SQL Server replication configuration?
Nobody ever had a good time configuring an application.

Conversely, all the software that you love is defined by its slick user experience. Great software doesn’t require configuration, it just works.

Every time I watch a developer add “just another configuration value” or “just add something extra into the installation script”, I feel a pang of sadness, because I know all that’s really happening is they’re setting themselves a trap for later.

Configuration code takes time to build – you have to build parsers, or pick meaningful names for configuration settings.

You have to plumb those settings into your code somewhere.

For all this work? You’re rewarded with code that people can misconfigure.

A new point of failure in your application.

“But that’s ok!”

I hear you cry.

“I’ll write some documentation!”

I’ve got some bad news for you. That documentation gets out of date, or that person that uses your code doesn’t even know exists.

But lets pretend for a second, that this is all ok.

Lets pretend we correctly configure our software for the environment it’s deployed into, with correct database connection strings, paths to dependent services, and weird internal settings for physical paths on the local machine.  You start the application and it crashes. 

So you trawl through your error logs, and find an exception log that says something like

Application failed to start due to DirectoryNotFoundException – couldn’t open c:\program files (x86)\YourApp\UserData”.

You take a look, and sure enough, the directory doesn’t exist – so you create it and try again and the app starts.


Friends don’t let friends configure software

If you bought some software and endured that kind of user experience during installation and configuration, you’d probably give up, yet we don’t think twice about exposing our teams in dev and ops to this kind of complexity every single day.

To make things worse, in my experience the vast majority of configuration in setup that applications go through isn’t even required – it’s just in place to cover the cracks where we could’ve done better as developers.

Every time you’re tempted to add a new configuration setting, ask yourself carefully

“Is there any way my application can infer this configuration setting from it’s environment?”

And every time you’re tempted to add a custom installation step or an extra line to your setup guide, think

“Can my application verify it’s environment and do this task at startup?”

Convention and inference are the antidotes to configuration and complexity in setup, and I want to spend a bit of time explaining how you can infer and discover the vast majority of behaviours and settings you think that you need to configure.
We should aim for our software to be safe and configured by default.


Why do we configure?

To understand why we implement configuration, we need to look at the kind of things that get configured.

We configure software…

  • To adjust the behaviour of the application at install time
  • To adapt the application to multiple environments
  • To configure aspects of our applications functionality

Of the things we configure, you can split the types of configuration into a couple of categories

  • Configuration pointing to environment-specific instances of dependant services
  • Configuration that toggles application behaviour
  • Configuration that deals with the internal behaviour of the software being installed


Environment specific configuration is frequently unavoidable, often repetitive and badly factored.

Feature toggles and internal configuration are much more contentious, frequently defining functionality that shouldn’t be externally configured.

If you’re building software and you need to deploy it to multiple target environments, you’re probably going to end up with some kind of application specific configuration, defining environmental settings, database connection strings, and URIs that you depend on. There’s a certain amount of necessity to environmental connection strings, but we can make it as simple as possible.

Unfortunately, software that requires a lot of configuration and setup, is by definition more difficult to deploy, one of our key goals in building reliable, deployable software should be supporting simple automation.  Simplicity in usage and installation is mandatory, not optional.

Let’s talk about how we can make our software configuration simpler.


Always verify your environment

Your application should attempt to verify the existence of absolutely everything it depends on, creating things as it needs to.

This means that if you application requires certain data directories that don’t exist, or has expectations of other networked resources, it should detected, verify and create them at application startup.

Failure detect dependencies it cannot create should be a fatal error, while the lack of a dependency it can create should result in those dependencies being correctly created.

This includes

  • Creating queues
  • Registering topics on event busses
  • Creating directories
  • Setting permissions
  • Making exploratory calls to web-services

You application should refuse to startup in an invalid state, and ideally, should be able to create its entire execution environment from sensible defaults.

Keeping these checks in the codebase of the application, and executing them at startup rapidly reduces time to resolution of any issues. It’s simple to implement, there is no excuse.


Regularity helps

Unpredictable environments have a cost on the complexity of application configuration.

Do your best to ensure that all the target environments you deploy to follow regularly patterned names, and verify these strong conventions.

Use names like


This regularity in your environment makes mistakes trivial to spot, and reduces the need for verbose and error prone configuration.

Irregularly formed environments require a huge amount of configuration to automate, and that configuration will be brittle and get broken.


Discovery heuristics beat configuration

Anything that a human has to configure, a computer can probably do better. Consider having your software detect the environment it exists in, and configure itself appropriately.

You can lean on environmental variables, or even by detecting which services exist in the environment at startup either by connecting to them, or using a service registry of some kind.

This is an obvious technique to use if you have configuration for failover services – at runtime, your application can attempt to connect to both the primary and secondary service, and select which one to configure itself against. I’ve had good experiences using this technique to load balance across payment gateways that had a tendency to fail.


If you must configure, keep it dry and remove repetition

One of the absolute worst things I see regularly, is huge configuration files, and templated transformations, to repeatedly change a single portion of a templated string over and over.

We’ll end up with something like this


<add key=”Server1” value=”” />

<add key=”Server2” value=”” />

<add key=”Server50” value=”” />


and a transform file that looks like this…


<add key=”Server1” value=”” />

<add key=”Server2” value=”” />

<add key=”Server50” value=”” />


When a much simpler approach would’ve been to configure the application with a string template


<add key=”env” value=”environment1” />

<add key=”Server1” value=”http://someservice1.{env}.com” />

<add key=”Server2” value=”http://someservice2.{env}.com” />

<add key=”Server50” value=”http://someservice50.{env}.com” />


and allow the application to splice the two strings together. This means that your environmental configuration files replace a single line of configuration – not 50. We can take this further and support overridden settings from environmental variables, all because the application now controls how it’s configured and doesn’t blindly read a text file and just go with it.

I’ve seen this remove literally hundreds of lines of configuration peppered through lengthy transformation files.

Never repeat yourself in a configuration file.


Allow important configuration to be overridden

Devise a general way to override any of your inferred or conventional configuration settings.

There will always be the odd environment or special circumstance that requires configuration, so you should make a point of it being the exception rather than the rule.


Configuration Is Hell

We’ve talked about how regulating deployment environments allows you to configure your software by convention, how predictability makes this easy and divergence makes it hard.

We’ve discussed how allowing your application to verify and create it’s local dependencies relieves the administrator from doing this mundane tasks.

We’ve also considered how detection heuristics can help you remove any notion of environmental configuration from your code, instead focusing on the services the app can discover when it starts up.

And finally, we’ve looked at how trivial configuration templating can remove much of the friction and duplication from the small amounts of required configuration.

It’s really easy to “just add more configuration” in situations where discovery heuristics and self-configuring applications are the correct answer, but configuring software is a user experience problem. 

You’d not accept it from boxed software you bought, so you should do your best to protect your teams from configuration, and the accidental bugs that occur when they get it wrong.

Cutting CODE!&ndash;A livestream show for programmers

2/1/2015 10:37:15 PM

I’ve spent some time recently thinking and discussing the idea of live-streaming coding sessions. It started with conversations with my brother about how there’s not really a Twitch TV for programming, but if there was I’d be really into that.

In a classic case of “The Simpsons Already Did It” a week after floating the original idea to start a pair-programming streamed show with Rob Cooper, Scott Hanselmen posted “Reality TV for Developers – Where is for Programmers?”. At about the same time a new sub-reddit grew out of /r/programming called /r/WatchPeopleCode/ – the timing seemed a little too good to be true, so last Sunday I did a stealth trial run and mutely live-coded two hours of hacking on a random library I’ve been spiking. It was fairly dull stuff, but about 80 people came and went over the duration of two hours.

That’s enough of an audience for now. I love writing software, and I love pairing, so earlier today I got together with Chris Bird and we streamed our first “live-on-air code kata”. It clocks in at about two hours, and was fun to put together. Time allowing, I’m going to aim to put one or two of these together a week, ideally sticking to a ping-pong-pairing with conversation format.

Here’s the YouTube recording of the pilot “Cutting CODE!” stream, where we build an image to ASCII art converter in two hours, entirely driven by tests.