Leaky ORMs

The other day I posted a link to this article on our company gmail, as we like to do here, to spark a Friday afternoon debate. One of my fellow devs, the most excellent @knocte came up with this reply so I thought I’d post it here:

“That’s a good article.

ORM pattern is always going to be leaky* because it’s just a workaround to a bigger problem: impedance mismatch between the relational model of data and the object-oriented model of our languages.

Until we have a language that has the relational model built-in as first class citizen**, or until we use an object-oriented database (IMO better than the document-oriented ones), we’ll have to live with this abstraction (as in, I prefer to have a leaky abstraction than not having any abstraction).

In the end having a leaky abstraction is not so bad anyways: you need to understand the inner details of what is being abstracted, not treat it just like a black box. With that you get a lot of side benefits: you understand the bigger picture, you will know at what level of abstraction you will need to apply a fix for a certain problem, etc.

I would apply this quote from Joel not only to code generators but to every abstraction:

Code generation tools which pretend to abstract out something, like all abstractions, leak, and the only way to deal with the leaks competently is to learn about how the abstractions work and what they are abstracting. So the abstractions save us time working, but they don’t save us time learning. … Ten years ago, we might have imagined that new programming paradigms would have made programming easier by now. Indeed, the abstractions we’ve created over the years do allow us to deal with new orders of complexity in software development that we didn’t have to deal with ten or fifteen years ago, like GUI programming and network programming. And while these great tools, like modern OO forms-based languages, let us get a lot of work done incredibly quickly, suddenly one day we need to figure out a problem where the abstraction leaked, and it takes 2 weeks.

So whenever you want to avoid ORM because it’s a leaky abstraction, think first if you want to be 10 or 15 years ago behind.

* Let’s say “moderately leaky” to put it in Joel’s terms, as for him all non-trivial abstractions are leaky to some degree.

** No, LINQ is not that.”

Great stuff!!

See also:
http://martinfowler.com/bliki/OrmHate.html

 

Posted in Software Development | Tagged , | Leave a comment

Comparing Solr Response Sizes

After seeing some relative success in our Solr implementations xml response times by switching on Tomcats http gzip compression, I’ve been doing some comparisons between the other formats solr can return.

We use Solrnet, an excellent open source .NET Solr client. At the moment, it only supports xml responses, but every request sends the “Accept-encoding:gzip” header as standard, so all you have to do is switch it on on your server and you’ve got some nicely compressed responses. There is talk of supporting javabin de-serialisation, but it’s not there yet.

I’ve decided to compare the following using curl with 1000 rows and 10000 rows in json, javabin, json/gzip compressed and javabin/gzip compressed. My test setup is a solr 1.4 instance with around 11000 records in sitting behind an nginx reverse proxy handling the gzip compression. As I said, this could easily be achieved by switching on gzip compression in Apache Tomcat.

This picture shows the results.

As you can see the same 10000 records, returned using the q=*:* directive with wt=json when http gzip compressed is the smallest, but only marginally, compared to wt=javabin. It would seem that json compresses very well indeed. You can also see the massive drop just switching on gzip compression gives to xml.

My conclusion to this would be that because json is a widely accepted content-type, with many well known and fast de-serialising libraries, it would probably be worth implementing that rather than trying to de-serialise javabin. But this was only a quick test and does’t take into account how quickly solr handles serialisation of the documents server-side.

 

 

Posted in Solr | Tagged , , , | Leave a comment

Delivering Search

Recently, I was asked to head up a 2 man team to deliver our Solr based search infrastructure, to move it from our existing creaking database.

It was a great project that was really enjoyable to work on and was delivered quickly with a large amount of success. So I just wanted to say a few words about why.

Focused Kanban Board

The first job was to create our own Kanban board, with a work in progress(wip) limit of 1. This would allow us to pair on one job at a time helping maintain focus and also making sure knowledge of a particular piece of work was adequately shared.

The second job was to take any existing search jobs from the backlog and analyse them to see if we could split them in any way. We had a concept of Yellow cards for large tasks that had not been “Split” yet, and then blue cards for each sub-task which we would then work on within our limit. Blue cards were strictly only allowed to be created once a yellow card was moved into our ‘In Development’ column, unless there was a very good reason. Again, this was to help maintain focus.

The result of this was that we were able to get work out very quickly indeed as we could clearly focus on one job at a time; if any “blockers” occurred, we would stop until we got the problem resolved, or re-prioritise our other blue cards. Another plus was that it was easy for other members of staff to see exactly was we were working on as our board was not littered with noise.

Finally, we kept our usual pink cards for production bugs, which would always have to jump the queue. Luckily we were able to keep these to a minimum.

Clearly defined roadmap

Another great help to us was the fact that we had a very clearly defined brief; simply put – we needed to replace the current search system for a better one. Acceptance criteria were clearly laid out and easy to understand and build initial tests from, leading to a very successfully acceptance test led project. This meant we were able to deliver manageable slices of the new Search-Api endpoint by endpoint and get quick feedback in the real world.

This allowed us to focus on delivering the work from a technical point of view, without having to worry overly about usability issues. As long as we adhered to the acceptance criteria, we were OK.

Clean and test driven modular code from the off

As this was an entirely new project that existed as a loosely coupled ‘sub’ api called over HTTP from our existing api service, we were able to work in a largely greenfield way. We used OpenRasta to offer an immediate restful interface around the solr backed search which allows you to work very quickly. It also allows you to work in a way that encourages small, loosely coupled classes, which again helps with maintainability and speed of development.

Builds and deployed were handled by our usual TeamCity build system, with a stripped down set of build projects and no release branch. We used the concept of pinning the latest stable build to deploy into our near live environment, and once signed off we could deploy straight into live. This enabled us to release small changes often.

The whole project was properly test driven from the off, with a focus on keeping the tests fast. Our acceptance tests used a dedicated solr instance to inject records at testfixturesetup time to enable us to run a set of tests against a given setup. Solr is lightning quick, and this method of setting up our tests environment on the fly and tearing it down at the end allowed us to write extensive and robust tests whilst maintaining running speed. All of which gave us heightened release confidence.

Proof

Once we began to switch the functionality over to the new service we have been able to easily deal with our recent upsurge in demand, with incredibly fast response times (<25ms) and much more relevant search results!

Next stop, our entire track catalogue gets the Solr treatment.

 

 

 


Posted in Software Development, Solr | Tagged , , , , | Leave a comment

Dependency Injection – Checking for null in the constructor

The other day, a colleague and I were having a quite heated debate about the need to check an interface passed into a class as a collaborator argument in the ctor. (see Dependency Injection)

This is a technique we use extensively using frameworks such as StructureMap and Castle Windsor, in order to make our classes small, loosely coupled and testable by allowing mocked implementations to be injected in within unit/integration tests.

The problem is, as you are allowing an interface to be passed in to the constructor to be translated to a member variable, there is a chance that within your class you will be attempting to access a property of a null, and therefore an evil NullReferenceException will be thrown! <<shudder>>

My colleague pointed out that we should always check for null, without exception (if you pardon the pun) and return an ArgumentNullException instead, and spent the next 15 minutes bombarding me with internet links stating that it was standard practice. It was argued that a developer will benefit from a more useful message, stating which parameter was null (fair point). Most of these articles did refer to null checks on parameters within a class method rather than a constructor, which I would actually agree with, but my beef was about having them in the constructor.

Now my argument is that, within our code base, you DON’T need to check for null everywhere as you run the risk of the following:

  • Violating DRY
  • Sullying your nice clean code with boilerplate try catch checks
  • Optimising too far down the stack

These kind of checks should be made as close to the top of the stack as possible, if at all. Within our code base all of our classes are consumed internally, we don’t really offer OS compiled class libraries for use by third parties. There are occasions where compiled libraries are passed around between teams, and in this regard, we can use the built in class access modifiers to prevent access (via the internal keyword) to classes that would allow a null argument to be passed in the ctor.

If we really want to help out our fellow developers, we can insert our null checks with meaningful messages as close to the point of usage as possible, within a public class that we have correctly managed the access of, again at the top of the stack/point of entry for that developer.

Also, our DI frameworks generally offer the concept of a Bootstrap “facility” (as its called in Castle), that allow a developer the ability to offer the standard top-of-the-stack resolution of all the concrete types we will need within the code-base. This is a method used in third party OSS libraries like SolrNet.

Finally, try/catch is ugly, I would be moved to tears to see it used in every single constructor in every single class I had created within a project, just to give a developer a helpful indicator as to which param was null. I know not everyone will agree, but my view is that I would rather use clean code and tests to help a future developer, than assert for null.

I’d like to go out with a quote from this post:

“Checking for nulls usually goes against testability, and given a choice between well tested code and untested code with asserts, there is no debate for me which one I chose.”

Amen to that……

 

Posted in Dependency Injection, Software Development | 2 Comments

Baby Sochanik

Just wanted to announce the “release” of our baby boy Finn! Deployed at 6:45am Sunday 22nd January 2012.

He’s an absolute smasher and has settled down to life outside the womb like a little trooper.

Have been advised by a colleague to update by dev library with the following books:

“Milk Driven Development”, ”Clean Nappy” and ”Sleeping Patterns”

Nice….

 

Posted in Uncategorized | Leave a comment

OpenRasta and CastleWindsor Concurrency Issue

A couple of months ago we discovered an issue in the 2.0.3 version of the OpenRasta project.

Heisenbug

To cut a long story short we noticed a Heisenbug in our search endpoints, which use OpenRasta as a business layer between our Api and the Apache Solr search engine.

Every now and then, with no apparent pattern, we would see a series of errors being thrown from Castle.Windsor.
This was the error we saw:

System.IndexOutOfRangeException: Index was outside the bounds of the array.
     at System.Collections.Generic.List`1.Add(T item)
     at Castle.MicroKernel.Handlers.AbstractHandler
        .EnsureDependenciesCanBeSatisfied(IDependencyAwareActivator activator)

Checking the event logs on the live servers we noticed that this error always corresponded exactly with an application pool recycle. This then led us to think that the issue must be to do with application start-up.

DependencyResolverAccessor

OpenRasta has a concept of an IDependancyResolverAccessor, which exposes an interface allowing you to implement your own choice of Dependency Injection framework to set up your dependencies. OpenRasta can then resolve instances that have been added to the container at run time in the normal way.

Our DI framework of choice for this project was Castle.Windsor, which is a very mature solution, and also integrates very well with SolrNet. The stack trace for the error led us to the WindsorDependencyResolver, which then led us through to Castle Windsor’s own internal dependency store which uses a generic List<T>. It turns out that .NET generic Lists are not thread safe.

The DependencyResolver is set up as a Singleton, and therefore is only ever called once, at the start of the application. We then deduced that what must be happening is that at application startup, if a large amount of requests come through at the same time, they can access the same List<T>. This in turn can throw the backing array out of sync with the size of the list, resulting in the IndexOutOfRangeException we saw.

To illustrate this, I was able to write an Integration Test that used Threading to fire a large number of concurrent requests at it, each one newing up an instance of WindsorDependencyResolver to emulate application startup.

The Fix

To fix the issue, we needed to use the double-check locking pattern around the resolvers internal container. This ensures that there is indeed only ever one Container set up even if multiple threads access this on application start-up.

private static volatile IWindsorContainer _windsorContainer;
private static readonly object _synchRoot = new object();
public WindsorDependencyResolver(IWindsorContainer container)
{
    if (_windsorContainer == null) {
        lock (_synchRoot) {
             if (_windsorContainer == null) {
                   _windsorContainer = container;
             }
        }
    }
}

Note the use of the C# volatile keyword used to enforce read/write barriers around all access of the singleton IWindsorContainer. This removes the need to use .NETs Thread.MemoryBarrier().

This has been in production for 2 months and thankfully we’ve seen no repeat of the error!

Posted in OpenRasta, Software Development, Solr | 3 Comments

API 2.0 – a more restful proxy around our Api

As we’ve delved further and further into REST and frameworks such as OpenRasta and ServiceStack, I’ve come across a few things that have been niggling me about our own Api. For this innovation time, I wanted to come up with a way of enabling us to kick start our new version of the Api, moving it over to a new framework.

Closer to REST

The few things I wanted to achieve with this first pass were the following, which I’ll go into in a bit more detail shortly:

  • Remove the annoying <response status=””> element from around the entity as this should be handled by Http
  • Return the correct status codes all the way to the client
  • Return the response resource in the correct format as requested in the accept header xml/json or any shiny new format
  • Return OAuth / Authentication responses [http://tools.ietf.org/html/rfc2617#section-1.2 correctly]
  • Have it deploy independently from our existing Api

Also, to make this a worthwhile job, I needed:

  • To be able to get something up and running asap
  • To not have to write very much code
  • For any of the logic to be as clean, simple and as modular as possible

To achieve this I decided to build an Api “proxy” in OpenRasta, utilising the IPiplineContributor system to chain a set of tasks together to intercept and modify the response from a request sent to the 7digital api.

PipelineContributors

PipelineContributors are an OpenRasta concept that allow a developer to hook into the request response process at any point in OpenRastas KnownStages, and modify the current state of this requests ICommunicationContext. As I am running under IIS, the ICommunicationContext implementation is OpenRasta’s version of the AspNet HttpContext.

The interesting thing to me was that you can chain these classes together, e.g:

public void Initialize(IPipeline pipelineRunner)
 {
       pipelineRunner.Notify(GetApiUrl).Before<ApiRouter>();
 }

So this is the route I went down. Currently the PipelineContributors do the following:

FindApiEndpointUri:

As we need to take the current request and route it through to the existing Api v1.2, we first need to find out what the actual API endpoint Uri template is. This is done here using the Linq Except() method to compare the two sets of Uri Segments and return the matching ones, without the host, virtual directory and querystring. This class does exactly this and passes the result up into the pipeline to be used elsewhere.

ApiRouter:

This is where we send a request to the old Api v1.2 using the same endpoint uri and querystring as was passed into the pipeline. It also passes the Api response status code into the pipeline (usually a 200 even on error, but it does return 401s).
It also passes the response into the pipeline.

StripLegacyResponse:

The <response> element is not needed as it contains data that should be handled within the Http header and causes us to have to write extra logic for deserialising our resources within OpenRasta.This PipelineContributor simply strips it out before passing it back along the pipeline.

ApiResponseDeserializer:

This is the final stage and is hooked up to be fired before IHandlerSelection. This is where we deserialise the Api responses into C# DTOs using the SD.Api.Schema library. It then uses a list of Strategy Pattern style rules to see which OperationResult it should be returning, and then sets the context.OperationResult to that.

Each PipelineContributor method returns PipelineContinuation.Continue which tells OpenRasta to continue through the pipeline with our modified Context, so that we can still tap in to the OpenRastas Resource – Handler – Codec concept to output our responses.

Single Handler concept

The idea was that everything essentially hits the API proxy at the root, so we only need one ResourceDefinition, and therefore only need one Handler. This is a very simple class that provides a method for each HttpVerb (Get, Post etc). It then takes the current per request ICommuncationContext (injected by OpenRasta) as a ctor collaborator, and for each method returns the current OperationResult.

Running at website root

I had some problems initially trying to enable OpenRasta to register a Resource at the root Uri (“/”), but after a pit of poking around, I noticed for a number of our projects, we had OpenRastaHttpHandler set to “*.rastahook”.

I realised this needed to be “*”, once I’d changed this (and turned off default web pages) I was able to access at root. Good news if we ever want to create a root endpoint list of uri templates for discover-ability.

Summary

The great thing about this project is that it allows us to have a play around with how we’d like the Api to look whilst the Api v1.2 is still running. It has very little code to maintain and took relatively little effort, but as a result we are now able to support json responses, correct http status codes, and the responses returned adhere much more closely with the Representational State Transfer model.

As I’m using PipelineContributors to modify the request, as and when we need to we can remove/tweak this functionality quite easily. We are also free to move functionality across to this new api without affecting the existing api.

Posted in OpenRasta, REST, Software Development | Leave a comment

Clean OpenRasta OperationResults

In my post REST in practice and OpenRasta, I alluded to the ability to deliver specific OperationResults based on the actual result of a specific handler.  I had decided to move the responsiblity for this over to an OperationInterceptor that could contain the same logic for all endpoints.

What quickly became apparent was the fact that this could end up resulting in a huge if statement or ugly and difficult to maintain switch statement , so I came up with the concept, the IOperationResultRule.

public interface IOperationResultRule
{
    bool CanHandle(object entity, int statusCode);
    OperationResult GetResult(object entity);
}

What this means is that within my consuming OperationInterceptor I can supply a list of rules that can be tested, using the CanHandle method, to find the one that can deal with the outcome.

Here’s an example of the consuming code:

foreach (var rule in _rules )
{
     if (!rule.CanHandle(entity, context.Response.StatusCode))
     continue;

     context.OperationResult = rule.GetResult(entity);
     break;
}

Where in this example, entity is the object being passed along the pipeline and StatusCode is the current pipeline context StatusCode, which in this instance I can update elsewhere.

The entity contains the information I need to be able to decide which OperationResult to fire. The first one that “CanHandle” it, handles it.

As a result we get easily maintainable, atomically testable code and no ugly switch statement. New OperationResultRules can be added as required!

Lovely.

Posted in OpenRasta, Software Development | Tagged , | 1 Comment

Creating a basic catalogue endpoint with ServiceStack

Overview

Servicestack is a comprehensive web framework for .NET that allows you to quickly and easily set up a REST web service with very little effort. We already use OpenRasta to achieve this same goal within our stack, so I thought it would be interesting to compare the two and see how quickly I could get something up and running.

The thing that most interested me initially about ServiceStack was the fact that it claims out of the box support for Memcached, something we already use extensively to cache DTOs, and Redis; the ubiquitous NoSql namevaluecollection store.

Getting cracking

I set myself the task of creating a basic endpoint for accessing 7digital artist, release and track details. Whilst taking advantage of ServiceStack’s ability to create a listener from a console window so I didn’t have to waste time attempting to set it up via IIS:

class Program {
      static void Main(string[] args) {

      var appHost = new AppHost();
      appHost.Init();
      appHost.Start("http://localhost:2211/");
      Console.Read();
   }
}

As you can see this couldn’t be simpler. Whilst the thread is running, it will listen at localhost on port 2211 for incoming requests.

AppHost

Every ServiceStack implementation starts with the concept of an AppHost, which is a catch all class that exposes the initial setup of your service. For a console app based HttpListener setup It relies on you overriding the AppHostHttpListenerBase.Configure() method, which offers up a Container for access to the built in IOC. Funq is the weapon of choice in Service Stack.

It seems a shame that Service Stack doesn’t abstract away the responsibility of the IOC, allowing the developer the case to write their own IOC implementation as you can with OpenRasta, but the emphasis with ServiceStack is on speed (both of performance and setup) and Funq is perfectly adequate.

Routes

The AppHost is where it is suggested that you set up the concept of Routes, which like with the ConfigurationSource in OpenRasta, you set up your Resource – UriTemplate relationship:

Routes
    .Add<Artist>("/artist/details")
    .Add<Artist>("/artist/details/{Id}")
    .Add<Release>("/release/details")
    .Add<Release>("/release/details/{ReleaseId}");

As with OpenRasta, the resource is represented by a simple DTO. In exactly the same way (via the KeyedValueBinder attribute in OR) your DTO represents the incoming request parameters, or POST request representation of your resource.

Services

The actual service itself (the equivalent of the Handler in OR) is where the request gets processed. There are a couple of ways you can accomplish this, but I opted for the documented method of deriving from the RestServiceBase<TResource> class. From within here you can the override a set of “On” methods which ServiceStack routes the call through to depending on the verb used, for example:

public override object OnGet(Artist request) {
    // Service logic here
}

Those familiar with OpenRasta will recognise a similar concept in setting up of Handlers, but with a few subtle but important differences. OpenRasta successfully decouples the concept of a handler from your implementation by allowing you to tie it to a resource elsewhere, which from a clean code perspective I prefer.

OpenRasta also more importantly does not assume that you will be implementing all http verbs within a handler, and returns a valid 405 Method Not Allowed if you have not implemented that method for a service.

IOC implementation

ServiceStack’s default Funq works very well and you can opt for both constructor injection or property injection. Ctor injection is our (and should also be your) preferred way of achieving this, and Funq handled this perfectly.As mentioned earlier, you set up your container within the AppHost. You can then use the familiar Register<TInterface>(ConcreteInstance) to set up your dependencies..

MediaTypes / Features

ServiceStack’s great selling point is the ability to set up a “vertical slice” of a site incredibly quickly, and this it does without fail. Once I had my resource DTOs, service and AppHost set up I was able to access it immediately. It also supports many different media types out of the box, which can be turned off and on within the AppHost like so:

SetConfig(
   new EndpointHostConfig
   {
      EnableFeatures = Feature.All.Remove(Feature.All)
      .Add(Feature.Xml | Feature.Json | Feature.Html),
   }
);

Caching and ReDis

ServiceStack is true to its word that it supports a caching layer out of the box, and it is really easy to set up. It comes with its own MemoryCacheClient which works well as a basic .NET IDictionary implementation of a cache. It supports TTLs but not sure about LRU (least recently used) or other caching strategies.

Each Cache class implements the ICacheClient class, and you just set up the dependancy in the AppHost in the normal way. You can then inject it into your caching service as normal. I implemented it in my ArtistDetails

ReDis works in exactly the same way, and it does just work. I was very impressed with it’s implementation.

It would have been nice to test its Memcached setup, but that didn’t come with the latest release, it’s only available within the latest cut from github. I downloaded it, but due to some initial setup issues ran out f time before I could play with it. It’s essentially an adapter around the Enyim library which we already use for our Memcached setup.

Pipeline vs ResponseFilters

OpenRasta allows you the ability to “hook into” various stages of its pipeline process outlined here. I wanted to see if ServiceStack did the same and I got very excited when I saw the concept of ResponseFilters, which again are set up in the AppHost. Sadly I ran out of time on this, I wanted to try and implement a “catch all” way of dealing with my http response codes issue as mentioned above, but this could be something I investigate further at a later date.

HttpStatusCodes

My biggest issue with ServiceStack after poking around a bit revolved around status responses. If, within a service, you have not overridden a method for a specific http verb then you do not get a nice instant 405 Method Not Allowed response. You instead get a 500 Internal Server Error with the available mime-type representation of the error object and stack trace.

In an attempt to rectify this, I ended up creating an interim ErrorHandlingRestServiceBase<TResource> abstract class as follows:

public abstract class ErrorHandlingRestServiceBase<T> : RestServiceBase<T>
{
     protected override object HandleException(T request, Exception ex) {
         if (ex is NotImplementedException)
             return new HttpResult(ex)
                    { StatusCode = HttpStatusCode.MethodNotAllowed };

         return base.HandleException(request, ex);
     }
 }

The service then derives from this. Not the most elegant solution, but the only way I could see you it could be done. Having to implement functionality through inheritance rather than loosely coupled hooks can lead to complexity over time.

CustomSerialization

Another thing I didn’t get a chance to look at in more detail was customising your final mime-type related representation of your resource on the way back to the client. OpenRasta handles this excellently through the concept of a Codec, which is hooked up as a representation of a Resource with the ResourceSpace.Has syntax. This helps to leave the implementation of the request decoupled from the representation of the resource.

ServiceStack doesn’t seem to have an equivalent of this concept. In an attempt to ease you into an out-of-the-box implementation, it takes care of this all for you.

Summary

In my opinion, ServiceStack does deliver on its promises, it’s intuitive, user friendly and quick to set up. I’m sure if I’d had as much time with it as I have had with OpenRasta, I’d have found out ways around the issues outlined above. Currently I don’t see anything that would prompt me to think about using it instead, but as a simple framework to quickly get an application up and running it’s definitely a winner.

The project is available here

Links

Posted in OpenRasta, REST, Software Development | Tagged , , | Leave a comment

7digital search and Apache Solr

I’ve recently been working on a huge project to move our existing Sql based full text search functionality over to use apache solr. It was a great project to work on and has been a resounding success so far.

There’s a great article about it on the 7digital blog written by Mark, our product manager.

I’ll write an entry about some of the finer points of the project and why it was so rewarding to work on soon…

Posted in Software Development, Solr | Tagged , , | Leave a comment