Geeks With Blogs

Joe Mayo

There’s been some discussion recently about whether it’s proper for a language to support null values or not. My guess is that the people having these discussions are a whole lot smarter than myself, so it’s probably best to defer to their judgment on any matters that might conflict with whatever I have to say. That said, there might be semantic differences in what is being referred to when discussing null. Perhaps one discussion regards the logical operation or state of code, whereas my perspective in this post is on data.

There was a time when I was fascinated by subjects such as truth, equality, and the meaning of null. Especially when thinking about data, null is an interesting subject. In a relational context, null means the absence of a value. When writing code, the meaning of null is determined by your domain. You typically won’t have a specific requirement, use case, or story that tells you how to work with a null value when first writing the code and your logic is often inferred. It’s often when a bug appears in your code that a formal requirement might be specified, but I say “might” loosely because handling of null is a technical requirement and the bug would probably be stated as the functional requirement to satisfy. This is why null values are often so aggravating because they bubble up later in the application lifecycle where they’re more painful to deal with. This seemingly mundane topic of existence appears to be more important than one may think.

Today I encountered a null in the wild and all the related discussion compelled me to investigate. I was adding a new field to the LINQ to Twitter Status entity named FavoriteCount. This relates to the Twitter Tweets favorite_count field. What caught my eye is the definition of favorite_count identifies the field as Nullable. I know what Nullable means in relational and .NET parlance, but Twitter is a different world. So, I went to the Twitter developer’s site and asked, “What is the Meaning of null favorite_count?”. I’ll quote an important part of @episod’s answer:

The state of null is usually reached because it was not possible at the time of servicing the request to provide a value in that field -- an underlying system could have taken too long to respond, or a null value got cached at some other point in time when that happened.

One could say that this is the same thing as the “absence of a value” definition I mentioned earlier, but there’s an important nuance here in the context of the data source and nature of the application. The “absence of a value” comes from a relational ACID transaction perspective where you have what you’re going to get. However, Twitter is a massively scalable system that uses a NoSQL data source. From the answer and the scalable NoSQL perspective, the true value of favorite_count may not be null at the point in time that you’re looking at it, but you don’t know for sure. I believe the proper term for this phenomenon is Eventual Consistency. There’s some hair splitting that can be done here, but I’m speaking in general terms.

In my initial question about the favorite_count field, I wondered whether it was safe to assume that a null value meant 0. Making a blanket assumption like that could certainly cause some trouble and it would be unwise for me to make that assumption on behalf of the developers using LINQ to Twitter. e.g. What if an application would rather monitor a tweet for N minutes to see what the real favorite_count was after it initially appeared as null? There are variations of this theme, but it’s a situation where I would want to expose functionality that was meaningful and allowed the developer to know the true state of an entity.

To understand what null is and put it into context is an important endeavor, especially in keeping customers from hating you and reducing project lifecycle costs. However, you still need language support unless you regress to kludges of earlier platforms. In the first version of .NET you could encounter code that either treated a value as a string or used some other mechanism, such as an alternate value to represent null values. I can’t count the number of different representations of DateTime value system I encountered in those days and the headaches of accepting defaults of different databases with null values. Of course, ADO.NET has the DBNull type and there were 3rd party libraries that special types that allowed you to represent nullable value types.  .NET 2.0 introduced the Nullable<T> type and added special language support to C#, where you could add a Nullable suffix, such as DateTime?. These days, it’s easy to represent null data in a consistent way with both .NET Framework and language support.

Null data is a reality. Old databases often have null values. These old databases also have applications with logic that relies on the null data. The cost/benefit of a re-write can result in leaving things as is. There are also current scenarios where null makes sense, such as when having Created and Modified fields on a record where it doesn’t make sense to provide a value for Modified because the record hasn’t changed since it’s initial value. One could argue that Modified should have the Created time and application logic would detect this fact or perhaps another design would be more appropriate, but the fact is that you’ll find people with differing opinions on what the data design should be in any given situation and sometimes the argument for null will win. I’m not one to fall on my sword for a null value when it’s possible to write code to deal with it. Another perspective, as mentioned earlier is the NoSQL world with eventually consistent data. Regardless of what arguments you have on that, the reality is that you will be working with APIs, public data, and situations where the need for scalability results in null data making sense or being the best choice to meet requirements.

So, I’m not trying to say that someone else’s opinion of null is wrong. However, it seems like there are nuances to consider before declaring that “Null is Evil!” Our industry is plagued with never-ending battles of who’s platform or language is best and the true answer can sometimes be found in-between. While Scientists debate What is Nothing, VB developers can relate and C# developers can say that’s equivalent to null. Or is nothing really something to care about?


Posted on Thursday, March 28, 2013 5:04 PM | Back to top

Copyright © Joe Mayo | Powered by: