Data tool investment inertia

Hot-take

Oct 23

Recently I read the post by Ben Stencil on the question of the potential gap [1]. That is the observation between what our data tools are capable of, and the extent to which we use it. This further reminds us that it is our choice to not fully utilise the capabilities of our tools.

The way I read and understood his comments he blames our fear of lock-in, and unwillingness to invest the time and resources into utilising the tools we get. And that this creates an incentive for vendors not to create the comprehensive tools we seek. Thus creating a vicious cycle.

I am at no point denying the statement that there is a potential gap. Anyone who has spent any time in this field can see that. What I want to discuss is his premise that the “fear of lock-in” is causing the gap by itself. There are many more factors at play here, and blaming it on one factor alone is overly simplistic. This is an attempt to provide some additional context and nuance to the discussion.

One Ben himself in his initial illustration of the perfect Microsoft family touches on. In describing that this is not an image of reality. But that it could be if the people dedicated the time to learn the ins and outs. And became super skilled at Microsoft. The operative statement is the time to acquire the necessary skills. As someone who has spent a lot of their career in various modernization projects. Dealing with the re-skilling of massive amounts of people.

It is the hardest part. The more people you have and the longer you have been with one technology stack the harder it gets. Not only do you have to deal with the people who are unwilling to learn, but you have to coax them into what you see as the new reality. The keen people also become a challenge. As regardless of how fast you move you can never move fast enough for the most enthusiastic of users. Then they start to look for solutions and options that they can control themselves starting the process of system chaos that Stencils described as the unfortunate reality.

The next challenge to further utilization can be found in the process of selection. It is a rare company (probably a startup with very small numbers) where the people investigating these tools and making decisions on their place in the org, are the same people making the decisions to dedicate the the company to learn these tools. If your data organization is small it might be feasible to declare changes and dedication to a tool. But your data challenges are comparatively small and simple compared with larger data org.

So beyond the challenges of training and coordination, we can look at the question of fear of lock-in. I will happily admit that lock-in is something that always is on my mind when evaluating tools and features. My primary concern is not the lock-in itself, but rather the spreading of lockin. Every time you choose a tool you accept some level of lock-in. There is no such thing as a lock-in-free tool. But it is one thing to accept lock-in, it is something else entirely when the lock-in spreads. One tool taking on a new role beyond what you chose it for. And now if you want to replace it there are more things that need replacing, and you can’t guarantee that you find a combination that provides you with the features you need. I consider this a malicious lock-in. And it is the thing I want to avoid happening without being aware of it.

The MAD landscape and modern data stack model promise us the ability to choose the best-of-breed tools in order to build our data platforms. There is however in my opinion something essential is missing in that promise.

The tools currently being created all have their own interface and implementations. For a system ostensibly created to be built as a building block there is very little being done in the realm of adaptability. As a platform engineer, I spend a large amount of time trying to ductape various components together. And every tool has its own quirks and challenges. And I might be a cynic and wrong, and there is probably a spectrum here. But I can’t help feeling that there is a degree of trying to keep us with the tools because it is so much work to change.

What I want from the MAD stack is a tool that is built in such a manner that every tool is modular enough that it can slot in and be replaced. If our tools were to operate with a standardised interface to the other components we would create an incentive for vendors to focus on the quality of features knowing that if they lag behind or are overtaken it is easy to replace them. As long as we don’t have a common set of interfaces. There is an incentive to make it harder to change. Creating the need for investment and dedication upfront before the tool has proven its worth. The only tool we as developers have is to be vary of too much lock-in up front.

We can prioritise tools that make it easy to replace them but dedicate work to learning to use them. Reward the systems that see themselves as a part of a collaborative community of processes. While avoiding the the ones that try to become your one and only, and at the same time make it harder to collaborate with other systems.

Paradoxically the lock-in from an end-to-end tool like the ones Stencil mentioned is less and less incidious than the cumulative lock-in of smaller components. Because when you dedicate to a single end-to-end tool you already accept that if you want to replace it, you will need to replace it all. That is preferable to having what you think is a modular platform, wanting to replace a single component only to find it has embedded itself in so many other components and areas that in order to replace it, you need to unravel much more and find new solutions for aspects that you originally are happy with.

The challenge with an end-to-end tool, and here I agree with Stencils, is to get the necessary corporate, and management buy-in to start there. All organisations want proof of concepts. Small test, low-risk experiments. And that trend is anathema to tool utilisation. If I were to hazard a guess as to why it is so hard to get the buy-in, I would say it is because the decision to trial these tools doesn’t originate at the level that needs the buy-in. It has to be sold in. Presented as a good idea. Possibly showing that it will solve a problem that management isn’t aware of. And it is environments like this it is hard to get buy-in for something that will be monumental organisational changes and investment. It is much easier to present a small change that holds little to no risk.

In those cases where the change is brought about as the result of an organisational threat. These large end-to-end systems take too long to produce results compared with the immediate need for change.

Another challenge to the implementation of these large investment tools is the sense of ability to self-determination. While it might be overcome in smaller organisations. In large organisations, there will always be people that resist the change and will oppose it on the basis that It isn’t their choice. That probably won’t say it that way, but at the core that’s the case. And the more an end-to-end tool requires everyone to work the same way the more the resistance will grow.

It all comes back to modularity. When selecting tools or building platforms the most important feature that you can have is the modularity of the system. With solid modularity, you can allow users to choose the tools that suit their needs. Make it easy to take one or more tools out and replace them. If your platform supports the co-existence of multiple tools within the same MAD sector, it becomes easier to dedicate resources to committing to learning tools, as there is low perceived lock-in and the tools are allowed to stand on their own merits. You can safely invest the time to learn tools, knowing you have an alternative.

The remaining problem which unfortunately no technology choice can ever help you with, is convincing your users to acquire the skills they need.

References

[1] https://benn.substack.com/p/the-potential-gap

Modern Data StackSystem-modernisationData Platforms

Simen Svenkerud

Data tool investment inertia

References

The Mythical man-month (Review)

Modern Software Engineering (review)