Middle-child MAD company
OK, so this is something that I have had to deal with quite a lot recently. How do you take a tool from the MAD landscape, and scale it across as much of an organisation as possible? It is hard. I have spent weeks on thralling the Internet, looking for articles on how best to implement these tools at scale, or even how to implement them with others at all. And it is extremely hard to find anything sensible out there. That frustration is part of the reason I started this blog. One, to scream out into the void, but more importantly, to actually start and provide some of these articles myself. It was also there that I started to realise why there were so few of them in the first place. It is hard to make good general recommendations on how to scale the Modern Data stack. But I promise there are articles in the work on the matter.
But for now a bit more void screaming first... As you move into the scaling part of building a data platform you run into an entirely new set of challenges. I will call this phase the scaling phase, although it has nothing to do with the size of the company. I am currently working on a scaling phase project, in a large established company. that already has an existing enterprise data platform. We are a typical modernisation project. But our platform is scaling. expanding from a single-user team to several new ones at once.
The first thing that hit me when I was looking for advice, on how to best set up these tools was that virtually all articles and tutorials were based around small simple implementations, single teams. tiny data. virtually no dependencies. It made the tutorials easy to follow and understand. but it broke down as soon as I tried to fit the simplified case into my more complex case. And no, our version of the Modern data stack is in nå way special or controversial. it is in fact remarkably basic. So I can't blame that I am looking for odd tool combinations either.
The next There was no information on the very large-scale implementations. which leads me to think that either no one has been able to create a truly enterprise-wide Modern data stack (yet), or for one reason or another don’t see the reason to share how they got there. Oh, I have seen the broadcast conference talks, and been to a few in person. But these talks feature so heavily sanitised version of reality and serve more as advertisements, than actually learning points.
The comparison I want to draw is the scaling phase of the modern data platform as the middle child. the youngest/smallest/trial phase platform gets all the attention, with easy-to-understand documentation, and lots of third-party tutorials. The older, senior data platform is able to manage on its own and is rather fed up with the annoying youngsters, actually just wanting to do their thing. While in the middle are we left to fend for ourselves. Find the path fresh. Likely set to solve problems already solved. Make mistakes already made.
Why does it have to be that way? Why are vendors not more interested in helping scale the user base? Oh, they are happy to see consumption go up (they can bill more), but the actual building of needed structures and systems is always left to the customers. While they should have tons of useful material to draw upon from other customers.
Why am I sceptical of vendors? because I have been in talks with them. Sat down with them, and talked through what they have. And surprisingly often what they come up with is the end result of the older sibling, or rather more of the simplified younger sibling.
Let us all, here and now, acknowledge that this is hard. It won't be clean and nice. it never is. But like the annoying younger sibling, I hope for some attention from vendors and those who have completed the path to start providing more realistic descriptions of how to take the platform from small and simple, to a fully functioning enterprise data platform. And in the meantime, I will do my part and start talking about the awkward middle stages, and the solutions that are implemented in the current and previous scaling phase platforms.