Twenty Years of Data Through the Lens
Data Lens Turns 20
LinkedIn reminded me that Data Lens, the data consulting practice I started, is now 20 years old this month. I originally started the company as a solo BI practitioner, but as I brought in consultants, I never stopped to think about the longevity of Data Lens. I just immersed myself in the work. Data management consulting is a great field, and in the past 20 years we have observed things that either changed dramatically or remain frozen in time. I decided to take a look back through my lens from the past two decades and share observations ranging from the compelling to the just-plain silly. Some things have changed while others REALLY need to. These are my random observations – your results may vary.
What Hasn’t Changed
- Marketing still struggles from many of the same data challenges: Before going out on my own, I ran the database marketing group for a global software company. Tools for lead management and campaign tracking were limited, and we ended up building our own primary lead-generation engine. We explored some tools, but there was still a lot of manual work back then. Today, it is surprising that process efficiency is still lacking. Even with major CRM platforms and integrated tools, I witness marketing teams weighed down with manual lead tracing, poor list quality, and bad data.
- Shadow IT survives: The fact is, business units need access to their data and reporting, and many decision makers don’t want to wait through delays caused by formal IT practices. Shadow IT will always be the secret weapon used by non-IT departments. If you are a data governance practitioner, I feel your pain, but you have to come to terms with it.
- The need to invest in a data foundation for analytics: Regardless of your industry, data use cases, analytic needs, or vendor preferences – you need to spend time to architect a centralized data platform for analytics. There is no getting around it. Once you have put in the upfront work, the foundation will pay dividends and grow with your business.
- Data privacy issues: Data security breaches are not the only factor here. The democratization of data intelligence tools (and data to go with it) has increased the focus on data privacy. Did I mention that I feel bad for data governance practitioners?
- Odds and ends: FTP – like an old pair of shoes that still fits great, but you don’t really want to wear anymore. And who can forget old habits like using underscores when naming objects in your schemas or models. Please, let_them_go.
What Has Changed
- Connecting to your data: Bringing in your data (relational, semi-structured, etc.) is so much easier. BI tools have made constant progress with improving their connectors and reducing the time it takes to put data to work, and THAT is their biggest sales pitch. Another observation is that these tools do not crash like they did in the early 2000s. They also feel more cohesive – although those who got Hadoop’ed a decade ago might not agree.
- Open-source RDBMS database platforms are more popular than commercial platforms: In 2001, IBM was still the #1 ranked DBMS vendor overall based on new license sales (Gartner Dataquest, May 2002). That does not include its acquisition of Informix. In 2020, MySQL holds the top spot based on both its community and commercial licenses. It is a stable open-source DBMS that is embedded in many web applications and is a staple for WordPress sites. Among the commercial DBMS vendors, those that made it to the cloud first seem to be the horses to bet on.
- Database deployment: Automated deployment is now possible in the database world on a wide scale. In 2001 you would think this sounds crazy as you copied and pasted your next release script. But thanks to open-source collaboration, it did not take the actions of a single commercial DBMS platform to push this forward.
- Data delivery: API’s deliver a much-needed delivery mechanism; but the management of your API catalog is still remarkably similar to administering interfaces from the past.
- DBMS pricing methods: Cloud computing completely disrupted how commercial platforms offer their pricing. The platforms that separate their pricing between storage and computing layers have figured out that customers want more flexibility. I still can’t figure out some of the pricing models without talking to a person, so I guess that aspect has not changed in 20 years.
How has data management changed over your career? Are there areas where things have not changed despite advancements in others? I know I will think of 20 more things in the next week, but doing this mind dump reminded me how far we have come. Well…