Francesco Gadaleta PhD is Senior Software Engineer, Chief Data Scientist and Founder of Amethix Technologies. He hosts the popular Data Science at Home podcast, and he’s held key roles in the healthcare, energy, and finance domains. Francesco’s specialisms include advanced machine learning, computer programming, cryptography, and blockchain technology.
In this post, Francesco reflects on the changing role of the data scientist in an organisation. Data platforms have evolved, and will continue to do so at a pace, but how can data scientists help organisations adapt to take advantage of these ongoing advancements? As Francesco explains, the solution isn’t always as simple as integrating the latest platform.
The work of the Data Scientist has changed in recent years.
While it has improved for some, it’s probably degraded and become even more frustrating for others. For example, several of the tasks that Data Scientists performed a decade or so ago are no longer needed because there are better tools, better strategies, better platforms – and definitely better architectures.
Think about Oracle, the likes of PostgreSQL, MS SQL Server, and many others. Over the past couple of decades, these platforms have been the most important systems in the organisations of pretty much any sector, regardless of the size and data types. With an increased demand for more analytical tasks alongside the needs of key decision-makers in companies and organisations, engineers are using different types of architectures and platforms just to fulfill such new requirements. We can list these architectures from the data warehouse to the data lake, data fabric, the lake house and the data mesh. Chronologically, the data warehouse from the sixties and seventies has evolved to become the data mesh and the data lakehouse later. The lakehouse is one of the most recent architectures and evolutions of architectures in data.
Such a new concept overcomes some of the most evident limitations of databases, both structured and unstructured. Ignoring the fact that we have key-value stores, and one can, to a certain extent, manage unstructured data, the idea of the database is based on transactions; to have some sort of consistency, and to also have a schema in the data. This means that once you have an ingestion layer that takes data from a data source into the database, the Engineer has to know in advance what that data looks like. For some organisations, this is still a valid concept. The fact that data architectures evolve doesn’t necessarily mean that one has to migrate to the most novel approach overnight.
If you’d like to learn more about the challenges and opportunities data platforms offer, read this article in full over on our magazine: https://issuu.com/articles/19459899