Tarush Aggarwal is one of the world’s leading experts in helping organisations leverage data for exponential growth. He holds a degree in Computer Engineering from Carnegie Mellon University, and he became the first Data Engineer on the analytics team at Salesforce. Tarush also led the data function for WeWork before founding the 5x Company in 2020, which supports entrepreneurs in scaling their businesses.
We asked Tarush to share his expert knowledge on how organisations can meet the demands of managing their disparate data sources. Tarush explains the benefits of implementing a scalable, managed data platform that will evolve alongside your organisation.
Stay tuned as we delve into Tarush’s inspiring journey and learn valuable insights from his experience in the ever-evolving world of data science.
How has the data infrastructure landscape developed over the last 10 years, Tarush?
When you look at the history of data infrastructure, it began with the online revolution. All of a sudden, we went from storing data on our own personal devices to storing data on the Cloud. With the advent of Facebook and Google, Cloud companies started collecting massive amounts of customer information, so the need to analyse this information is really where the big data revolution came from.
Along with starting to store information in the Cloud, the second thing which became prevalent is that we started having multiple different services to store this information. It was no longer one company which had all of this information. Today, your average start-up has got 10 different sources of data. This could be your backend databases, marketing data from Facebook Ads, Google Ads, data from your CRM, financial data from Xero or QuickBooks, or even Google sheets mixed with application data from Greenhouse and Lever. The number of different data sources has increased, so this has resulted in a need to centralise data again. We need to decentralise this and make sense of the data.
That’s a quick history of how we got to where we are today, and why it’s becoming more and more important for companies to have the right platform or the right infrastructure in place to make sense of all of this data.
What can companies do to tackle this problem of disparate and convoluted different data sources?
I think there are four core steps when we think about data platforms today… Step one is how do we pull this data from these different data sources into a single place to analyse? Once you have this, you want to store all of this inside a data warehouse, which is structured to store large amounts of data. Modern day warehouses are able to separate storage from computers, which makes them really cost-efficient in being able to store lots and lots of data without racking up large bills.
That’s step two.
For more of Tarush’s insights on data platforms and the changing role of data scientists, read the interview in full over on our magazine: https://issuu.com/datasciencetalent/docs/the_data_scientist_mag_issue_3_digital?fr=xKA9_zU1NQ