There is such a buzz surrounding the wonderful world of Data Science these days. But what actually is a Data Scientist and how do you become one? This question is on the minds of both rookies and professionals in the field of data. Online we are inundated with blogs and articles about this topic, but there are also some great books that sometimes offer a bit more depth into the subject. In this blog I want to introduce you to the contents of a book called “Data Scientist – The Definitive Guide to Becoming a Data Scientist” by Zacharias Voulgaris. I found this book well worth reading, so I want to give you a quick introduction to the book and reflect on what I think are the book’s strengths.
One of the first great things you get from this book is a clear definition of Data Science and an unambiguous distinction that Voulgaris makes between Data Scientists and Data Analysts. Voulgaris describes Data Science basically like this: Data Science concerns all the different aspects relating to handling data, especially Big Data, in an intelligent and methodological way in order to create useful data products. This definition includes a number of valuable perspectives: Big Data; intelligent and methodological way; create data products. Voulgaris therefore clearly shows that Data Science is not merely a rebranding of Data Analysis or Data Engineering but in fact very much a field in its own right at the interface between Data Analysis and more IT-focused roles such as that of Data Engineers. In the rest of the book, he breaks this unique role down in more detail.
This point is further underlined Voulgaris explicitly distinguishes between Data Scientists and Data Analysts, a distinction that many struggle with right now: to highlight the differences between these roles. A Data Analyst uses techniques that work well on data that could be more or less described as Big Data. However, many of these techniques are inefficient and lack the flexibility that characterises the techniques used by a Data Scientist. Data Analysts rely on a variety of available models to derive useful information from the data, and generate reports to present to colleagues on the business side of the organisation. Data Scientists often develop their own models or adopt an entirely data-driven approach in their analysis. In most cases this delivers something that can be used by many other people, not just the business people within their own organisation. A Data Scientist will develop an interactive dashboard to display all the essential information in real time. In a world like ours, where nothing is exclusively black or white, it is enlightening when someone has the courage to draw a clear line, and Voulgaris does just that. You can agree with it or disagree with it, but at least a clear boundary is drawn. Rather than delivering data products, a Data Analyst will deliver insights. They are more like a consultant. A Data Scientist has failed to do a good job if their response is not scalably secured in the IT landscape. They are more like a builder. That’s what I take from Voulgaris’s comparison and it is a starting point I can really buy into, if you briefly overlook all the shades of grey in between.
The book does, however, consist of more than just a definition of Data Science and a comparison between Data Analysts and Data Scientists. First, a bit more about the author: who is Dr Zacharias Voulgaris? He originally trained as an Engineer, completed a PhD in Machine Learning and has worked as an academic researcher. Voulgaris has ‘only’ been a Data Scientist himself since 2013, writing this book not long after he had made his own journey of exploration into what Data Science actually is. In his book, he takes readers through some of the history of Data Science, describes the various different types of Data Scientists that he has identified, shows what it is that makes Data Scientists so unique, and has many tips and interviews that are highly valuable for anyone embarking on their own adventure into the field of Data Science. It is conveniently structured, with each chapter starting with a brief introduction and ending with a recap. Also valuable are the interviews with “real” Data Scientists in the later chapters and an extensive glossary of the hard-hitting Big Data terminology which Voulgaris includes at the back of the book. So, all in all, I recommend you read it! But if you want a sneak preview of its contents before you get started, or you just want a shortcut to the book’s key insights, have a look here first :).
I found Voulgaris’s book highly enjoyable and interesting to read. His clear writing, distinctions and plethora of tips and references to other interesting sources have refined, strengthened and expanded my understanding of Data Science. For those potentially venturing into the field of Data Science, Voulgaris has made the mythical status of the Data Scientist tangible and attainable. I want to end with his final sentences in the last paragraph in the book: …And as big data technology continues to evolve, more and more interesting ways of making use of existing data will become available. The Data Scientist will continue to be an ever-fascinating role that will rely as much on creativity as it does on technical skills. By then, there will probably be university departments specializing in this field, and future Data Scientists will look back on the Data Scientist of this decade, the pioneers of the field, with great admiration.
What great, inspiring words to finish with! We can confirm his point that universities, including those in the Netherlands, are indeed developing more and more specialisations in the field of Data Science. For more information about this, read our post about Data Scientist courses at educational institutions in the Netherlands. And would you like further insights into Dutch Data Scientists? If so, you should also be sure to read our posts based on our own research into Dutch Data Scientists, which allowed us to examine the typical Dutch Data Scientist on the basis of more than 1,000 Data Scientist profiles. We have also interviewed a number of leading Data Scientists from the Netherlands and you can expect more on this in the pipeline, so watch this space!
17 April 2023
In every company it’s a struggle to make sure we only keep the documents we want... read more
7 September 2022
We don’t know if you’ve heard already, but there is yet another crisis on our horizon:... read more
15 September 2021
Project-Friday At Cmotions, we love a challenge. Especially those that make us both think and have... read more