top of page

The Confused Data Professionals

Data Scientist, Data Analyst and Data Engineer

The companies around the globe are not using these professionals in their respective roles

When me and my nephew had a discussion about his education and career, he asked me about Data Scientist, Data Analyst and Data Engineer, since he is going to major Data Science in his Under graduate studies. In order to answer his question on these specific roles, I contacted couple of my friends working in this stream for a long time. To my surprise I learned that the industry is not distinguishing these roles very much, instead using a mixed roles and responsibilities and confusing them.

On my discussion with them, I realized most of them are not happy with what they are doing now.

There is a wider gap between the aspiration and the reality. The majority of these data science aspirants end up in companies where the huge data is available, but under the IT or Software departments. The paradigms of these departments are not well suited especially for Data scientists and Data analysts, but a Data engineer can be well accommodated there. Another factor is the company size and employee expertise level which surely play a role in who does what in this regard and not all companies have the luxury of drawing really solid lines between these three functions.

Nowadays the software engineering departments of every company expect their engineers (please note that they have called as engineers) to know almost everything in software engineering and as well as do everything that comes under that stream. Being a software engineer for more than 20 years, I can understand that is possible, because there is not much difference between what we are doing and what we are preaching and what we are learned and learning. A software engineer can become a VP Product engineering or a CTO and even once they reached there, if required most of them can roll up their sleeves for architecting, writing and deploy a software application.

Wrong Thoughts

For the love of everything sacred and holy in the profession, this should not be a dedicated or specialized roles. There is nothing more soul sucking than writing, maintaining, modifying, and supporting ETL to produce data that you yourself never get to use or consume.
Instead, give people end-to-end ownership of the work they produce (autonomy). In the case of data scientists, that means ownership of the ETL. It also means ownership of the analysis of the data and the outcome of the data science.

But in the case of these data professionals it is not possible, Data Engineers cannot put the shoes of Data Scientist/Data Analyst and vise versa. Even Data Analyst and Data Scientists cannot change their shoes. The main reasons for this are the difference between their job responsibilities and educational backgrounds. Let us look the details.

Data Engineer vs Data Scientist vs Data Analysts

There is a significant overlap between data engineers and data scientists when it comes to skills and responsibilities.

The main difference is the one of focus. Data Engineers are focused on building infrastructure and architecture for data generation. In contrast, data scientists are focused on advanced mathematics and research on that generated data and the analysts do statistical analysis to gather information from it.

Data Scientists are engaged in a constant interaction with the data infrastructure that is built and maintained by the data engineers, but they are not responsible for building and maintaining that infrastructure. Instead, they are internal clients, tasked with conducting high-level market and business operation research to identify trends and relations—things that require them to use a variety of sophisticated machines and methods to interact with and act upon data.

In contrast, data engineers work to support data scientists and analysts, providing infrastructure and tools that can be used to deliver end-to-end solutions to business problems. Data engineers build scalable, high performance infrastructure for delivering clear business insights from raw data sources; implement complex analytical projects with a focus on collecting, managing, analyzing, and visualizing data; and develop batch & real-time analytical solutions.

Simply put, data scientists depend on data engineers. Whereas data scientists tend to toil away in advanced analysis tools such as R, SPSS, Hadoop, and advanced statistical modeling, data engineers are focused on the products which support those tools. For example, a data engineer’s arsenal may include SQL, MySQL, NoSQL, Cassandra, and other data organization services.

But when we look the roles and responsibilities of Data Analysts we could see that they are totally differ from the above roles. The Data Analysts usually pre-process the data, gather, maintain them with the help of data engineer. Their mins set is to represent the data via reporting and visualization mechanisms and tools that can be understood by the business professionals, do statistical analysis and data interpretations to give more insights to the business, optimize efficiency and quality of statistics etc..

Data Engineers require more software engineering skills, programmings skills and other database, ETL etc.. so a software engineer course is perfectly suited for them. In the case of Scientist programming, ML and little software skills are required, but more than that mathematical and statistical skills are important. Their mind set also need to be as a scientist to do research on data, experiment on data, cracking and purifying. Data Analyst does not require a software engineering skill, instead need to know some tolls and application to represent the data in statistical and historical way as well as analytics skills to understand and explain them to business, so communication skill is very much important.

Simple Definitions

Data Engineer is the facilitator of data, who collects, pipeline, store for the use of scientists.

Data Scientist is one who does research on the data, look through hidden informations, figure out patterns, make predictions that can help the companies' business.

Data Analyst is the person who convert the numbers into statistics and represent them in visual forms and then explain all the data points to business.

So If you are VP or a manager never push your data professionals to mix and wear shoes, pants and hats. Remember that we are not running a costume party, but a business . Let them wear their own, so they will be happy and can produce good insight from your data real quick and your business will be smooth sailing.


bottom of page