What do I think being a data scientist is about? Short answer: dealing with the big data. It’s the data organization, moreover, it’s the collecting, cleaning, and transforming of massive datasets, that statisticians typically wouldn’t touch. Data scientists go through the scientific process that we all know - make observations, make a question, form a hypothesis, test hypothesis, interpret results. In my opinion, the main difference between a data scientist and a statistician is the amount of data. A statistician goes through the scientific process too, however they have a stronger knowlegde of the inner workings of the statistics behind the tests, allowing them to use smaller datasets to achieve accurate results.

I think the main knowledge points for a data scientist are:

  1. strong data management/software skills, and
  2. decent understanding of statistical tests/procedures required for hypothesis testing of large samples.

I consider myself a statistician rather than a data scientist for a few reasons:

  • Most of my schooling solely focused on statistics - nearly fully ignoring the data science concept.
  • My interests lie in biostatistics and clinical trials.
  • I find more joy out of the statistical processes than from data cleaning and working with massive datasets.

Final note: I recognize the necessity for data scientists and am excited to learn more about the inner workings of the role.


<
Blog Archive
Archive of all previous blog posts
>
Next Post
Programming Background