Dumped on by data scientists

A story in The Chronicle of Higher Education reminded me that I’ve been meaning to write about “data science” for some time.

The headline to the story:

“Dumped On by Data: Scientists Say a Deluge Is Drowning Research”

Rather amusingly, this is abbreviated in the URL to “Dumped-On-by-Data-Scientists”; a nice example of how the same words, broken in the wrong place, can lead to a completely different meaning.

Anyway, to the point. The term “data scientist” – a good thing, or not?
I’m throwing this one out there because I spent much of 2010 (a) reading articles that used the term and (b) trying to decide whether I like it or not – and I still can’t decide.

Arguments for:

  • It’s an attention-grabber, designed to make us think about the tools and skills required to analyse “big data” in the same way that “NoSQL” is designed to make us think about alternative database solutions

Arguments against:

  • The “data” part is redundant, since all scientists deal with data
  • It belittles the job title of “scientist”; the term might be construed as dismissive of the education, training and skills required to do “boring old school science” as opposed to “new, flashy sexy data science”
  • Many (most?) “data scientists” do business intelligence, not science; crunching Twitter posts to help formulate a better product marketing strategy is not the same as addressing a genuine scientific problem

At the heart of the issue, I feel, is a different approach to data. In “data science” we start with everything, give it a shake and see if answers to our questions fall out. In “real science” we start with a specific question, generate data designed to answer that question and see what falls out. Perhaps they are just different philosophies and mindsets. Perhaps each can learn from the other.

I guess with one “for” and three “against” I’ve decided that I don’t like the term “data scientist”, but I can’t quite shake the feeling that it has some use. What do you think?

8 thoughts on “Dumped on by data scientists

  1. Deepak Singh

    I like the term because in science too we often need to generate data first, then try and interrogate it to try and see if we can come up with some hypotheses. At the time of interrogation, we should have a purpose. Of course, I do think that science can do with a little bit of a BI mindset as it gets more data centric.

    Regardless of labels, the connotation to me is you have all sorts of data, from all sorts of sources. Now what are you going to do with it. Are you going to be random, or are you going to be systematic (i.e. scientific) about how you interrogate the data.

  2. Pingback: Tweets that mention Dumped on by data scientists | What You’re Doing Is Rather Desperate -- Topsy.com

  3. Harlan

    As someone with scientific training who uses those tools to solve business intelligence problems, I certainly struggle with a description of my role. “Data Scientist” is pretty good, as it correctly indicates that I use scientific techniques (controlled experiments, correct statistics) to understand our company’s data. I often describe myself as a “Statistician”, too, which gets across some of the same ideas without people having to do a double take and parse a new phrase. I also sometimes describe myself as doing “Operations Research” (aka “Management Science”, although I don’t use that term), since I use some of the tools of that fields, as well as of Artificial Intelligence/Machine Learning to optimize certain objective functions.

    I don’t know what the right answer is. It might depend on the precise person and their precise role. My title, for instance, is the result of a back-and-forth with my boss and HR, trying to find words that have both appropriate internal and external meanings.

    1. nsaunders Post author

      Thanks for this thoughtful comment; it explains nicely both the ambiguity I feel about the term and the reasons why it remains useful.

  4. Winter

    I think you made the argument for the term (or at least *a* term) by delineating the two approaches at the end of the post. Call those who start with the question and then seek out the data, “theoretical scientists” (or “hypothetical scientists”, if you want to be cute), and those who start with the data and look for answers to unknown questions… well, “data scientist” has caught on and is currently sexy, so why not?

  5. Pingback: Somethink to Chew On » “Data Scientist” and other titles

  6. Pingback: Data and a product mindset

  7. Pingback: Data Scientists Dealing with Data | WalterJessen.com

Comments are closed.