Data Cleaning and New Epidemiology Bloggers

David McCormick

David McCormick

The day-to-day routine of epidemiology is not always thrilling. Although fieldwork and data collection are fun, once you’ve collected the data you have to analyze it – and before you can analyze it, you have to clean it. Before I started my internship, data cleaning was a mystery to me, but after spending 3 weeks cleaning a rather large data set I’ve realized that it’s mostly a tedious process – going through each individual observation (there can be anywhere from a few hundred to over ten thousand) and making sure that all of the variables make sense.  However tedious it is, it’s one of the most important steps in the analysis.  Computers are fundamentally stupid and only do (exactly) what you tell them – if that wasn’t enough, they have a hard time figuring out what letters are, so a major part of data cleaning and entry is devising codes to convert your data into a series of numbers. It’s not exactly the most glamorous part of research, but it needs to be done before you can start asking the questions you set out to answer. The upside of data cleaning is that I spend most of my day looking at screens like the one below:

What I Spend Most of my Day Doing

It appears that I’m rapidly approaching the end of the data cleaning and finishing up some of my analyses, and I’ll be off to Liwonde for the next 2-3 weeks to start collecting data from the district hospital there. Thankfully, I’ll be able to put the computer away for a few weeks and start to learn how the data that make up our datasets get collected.

In other news, a new cadre of epidemiology students have started blogging about their summer internships over at the blog for the Epidemiology Student Organization. This summer we have students working in fields as diverse as infection control and managing chronic asthma.  Some of the recent highlights:

Mike shares his thoughts on the inherent problems with interpreting data in any study and how the questionnaire design can influence a study’s outcome.

Stefanie talks about working in a state epidemiology department and the benefits of applying the knowledge that we’ve spent so long learning.

Laurel gives us an update on changes and regulations in the public health scene that may be coming soon to NYC, and Chelsea talks about why infection control matters.

One thought on “Data Cleaning and New Epidemiology Bloggers

  1. My partner and I stumbled over here different website and thought I might check things
    out. I like what I see so now i’m following you. Look forward to looking into
    your web page repeatedly.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s