The day-to-day routine of epidemiology is not always thrilling. Although fieldwork and data collection are fun, once you’ve collected the data you have to analyze it – and before you can analyze it, you have to clean it. Before I started my internship, data cleaning was a mystery to me, but after spending 3 weeks cleaning a rather large data set I’ve realized that it’s mostly a tedious process – going through each individual observation (there can be anywhere from a few hundred to over ten thousand) and making sure that all of the variables make sense. However tedious it is, it’s one of the most important steps in the analysis. Computers are fundamentally stupid and only do (exactly) what you tell them – if that wasn’t enough, they have a hard time figuring out what letters are, so a major part of data cleaning and entry is devising codes to convert your data into a series of numbers. It’s not exactly the most glamorous part of research, but it needs to be done before you can start asking the questions you set out to answer. The upside of data cleaning is that I spend most of my day looking at screens like the one below:
It appears that I’m rapidly approaching the end of the data cleaning and finishing up some of my analyses, and I’ll be off to Liwonde for the next 2-3 weeks to start collecting data from the district hospital there. Thankfully, I’ll be able to put the computer away for a few weeks and start to learn how the data that make up our datasets get collected.
In other news, a new cadre of epidemiology students have started blogging about their summer internships over at the blog for the Epidemiology Student Organization. This summer we have students working in fields as diverse as infection control and managing chronic asthma. Some of the recent highlights:
Mike shares his thoughts on the inherent problems with interpreting data in any study and how the questionnaire design can influence a study’s outcome.
Stefanie talks about working in a state epidemiology department and the benefits of applying the knowledge that we’ve spent so long learning.