Skip to content

Latest commit

 

History

History
executable file
·
54 lines (49 loc) · 2.76 KB

File metadata and controls

executable file
·
54 lines (49 loc) · 2.76 KB
layout
lesson

Data Carpentry’s aim is to teach researchers basic concepts, skills, and tools for working with data so that they can get more done in less time, and with less pain. This workshop provides a full overview of working with data in the social sciences. We want to provide you with multiple tools that can help you no matter what you are working on.

We will be starting with spreadsheets. Much of your time as a researcher will be spent in the initial 'data wrangling' stage, where you need to organize the data to perform a proper analysis later. It's not the most fun, but it is necessary. In this lesson you will learn how to think about data organization and some practices for more effective data wrangling. With this approach you can better format current data and plan new data collection so less data wrangling is needed.

Next, we will be working with OpenRefine. A part of the data workflow is preparing the data for analysis. Some of this involves data cleaning, where errors in the data are identifed and corrected or formatting made consistent. This step must be taken with the same care and attention to reproducibility as the analysis. OpenRefine (formerly Google Refine) is a powerful free and open source tool for working with messy data: cleaning it and transforming it from one format into another. This lesson will teach you to use OpenRefine to effectively clean and format data and automatically track any changes that you make. Many people comment that this tool saves them literally months of work trying to make these edits by hand.

Finally, we will be working with R. This lesson will be an introduction to R designed for participants with no programming experience. R gives you the power to truly explore your data by making new variables, exploring summary statistics, and creating visualizations. We will move from basic syntax and data types up to more complex topics such as data maniputation and plotting. Finally, we will discuss how to save your work as a reproducible report so that anyone (including you at a later date) can understand exactly what was done to get a set of results.

Getting Started

Data Carpentry's teaching is hands-on, so participants are encouraged to use their own computers to ensure the proper setup of tools for an efficient workflow.

These lessons assume no prior knowledge of the skills or tools.

To get started, follow the directions in the "Setup" tab to download data to your computer and follow any installation instructions.

Prerequisites

This lesson requires a spreadsheet program, OpenRefine, and a working copy of R and RStudio.
To most effectively use these materials, please make sure to install everything before working through this lesson. {: .prereq}