forked from cambiotraining/r-intermediate
-
Notifications
You must be signed in to change notification settings - Fork 28
/
ggplot2-exercises.Rmd
180 lines (106 loc) · 4.21 KB
/
ggplot2-exercises.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
---
title: "ggplot2 exercises"
author: "Mark Dunning and Matt Eldridge"
date: '`r format(Sys.time(), "Last modified: %d %b %Y")`'
output: html_document
---
## Part I -- geoms and aesthetics
These first few exercises will run through some of the simple principles of
creating a ggplot2 object, assigning aesthetics mappings and geoms.
1. Read in the cleaned patients dataset, `patient-data-cleaned.txt`, into a
new object called `patients`.
```{r}
```
### Scatterplots
2. Generate a scatterplot of BMI versus Weight using the patient dataset and add
a colour scale based on the Height variable.
```{r}
```
3. Using an additional geom, add an extra layer of a fit line to the previous plot.
```{r}
```
4. Does the fit in the previous plot look good? Look at the help page for
`geom_smooth` and adjust the method to fit a straight line without standard
error bounds.
```{r}
```
### Boxplots and Violin plots
5. Generate a boxplot of Score values comparing smokers and non-smokers.
```{r}
```
6. Split the previous boxplot into male and female groups with different colours.
```{r}
```
7. Produce a similar boxplot of Scores but this time group data by Sex and colour
the interior of the box (not the outline) by Age. Change this plot to a violin
plot.
**Note**: Having loaded the data using `read_tsv`, the `Age` column has been
set to `dbl` (short for `double`, a `numeric` vector type) as it only contains
numbers. This makes it a **continuous** variable. In order to split the boxplot
by age and colour each one according to Age, it is necessary to change age to
be a **categorical** variable. We can do this by changing the `Age` column into a
different vector type: a `factor`.
```{r}
```
### Histogram and Density plots
8. Generate a histogram of BMIs with each bar coloured blue, choosing a
suitable bin width.
```{r}
```
9. Instead of a histogram, generate a density plot of BMI
```{r}
```
10. Generate density plots of BMIs coloured by Sex.
_Hint: alpha can be used to control transparency._
```{r}
```
## Part II - facets
In this next part you will create plots with faceting. First check that the cleaned
patients dataset has been read in and is available as a data frame in your current
session. If you haven't done so, convert the Age variable to a factor.
11. Using the patient dataset generate a scatterplot of BMI versus Weight, add a
colour scale to the scatterplot based on the Height variable, and split the plot
into a grid of plots separated by Smoking status and Sex.
```{r}
```
12. Generate a boxplot of BMIs comparing smokers and non-smokers, colour boxplot
by Sex, and include a separate facet for people of different age.
```{r}
```
13. Produce a similar boxplot of BMIs but this time group data by Sex, colour by Age
and facet by Smoking status.
```{r}
```
## Part III -- scales and themes
In these exercises we look at adjusting the scales and themes of our plots.
Check that the cleaned patients dataset has been read in and is available as a data
frame in your current session. Check also that the Age variable is a factor.
### Scales
14. Generate a scatterplot of BMI versus Weight from the patients dataset.
```{r}
```
15. Starting from the previous plot, adjust the BMI axis to show only labels for 20, 30, 40 and the weight axis to show breaks for 60 to 100 in steps of 5, adding the units (kilograms) to the axis label.
```{r}
```
16. Create a violin plot of BMI by Age where violins are filled using a sequential
colour palette.
```{r}
```
17. Create a scatterplot of BMI versus Weight and add a continuous colour scale for
the height. Make the colour scale with a midpoint (set to mean point) colour of
grey and extremes of green (low) and red (high).
```{r}
```
### Themes
18. Recreate the scatterplot of BMI by weight this time colouring by age group and add
a straight line fit (but no standard error/confidence intervals) for each age group.
```{r}
```
19. Remove the legend title from the previous plot, change the background colours of
legend keys to white and place the legend at the bottom of the plot.
```{r}
```
20. Add a title to the plot and remove the minor grid lines.
Save the plot to a 7 by 7 inch image file.
```{r}
```