library(tidyverse)
library(kableExtra)
library(readxl)
library(here)Lab 07: Midterm / review
Avengers, World Happiness
Goals
This lab will assess the following skills:
- Data wrangling
- Data visualization
- Importing data
- Clarity of code and written communication
- Ability to use data to investigate questions
Getting started
- Download
lab-07.qmd,avengers.csvandWorld Happiness 2023.xlsxfrom the course website - Place the .qmd file in your
STAT_4380folder on your computer - Place the .csv and .xslx data files inside the
datasubfolder withinSTAT_4380
Packages
We will use the following packages for this lab:
Data
You will analyze two different datasets in this lab:
avengers(Exercises 1 - 5): data on 173 Marvel charactershappiness(Exercises 6 - 9): data on World Happiness metrics for 165 countries from 2008 to 2023
Avengers Data
This data was originally collected for a FiveThirtyEight article. The version of the avengers data we will work with here can be found in the avengers.csv file on the course website. The code below will load the data (assuming it is placed appropriately inside a data subfolder of your R project).
avengers <- read_csv("data/avengers.csv")This dataset includes information about characters across the entire Marvel Cinematic Universe (MCU), so some of the names will be familiar if you are a fan of the films or comics. Don’t worry if you aren’t a Marvel fan; no background knowledge is needed to successfully complete this lab!
We will focus on the following variables in this lab:
| Header | Definition |
|---|---|
name |
The full name or alias of the character |
appearances |
The number of comic books that character appeared in as of April 30 |
current |
Is the member currently active on an avengers affiliated team? |
gender |
The recorded gender of the character |
probationary_introl |
Sometimes the character was given probationary status as an Avenger, this is the date that happened. The value will be NA if the character was never given probationary status. |
full_reserve |
The month and year the character was introduced as a full or reserve member of the Avengers |
year |
The year the character was introduced as a full or reserve member of the Avengers |
years_since_joining |
2015 minus the year |
death1 |
Yes if the Avenger died, No if not. |
return1 |
Yes if the Avenger returned from their first death, No if they did not, blank if not applicable |
See FiveThirtyEight’s GitHub repo for the full codebook.
Exercises
Hint: is.na() will be useful. Check: your new data frame should have 27 observations.
World Happiness Data
The World Happiness Report is produced annually by the Gallup World Poll. According to their website,
“Since creating the World Poll in 2005, Gallup has conducted studies in more than 160 countries and territories that are home to more than 98% of the world’s adult population. The World Poll survey includes more than 100 global questions as well as region-specific items. Gallup asks residents from Australia to Zimbabwe the same questions, every time, in the same way. This makes it possible to trend data from year to year and make direct country comparisons.” - World Happiness Report
The World Happiness 2023.xlx data file includes data from 2008 to 2023. The following code will read it in.
happiness <- read_excel("data/World Happiness 2023.xlsx")| Variable | Description or Question(s) Asked |
|---|---|
| country | Country name (used only as an identifier). |
| year | 2008 – 2023. |
| happiness | Country average response to “Imagine a ladder, with steps numbered from 0 at the bottom to 10 at the top. The top of the ladder represents the best possible life for you and the bottom of the ladder represents the worst possible life for you. On which step of the ladder would you say you personally feel you stand at this time?” |
| logGDPpc | log₁₀ of Gross Domestic Product Per Capita (in 2011 US$). |
| social_support | Having someone to count on in times of trouble. Proportion of people in that country who responded “yes” to “If you were in trouble, do you have relatives or friends you can count on to help you whenever you need them, or not?” |
| life_exp | Life expectancy, in years, of a healthy child at birth. |
| freedom_choices | Proportion of people in that country who responded “satisfied” to “Are you satisfied or dissatisfied with your freedom to choose what you do with your life?” |
| freedom_choices_cat | A categorization of freedom_choices into “low”, “med”, or “high”. |
| generosity | Residual of national average of response to “Have you donated money to a charity in the past month?” on GDP per capita. |
| corruption | Average proportion of people responding “yes” to “Is corruption widespread throughout the government or not?” and “Is corruption widespread within businesses or not?”. |
| affect_pos | Average of three questions: “Did you experience happiness, laughter, and enjoyment during a lot of the day yesterday?” |
| affect_neg | Average of three questions: “Did you experience worry, sadness, and anger during a lot of the day yesterday?” |
Submission
Before submitting your .html (as a .zip file to Blackboard):
- Check your code for neatness - add spaces and line breaks where appropriate to improve readability
- Check visualizations for clean titles and labels
- Suppress extraneous messages/warnings (e.g. set
#| warning: false,#| message: falseinside code chunks) - Ensure exercises are clearly labeled and your text responses are visually distinguished
- Confirm neat organization and readable structure
Render one last time, check the .html file for accuracy, then convert to .zip file to upload to Blackboard.
Grading (100 pts)
| Component | Points |
|---|---|
| Ex 1 | 8 |
| Ex 2 | 12 |
| Ex 3 | 8 |
| Ex 4 | 6 |
| Ex 5 | 12 |
| Ex 6 | 12 |
| Ex 7 | 10 |
| Ex 8 | 8 |
| Ex 9 | 8 |
| Ex 10 | 8 |
| Ex 11 | 8 |
| BONUS | 2 |