The data this week comes from Data.Europa hattip to Data is Plural.
Wimdu wrote a short blog post on the most popular ERASMUS destinations.
erasmus <- read_csv("erasmus.csv") %>%
mutate(send = countrycode(sending_country_code,
origin = 'iso2c',
destination = 'country.name')) %>%
mutate(send = case_when(
sending_country_code == "EL" ~ "Greece",
sending_country_code == "UK" ~ "UK",
sending_country_code == "XK" ~ "Kosovo",
TRUE ~ send
)) %>%
mutate(receive = countrycode(receiving_country_code,
origin = 'iso2c',
destination = 'country.name')) %>%
mutate(receive = case_when(
receiving_country_code == "EL" ~ "Greece",
receiving_country_code == "UK" ~ "UK",
receiving_country_code == "XK" ~ "Kosovo",
TRUE ~ receive
))
mobility <- read_csv("mobility.csv")
erasmus_country <- erasmus %>%
select(academic_year, participant_gender, send, receive,
participants, special_needs) %>%
pivot_longer(cols = send:receive, names_to = "how", values_to = "country") %>%
group_by(country, how, special_needs, participant_gender, academic_year) %>%
summarize(total = sum(participants))
I was hoping to do more with this plot (including ordering, segmenting the barplots, etc.). Alas, gotta be done for now!
erasmus_country %>%
filter(total > 100) %>%
mutate(total = ifelse(how == "send", -total, total)) %>% ggplot() +
geom_bar(aes(x = country, y = total, fill = participant_gender), stat = "identity") +
geom_hline(yintercept = 0) +
coord_flip() +
scale_fill_viridis_d()
Some odd data characteristics… how is the age -7184 years? or 1049 years?
erasmus %>%
select(participant_age) %>%
summary()
## participant_age
## Min. :-7184.00
## 1st Qu.: 17.00
## Median : 21.00
## Mean : 24.54
## 3rd Qu.: 28.00
## Max. : 1049.00
I’m also not totally sure what a single row represents. It seems like it means a particular combination of demographic characteristics. But are there really 17 participants with the same demographic characteristics which would make up a single row?
erasmus %>%
filter(special_needs == "Yes") %>%
select(special_needs, participants) %>%
table()
## participants
## special_needs 1 2 3 4 5 6 7 8 9 17
## Yes 1956 351 114 60 31 15 3 4 1 1