library(tidyverse) # ggplot, lubridate, dplyr, stringr, readr...
library(praise)
Stack Overflow Annual Developer Survey 2024
Data
This week’s dataset is derived from the 2024 Stack Overflow Annual Developer Survey. Conducted in May 2024, the survey gathered responses from over 65,000 developers across seven key sections.
<- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-09-03/qname_levels_single_response_crosswalk.csv')
qname_levels_single_response_crosswalk <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-09-03/stackoverflow_survey_questions.csv')
stackoverflow_survey_questions <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-09-03/stackoverflow_survey_single_response.csv') |>
stackoverflow_survey_single_response mutate(ai_sent2 = case_when(
== 1 ~ "favorable",
ai_sent == 2 ~ "indifferent",
ai_sent == 3 ~ "unfavorable",
ai_sent == 4 ~ "unsure",
ai_sent == 5 ~ "very favorable",
ai_sent == 6 ~ "very unfavorable"
ai_sent |>
)) mutate(ai_sent2 = factor(ai_sent2,
levels = c("very unfavorable",
"unfavorable",
"unsure",
"indifferent",
"favorable",
"very favorable")))
AI sentiment and number of years programming
How is AI sentiment distributed across the number of years an individual has spent programming?
|>
stackoverflow_survey_single_response ggplot() +
geom_bar(aes(fill = ai_sent2, x = years_code),
position = "fill") +
labs(fill = "AI sentiment",
y = "",
x = "number of years spent programming")
Developers across the world
How much faith do developers across the world have in AI? We measure the average sentiment (now mapped to a 1-5 scale that is more meaningful) and average per country. Thank you to @Sarah Penir for the useful code at https://sarahpenir.github.io/r/making-maps/.
<- stackoverflow_survey_single_response |>
survey_country mutate(ai_sent = case_when(
== 1 ~ 4,
ai_sent == 2 ~ 3,
ai_sent == 3 ~ 2,
ai_sent == 4 ~ NA,
ai_sent == 5 ~ 5,
ai_sent == 6 ~ 1
ai_sent |>
)) rename(region = country) |>
mutate(region = recode(region,
"United States of America" = "USA",
"United Kingdom of Great Britain and Northern Ireland" = "UK",
"Republic of Korea" = "South Korea",
"Democratic People's Republic of Korea" = "North Korea",
"Congo, Republic of the..." = "Republic of Congo",
"Russian Federation" = "Russia",
"United Republic of Tanzania" = "Tanzania",
"Côte d'Ivoire" = "Ivory Coast",
"Venezuela, Bolivarian Republic of..." = "Venezuela"
|>
)) group_by(region) |>
summarize(ave_sent = mean(ai_sent, na.rm = TRUE),
n_devel = n())
<- map_data("world")
world <- left_join(world, survey_country, by = "region") full_world
|>
full_world ggplot(aes(x=long, y = lat, group = group)) +
geom_polygon(aes(fill = ave_sent)) +
scale_fill_distiller(palette ="RdBu", direction = -1) +
coord_fixed(1.3) +
theme_void() +
labs(fill = "average sentiment")
Note that in Mali, the average AI favorability is extremely high. One might think that the extreme average is due to having very few developers in Mali. Below is a map showing how many developers were surveyed in each country. Indeed, there were very few developers from Mali who filled out the survey (n=2).
|>
full_world ggplot(aes(x=long, y = lat, group = group)) +
geom_polygon(aes(fill = log(n_devel, 10))) +
scale_fill_distiller(palette ="RdBu", direction = -1) +
coord_fixed(1.3) +
theme_void() +
labs(fill = "number of developers\n log10 scale")
praise()
[1] "You are slick!"