As I noted in a previous post, APDA is in the middle of finalizing data for a new report. This will be a follow up to the report released in August 2015. We hope to include data on graduates with no listed placements and Carnegie Classifications, among other improvements. It is our aim to release the new report by April 15th, so that it can be useful to those who have applied to graduate programs this year. (Until that time, editing on the site has been turned off so that we can verify and analyze the data. We will turn back on editing in May when we turn on a new feature to allow for individual editing.)

In preparation for that report, I have been trying to determine the best way of displaying our data. I am attaching four DRAFT images that present data for 104 universities using pie charts (on gender, AOS, job type, and graduation year: gender and AOS use data from APDA alone, whereas job type and graduation year also uses graduation information from outside APDA, discussed in this post). I used pie charts because they are visually intuitive and I want the data to be as accessible as possible. I used suggestions from this post to help avoid some common criticisms of pie charts. (Note: I tend to analyze data in R, using ggplot2 for graphs, which is the language I provide below for anyone with expertise in this area.) At the top left of each image are the data for the full set of 104 universities. (Universities are included only if we have both an external source of graduation data and placement records for that university with recorded graduation years in this time period.) 

I am looking for feedback on these charts. Are these easy to understand? Are there alterations that would be beneficial? Two other options, with images below: 1) Replace pie charts with bar graphs (one sample version below). 2) Make university-specific sets of charts. (This is more time-intensive than 1.) 

Note also: We aim to release tables and regression analyses, as we did last time, and any images we release will be in addition to that work. Your input is welcome! 

(Click the images for full size version.)

 

Gender:

Gender

AOS:

AOS

Job Type:

Job

Graduation Year:

Year


Code for R:

For Gender:
> df <- read.csv("gender.csv", header = TRUE)
> p = ggplot(data=df,  aes(x=factor(1), y=Percentage,  fill = Gender)) + facet_grid(facets=. ~ University) + geom_bar(stat="identity", width = 1) + facet_wrap(~ University) + coord_polar(theta="y") + xlab("") +  theme(axis.ticks = element_blank(), axis.text.y = element_blank(), axis.text.x = element_blank()) + ylab("") + ggtitle("APDA Records: Gender as Proportion of Graduates 2012-2015")

> png(file="gender.png",width=2200,height=2200,res=140)
> p
> dev.off()
 
For AOS:
> df2 <- read.csv("AOS.csv", header = TRUE)
>p = ggplot(data=df2,  aes(x=factor(1), y=Percentage,  fill =First.Listed.AOS.Category)) + facet_grid(facets=. ~ University) + geom_bar(stat="identity", width = 1) + facet_wrap(~ University) + coord_polar(theta="y") + xlab("") +  theme(axis.ticks = element_blank(), axis.text.y = element_blank(), axis.text.x = element_blank()) + ylab("") + ggtitle("APDA Records: First-Listed AOS Category as Proportion of Graduates 2012-2015")
> png(file="AOS.png",width=2200,height=2200,res=125)
> p
> dev.off()
 
For Job Type:
>df3 <- read.csv("job.csv", header = TRUE)
>p = ggplot(data=df3,  aes(x=factor(1), y=Percentage,  fill =Placement.Type)) + facet_grid(facets=. ~ University) + geom_bar(stat="identity", width = 1) + facet_wrap(~ University) + coord_polar(theta="y") + xlab("") +  theme(axis.ticks = element_blank(), axis.text.y = element_blank(), axis.text.x = element_blank()) + ylab("") + ggtitle("APDA Records: Placement Type as Proportion of Graduates 2012-2015")
> png(file="job.png",width=2200,height=2200,res=125)
> p
> dev.off()
 
For Graduation Year:
df4 <- read.csv("year.csv", header = TRUE)
> p = ggplot(data=df4, aes(x=factor(1), y=Percentage, fill =Year)) + facet_grid(facets=. ~ University) + geom_bar(stat="identity", width = 1) + facet_wrap(~ University) + coord_polar(theta="y") + xlab("") + theme(axis.ticks = element_blank(), axis.text.y = element_blank(), axis.text.x = element_blank()) + ylab("") + ggtitle("APDA Records and External Graduation Data: Graduation Year as Proportion of Graduates 2012-2015")
> png(file="year.png",width=2200,height=2200,res=125)
> p
> dev.off()
 
Here is a sample set of bar graphs:
Job
Here are sample sets of university-specific graphs:
    Test
The R code for these:
> df4 <- read.csv("test.csv", header = TRUE)
> p4 = ggplot(data=df4, aes(x=factor(1), y=Percentage, fill = Category)) + facet_grid(facets=. ~ Data.Type) + geom_bar(stat="identity", width = 1) + facet_wrap(~ Data.Type) + coord_polar(theta="y") + xlab("") + theme(axis.ticks = element_blank(), axis.text.y = element_blank(), axis.text.x = element_blank()) + ylab("") + ggtitle("Berkeley, n=18 (AOS, Gender), n=22 (Type, Year)") + scale_fill_manual(values=c("#556B2F", "#006400", "#32CD32", "#DCDCDC", "#ADFF2F","#000080", "#DCDCDC", "#0000FF", "#8B0000", "#FF0000", "#DCDCDC", "#4B0082", "#8B008B", "#BA55D3", "#FF00FF"))
> df3 <- read.csv("test.csv", header = TRUE)
> p3 = ggplot(data=df3, aes(x=factor(1), y=Percentage, fill = Category)) + facet_grid(facets=. ~ Data.Type) + geom_bar(stat="identity", width = 1) + facet_wrap(~ Data.Type) + coord_polar(theta="y") + xlab("") + theme(axis.ticks = element_blank(), axis.text.y = element_blank(), axis.text.x = element_blank()) + ylab("") + ggtitle("Baylor, n=12 (AOS, Gender), n=23 (Type, Year)") + scale_fill_manual(values=c("#556B2F", "#006400", "#32CD32", "#DCDCDC", "#ADFF2F","#000080", "#DCDCDC", "#0000FF", "#8B0000", "#FF0000", "#DCDCDC", "#4B0082", "#8B008B", "#BA55D3", "#FF00FF"))
> df2 <- read.csv("test.csv", header = TRUE)
> p2 = ggplot(data=df2, aes(x=factor(1), y=Percentage, fill = Category)) + facet_grid(facets=. ~ Data.Type) + geom_bar(stat="identity", width = 1) + facet_wrap(~ Data.Type) + coord_polar(theta="y") + xlab("") + theme(axis.ticks = element_blank(), axis.text.y = element_blank(), axis.text.x = element_blank()) + ylab("") + ggtitle("Arizona, n=20 (AOS, Gender), n=26 (Type, Year)") + scale_fill_manual(values=c("#556B2F", "#006400", "#32CD32", "#DCDCDC", "#ADFF2F","#000080", "#DCDCDC", "#0000FF", "#8B0000", "#FF0000", "#DCDCDC", "#4B0082", "#8B008B", "#BA55D3", "#FF00FF"))
> df1 <- read.csv("test.csv", header = TRUE)
> p1 = ggplot(data=df1, aes(x=factor(1), y=Percentage, fill = Category)) + facet_grid(facets=. ~ Data.Type) + geom_bar(stat="identity", width = 1) + facet_wrap(~ Data.Type) + coord_polar(theta="y") + xlab("") + theme(axis.ticks = element_blank(), axis.text.y = element_blank(), axis.text.x = element_blank()) + ylab("") + ggtitle("Arizona State, n=9 (AOS, Gender), n=10 (Type, Year)") + scale_fill_manual(values=c("#556B2F", "#006400", "#32CD32", "#DCDCDC", "#ADFF2F","#000080", "#DCDCDC", "#0000FF", "#8B0000", "#FF0000", "#DCDCDC", "#4B0082", "#8B008B", "#BA55D3", "#FF00FF"))
> png(file="test.png",width=1000,height=2000,res=100)
> multiplot(p1,p2, p3, p4)
> dev.off()
Posted in , ,

4 responses to “Visualizing Placement Data (Updated 4/6/16)”

  1. Dan Hicks Avatar

    Yay visualizations! Here are a few quick comments:
    1. Ggplot’s default colors are both ugly and problematic for many dichromats. scale_fill_brewer(palette = 'Set1') is much less ugly, but still not good for many dichromats. The dichromat and viridis packages have palettes that are good for most dichromats, though aren’t as pretty as the Brewer palettes:
    https://stat545-ubc.github.io/block018_colors.html
    https://cran.r-project.org/web/packages/viridis/vignettes/intro-to-viridis.html
    2. In the job type plot, why is non-academic grouped with unknown?
    3. It looks like R is interpreting Year as continuous; consider coercing it to a factor in the plot.
    4. Pie chart radius could be used to show how many individuals are in each facet.
    5. Maybe you’ve already thought of this; aggregating and faceting at the university level is great for showing variation, but it’s hard to pick out any patterns. Scatterplots would be more useful for showing patterns.

    Like

  2. Carolyn Dicey Jennings Avatar

    Thanks, Dan! Yann Benétreau-Dupin had a similar worry to 1, and I found this site: http://colorbrewer2.org/. I am now planning to use that to manually set hex values that are colorblind safe. I will fix 2 (I thought the SQL query I had didn’t yield this info and was waiting for a new one, but it does!). On 5, how do you feel about ordered versions? Here is a scatterplot of the data for comparison–I take it that this will be off-putting for some, but perhaps I should include both for all 4 sets of data?

    Like

  3. recent grad Avatar
    recent grad

    Job type link is missing.

    Like

  4. Carolyn Dicey Jennings Avatar

    It looks like I lost the source file, but I have now linked to the larger image file that was uploaded into the blog. Thanks for pointing this out.

    Like

Leave a comment