Introducing pairedVis() – interactive scatterplot matrix

John Muschelli (@strictlystat) wins the prize for first adapted d3 visualization! He liked this plot (which we’ll explain further) created by Mike Bostock (the creator of the d3.js library), so he decided to develop a version of it for our package. You can read about his process on his blog post. We touched up a few parts and the visualization is now part of our package!

So what is this?
A scatterplot matrix is useful for getting a high-level view of a data set. R has the function pairs() which creates such a matrix plot. Let’s use the iris data set (built into R) that Mike used above as an example. pairs(iris) produces the following:

pairs() plot of iris dataset

pairs() plot of iris dataset

If you clicked the link to the d3 version of this plot above, you’ll notice 1) a more colorful display and 2) the ability to select groups of points in one plot and have them be highlighted in all the plots. The colors correspond to the categorical variable “Species”.

So if you have a data frame with both numeric and categorical columns, the d3 version is a neat alternative way to display this data. We added additional functionality for multiple categorical variables (can recolor the points via the drop-down) and a dynamic legend. Let’s add a fake categorical variable to iris and visualize:

test_data <- iris
test_data$content <- sample(c("High", "Med", "Low", "None"), nrow(test_data), replace=T)

We get this:

But what does this have to do with health?
Nothing directly, but that’s not an issue! If you read our FAQ, you’ll see that we are more than excited about others developing any type of d3 visualization they want to integrate with R through our package. Anything you think would be useful for many people and different data sets is eligible.

Cool! I want to use it
No problem. If you don’t have the healthvis package installed already, head over to the install page for instructions. If you have the package, you can update it easily (check the “Update” section on the install page).

This development story seems rigged…
Yes, it’s true that John is in the Hopkins Biostat department, and so are we, but we can assure you that he was not part of the initial package development and learned everything independently! The only real way to prove that the development process is not so bad is for someone completely unaffiliated to give it a try…consider the gauntlet thrown down!


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s