Quantcast
Channel: Actuate » Fred Sandsmark
Viewing all articles
Browse latest Browse all 82

Data Driven Digest for August 21

$
0
0

The second* data visualization we all learned in school was the Cartesian coordinate system.  By plotting figures on a two-dimensional graph, we learned the relationship between numbers and space, unlocked patterns in those numbers, and established foundations for understanding algebra and geometry. The simple beauty of X-Y coordinates belies the power they hold; indeed, many of the best data visualizations created today rely on, and build upon, on the Cartesian plane concept to show complex data sets. Here are three examples. (Note that none of these are textbook Cartesian visualizations, because the X and Y axes represent different units.)

us-college-majors-income-vs-gender-ratio-ann

Back to School: Our favorite “Data Tinkerer,” Randy Olson, published a blog post this week exploring correlations between earnings, gender, and college major. Using data from the American Community Survey (and building on a FiveThirtyEight article by Ben Casselman), Olson created the graph above to show his findings. Then he generated a variety of graphs (one of which is below) that fit a linear regression onto the data and add bar charts along the graphs’ sides to show quantity along both axes. The results very effectively illuminate more aspects of the same data in a very efficient format.

college majors gender


hack to scientific glory

Statistically Significant: Scientists are sometimes accused of adjusting their experiments to yield the answers they want. This practice is called p-hacking (for p-value) and is explained in a fine FiveThirtyEight article by Christie Aschwanden, Science Isn’t Broken – It’s just a hell of a lot harder than we give it credit for. The article is accompanied by the endlessly fun interactive shown above; click through to play with it. As you add or subtract parameters, the data on the Cartesian plane and the linear regression of that data change before your eyes. If you can find a connection that yields a p-value of 0.05 or less, Aschwanden says, you have data that’s suitable for publishing in an academic journal. Click here for a great explanation of p-values.


W150730_KLINGEBIEL_INNOVATIONPERFORMANCE

Business Time: At the Harvard Business Review, Ronald Klingebiel and John Joseph delved into whether it’s better to be a pioneer or a follower by studying a very specific slice of data: German mobile-handset makers in the years 2004-2008. Their chart (above) plots many manufacturers along two axes; the number of features on the x axis, and the month of entry into the market along the Y axis. Klingebiel and Joseph then highlight two companies that succeeded (Samsung and Sagem) and two that didn’t (HP and Motorola). The authors’ hypothesis was that a handset manufacturer was more likely to succeed if it came to market early with lots of features, or if it arrived later with fewer, better-focused features. The chart, while very good, would benefit from interactivity; I’d like to hover on any dot to get the company name, and click any dot to get details of how that company performed. Without this context, I must rely on the authors’ definition of success.

* The number line is the first data visualization I recall using. What was your experience? I’d love to hear about the first data visualization you remember. Please leave a message in the comments.


Like what you see? Every Friday we share great data visualizations and embedded analytics. If you have a favorite or trending example, tell us: Submit ideas to blogactuate@actuate.com or add a comment below. Subscribe (at left) and we’ll email you when new entries are posted.

Recent Data Driven Digests:

August 14: Viewing over oceans, seeing under the sea, mapping rivers’ width

August 7: Twitter sentiment, dataviz tweet chat, Twitter transparency

July 31: English words, machine learning, data science degrees.


Viewing all articles
Browse latest Browse all 82

Trending Articles