Wednesday, March 31, 2010
Monday, March 29, 2010
So I was chatting with a friend of mine on Skype last night and I was
going into some detail about my affinity for pie charts.
He simply responded:
"Pareto and histogram. Everything else is bullshit"
Dan's visual response:
Can’t put a histogram on a map! (at least not in Tableau)
First, histograms are only good for certain situations (a lot of data points, when you are trying to see the overall distribution, etc.). But, since they data is put into categories, it is easily manipulated based on what categories you pick (like ignoring the differences between months when grouping data into a full year). Pareto charts have a lot of the same problems.
Is your friend in QA? He's probably following the 7 useful charts that exist according to QA Wikipedia:
As I've said before, pie charts are very effective for certain situations. It's for a general estimate of ratios, not exact percentages (that's why I always label them!). I usually only like a few slices max, because it starts to get misleading if you are looking at similarly sized pieces (since it's not obvious which is bigger). But showing that this group is 10% of the whole, this other group is 20% of the whole, and 70% other? Well that's what a pie chart is made for!!
Tell your friend to put THAT in a histogram!!!!!
And, with final thoughts, Dan:
Here's a couple articles in defense of pie charts if you ever need ammo to support your love of pie charts:
“I don't accept the information design dogma that pie charts should never be used. Pie charts have weaknesses but they also have many strengths. Put them back in your bag of tools and pull them out when appropriate.”
Sunday, March 28, 2010
Saturday, March 27, 2010
Text Analysis on Election '08 stump speeches
by Raif Majeed on February 4, 2008 - 2 comments
If you've seen the news, you'll know that there are lots of words flying around nowadays -- political speeches, debates, ads, etc. If you're trying to understand the words and decide how to vote, it can be overwhelming. However, if you look at the words as data, you can suddenly get interesting new insights.
Here's an interesting packaged workbook that shows a text analysis of recent stump speeches by the four major remaining presidential candidates (Hillary Clinton, John McCain, Barack Obama, and Mitt Romney). To give you a flavor of the kind of analysis I've done here, I've developed a packaged workbook showing the most common 2-word phrases uttered in each candidate's speech:
You can adjust the quick filter under the dashboard to limit yourself to phrases of a certain length of characters (the space between the words counts as one character). I want to keep this post politically neutral, so I'll let you dig in with Tableau (or the free Tableau Reader) and make your own discoveries. I'm sure you'll be surprised by some of the results, as I was.
The speeches were pulled from candidate websites; each was in a different forum -- for instance, Hillary Clinton was speaking in a church and Mitt Romney was speaking to auto workers in Michigan, which accounts for some of the unusual phrases you see.
To get the texts into a form that Tableau could understand, I used a quick Perl script to eliminate non-word characters (except whitespace, apostrophes, and hyphens), then split the text on whitespace and output the result as a CSV. To get 2-word analysis, I left-joined the resulting CSV against itself, with a one-off ON condition ("[current].[Position]+1=[next].[Position]", where [current] and [next] are table aliases). I used context filters and dashboards liberally to generate what you see.
I encourage you to play around with the workbook in Tableau and see what patterns you can find. Enjoy!