May 25, 2015
I first got a demo of Alteryx from George Mathew back in San Diego at TCC12. I was working for Facebook at the time, Mike Roberts from InterWorks set up the meeting, but I didn’t see a particularly good use case immediately for Facebook. Why? Facebook Data Engineers have always (and probably always will) code their own pipelines.
The day before heading to Inspire, I was sitting with Robin Kennedy and told him that I wanted to get a headstart on my training. Low and behold, he showed me all of the fabulous training modules that are built right into Alteryx. I had no idea! I completed about 15 of these on the flight to Boston.
After watching Arsenal draw 1-1 in a drab affair Sunday morning, I headed to the first of three training sessions: Predictive Analytics for Beginners. In this class I learned how to apply different data investigation techniques to help me understand how predictive a data source may be. The instructor showed us how to use the Association Analysis, Violin Plot, and Field Summary tools.
The workflow that we created...
…resulted in this series of violin plots (apologies for the blurry image).
The regression analysis workflow we created...
…resulted in a series of tables and this chart (which shows that charting is not in the Alteryx sweet spot).
The third and final class I attended on Sunday was Intermediate Macro Development. This was a pretty simple class in which we built a workflow + macro to strip heading from a messy Excel spreadsheet.
|The Information Lab team ALWAYS has fun!|
|Nice photobomb by the TIL team!|
|The quantified self work of Tim Ngwena of TIL was a keynote highlight!|
In the end, Inspire15 was a fantastic experience for me, a new Alteryx user. I’ve already started applying what I’ve learned and am working on two blog posts. My only regret is that I didn’t start using Alteryx sooner.
May 23, 2015
- Track every purchase that I make
- Categorize each purchase by the type of goods
- Locations each place where I made a purchase
Precise times were taken from purchase receipts, along with the categorisations. I then recorded the locations of each place by Swarm check-in, which were uploaded to a Google Sheet via IFTTT. I downloaded both sets of data into excel and manually joined them (there were only 19 records so it wasn’t much effort to do manually).
I then explored the data in Tableau, to see what stories I could find, if any. This week took me longer than I was expecting, mostly because I was having trouble finding anything interesting in the data. The one point that stuck out the most is that I spent more on ice cream than Mother’s Day. Oops! Please don’t tell my mom.
Click on the image below to explore the story.
May 19, 2015
May 18, 2015
Quick makeover this week (we have a Segway tour of Boston at #Inspire15 in 30 minutes). I saw this graphic on the LA Times about the amount of water it takes to produce a single ounce of food.
It’s cute and it’s interactive, but it’s not very good for making comparisons or ranking. Bubble plots are notoriously difficult this way. For example, tell me quickly which food uses the 3rd most water? Tough to tell, right? I also don’t understand why they grouped fruits and vegetables together.
I manually recreated the data in Excel, which you can download here. Hopefully I recorded everything correctly; if not, please let me know. I then quickly built a chart in Tableau. I’ve addressed the issues that bubbles present, ranking and comparison, by using a bar chart instead.
Going back to the previous question, using my viz, which food uses the 3rd most water? Simple right? How about the 10th most vegetable? That’s simple too; all you need to do is click the color on the right.
May 13, 2015
The first example is very basic; I did this intentionally so that the steps would be super easy to follow. The second example is only moderately more complex; it looks at Tableau's SEC financial filings from 2011-2014.
May 11, 2015
They go on to do an analysis, but never really address the story the data is telling in this table. Clearly what this table is screaming out for is to show the difference between the two populations. I’ve been on a bit of a slope graph kick lately, so that’s what I’m using again this week. Why? Because I find slope graphs to be an excellent way to show variances between two data points. Click on the image below for the interactive version.
The slope graph clearly makes the differences stand out. One can easily see that there are fewer Protestants and Catholics in prison, and at the same time see that there are way more Muslims in prison. I then like to supplement the slope graph with a bar chart that shows only the differences.
There’s no clear evidence available as to why this is, but representing the data this way leads to more questions and more discussion. Any time you design a viz and it continues the conversation, you’ve probably done something right.
May 4, 2015
My biggest problem with this viz is that I have to turn my head sideways to read it. In addition:
- The length of the bars isn’t accurate. How can +4.5 be longer than -5.0?
- The bars are in reverse order - the biggest overachievers (Dallas) should be first.
- I have to do the math in my head to get to their predicted wins.
My first thought was to see what this viz looked like it I rotated it counter clockwise.
That definitely makes it more readable, but the story still doesn’t stand out. What the data is screaming for is to show the change and emphasize the winners and losers. To this end, along with accounting for the observations above, I created this interactive version in Tableau. Click on the image below to activate the interaction.
May 2, 2015
April 27, 2015
The global talent pool has never been larger, will grow to 2030, read http://t.co/aA9AyLEXta (pdf) #education #stats pic.twitter.com/aiIiTDbZt9These pie charts are part of a larger study conducted by OECD, which you can read here. Some thoughts about these pie charts:
— OECD (@OECD) April 24, 2015
- The author is trying to show the change from 2013 to 2030. Using two pie charts makes this more difficult than necessary. At least, though, they kept the countries in the same order.
- The pies do not add up to 100%, I assume due to rounding. The 2013 pie adds up to 101% and the 2030 pie adds up to 102%.
- The focus is on the top 20 countries, so the “Other” category isn’t needed.
- The labels on the pies include both the country name and the value. A table would be better than this. Adding all of these labels makes the chart way too busy.
- There are two key metrics in the data: share of degrees and the number of degrees. The pie chart doesn’t provide enough context for understanding where the number of degrees will be coming from.
Given all of the above, I decided to create a slope graph.
- I included a parameter which allows you to select the metric.
- This option, along with using a slope graph, really helps show how dramatic the change is for China and India.
- Switch back and forth between the parameter options and you’ll see quite a different story.