Data Viz Done Right

May 27, 2016

Fix it Friday: Early Leavers from Education and Training in Europe

On the train to work this morning I was reading through the blogs I follow and ran across this amazing visualisation from Stephanie Evergreen:

I love small multiples and I love slope charts, and this in an amazing combination of the two. Shortly thereafter, I ran across this chart from the Financial Times:

To me, this chart is screaming out for a slope chart. Also, I don’t understand why they didn’t include all countries in Europe. I downloaded the data from Eurostat and created this small multiples slope chart in Tableau.

I also was able to include an option that allows you to pick a gender or the overall. Notice how the title changes color to match the lines in the slope graph. Do you know how I did that?

Which one do you think tells the story better? Does the bar chart of the slope chart make comparing the years easier?

May 24, 2016

Tableau Tip Tuesday: Five Use Cases for Strip Plots

In last week’s Makeover Monday about global warming I included a strip plot at the bottom of my final visualisation. You may hear these also called barcode charts or frequency charts, but whatever their “official” name, they are very useful for:

  1. Seeing a lot of data at a glance
  2. Understanding concentration of the data
  3. Seasonal trends

In this week’s video tip, I walk you through five use cases for strip plots varying from global warming to the frequency of fires to distribution of sales to deprivation in Scotland.

May 23, 2016

Makeover Monday: The Militarization of the Middle East in a Post-9/11 World

No comments

After an epic week 20 for Makeover Monday, I had great expectations for this week. Another great data set, this time looking at global arms imports and exports. But dang it was tough! I really struggled this week making something I was happy with. In the end, time is up and I learned a lot.

Let’s start by looking back at the original visualisation.

What works well?

  • The colors clearly distinguish imports and exports.
  • The labels provide the needed context.
  • Nice small line charts for Europe and the Middle East along with an indicator for the rate of change.

What could be improved?

  • The title of the article doesn’t match the chart.
  • It’s hard to compare countries.
  • Why were the countries that are shown selected? Are they the top N?
  • Why is the timeframe 2011-2015? That seems a bit arbitrary.
  • Why are there only sparklines for Europe and the Middle East?
  • The lower section with the flags has nothing to do with the map.
  • In the lower section, why don’t UAE and China show awaiting delivery? It should be consistent.
  • Is there a better story that can be told? The data goes back to 1950 after all.

I decided to focus on the title of the article: “The Militarization of the Middle East”. And I focused even farther by looking at the post-9/11 era from 2002-2015. America initiated a war with the Middle East. I wanted to know how that impacted the import of arms to the region.

Once again this was a week of iterations. I started with this small multiple map view, but didn’t think it showed the change through the years very well.

Click the image for the interactive version

I then looked at a slope graph comparing the % of total arms imported in the region by country in 2002 compared to 2015. This definitely shows the rate of change better, but I lose the context of the years in between.

Click the image for the interactive version

Maybe a DNA chart will work better than the slope graph? Not really, it just flattens it out.

Click the image for the interactive version

I was getting frustrated by this point, so I decided to take the opportunity to learn a new technique. I read Matt Chamber’s blog post recently on how to build ranked bump charts and thought this would make a great use case for this type of chart. In this view, I can see how a country moves year by year in the ranking of arms imported into the Middle East. I really like being able to click on a country and see it highlighted.

What the bump chart loses, though, is the context of the overall value of the arms imported. So to take care of that, I included the sparkling which also updates when you click or tap on a country. In the end, I’m satisfied and I learned something new. That’s a bit of what Makeover Monday is about.

May 19, 2016

Rain Patterns at Mount Diablo: What 60 years of rain data tells us about the Northern California drought

No comments

I’ve been getting deep into Alberto Cairo’s latest book “The Truthful Art” and was particularly fascinated by the Rain Patterns in Hong Kong visualisation created by The South China Morning Post.

Immediately I began to think back to our time living in Northern California and the historic drought conditions. I decided to use Mount Diablo as a representative weather station because it was one of the most complete and oldest in the Bay Area.

I decided to use this visualisation as inspiration for a version of my own. Some interesting patterns reveal themselves:

  • It hardly ever rains between the end of May and early October.
  • The most single-day rain total in the last 60 years was 5 inches on 21-Jan-1967.
  • Out of the 21,404 days in the data set, only 3,892 had any measureable rain (18.2%).
  • We lived in Pleasanton for 1,070 days. During that time, there were only 145 days of measurable rain (13.6%) and only 57.3 inches of rain during that time (less than 1/2 inch per day that it rained).
  • Over 60 years, there’s an average of 66 days of rain per year.
  • During our time living there, we saw an average of 50 days of rain per year.

My version was made with Tableau 10, so I can’t publish it to Tableau Public yet. You can download the workbook here and the data set I used here. The data was sourced from NOAA.

Finally, a special thanks to Data Schooler Nisa Mara for her feedback during the process.

12 Books Every Great Data Analyst Should Read

I had the pleasure of speaking today to the Virtual Government Tableau User Group. I mentioned that great data analysts have an appetite for learning and included a list of books that every great analyst should read. Here are 12 books that I highly recommend (obviously this isn’t an exhaustive list):

May 18, 2016

The Data School Gym - Marimekko Alternative


Former Data Schooler Nicco Cirone posted a challenge to the team at The Information Lab Monday to create this alternative to a Marimekko chart.

Click the image for a larger version

Nicco got the idea from this post by Jon Peltier about the problems with Marimekko charts. Give it a shot. It’s a bit tricky and one you will surely learn something from. There are several very subtle tricks in here. Try to match mine pixel for pixel.

If you want to use the same version of Superstore Sales that I used, you can download it here. If you think you have it and want to check to see if you got it right, you can view the final solution on my Tableau Public profile here.

Good luck!

May 17, 2016

Tableau Tip Tuesday: How to Create Monthly Radar Charts


In this week’s Tableau Tip Tuesday, I show you how to create radar charts that are based on monthly data. That is, the months go around an imaginery circle like a clock. Thanks to Jonathan Trajkovic for this great explanation and to Ed Hawkins for the inspiration.

May 15, 2016

Makeover Monday: How warm is Earth becoming?


There was a lot of chatter on Twitter last week about this terrific visualisation by Ed Hawkins:

The beauty of this visualisation is in the animation. However, without the animation, it kind of fails to tell the story. Let’s dig a bit deeper.

What works well?

  • There is a clear title.
  • The background circles provide helpful context.
  • Including the month labels makes it easier to understand what you’re seeing.
  • The year in the middle helps tell the story.
  • The animation is compelling.
  • It has a nice color scheme that works well on a black background.

What could be improved?

  • While the title is clear, it could be more eye-catching, like a news headline.
  • If you see this as a static image, you lose the sense of change.
  • You can’t compare any time periods. All you know is 2016 is the warmest.
  • There’s no explanation about what the numbers represent. Though I do see in the Twitter post a link to additional information.
  • The color scale has nothing to do with the temperature change, which I assumed it did until I read hte additional information. The colors actually represent the years. That doesn’t add much value. I think coloring by the temperature change would be more impactful.

So, this data set is actually incredibly simple. All we have is one record per month, the temperature, and the confidence intervals.

The first thing I wanted to do was rebuild the radial chart. This wasn’t nearly as easy as I thought. This post by Jonathan Trajkovic was very helpful, but it wasn’t designed for months. I’ll record how I did made it for a future Tableau Tip Tuesday.

Click the image for the interactive version

This radial chart is basically the same as the original, however I can’t make it “play”on Tableau Public and I also changed the color to be the median temperature difference. Really, I only built this to see if I could. It’s not any more useful than the original.

Next, I took the radial chart and flattened it out.

Click the image for the interactive version

This doesn’t make the understanding all that much easier because I can’t tell which years are which. Maybe I should switch the color legend back to years?

Click the image for the interactive version

Oh wow! What a difference! Now I can easily see the distinction between the older and more recent years. I think this is much, much better than the original, especially in static format. I wanted to keep iterating though.

Whenever I’m working with time-based data, I like to build either calendar heatmaps or heatmaps by year and month. Here’s what this data set looks like as a heatmap:

Click the image for the interactive version

The heatmap makes the series of lines even easier to understand. It’s super easy to see the gradual temperature change over time. This is pretty compelling, yet I wanted to keep going. Was there a better way to tell the story?

Next I looked at the 10-year average, that is, a 120 month moving average of the median temperature change. I then overlaid the confidence intervals.

Click the image for the interactive version

Lastly, I took the 10-year moving average view and replaced the monthly confidence intervals for the monthly values while keeping the overall 10-year average. This is my submission for Makeover Monday. In this view, I like how I can see the drastic monthly fluctuations but still have the overall context. Including a reference line at zero helps emphasize the dramatic change since about 1984.

I also included a strip plot under the graph that shows the average median temperature difference for the entire year. This brings back a bit of the heatmap view above.

In the end, another fun week with a simple data set that provides lots and lots of options. Which one do you like best?

UPDATE: This week has been a fascinating exercise in iterating. That’s the beauty of Tableau. I can get another idea and build it quickly. After seeing some of the submission for this week, I thought a jitter plot might work well. Thoughts?

Click the image for the interactive version

May 12, 2016

The Data School Gym - Timeline Pareto Chart


Another Data School Gym challenge for you. Today Jonathan MacDonald reached out to me with a question. I’m glad he reached out because I’m constantly asking him question. He asked:

How can we create a timeline Pareto chart? That is, we need to calculate the cumulative sales for a dimension from it’s first sale until it’s most recent, but the time has to be expressed as a % of cumulative days.

So, here’s the challenge for you. Create this chart below. It’s based on Superstore Sales and there are no big tricks. Just see if you can do it. Some hints:

  1. You will need an LOD calc for the x-axis.
  2. Include a single select parameter for the dimensions. Each line on the chart represents one element of the selected dimension.
  3. Include an option to highlight a specific element within the dimension selected.

I’ve disabled the download option on the workbook so you can’t cheat. Tweet me when you think you’ve figured it out.

May 10, 2016

Tableau Tip Tuesday: How to Create Directional Lollipops


In this week’s tip, I show you how to create directional lollipops, an alternative view to time series jittering. In the video, I look at the incredible season by Stephen Curry and his shot results minute-by-minute, game-by-game.


May 8, 2016

Makeover Monday: How Many Hours Do Women Work in OECD Countries?


Since Sunday is Mother’s Day in the States, this week’s Makeover Monday topic is about how many hours women work in various OECD countries. Let’s start by reviewing the original chart by Business Insider:

What works well?

  • The stacked bar chart is relatively easy to understand since it only has four colors and there aren’t that many countries to compare.
  • The chart is sorted by the smallest percentage of women working 40+ hours per week, which makes it easy to compare that category.
  • The colors are easily distinguishable.
  • Easy to read headers

What doesn’t work well?

  • I have no idea what year this data is from. The data goes back to 1976. I assumed it was for 2016, since that’s when the article was written, but after finding the data myself, it looks like it’s from 2014.
  • The title of the article "American women work way more than their European counterparts” isn’t entirely true. The chart doesn’t show all of the countries is Europe from OECD. The U.S. would rank 9th is you compare European countries and the U.S. from 2014.
  • The chart title is useless.
  • Japan isn’t in Europe, so why is that included?
  • Why is the OECD average included if this is supposed to be the U.S. compared to Europe?
  • There’s no rationale to the countries they chose to include. Is the author being deceitful on purpose? I hope it’s merely an oversight.
  • While I don’t think this stacked bar chart is terrible, it does make it very hard to compare any of the other categories of hours worked.

The first thing I did was rebuild the chart including all of the OECD countries and reversing the sort to be by the highest rate of women working 40+ hours.

Click to interact

I included several filtering and sorting options to allow the user to find their own story. The user can scroll through all of the years and see how the story unfolds. This view solves the problem of not being able to sort by any of the other categories of hours worked.

I didn’t love this though, so I created a slightly different version that shrinks the bars and adds dots. Think of it as a stacked dot chart.

Click to interact

This is the beauty of Tableau. I can quickly iterate on ideas and see which one I like best. At first, I thought adding the dots would make it easier to understand. I think it looks pretty neat, but actually, I think I made it harder to understand.

The problem in both of these stacked charts is that I can’t see all of the years in one view. I was really curious as to the patterns. Has the % of women working 40+ hours per week in the U.S. grown? How does that compare to the OECD average? How do other countries compare?

With those thoughts in mind, I created this series of line charts across the different work hours ranges.

Click to interact

I love these types of charts. I created one last week as well. What I like about them is they include lots of context. In this particular example, I can clearly see that the U.S. is higher than the OECD average in the 40+ hours worked per week section. Yet I can also see that there are quite a few OECD countries that are higher than the U.S. I can easily compare Europe to North America. Or only look at the top 10 countries according to U.S. News and World Report. I can zoom into a specific working hours category with a simple tap on the filter.

I almost stopped here, because I think this already is much better than the original. However, I wanted to see of there was a better way to compare the different work hours within a single country. To address that, I thought a small multiples view might work well.

Click to interact

I chose to sort the countries by the highest % of women working 40+ hours per week in 2014. Then you read it in a z-pattern. So this view let’s you see where a country ranks amongst the others and you can also compare the hours worked within a single country.

Then it hit me. I quickly went to Andy Cotgreave’s blog and found this viz he created a few weeks ago:

Yes! This is it! It even matches the colors I was using. I duplicated the previous viz and changed it to an area chart. I then added some of the filtering options back.

NOTE: If you’re viewing this on a phone, you’ll see a long skinny version with less filtering and that also has the sorting option removed.

It took me five iterations, but I got there in the end. I’m not sure how I could have done this quicker with any tool other than Tableau. I love how I can fail fast! Which version do you like best?

May 5, 2016

Makeover Monday: The Rising Cost of Tuition in the United States - Highcharts Edition

No comments

Makeover Monday has been an incredible learning experience for me. I’ve become particularly fond of designing for mobile devices. Doing so has led me to learn a lot about things Tableau can do to make the mobile experience better. The great news is they listen. I’m having a call with the mobile developers to talk to them about my experience.

However, until these problems are fixed, I need to find another way to create visualisations that work well on mobile. Enter Highcharts. It’s a javascript-based charting tool that has proven super easy to learn. I’ve never coded in JS before, yet I was able to reproduce my entire Makeover Monday viz from this week without too much fuss. Yes, it takes me longer than Tableau, but the extra time is worth the control I have over the visualisation.

Give it a play, especially on your phone. I think you’ll find it a much better experience than the Tableau version.

May 4, 2016

Data+Women: Women are Underrepresented on Tech Boards


I was listening to the latest Tableau Wannabe Podcast about Women in Data Month and Emily mentioned how Tableau has no females on its Board of Directors. I’m also preparing to speak at the first Data+Women London meetup tomorrow, so I wanted to educate myself a bit and also verify Emily's comment.

I looked on Google Finance at Tableau to get a list of comparable companies. I then included some more big tech companies from Silicon Valley for comparison purposes. The data is shocking!

Of the 17 companies I selected:

  1. Only seven (7) have boards with at least 25% female composition
  2. 0% of the companies have 50% representation of females
  3. Tableau and MicroStrategy have exactly zero (0) female members on their boards

This is sad, truly sad. My message to the leaders of these companies: “Lean in!"

May 3, 2016

Tableau Tip Tuesday: 12 Use Cases for Parameters


Parameters are one of Tableau’s most powerful features. I remember when they were first introduced and it completely changed the Tableau paradigm. This week’s video takes you through 12 simple use cases for parameters. These 12 barely scratch the surface for what’s possible.

NOTE: This video is 50 minutes long.

May 2, 2016

Makeover Monday: The Rising Cost of Tuition in the United States

No comments

This week for Makeover Monday, we look at an article and viz from Online MBA Page about the cost of college tuition in the United States.

What works well?

  • Very clear subtitle that describes the visualisation
  • Used a two-color diverging scale to identify the change
  • Nice labels above the color legend to ensure you know what is good and what is bad
  • Good use of red for bad
  • Including the US Avg for context
  • Using % change provides better context than the raw numbers

What doesn’t work well?

  • The irregular state shapes - is a map even necessary?
  • The color bins are not equal size - notice how the first two are different sizes than the rest
  • Bar chart should be rotated 90 degrees to make it easier to read
  • What year does the bar chart represent? I can’t find anywhere that tells us.
  • Do you really need the ranking numbers below the bar chart?

The data that was available goes back 12 years, so why did the author only choose to look at the change over the last 5 years? For my version, I’m going to consider all of the years because I think that provides better context as to how much tuition has changed.

I started by looking at maps. I created this small multiples version, which shows the difference for each state to the U.S. average year by year.

This doesn’t work well because all of the maps basically look the same. For the most part, the northeast is higher than the national average consistently. It also doesn’t allow me to easily compare one state to the others.

I tried a hexmap and circle maps, but none of them worked well for me. In other words, just because we have geographical data, it doesn’t mean we HAVE to create a map. For me, what’s more important is including context. How can I best compare the States? How can I compare to the U.S. Average? What’s the best way for me to show how the cost of tuition has changed?

Given all of these questions, I decided that a series of three line charts were the best way to go for my story. In this view I can show:

  1. Yearly tuition rates for each State
  2. Highlight a selected State, but include the others for context
  3. Include the U.S. average for context
  4. Show the change over the last 12 years
  5. Show the variance to the U.S. average

Each of these provide much greater context than the original visualisation and tell a more compelling story about the escalating rise in tuition across most of the U.S.