SAS Visual Analytics: Bubble Plots for Less Toil and Trouble

You cannot research bubble plots from any data visualization expert without reading a reference to the Hans Rosling’s Ted Talk Let My Dataset Change Your Mindset. It may have been the first time a bubble plot gained a mainstream appeal noted Stephen Few in his Show Me the Numbers book. Rosling provided an intense amount of information in one data visualization and he changed my mindset. Here’s some pointers on using bubble plots to give your data set a sexy new appeal.

Bubble Plots Provide an Overall View

Bubble plots allow you to understand the relationship between three values. Two numeric values are plotted by their X and Y coordinates and the third coordinate is a bubble. In this example, I plotted the Revenue per Order by the Quantity per Order and the bubble size is the gross margin. The bubbles each represent a different product that is represented by the color. You can interpret this chart by comparing the bubble size and placement.

Gross margin means that we had more profit on the product – so the larger the bubble the bigger the bank accounts (yeah!)  But what else can we learn?  Here’s a similar chart with an additional product added but with the notations removed.

What We Observe

• The red product has a lower profit margin than the other products despite being double the revenue. Wonder if the product was on sale to lure other customer purchases or if our discount is too much?
• In the bottom corner, the lime green and teal products had identical orders in terms of quantity and revenue, but the lime green one is larger, so it was more profitable. If these are similar products, wonder why the margin is so different? Maybe we switched suppliers and had to pay a higher cost?
• The blue, yellow and teal products had similar margins but at different prices and quantities. How do we sell more of those products!?!

Preparing the Data

When preparing data for this demonstration – I used a sample dataset from the BIRT project about a company selling Classic Cars.I joined several tables to get a dataset that contains the line items from each customer order. Here’s an example of two orders.

The overall revenue and quantity were averaged but the chart also worked when I used sums instead of averages. Since the numbers represented several orders over a time period I thought it made more sense to show the average.

The Gross Margin (%) was calculated measure that I created based on the each order’s line item, as shown in the following figure. I chose a Calculated Item so I would have the value for each row. SAS Visual Analytics understood what to do with the calculation when it was added to the data object. When added to the data object, I set the aggregation as Average (shown in next step).

Duplicate and Rename Data Items

I had a Line Item Revenue and Line Item Quantity data items and since I didn’t like their names my simple solution was to duplicate the items and rename to Revenue Per Order and Quantity Per Order. Then I changed each data item to display as averages for the chart.

5 Tips for Using Bubble Plots

Many people criticize bubble charts for being difficult to interpret. Don’t let that stop you from using it – just be sure to add the supporting information to help the user understand. Most business users can learn to interpret the chart easily when it’s properly labeled. This chart is also not suited to precise values because it’s meant to provide an overall look and help the audience identify areas of concern. During this analysis process, we were comparing bubble size and not actual profit margins. A bubble chart is used to identify situations – not explain what happened or why it happened.

• The legend on the side helps the user understand the bubble size and the color meaning – so it’s essential. Otherwise there is no way to indicate that the bubble size means it is gross margin and the colors are the product codes.
• You can also change the transparency setting so the users can see the overlap when there are more bubbles. The transparency setting is on the Properties pane.
• If there are too many bubbles, it can confuse the reader (see the preceding figure.) I suggest adding a List object or a hierarchy to help the reader digest and explore what they are seeing. The list object allows the users to make multiple selections.
You might also notice that this one is an earlier iteration where I had not averaged the Revenue and Quantity. As you can see by the bubble placement – It didn’t really change the message.  There is still a question about the margin.
• Add data tips to give the users more insight as they explore the data visualization.  I added the suggested retail price (MSRP). Wonder what else I could have added to change the story.
• You can add a fifth dimension of time. I used the quarter the product was sold. This helps the user understand over time what happened to the product. We can see that we started selling more of the green product which didn’t change the margin that much.  Wonder if the product was discounted toward the end of the year or the customer was given a discount for a larger order?

Ready to Start Using Bubble Plots?

While a bubble plot may require the users pause to think about the data, they will be rewarded with a richer data experience. Remember that this data object is intended to provide a data overview and help the users understand areas of their data to review more. Here’s another blog post where I used bubble plots on a geo-plot.