<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>R on Rob Kabacoff</title>
    <link>https://www.rkabacoff.com/categories/r/</link>
    <description>Recent content in R on Rob Kabacoff</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en-us</language>
    <copyright>&amp;copy; 2023</copyright>
    <lastBuildDate>Mon, 25 Feb 2019 00:00:00 +0000</lastBuildDate>
    <atom:link href="/categories/r/" rel="self" type="application/rss+xml" />
    
    <item>
      <title>Better bubble charts</title>
      <link>https://www.rkabacoff.com/post-orig/better-bubble-charts/</link>
      <pubDate>Mon, 25 Feb 2019 00:00:00 +0000</pubDate>
      
      <guid>https://www.rkabacoff.com/post-orig/better-bubble-charts/</guid>
      <description>


&lt;p&gt;A bubble chart is simply a scatterplot with the added feature that point sizes are proportional to the values of a third quantitative variable.&lt;/p&gt;
&lt;p&gt;Here is an example. Using the &lt;a href=&#34;https://www.rdocumentation.org/packages/datasets/versions/3.5.0/topics/mtcars&#34;&gt;mtcars&lt;/a&gt; dataset, we’ll plot car weight vs. mileage and use point size to represent horsepower. We’ll use the &lt;code&gt;ggplot2&lt;/code&gt; package and rely on the defaults.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# create a bubble plot
data(mtcars)
library(ggplot2)
ggplot(mtcars, 
       aes(x = wt, y = mpg, size = hp)) +
  geom_point()&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;figure&#34;&gt;&lt;span style=&#34;display:block;&#34; id=&#34;fig:bubbleplot1&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;https://www.rkabacoff.com/post-orig/better-buble-charts_files/figure-html/bubbleplot1-1.png&#34; alt=&#34;Basic bubble plot&#34; width=&#34;672&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 1: Basic bubble plot
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;While useful, we can improve on the default appearance by&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;increasing the size of the bubbles&lt;/li&gt;
&lt;li&gt;choosing a different point shape and color&lt;/li&gt;
&lt;li&gt;adding some transparency, and&lt;/li&gt;
&lt;li&gt;including more useful labels.&lt;/li&gt;
&lt;/ul&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# create an improved bubble plot
ggplot(mtcars, 
       aes(x = wt, y = mpg, size = hp)) +
  geom_point(alpha = .5, 
             fill = &amp;quot;cornflowerblue&amp;quot;, 
             color = &amp;quot;black&amp;quot;, 
             shape = 21) +
  scale_size_continuous(range = c(1, 14)) +
  labs(title = &amp;quot;Auto mileage by weight and horsepower&amp;quot;,
       subtitle = &amp;quot;Motor Trend US Magazine (1973-74 models)&amp;quot;,
       x = &amp;quot;Weight (1000 lbs)&amp;quot;,
       y = &amp;quot;Miles/(US) gallon&amp;quot;,
       size = &amp;quot;Gross\nhorsepower&amp;quot;) &lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;figure&#34;&gt;&lt;span style=&#34;display:block;&#34; id=&#34;fig:bubbleplot2&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;https://www.rkabacoff.com/post-orig/better-buble-charts_files/figure-html/bubbleplot2-1.png&#34; alt=&#34;An improved bubble plot&#34; width=&#34;672&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 2: An improved bubble plot
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Here are some things to note:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;The &lt;code&gt;range&lt;/code&gt; parameter in the &lt;code&gt;scale_size_continuous&lt;/code&gt; function specifies the minimum and maximum size of the plotting symbol. The default is &lt;code&gt;range = c(1, 6)&lt;/code&gt;. A larger range will magnify size differences.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The &lt;code&gt;shape&lt;/code&gt; option in the &lt;code&gt;geom_point&lt;/code&gt; function specifies a circle with a border color and fill color (&lt;code&gt;shape = 21&lt;/code&gt;).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Labels should include units of measurement whenever possible. Also note the use of &lt;code&gt;\n&lt;/code&gt; to produce a line break in the horsepower label.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Clearly, miles per gallon decreases with increased car weight and horsepower. However, there is one car with low weight, high horsepower, and high gas mileage. Going back to the data, it’s the Lotus Europa.&lt;/p&gt;
&lt;div id=&#34;using-color&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Using Color&lt;/h3&gt;
&lt;p&gt;We can add a fourth variable to the plot by mapping its values to the bubble fill color. For example, let’s add the number of a car’s cylinders to the plot above.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# create an improved bubble plot using fill color to represent
# a fourth variable
ggplot(mtcars, 
       aes(x = wt, y = mpg, size = hp, fill=factor(cyl))) +
  geom_point(alpha = .5, 
             color = &amp;quot;black&amp;quot;, 
             shape = 21) +
  scale_size_continuous(range = c(1, 14)) +
  labs(title = &amp;quot;Auto mileage by weight and horsepower&amp;quot;,
       subtitle = &amp;quot;Motor Trend US Magazine (1973-74 models)&amp;quot;,
       x = &amp;quot;Weight (1000 lbs)&amp;quot;,
       y = &amp;quot;Miles/(US) gallon&amp;quot;,
       size = &amp;quot;Gross\nhorsepower&amp;quot;,
       fill = &amp;quot;Cylinders&amp;quot;) +
  theme_minimal() &lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;figure&#34;&gt;&lt;span style=&#34;display:block;&#34; id=&#34;fig:bubbleplot3&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;https://www.rkabacoff.com/post-orig/better-buble-charts_files/figure-html/bubbleplot3-1.png&#34; alt=&#34;Adding color to a bubble plot&#34; width=&#34;672&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 3: Adding color to a bubble plot
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;I’ve found that mapping a variable to a fill color works best for factors (representing categorical variables). Often mapping quantitative variables to fill colors leads to smooth color gradations with color variations that are difficult to distinguish.&lt;/p&gt;
&lt;p&gt;In the above graph, a minimal theme was used. This is a personal aethetic choice.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;interactive-bubble-plots&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Interactive Bubble Plots&lt;/h3&gt;
&lt;p&gt;So far, the plot is static. We can use a package such at &lt;code&gt;plotly&lt;/code&gt; to render an interactive bubble plot. Mouse over the points to see the effects.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# create an interactive bubble plot
library(plotly)

mtcars$cyl &amp;lt;- factor(mtcars$cyl, 
                     labels=c(&amp;quot;4 cyl&amp;quot;, &amp;quot;6 cyl&amp;quot;, &amp;quot;8 cyl&amp;quot;))

plot_ly(mtcars, x = ~wt, y = ~mpg,
        size = ~hp, color = ~cyl,
        type = &amp;quot;scatter&amp;quot;, mode = &amp;quot;markers&amp;quot;,
        marker = list(opacity = 0.5, sizemode = &amp;quot;diameter&amp;quot;),
        text = ~paste(row.names(mtcars), 
                      &amp;quot;&amp;lt;br&amp;gt;horsepower:&amp;quot;, hp)) %&amp;gt;%
  layout(title = &amp;quot;Auto mileage by weight, horsepower, and number of cylinders&amp;quot;,
         xaxis = list(title = &amp;quot;Weight (1000 lbs)&amp;quot;),
         yaxis = list(title = &amp;quot;Miles/(US) gallon&amp;quot;))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can create much more customized interactive bubble plots with &lt;code&gt;plotly&lt;/code&gt;. This &lt;a href=&#34;https://plot.ly/r/bubble-charts/&#34;&gt;help page&lt;/a&gt; provides details and examples.&lt;/p&gt;
&lt;p&gt;You can freely create interactive Plotly graphs on your desktop for local use. However, if you want to &lt;strong&gt;post&lt;/strong&gt; the graph on a webpage, you’ll need to use the Plotly API. There are three steps in this process:&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;&lt;p&gt;Obtain a free account on Plotly (&lt;a href=&#34;https://plot.ly&#34; class=&#34;uri&#34;&gt;https://plot.ly&lt;/a&gt;) and record your &lt;em&gt;username&lt;/em&gt; and and &lt;em&gt;api key&lt;/em&gt;. This will allow you to post 25 graphs for free (with 500 embedded views per 24 hour period).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Execute the following code in the R console (inserting your actual &lt;em&gt;username&lt;/em&gt; and &lt;em&gt;api key&lt;/em&gt; for the placeholders). Alternatively, you can place the commands in your &lt;code&gt;.Rprofile&lt;/code&gt; (a file that is executed every time you start R).&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;Sys.setenv(&amp;quot;plotly_username&amp;quot;=&amp;quot;your_plotly_username&amp;quot;)
Sys.setenv(&amp;quot;plotly_api_key&amp;quot;=&amp;quot;your_api_key&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;ol start=&#34;3&#34; style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;Use the following code to generate the graph.&lt;/li&gt;
&lt;/ol&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# create an interactive bubble plot for placement on webpage
mtcars$cyl &amp;lt;- factor(mtcars$cyl, 
                     labels=c(&amp;quot;4 cyl&amp;quot;, &amp;quot;6 cyl&amp;quot;, &amp;quot;8 cyl&amp;quot;))

p &amp;lt;- plot_ly(mtcars, x = ~wt, y = ~mpg,
             size = ~hp, color = ~cyl,
             type = &amp;quot;scatter&amp;quot;, mode = &amp;quot;markers&amp;quot;,
             marker = list(opacity = 0.5, sizemode = &amp;quot;diameter&amp;quot;),
             text = ~paste(row.names(mtcars), 
                           &amp;quot;&amp;lt;br&amp;gt;horsepower:&amp;quot;, hp)) %&amp;gt;%
  layout(title = &amp;quot;Auto mileage by weight, horsepower, and number of cylinders&amp;quot;,
         xaxis = list(title = &amp;quot;Weight (1000 lbs)&amp;quot;),
         yaxis = list(title = &amp;quot;Miles/(US) gallon&amp;quot;))

chart_link &amp;lt;- api_create(p, filename = &amp;quot;bubble-chart&amp;quot;)
chart_link&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Simply place this code in an R chunk in your Rmarkdown document.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;final-notes&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Final Notes&lt;/h3&gt;
&lt;p&gt;Bubble charts are controversial for the same reason that pie charts are controversial. People are better at judging length than volume. However, they are quite popular and their use is growing.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;This post is adapted from &lt;a href=&#34;https://rkabacoff.github.io/datavis/&#34;&gt;&lt;strong&gt;Data Visualization with R&lt;/strong&gt;&lt;/a&gt;, a freely available guide for data visualization.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;/div&gt;
</description>
    </item>
    
  </channel>
</rss>
