We may have been using ggplot2 for a while now, learning how to add fonts, colours, and annotations. Here we will look at changing the scales on the axes of our graphs and plots. Scales such as percentages, ratios, and other such like are needed when the quantities depicted demand it, when the data variabls on the axes are not simple counts or measurements.
Here we will look at some of the neatfeatures from the scales package.
1.1 Fonts and Themes
Show the Code
## Clean the slatesystemfonts::clear_local_fonts()systemfonts::clear_registry()##showtext_opts(dpi =96)# set DPI for showtextsysfonts::font_add( family ="Alegreya", regular ="../../../../../../fonts/Alegreya-Regular.ttf", bold ="../../../../../../fonts/Alegreya-Bold.ttf", italic ="../../../../../../fonts/Alegreya-Italic.ttf", bolditalic ="../../../../../../fonts/Alegreya-BoldItalic.ttf")sysfonts::font_add( family ="Anton", regular ="../../../../../../fonts/Anton-Regular.ttf")# Only one style is availablesysfonts::font_add( family ="Roboto Condensed", regular ="../../../../../../fonts/RobotoCondensed-Regular.ttf", bold ="../../../../../../fonts/RobotoCondensed-Bold.ttf", italic ="../../../../../../fonts/RobotoCondensed-Italic.ttf", bolditalic ="../../../../../../fonts/RobotoCondensed-BoldItalic.ttf")sysfonts::font_add( family ="IbarraNova", regular ="../../../../../../fonts/IbarraRealNova-Regular.ttf", bold ="../../../../../../fonts/IbarraRealNova-Bold.ttf", italic ="../../../../../../fonts/IbarraRealNova-Italic.ttf", bolditalic ="../../../../../../fonts/IbarraRealNova-BoldItalic.ttf")sysfonts::font_add( family ="Tangerine", regular ="../../../../../../fonts/Tangerine-Regular.ttf", bold ="../../../../../../fonts/Tangerine-Bold.ttf")# only these two are availablesysfonts::font_add( family ="Schoolbell", regular ="../../../../../../fonts/Schoolbell-Regular.ttf")# Only regular is availableshowtext_auto()# set these fonts as default for the session# sysfonts::font_families() # check which fonts are available in the session
Show the Code
theme_custom<-function(){theme_bw(base_size =10)+theme_sub_axis( title =element_text( family ="Roboto Condensed", size =8), text =element_text( family ="Roboto Condensed", size =6))+theme_sub_legend( text =element_text( family ="Roboto Condensed", size =6), title =element_text( family ="Alegreya", size =8))+theme_sub_plot( title =element_text( family ="Alegreya", size =14, face ="bold"), title.position ="plot", subtitle =element_text( family ="Alegreya", size =10), caption =element_text( family ="Alegreya", size =6), caption.position ="plot")}## Use available fonts in ggplot text geoms too!ggplot2::update_geom_defaults(geom ="text", new =list( family ="Roboto Condensed", face ="plain", size =3.5, color ="#2b2b2b"))ggplot2::update_geom_defaults(geom ="label", new =list( family ="Roboto Condensed", face ="plain", size =3.5, color ="#2b2b2b"))ggplot2::update_geom_defaults(geom ="marquee", new =list( family ="Roboto Condensed", face ="plain", size =3.5, color ="#2b2b2b"))ggplot2::update_geom_defaults(geom ="text_repel", new =list( family ="Roboto Condensed", face ="plain", size =3.5, color ="#2b2b2b"))ggplot2::update_geom_defaults(geom ="label_repel", new =list( family ="Roboto Condensed", face ="plain", size =3.5, color ="#2b2b2b"))## Set the themeggplot2::theme_set(new =theme_custom())## tinytable optionsoptions("tinytable_tt_digits"=2)options("tinytable_format_num_fmt"="significant_cell")options(tinytable_html_mathjax =TRUE)## Set defaults for flextableflextable::set_flextable_defaults(font.family ="Roboto Condensed")
2 Scales: Breaks and Labels
The scales package provides functions that are useful for scaling axes and legends in ggplot2. It is part of the tidyverse and is automatically loaded when you load ggplot2.
There are several parts to scaling: the breaks, the labels, and the transformations.
Breaks: These are the points at which ticks appear on an axis. You can specify breaks manually or use functions to generate them automatically.
Labels: These are the text labels that appear at the breaks. You can customize the labels to display values in different formats, such as percentages, currency, dates, or scientific notation.
Transformations: These are functions that can be applied to the data before plotting, such as logarithmic or square root transformations.
Scales can be applied all the aesthetics in ggplot: x, y, color, fill, size, shape, and even some esoteric ones area, alpha, linewidth etc.
3 Case Study: gss-wages
We will work with datasets that we have seen in earlier modules, datasets that deal with variables like currencies, or which vary over several multiples.
Let us create a basic scatter plot of realrinc vs year and colour by gender, and another boxplot of realrinc by gender:
Show the Code
p1<-gf_point( data =gss_wages, realrinc~year, color =~gender, alpha =0.8, size =1)%>%gf_labs(y ="Income", x ="Year", title ="GSS-Wages Dataset")%>%gf_refine(scale_color_brewer(palette ="Set1"))p1p2<-gf_boxplot( data =gss_wages, realrinc~gender, color =~gender, fill =~gender, alpha =0.8, orientation ="x")%>%gf_labs( y ="Income", x ="Gender", title ="GSS-Wages Dataset")%>%gf_refine(scale_color_brewer(palette ="Set1"))p2
(a) Scatter Plot
(b) Box Plot
Figure 1: Basic Plots
Let us now change the y-axis to show income in thousands of dollars. We will use the scales package for this.
The scale function label_currency has several arguments that can be used to customize the display of currency values. Here we have used scale = 0.001 to convert the values to thousands and suffix = "K" to add a “K” suffix to the labels.
We may yet be dissatisfied with this plot and wish to change the y-axis to a logarithmic scale. This is useful when the data spans several orders of magnitude, as is often the case with income data. We will also, for demonstration purposes, setup an excessively detailed y-scale with far too many breaks!!
Figure 3: Logarithmic Scale with (Excessively ) Detailed Breaks
5 Case Study: Gapminder
We will work with the gapminder dataset from the gapminder package. This dataset contains data on life expectancy, GDP per capita, and population for various countries over several years.
There are several variables, most of them “man-made”, which tend to vary over serveral orders of magnitude. Let us see how we can depict those.
Show the Code
p1<-gf_point( data =gapminder, gdpPercap~year, color =~continent, alpha =0.8, size =1)%>%gf_labs(y ="GDP per Capita", x ="Year", title ="Gapminder Dataset")%>%gf_refine(scale_color_brewer(palette ="Set1"))p1p2<-gf_boxplot( data =gapminder, gdpPercap~continent, color =~continent, fill =~continent, alpha =0.8, orientation ="x")%>%gf_labs(y ="GDP per Capita", x ="Continent", title ="Gapminder Dataset")%>%gf_refine(scale_color_brewer(palette ="Set1"))p2