The Mad Hatter’s Guide to Data Viz and Stats in R
  1. Grain Transportation Cartels
  • Data Viz and Stats
    • Tools
      • Introduction to R and RStudio
    • Descriptive Analytics
      • Data
      • Inspect Data
      • Graphs
      • Summaries
      • Counts
      • Quantities
      • Groups
      • Distributions
      • Groups and Distributions
      • Change
      • Proportions
      • Parts of a Whole
      • Evolution and Flow
      • Ratings and Rankings
      • Surveys
      • Time
      • Space
      • Networks
      • Miscellaneous Graphing Tools, and References
    • Inference
      • Basics of Statistical Inference
      • 🎲 Samples, Populations, Statistics and Inference
      • Basics of Randomization Tests
      • Inference for a Single Mean
      • Inference for Two Independent Means
      • Inference for Comparing Two Paired Means
      • Comparing Multiple Means with ANOVA
      • Inference for Correlation
      • Testing a Single Proportion
      • Inference Test for Two Proportions
    • Modelling
      • Modelling with Linear Regression
      • Modelling with Logistic Regression
      • 🕔 Modelling and Predicting Time Series
    • Workflow
      • Facing the Abyss
      • I Publish, therefore I Am
      • Data Carpentry
    • Arts
      • Colours
      • Fonts in ggplot
      • Annotating Plots: Text, Labels, and Boxes
      • Annotations: Drawing Attention to Parts of the Graph
      • Highlighting parts of the Chart
      • Changing Scales on Charts
      • Assembling a Collage of Plots
      • Making Diagrams in R
    • AI Tools
      • Using gander and ellmer
      • Using Github Copilot and other AI tools to generate R code
      • Using LLMs to Explain Stat models
    • Case Studies
      • Demo:Product Packaging and Elderly People
      • Ikea Furniture
      • Movie Profits
      • Gender at the Work Place
      • Heptathlon
      • School Scores
      • Children's Games
      • Valentine’s Day Spending
      • Women Live Longer?
      • Hearing Loss in Children
      • California Transit Payments
      • Seaweed Nutrients
      • Coffee Flavours
      • Legionnaire’s Disease in the USA
      • Antarctic Sea ice
      • William Farr's Observations on Cholera in London
    • Projects
      • Project: Basics of EDA #1
      • Project: Basics of EDA #2
      • Experiments

On this page

  • 1 Setting up R Packages
  • 2 Introduction
  • 3 Read the Data
  • 4 Data Dictionary
  • 5 Research Question
  • 6 Inspect/Analyse/Transform the Data
  • 7 Plot the Data
  • 8 Task and Discussion

Grain Transportation Cartels

1 Setting up R Packages

library(tidyverse)
library(mosaic)
library(skimr)
library(ggformula)

Plot Fonts and Theme

Show the Code
library(systemfonts)
library(showtext)
## Clean the slate
systemfonts::clear_local_fonts()
systemfonts::clear_registry()
##
showtext_opts(dpi = 96) # set DPI for showtext
sysfonts::font_add(
  family = "Alegreya",
  regular = "../../../../../../fonts/Alegreya-Regular.ttf",
  bold = "../../../../../../fonts/Alegreya-Bold.ttf",
  italic = "../../../../../../fonts/Alegreya-Italic.ttf",
  bolditalic = "../../../../../../fonts/Alegreya-BoldItalic.ttf"
)

sysfonts::font_add(
  family = "Roboto Condensed",
  regular = "../../../../../../fonts/RobotoCondensed-Regular.ttf",
  bold = "../../../../../../fonts/RobotoCondensed-Bold.ttf",
  italic = "../../../../../../fonts/RobotoCondensed-Italic.ttf",
  bolditalic = "../../../../../../fonts/RobotoCondensed-BoldItalic.ttf"
)
showtext_auto(enable = TRUE) # enable showtext
##
theme_custom <- function() {
  font <- "Alegreya" # assign font family up front
  "%+replace%" <- ggplot2::"%+replace%" # nolint

  theme_classic(base_size = 14, base_family = font) %+replace% # replace elements we want to change

    theme(
      text = element_text(family = font), # set base font family

      # text elements
      plot.title = element_text( # title
        family = font, # set font family
        size = 24, # set font size
        face = "bold", # bold typeface
        hjust = 0, # left align
        margin = margin(t = 5, r = 0, b = 5, l = 0)
      ), # margin
      plot.title.position = "plot",
      plot.subtitle = element_text( # subtitle
        family = font, # font family
        size = 14, # font size
        hjust = 0, # left align
        margin = margin(t = 5, r = 0, b = 10, l = 0)
      ), # margin

      plot.caption = element_text( # caption
        family = font, # font family
        size = 9, # font size
        hjust = 1
      ), # right align

      plot.caption.position = "plot", # right align

      axis.title = element_text( # axis titles
        family = "Roboto Condensed", # font family
        size = 12
      ), # font size

      axis.text = element_text( # axis text
        family = "Roboto Condensed", # font family
        size = 9
      ), # font size

      axis.text.x = element_text( # margin for axis text
        margin = margin(5, b = 10)
      )

      # since the legend often requires manual tweaking
      # based on plot content, don't define it here
    )
}

## Use available fonts in ggplot text geoms too!
ggplot2::update_geom_defaults(geom = "text", new = list(
  family = "Roboto Condensed",
  face = "plain",
  size = 3.5,
  color = "#2b2b2b"
))
ggplot2::update_geom_defaults(geom = "label", new = list(
  family = "Roboto Condensed",
  face = "plain",
  size = 3.5,
  color = "#2b2b2b"
))

## Set the theme
ggplot2::theme_set(new = theme_custom())

2 Introduction

From: Robert H. Porter (1983). A Study of Cartel Stability: The Joint Executive Committee, 1880-1886. The Bell Journal of Economics, Vol. 14, No. 2 (Autumn, 1983), pp. 301-314:

The Joint Executive Committee (JEC) was a cartel (of railroad firms) which controlled eastbound freight shipments from Chicago to the Atlantic seaboard in the 1880’s. While different railroad firms in the JEC shipped grain to different port cities (for example, Baltimore and New York), most of the wheat handled by the cartel was subsequently exported overseas, and the rates charged by different firms (were) adjusted to compensate for differences in ocean shipping rates.

Prices, rather than quantity, has typically been thought to be the strategic variable of firms in the rail-freight industry. Total demand was quite variable, and so the actual market share of any particular railroad firm would depend on both the prices charged by all the firms as well as unpredictable (random) forces. Price wars were not random, but precipitated by periods of slackened demand, which were presumably unpredictable, at least to some extent.

On the other hand, the predictable fluctuations in demand that resulted from the annual opening and closing of the Great Lakes (Superior / Michigan / Huron / Ontario / Erie ) to shipping (because they were frozen in winter), which determined the degree of outside competition, did not disrupt industry conduct. Rather, rates adjusted systematically with the lake navigation season.

This dataset is available on Vincent Arel-Bundock’s dataset repository, and is part of the R package AER (Applied Econometrics in R).

3 Read the Data

cartelstability <- read_csv("https://vincentarelbundock.github.io/Rdatasets/csv/AER/CartelStability.csv")
cartelstability
glimpse(cartelstability)
Rows: 328
Columns: 6
$ rownames <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18…
$ price    <dbl> 0.40, 0.40, 0.40, 0.40, 0.40, 0.40, 0.40, 0.40, 0.40, 0.35, 0…
$ cartel   <chr> "yes", "yes", "yes", "yes", "yes", "yes", "yes", "yes", "yes"…
$ quantity <dbl> 13632, 20035, 16319, 12603, 23079, 19652, 16211, 22914, 23710…
$ season   <chr> "Jan  1 - Jan 28", "Jan  1 - Jan 28", "Jan  1 - Jan 28", "Jan…
$ ice      <chr> "yes", "yes", "yes", "yes", "yes", "yes", "yes", "yes", "yes"…

4 Data Dictionary

NoteQuantitative Variables

Write in.

NoteQualitative Variables

Write in.

NoteObservations

Write in.

5 Research Question

Note

How do prices for per-tonne grain transport vary based on whether the cartel is working or not? Does this depend upon whether it is summer time or winter time? Why?

6 Inspect/Analyse/Transform the Data

```{r}
#| label: data-preprocessing
#
# Write in your code here
# to prepare this data as shown below
# to generate the plot that follows
# Rename Variables if needed
# Change data to factors etc.
# Set up Counts, histograms etc
```

Some summarizing…

7 Plot the Data

8 Task and Discussion

  • Complete the Data Dictionary.
  • Select and Transform the variables as shown.
  • Create the graphs shown and discuss the following questions:
    • Identify the type of charts
    • Identify the variables used for various geometrical aspects (x, y, fill…). Name the variables appropriately.
    • What research activity might have been carried out to obtain the data graphed here? Provide some details.
    • What pre-processing of the data was required to create the chart?
    • Explain what happens when it is stated “cartel is working” and “cartel is not working”.
    • How do prices for per-tonne grain transport vary based on whether the cartel is working or not? Does this depend upon whether it is summer time or winter time? Why?
    • Is the cartel beneficial for customers of the JEC? What would be their behaviour based on whether the cartel was operational or not?
Back to top

License: CC BY-SA 2.0

Website made with ❤️ and Quarto, by Arvind V.

Hosted by Netlify .