library(systemfonts)library(showtext)## Clean the slatesystemfonts::clear_local_fonts()systemfonts::clear_registry()##showtext_opts(dpi =96)# set DPI for showtextsysfonts::font_add( family ="Alegreya", regular ="../../../../../../fonts/Alegreya-Regular.ttf", bold ="../../../../../../fonts/Alegreya-Bold.ttf", italic ="../../../../../../fonts/Alegreya-Italic.ttf", bolditalic ="../../../../../../fonts/Alegreya-BoldItalic.ttf")sysfonts::font_add( family ="Roboto Condensed", regular ="../../../../../../fonts/RobotoCondensed-Regular.ttf", bold ="../../../../../../fonts/RobotoCondensed-Bold.ttf", italic ="../../../../../../fonts/RobotoCondensed-Italic.ttf", bolditalic ="../../../../../../fonts/RobotoCondensed-BoldItalic.ttf")showtext_auto(enable =TRUE)# enable showtext##theme_custom<-function(){theme_bw(base_size =10)+theme_sub_axis( title =element_text( family ="Roboto Condensed", size =8), text =element_text( family ="Roboto Condensed", size =6))+theme_sub_legend( text =element_text( family ="Roboto Condensed", size =6), title =element_text( family ="Alegreya", size =8))+theme_sub_plot( title =element_text( family ="Alegreya", size =14, face ="bold"), title.position ="plot", subtitle =element_text( family ="Alegreya", size =10), caption =element_text( family ="Alegreya", size =6), caption.position ="plot")}## Use available fonts in ggplot text geoms too!ggplot2::update_geom_defaults(geom ="text", new =list( family ="Roboto Condensed", face ="plain", size =3.5, color ="#2b2b2b"))ggplot2::update_geom_defaults(geom ="label", new =list( family ="Roboto Condensed", face ="plain", size =3.5, color ="#2b2b2b"))ggplot2::update_geom_defaults(geom ="marquee", new =list( family ="Roboto Condensed", face ="plain", size =3.5, color ="#2b2b2b"))ggplot2::update_geom_defaults(geom ="text_repel", new =list( family ="Roboto Condensed", face ="plain", size =3.5, color ="#2b2b2b"))ggplot2::update_geom_defaults(geom ="label_repel", new =list( family ="Roboto Condensed", face ="plain", size =3.5, color ="#2b2b2b"))## Set the themeggplot2::theme_set(new =theme_custom())## tinytable optionsoptions("tinytable_tt_digits"=2)options("tinytable_format_num_fmt"="significant_cell")options(tinytable_html_mathjax =TRUE)## Set defaults for flextableflextable::set_flextable_defaults(font.family ="Roboto Condensed")
2 Introduction
Network graphs show relationships between entities: what sort they are, how strong they are, and even of they change over time.
We will examine data structures pertaining both to the entities and the relationships between them and look at the data object that can combine these aspects together. Then we will see how these are plotted, what the structure of the plot looks like. There are also metrics that we can calculate for the network, based on its structure. We will of course examine geometric metaphors that can represent various classes of entities and their relationships.
Network graphs can be rendered both as static and interactive and we will examine R packages that render both kinds of plots.
There is a another kind of structure: one that combines spatial and network data in one. We will defer that for a future module !
3 What kind Network graphs will we make?
Here is a network map of the characters in Victor Hugo’s Les Miserables:
know the types and structures of network data and be able to work with them
understand the basics of modern network packages in R
be able to create network visualizations using tidygraph, ggraph( static visualizations ) and visNetwork (interactive visualizations)
see directions for how the network metaphor applies in a variety of domains (e.g. biology/ecology, ideas/influence, technology, transportation, to name a few)
PREDICT Inspect the code and guess at what the code might do, write predictions
RUN the code provided and check what happens
INFER what the parameters of the code do and write comments to explain. What bells and whistles can you see?
MODIFY the parameters code provided to understand the options available. Write comments to show what you have aimed for and achieved.
MAKE : take an idea/concept of your own, and graph it.
6 Graph Metaphors
Network graphs are characterized by two key terms: nodes and edges
Nodes : Entities
Metaphors: Individual People? Things? Ideas? Places? to be connected in the network.
Synonyms: vertices. Nodes have IDs.
Edges: Connections
Metaphors: Interactions? Relationships? Influence? Letters sent and received? Dependence? between the entities.
Synonyms: links, ties.
In R, we create network representations using node and edge information. One way in which these could be organized are:
Node list: a data frame with a single column listing the node IDs found in the edge list. You can also add attribute columns to the data frame such as the names of the nodes or grouping variables. ( Type? Class? Family? Country? Subject? Race? )
Edge list: data frame containing two columns: source node and destination node of an edge. Source and Destination have node IDs.
Weighted network graph: An edge list can also contain additional columns describing attributes of the edges such as a magnitude aspect for an edge. If the edges have a magnitude attribute the graph is considered weighted.
Edges Table
From
To
Relationship
Weightage
1
3
Financial Dealings
6
2
1
History Lessons
2
2
3
Vaccination
15
Layout: A geometric arrangement of nodes and edges.
Metaphors: Location? Spacing? Distance? Coordinates? Colour? Shape? Size? Provides visual insight due to the arrangement.
Layout Algorithms : Method to arranges nodes and edges with the aim of optimizing some metric .
Metaphors: Nodes are masses and edges are springs. The Layout algorithm minimizes the stretching and compressing of all springs.(BTW, are the Spring Constants K the same for all springs?…)
Directed and undirected network graph: If the distinction between source and target is meaningful, the network is directed. If the distinction is not meaningful, the network is undirected. Directed edges represent an ordering of nodes, like a relationship extending from one node to another, where switching the direction would change the structure of the network. Undirected edges are simply links between nodes where order does not matter.
TipExamples
The World Wide Web is an example of a directed network because hyperlinks connect one Web page to another, but not necessarily the other way around.
Co-authorship networks represent examples of un-directed networks, where nodes are authors and they are connected by an edge if they have written a publication together
When people send e-mail to each other, the distinction between the sender (source) and the recipient (target) is clearly meaningful, therefore the network is directed.
Connected and Disconnected graphs: If there is some path from any node to any other node, the Networks is said to be Connected. Else, Disconnected.
7 Predict/Run/Infer-1
7.1 Using tidygraph and ggraph
tidygraph and ggraph are modern R packages for network data. Graph Data setup and manipulation is done in tidygraph and graph visualization with ggraph.
tidygraph Data -> “Network Object” in R.
ggraph Network Object -> Plots using a chosen layout/algo.
Both leverage the power of igraph, which is the Big Daddy of all network packages. We will be using the Grey’s Anatomy dataset in our first foray into networks.
7.2 Step1. Read the data
Download these two datasets into your current project-> data folder.
Look at the output thumbnails. What attributes (i.e. extra information) are seen for Nodes and Edges?
7.3 Step 2.Create a network object using tidygraph:
Key function:
tbl_graph(): (aka “tibble graph”). Key arguments: nodes, edges and directed. Note this is a very versatile command and can take many input forms, such as data structures that result from other packages. Type ?tbl_graph in the Console and see the Usage section.
# A tbl_graph: 54 nodes and 57 edges
#
# An undirected simple graph with 4 components
#
# Node Data: 54 × 7 (active)
name sex race birthyear position season sign
<chr> <chr> <chr> <dbl> <chr> <dbl> <chr>
1 Addison Montgomery F White 1967 Attending 1 Libra
2 Adele Webber F Black 1949 Non-Staff 2 Leo
3 Teddy Altman F White 1969 Attending 6 Pisces
4 Amelia Shepherd F White 1981 Attending 7 Libra
5 Arizona Robbins F White 1976 Attending 5 Leo
6 Rebecca Pope F White 1975 Non-Staff 3 Gemini
7 Jackson Avery M Black 1981 Resident 6 Leo
8 Miranda Bailey F Black 1969 Attending 1 Virgo
9 Ben Warren M Black 1972 Other 6 Aquarius
10 Henry Burton M White 1972 Non-Staff 7 Cancer
# ℹ 44 more rows
#
# Edge Data: 57 × 4
from to weight type
<int> <int> <dbl> <chr>
1 5 47 2 friends
2 21 47 4 benefits
3 5 46 1 friends
# ℹ 54 more rows
NoteQuestions and Inferences #2
What information does the graph object contain? What attributes do the nodes have? What about the edges?
7.4 Step 3. Plot using ggraph
3a. Quick Plot: autograph() This is to check quickly is the data is imported properly and to decide upon going on to a more elaborate plotting.
Describe this graph, in simple words here. Try to use some of the new domain words we have just acquired: nodes/edges, connected/disconnected, directed/undirected.
3b. More elaborate plot
Key functions:
ggraph(layout = "......"): Create classic node-edge diagrams; i.e. Sets up the graph. Rather like ggplot for networks!
Two kinds of geom: one set for nodes, and another for edges
geom_node_point(aes(.....)): Draws node as “points”. Alternatives are circle / arc_bar / tile / voronoi. Remember the geoms that we have seen before in Grammar of Graphics!
geom_edge_link0(aes(.....)): Draws edges as “links”. Alternatives are arc / bend / elbow / hive / loop / parallel / diagonal / point / span /tile.
geom_node_text(aes(label = ......), repel = TRUE): Adds text labels (non-overlapping). Alternatives are label /...
labs(title = "....", subtitle = "....", caption = "...."): Change main titles, axis labels and legend titles. We know this from our work with ggplot.
Show the Code
set_graph_style(family ="Roboto Condensed", size =16)# Write Comments next to each line# About what that line does for the overall graphggraph(graph =ga, layout ="kk")+#geom_edge_link0(width =2, color ="pink")+#geom_node_point( shape =21, size =8, fill ="blue", color ="green", stroke =2)+labs( title ="Whoo Hoo! My First Silly Grey's Anatomy graph in R!", subtitle ="Why did I ever get in this course...", caption ="Bro, they are doing cool things in the other classes...\n And the show is even more cool!")
NoteQuestions and Inferences #4:
What parameters have been changed here, compared to the earlier graph? Where do you see these changes in the code above?
Let us Play with this graph and see if we can make some small changes. Colour? Fill? Width? Size? Stroke? Labs? Of course!
Show the Code
set_graph_style(family ="Roboto Condensed", size =16)# Change the parameters in each of the commands here to new ones# Use fixed values for colours or sizes...etc.ggraph(graph =ga, layout ="kk")+geom_edge_link0(width =2)+geom_node_point( shape =21, size =4, fill ="moccasin", color ="firebrick", stroke =2)+labs( title ="Whoo Hoo! My next silly Grey's Anatomy graph in R!", subtitle ="Why did I ever get in this course...", caption ="Bro, they are doing cool things in the other classes...")
NoteQuestions and Inferences #5
What did the shape parameter achieve? What are the possibilities with shape? How about including alpha?
3c. Aesthetic Mapping from Node and Edge attribute columns
Up to now, we have assignedspecific numbers to geometric aesthetics such as shape and size. Now we are ready ( maybe ?) change the meaning and significance of the entire graph and each element within it, and use aesthetics / metaphoric mappings to achieve new meanings or insights. Let us try using aes() inside each geom to map a variable to a geometric aspect.
Don’t try to use more than 2 aesthetic mappings simultaneously!!
The node elements we can tweak are:
Types of Nodes: geom_node_****()
Node Parameters: inside geom_node_****(aes(...............))
-aes(alpha = node-variable) : opacity; a value between 0 and 1
-aes(shape = node-variable) : node shape
-aes(colour = node-variable) : node colour
-aes(fill = node-variable) : fill colour for node
-aes(size = node-variable) : size of node
The edge elements we can tweak are:
Type of Edges” geom_edge_****()
Edge Parameters: inside geom_edge_****(aes(...............))
-aes(colour = edge-variable) : colour of the edge
-aes(width = edge-variable) : width of the edge
-aes(label = some_variable) : labels for the edge
Type ?geom_node_point and ?geom-edge_link in your Console for more information.
Describe some of the changes here. What types of edges worked? Which variables were you able to use for nodes and edges and how? What did not work with either of the two?
How does this graph look “metaphorically” different? Do you see a difference in the relationships between people here? Why?
8 Hierarchical layouts
These provide for some alternative metaphorical views of networks. Note that not all layouts are possible for all datasets!!
Show the Code
set_graph_style(family ="Roboto Condensed", size =16)# This dataset contains the graph that describes the class# hierarchy for the Flare visualization library.# Type ?flare in your Consolehead(flare$vertices)
Does splitting up the main graph into sub-networks give you more insight? Describe some of these.
10 Network analysis with tidygraph
The data frame graph representation can be easily augmented with metrics or statistics computed on the graph. Remember how we computed counts with the penguin dataset in Grammar of Graphics.
Before computing a metric on nodes or edges use the activate() function to activate either node or edge data frames. Use dplyrverbs (filter, arrange, mutate) to achieve your computation in the proper way.
10.1 Network Centrality: Go-To and Go-Through People!
Centrality is a an “ill-defined” metric of node and edge importance in a network. It is therefore calculated in many ways. Type ?centrality in your Console.
Standards
Let’s add a few columns to the nodes and edges based on network centrality measures:
How do the Centrality Measures show up in the graph? Would you “agree” with the way we have done it? Try to modify the aesthetics by copy-pasting this chunk below and see how you can make an alternative representation.
10.2 Analysis and Visualizing Network Communities
Who is close to whom? Which are the groups you can see?
Is the Community depiction clear? How would you do it, with which aesthetic? Copy Paste this chunk below and try.
10.3 Interactive Graphs with visNetwork
Exploring the VisNetwork package. Make graphs wiggle and shake using tidy commands! The package implements interactivity using the physical metaphor of weights and springs we discussed earlier.
The visNetwork() function uses a nodes list and edges list to create an interactive graph. The nodes list must include an “id” column, and the edge list must have “from” and “to” columns. The function also plots the labels for the nodes, using the names of the cities from the “label” column in the node list.
Show the Code
library(visNetwork)# Prepare the data for plotting by visNetworkgrey_nodesgrey_edges# Relabel greys anatomy nodes and edges for VisNetworkgrey_nodes_vis<-grey_nodes%>%rowid_to_column(var ="id")%>%rename("label"=name)%>%mutate(sex =case_when(sex=="F"~"Female",sex=="M"~"Male"))%>%replace_na(., list(sex ="Transgender?"))%>%rename("group"=sex)grey_nodes_visgrey_edges_vis<-grey_edges%>%select(from, to)%>%left_join(., grey_nodes_vis, by =c("from"="label"))%>%left_join(., grey_nodes_vis, by =c("to"="label"))%>%select("from"=id.x, "to"=id.y)grey_edges_vis
Some idea of interactivity and controls with visNetwork:
Show the Code
# let's look again at the datastarwars_nodes<-read_csv("files/data/star-wars-network-nodes.csv")starwars_edges<-read_csv("files/data/star-wars-network-edges.csv")
Show the Code
# We need to rename starwars nodes dataframe and edge dataframe columns for visNetworkstarwars_nodes_vis<-starwars_nodes%>%rename("label"=name)# Convert from and to columns to **node ids**starwars_edges_vis<-starwars_edges%>%# Matching Source <- Source Node id ("id.x")left_join(., starwars_nodes_vis, by =c("source"="label"))%>%# Matching Target <- Target Node id ("id.y")left_join(., starwars_nodes_vis, by =c("target"="label"))%>%# Select "id.x" and "id.y" ONLY# Rename them as "from" and "to"# keep "weight" column for aesthetics of edgesselect("from"=id.x, "to"=id.y, "value"=weight)# Check everything oncestarwars_nodes_visstarwars_edges_vis
11.1 Make-2: Literary Network with TV Show / Book / Story / Play
You need to create a Network Graph for your favourite Book, play, TV serial or Show. (E.g. Friends, BBT, or LB or HIMYM, B99, TGP, JTV…or Hamlet, Little Women , Pride and Prejudice, or LoTR)
In the nodes excel, use id and names as your columns. Any other details in other columns to the right.
In your edges excel, use from and to as your first columns.
Entries in these columns can be names or ids but be consistent and don’t mix.
Step 3. Decide on 3 answers that you to seek and plan to make graphs for.
Step 4. Create graph objects. Say 3 visualizations.
Step 5. Write comments/answers in the code and narrative text. Add pictures from the web using Markdown syntax.
Step 6. Write Reflection ( ok, a short one!) inside your Quarto document. Make sure it renders !!
Step 7. Group Submission: Submit the render-able .qmd fileAND the data. Quarto Markdown with joint authorship. Each person submits on their Assignments. All get the same grade on this one.
Ask me for clarifications on what to do after you have read the Instructions in your group.
Antonov, Michael, Gábor Csárdi, Szabolcs Horvát, Kirill Müller, Tamás Nepusz, Daniel Noom, Maëlle Salmon, Vincent Traag, Brooke Foucault Welles, and Fabio Zanini. 2023. “Igraph Enables Fast and Robust Network Analysis Across Programming Languages.”arXiv Preprint arXiv:2311.10260. https://doi.org/10.48550/arXiv.2311.10260.
Csárdi, Gábor, and Tamás Nepusz. 2006. “The Igraph Software Package for Complex Network Research.”InterJournal Complex Systems: 1695. https://igraph.org.
Csárdi, Gábor, Tamás Nepusz, Vincent Traag, Szabolcs Horvát, Fabio Zanini, Daniel Noom, Kirill Müller, David Schoch, and Maëlle Salmon. 2026. igraph: Network Analysis and Visualization in r. https://doi.org/10.5281/zenodo.7682609.
David Schoch. 2023. “graphlayouts: Layout Algorithms for Network Visualizations in r.”Journal of Open Source Software 8 (84): 5238. https://doi.org/10.21105/joss.05238.
Qiu, Yixuan, and authors/contributors of the included software. See file AUTHORS for details. 2024. showtext: Using Fonts More Easily in r Graphs. https://doi.org/10.32614/CRAN.package.showtext.