Representing Graphs

Let's spend some time now talking about how to implement these graphs and graph algorithms in code.

We will discuss our choice of API, and also the underlying data structures used to represent the graph. Our decisions can have profound implications on our runtime, memory usage, and difficulty of implementing various graph algorithms.

Graph API

An API (Application Programming Interface) is a list of methods available to a user of our class, including the method signatures (what arguments/parameters each function accepts) and information regarding their behaviors. You have already seen APIs from the Java developers for the classes they provide, such as the Deque.

For our Graph API, let's use the common convention of assigning each unique node to an integer number. This can be done by maintaining a map which can tell us the integer assigned to each original node label. Doing so allows us to define our API to work with integers specifically, rather than introducing the need for generic types.

We can then define our API to look something like this perhaps:

public class Graph {
  public Graph(int V):               // Create empty graph with v vertices
  public void addEdge(int v, int w): // add an edge v-w
  Iterable<Integer> adj(int v):      // vertices adjacent to v
  int V():                           // number of vertices
  int E():                           // number of edges
...

Clients (people who wish to use our Graph data structure), can then use any of the functions we provide to implement their own algorithms. The methods we provide can have a significant impact on how easy/difficult it may be for our clients to implement particular algorithms.

Graph Representations

Next, we'll talk about the underlying data structures that can be used to represent our graph.

Adjacency Matrix

One way we can do this is by using a 2D array. There is an edge connecting vertex s to t iff that corresponding cell is 1 (which represents true). Notice that if the graph is undirected, the adjacency matrix will be symmetric across its diagonal (from the top left to the bottom right corners).

Edge Sets

Another way is to store a single set of all the edges.

Adjacency Lists

A third way is to maintain an array of lists, indexed by vertex number. Iff there is an edge from s to t, the list at array index s will contain t.

In practice, adjacency lists are most common since graphs tend to be sparse (there are not many edges in each bucket).

Efficiency

Your choice of underlying data structure can impact the runtime and memory usage of your graph. This table from the slides summarizes the efficiencies of each representation for various operations. Do not copy this on to your cheatsheet without taking the time to first understand where these bounds come from. The lecture contained walkthroughs explaining the rationale behind several of these cells.

Further, DFS/BFS on a graph backed by adjacency lists runs in $O(V+E)$ , while on a graph backed by an adjacency matrix runs in $O(V^2)$ . See the slides for help in understanding why.

18.2 Representing Graphs