Math is hard!

, ok, I know, but the fact is that many real world situations can be modeled by mathematical objects. In today's world, many technologies rely upon some sort of network to work; For instance, spatial location relies partially on maps and routing packets on the internet relies on the connection between the routers themselves, and between routers and devices. Graphs are the perfect (combinatorial) objects to model networks in these and many other cases.

Informally, a graph is composed by vertices, which are the points in the graph, and edges, which are the lines or arrows connecting these points. Additionally, each edge may have a weight. Here is a figure representing an weighted undirected graph (edges are lines) to the left and an unweighted directed graph (edges are arrows) to the right.

It's not hard to see the many applications that a graph can model. In fact, the essence of many real world problems are nothing more than graph problems. Some examples are:

Scheduling tasks, allocating register antenna interference: Graph Coloring Problem;

Infection spread, social influence: Irreversible k-threshold processes on graphs;

Vehicle, fluid and people flow: Flow on Graphs;

Routing in mobile and wireless networks: Dominating Set Problem;

Drilling of printed circuit boards, vehicle routing: Travelling Salesman Problem.

But first things first, formally, an unweighted directed graph `G`

is a pair `(V,E)`

where `V`

is the vertex set and `E`

⊆ `V`

× `V`

, a set of pairs of `V`

, is the edge set. A weighted directed graph `G`

is a triple `(V,E,w)`

similar to the unweighted directed graph only having the weight function `w : E`

→ `R`

that assigns a number to each edge. The undirected versions of these graphs are basically the same, except that `E`

is a subset of the set composed by all the subsets with two elements of `V`

. Thus, in an undirected graph, the edges `{u,v}`

and `{v,u}`

are the same, but, in a directed graph, the edges `(u,v)`

and `(v,u)`

are not.

One particularly very relevant example of graph problem, both from a theoretical and practical point of view, is the **Single Source Shortest Path** problem. The **Single Source Shortest Path** problem is one of the most applicable problems involving graphs. This problem arise in many real world situations such as finding a shortest path between two locations in a map or the min-delay path in a network.

The **Single Source Shortest Path** problem consists in, given a graph `G`

and a vertex `s`

, finding the length of the shortest path between `s`

and all that other vertices of `G`

, where the length of a path is the sum of all its edges. Sometimes the problem asks for a shortest path, but, in this post, I'm going to worry only about finding the length of the shortest path.

If `G`

is a simple unweighted directed or undirected graph `G=(V,E)`

and `s`

∈ `V`

, the best known algorithm to compute the shortest path from a vertex `s`

to all other vertices is the breadth-first search (bfs), which is simple enough. The main idea of the bfs is that it traverses the graph beginning at `s`

and then, first, visits all vertices at distance one from `s`

, then visits all vertices at distance two of `s`

and so on. The algorithm is the following.

```
BFS(V,E,s)
# q is a queue and dist is an array that stores the shortest distances
for all vertices x in V do
dist[x] := infinite
dist[s] := 0
push s on q
while(size of q > 0)
x := extract first element of q
for every y such that (x,y) in E do
if(dist[y] = infinite)
dist[y] := dist[x]+1
push y on q
return dist
```

By the end of the algorithm, if `dist[x] = infinite`

, then there is no path between `s`

and `x`

on `G`

.

This algorithm is very efficient. In fact, for those familiar with asymptotic notation, its complexity is O(n+m), where `n`

is the size of `V`

and `m`

is the size of `E`

, which is linear in the size of the graph.

There is an animation of the BFS applied to an unweighted undirected graph below. The numbers inside the vertices are the order in which each vertex is visited (unqueued) and the numbers outside are their distance to the vertex on top.

On the other hand, if `G`

is a simple **weighted** graph `G=(V,E,w)`

, breadth-first search just won't do. Instead, we will use Dijkstra's algorithm, which is one of the most popular algorithms to compute the shortest paths from a single source `s`

. Dijkstra's algorithm is somewhat similar to the bfs. One drawback of Dijkstra's algorithm is that it only works if the graph has no negative weights. The algorithm is the following.

```
Dijkstra(V,E,w,s)
# T is a set and dist is an array that stores the shortest distances
insert s in T
dist[s] := 0
for all vertices w in V but s do
dist[w] := infinite
insert w in T
while(size of T > 0)
x := value in T with minimum dist[x]
remove x from T
for every y such that (x,y) in E do
if(dist[y] > dist[x] + w(x,y))
dist[y] := dist[x] + w(x,y)
return dist
```

Again, by the end of the algorithm, if `dist[x] = infinite`

, then there is no path from `s`

to `x`

on `G`

. Here is an animation of Dijsktra's algorithm applied to an weighted directed graph.

Green vertices are vertices currently being visited, i.e., the vertex `x`

inside the loop while, and red vertices are vertices already visited. The distance of all red vertices will not be changed anymore during the rest of the algorithm. The yellow vertices are vertices yet to be visited. The red edges at the end are the minimum paths from the source.

An efficient way to implement the set T is using a min-heap since we only want to extract the minimum of the set. Thus, a more concrete algorithm would be the following.

```
Dijkstra(V,E,w,s)
# h is a min-heap and dist is an array that stores the shortest distances
h.insert(s,0) # inserts element v with key 0
dist[s] := 0
for all vertices w in V but s do
h.insert(w,infinite)
dist[w] := infinite
while(size of h > 0)
(x,d) := h.extract_minimum() # extracts pair (x,d) with minimum key d
for every y such that (x,y) in E do
if(dist[y] > d + w(x,y))
dist[y] := d + w(x,y)
h.decrease_key(y,dist[y]) # decreases key of element y to dist[y]
return dist
```

One of the best heaps to use in Dijkstra's algorithm is the Fibonacci Heap, which performs very well in practice. However, the best known (theoretically speaking) heap to use in Dijkstra's algorithm is Thorup's fibonacci heap, which you can find in this paper by Thorup. In a graph `G = (V,E,w)`

, while Dijkstra's algorithm achieves a complexity of O(m + n log n) with fibonacci heap, with Thorup's fibonacci heap, it achieves a complexity of O(m + n log log n), where `n`

is the size of `V`

and `m`

is the size of `E`

.

It's worth mentioning that it's not hard to extend the algorithms above to compute a shortest path. I'll leave that as an exercise to the reader :). Also, there are several other algorithms for special cases of the Single Source Shortest Path problem, for instance, when all weights are positive integers, and many other algorithms for the **All Pairs Shortest Path** problem, which asks for the shortest distances between every pair of vertices of the graph, like the Floydâ€“Warshall algorithm.

In conclusion, graph theory and its algorithms are very relevant nowadays since graphs can model a vast number of real world applications in many disciplines and situations. Thus, every developer that aims to improve his algorithmic skills should, at some point, take his time to really learn it.

I'm a Ph.D. in algorithms and complexity with experience in software development. Just recently, I realized just how much theoretical knowledge can aid the everyday developer and, since then, I'm keen to show people how computing theory can help them improve their skills.