# Introduction to Graphs

Graphs are basically an abstract data type. They are an organisation of data points and relation between them. In more mathematical sense, graphs are a set of nodes/vertices connected by edges that mey be directed or undirected.

# Introduction

Let us first dive into the technicalities of graphs before getting into the Algorithms and Applications of graphs. So, we get ourselves acquainted with the terminologies and representation techniques of graphs in this section.

## Terminology

It is important to first get acquainted with teh various terminologies associated with graphs before actually diving into the algorithms and concepts associated with graphs.

• Refer to this article to get a hang of the various terminologies associated with graphs.
• Refer to this article to get familiar with the various definitions associated with graphs.

## Representation

There are several ways to represent graphs that are suitable under different scenarios. We take a look at each of them one by one -

• Pictorial Representation - Such a representation involves diagrams that shows nodes connected by edges. But such diagrams are not feasible when the graph consists of a lot of nodes and edges. This is just used as a visualisation technique for small graphs.
• Adjacency Matrix - This is a V*V matrix (V - number of vertices) that stores Boolean values 1 and 0 if there is an edge between two vertices and if there is no edge respectively. This can be modified depending on the use. For example, one can store the number of edges between two vertices or the weights associated with edges between two vertices, but in essence, it is basically, trying to maintain on O(1) lookup for adjacency between two vertices.
• Adjacency List - This is basically a structure consisting of V lists such that each of them stores the vertex number of all the adjacent vertices in the form of a list. This kind of representation is extremely useful in Graph Algorithms as we will see later.

• To get a brief introduction to representation of graphs, head here.
• Head here for a more in depth analysis of the representation techniques of graphs.
• For a complete introduction to various terms, definitions and representation techniques of graphs, this is an extremely useful tutorial.

We now start exploring the various algorithms beginning with traversal techniques on graphs in the next section.

# Graph Traversal

Let us say that we want to understand the connectivity in graphs and reach out to each node and explore the edges associated with it, then we would definitely want to organise things and traverse in a defined manner on the graph. There are two techniques for graph traversals that are extremely common in the world of graphs which we explore next.

## Depth First Search

It is often abbreviated as DFS and we use this henceforth. Depth First Search involves visiting vertices in the parent child grandchild order. This is pretty useful when we want to identify heirarchical properties of the tree and derive the parent-child relation of the nodes. It is widely used in practice majorly beacause it can be simulated using recursion as explained in the articles linked in this section.

• This tutorial gives insights into Breadth First Search Traversal.

We now discuss DFS specifically for Binary Trees where we talk about three kinds of DFS techniques -

• In Order Traversal (Left Root Right) - This involves DFS in the Left Subtree, root and then DFS in the Right Subtree.
• Post Order Traversal (Left Right Root) - This involves DFS in the Left Subtree, DFS in the Right Subtree and then the root.
• Pre Order Traversal (Root Left Right) - This involves root, DFS in the Left Subtree and then DFS in the Right Subtree.

This simulation is extremely useful in understanding the three types of order of traversals that we talked about.

It is often abbreviated as BFS which we use henceforth. Breadth First Search involves visiting vertices on one level first, followed by vertices on the next level. Basically, this can be thought of as visiting edges in order of their heights or distance from the root. Such a traversal is useful when we want to evaluate distance from root or generation level of a vertex from the root. It is an extremely useful search technique. A slight modification in BFS can give rise to several of the algorithms including Prim's Algorithm and Dijekstra as we will see later.

• Refer to this tutorial for insights into Breadth First Search Traversal.

# Shortest Path Algorithms

Let us frst try to think about the problem rather than directly jumping into the algorithm. Say, given any two nodes, we want to find out the shortest path between these two nodes. A path distance is evaluated as the sum of all edge weights that were encountered on the way between the two nodes. Now, on an unweighted graph, this was easy in the sense that a simple DFS/BFS wo give us the distance but not on a weighted graph.

## Dijkstra's Algorithm

Basically, this is a modification of the Breadth First Search Algorithm. Instead of visiting nodes in a level wise manner, we act greedily on the distance from root. At any point of time, we select a node with the minimum distance from root and traverse one edge from this node and evaluate the distances. This kind of greedy work allows us to reach the destination node in the shortest possible time.

This is implemented using a priority queue that always allows us to select the least distant node in O(logn) time.

• This tutorial explains the implementation of the algorithm on Sparse Graphs.

# Directed Acyclic Graph

It is often abbreviated as DAG which we use henceforth. It is basically a graph with directed edges and no cycles. It is widely applicable to various sets of problems. One of the main reason is the topological ordering of the nodes in which every node any edge in the graph is from some node on the left to the right in the ordering of nodes.

## Topolgical Sort

This kind of ordering that we discussed above is sometimes formulated as Topological Sort. It is intutively true that such an ordering exists and is in fact easy to contruct recursively. Basically, any such ordering requires that the leftmost node has no incoming edge, so we pick up such a vertex from the graph and delete this node and the resulting graph is again a DAG. This means inductively, it is possible to generate such an ordering of the nodes. The overall complexity of this approach is O(n+m).

• A smart way to do this is a simple DFS traversal. It is a really clever technique for achieving this ordering. Read this to get into the implementational details of this kind of sort.

# Minimum Spanning Tree

It is often abbreviated as MST which is used henceforth. Let us try to understand the situtation with an example. Say over the years you build a computer Network with several computers directly connected with a certain length of wire. Two computers can communicate if they are directly/indirectly connected. Now, the network has become really complicated and you want to simplify by removing certain connections and still maintaining a network where each pair of computers can communicate with each other. So, this require us to build a tree with some wires as edges and computers as nodes with some of lengths as min as possible. This is precisely what Minimum Spanning Tree is. We discuss two really popular algorithms for obatining MST out of a graph.

## Prim's Algorithm

This is purely greedy approach that selects edges on a logic based on the Shortest Distance Algorithm that we saw earlier. Initially, we choose any node and find the smallest length edge and push that vertex in. Now, jumping to a general time, we have a set of vertices that have been visited and we want to find an edge that links a vertex from this set to a vertex not in this set with the min length. We do this by maintaining a set of edges with their lengths. This greedy approach works correctly in every scenario.

• Refer to this article to get into the details of the algorithm.
• This advanced tutorial covers a lot of different kinds of graphs and what form of the algorithm works best with it's implementation and analysis.

## Kruskal's Algorithm

This algorithm is based on Disjoint Set Union Data Structure that is really popular in the field of Computer Science. We first sort the edges in order of weights and initialise a disjoint set union with isolated nodes as trees. We iterate over the edges and at each step pick an edge and see if it merges two different nodes. If yes, we add this and merge the trees, otherwise, we ignore this edge and move to the next step. The proof of correctness of this algorithm is based on induction over the number of edges selected till any point in the algorithm.

• Refer to this article to get into the details of the algorithm.
• This advanced tutorial covers it in a lot of depth with the proof of correctness and implementation of the algorithm.

# Strongly Connected Components

We defined Connected Components in an undirected graph as a set of vertices that are reachable from one another traversing through the undirected edges. On a similar note, we define Strongly Connected Components as the set of vertices in a Directed graph such that each vertex is reachable from every other vertex through directed edges.

We talk about two really efficient algorithms for finding out strongly connected components in a Directed Graph.

## Kosaraju Algorithm

We build what is called the Condensation Graph out of the given Directed Graph and come up with some properties of these graphs such as Acyclicity.

A single DFS is run on the graph from every non-visited vertex and the ordering is pushed to a vector. This ordering gives us a relation between nodes (strongly connected components) in the condensation graph. We then use transposition of a directed graph and run DFS on that in order to extract the stongly connected components. The overall complexity of the algorithm is O(m+n) where n is the number of nodes and m is the number of edges in the graph.

# Network Flow

A network flow is a directed graph with a function assigned to it that assigns each edge a particluar value or capacity. A flow is essentially assigning values to each edge under some restrictions. A good analogy for this is consider the edges to be water pipes and values associated with the edges to be the capacity of the pipes.

A max Flow is essentially a flow with the max value. We talk about an algorithm to solve the task of obtaining max flow in a given network flow graph. This algorithm is called the Ford-Fulkerson Algorithm.

We define something called the Residual Capacity of the edges/pipes. We begin by assigning each edge a value of 0 and then at each step try tp find some path with values less than the residual capacity. This process is done until we fail to find a path with no possible further increase in the flow value.

• Follow this link to get into the details of the algorithm.
• Refer to this tutorial for implementation details of the algorithm.

# Special Graphs

We talk about a few graphs that are important in fields like Biotechnology, Radio Electronics, Digital Computers and various other domains.

## Eulerian Graphs

A graph is said to be a Eulerian Graph if there exists some path in the graph that goes through each of the edge exactly once. It is easy to prove that a graph is Eulerian if and only if every vetrex has and even degree and the graph is connected.

## Hamiltonian Graphs

A graph is said to be a Hamiltonian Graph if there exists some path that goes through each vertex exactly once. It is a NP hard problem to find a necessary and suffcient condition for a graph to be a Hamiltonian Graph.