TD11 - Implementations of Dijkstra's algorithm¶

0. Preliminaries¶

import numpy as np
import sys

class G:
    def __init__(self,value=None,index=None,children_nodes=None,weights=None):
        self.value = value
        self.index = index
        self.children = []
        for child,weight in zip(children_nodes,weights):
            self.children.append([child,weight])
        
    def print_graph(self,depth=0,node_list=[]):
        if self in set(node_list):
            print("|","---" * depth,">",self.value)
            return
        else:
            print("|","---" * depth,">",self.value)
            node_list.append(self)
            for child,weight in self.children:
                if child != None:                                    
                    child.print_graph(depth+1,node_list)

def dijkstra(nodes,start_node):
    A=[]
    notA=[start_node]
    distances=np.zeros(len(nodes))
    predecessors = []
    for node in nodes:
        predecessors.append([[]])

    for v in [v for v in nodes if v!= start_node]:
        distances[v.index]=sys.maxsize//2
        
    while notA != []:
        u = min(notA,key=lambda v:distances[v.index])
        
        A.append(u)
        notA.remove(u)
        
        for v in [ v for v in nodes if v not in A]:
            for c,w in v.children:
                if (u==c and (distances[v.index]>distances[u.index]+w)):                
                    if distances[v.index]==sys.maxsize//2:
                        notA.append(v)
                    distances[v.index]=distances[u.index]+w
                    predecessors[v.index]=u 
            
    return distances

The complexity of Dikjstra's algorithm for a graph $\mathcal G=(\mathcal E,\mathcal V)$ can be evaluated as:

the initialization step costs $O(|\mathcal V|)$
the while loop is run at most $O(|\mathcal V|)$ times since vertices in $\bar A$ (notA) are successively deleted and can never reappear.
- within each loop, a search of the highest distance node is performed, which by default is also $O(|\mathcal V|)$ in the worse case. But can be improved! say the resulting cost is $O(x)$: the total cost over the while loops is then $O(|\mathcal V|x)$
- again within each while loop, we conditionally update the distance for vertices not in $A$ (A). These are in number also $O(|\mathcal V|)$ in total. However, only "neighboring nodes" will ever be considered, so the total number of iterations of this part of the while loop, summed over all while loops, is rather $O(|\mathcal E|)$ in total (which in sparse graphs would be much less than $O(|\mathcal V|^2)$. Assuming that the actual updating operation has cost $O(y)$, the maximum cost for this part of the loop is then $O(|\mathcal E|y)$.

In total, the complexity is $$O(|\mathcal V|x+|\mathcal E|y).$$

### Random graph generator

graph_size = 10
list_values = np.random.randint(0,100,graph_size)

# Affinity matrix (undirected) of the graph with 'p' probability of connection
# and integer weights from 0 to 'max_weight'
p = .5
max_weight = 100
A = np.random.binomial(1,p,(graph_size,graph_size))*np.random.randint(0,max_weight+1,(graph_size,graph_size))
A = np.triu(A,1)+np.triu(A,1).T

nodes = []

print("List of values in the graph:",list_values)

for i in range(len(list_values)):
    nodes.append(G(value=list_values[i],index=i,children_nodes=[],weights=[]))

for i in range(len(nodes)):
    for j in range(len(nodes)):
        if A[i,j]>0:
            nodes[i].children.append([nodes[j],A[i,j]])
            
print("\nAdjacency matrix:\n",A)
        
node_index = 0
print("\nTree representation of the graph starting from node[",node_index,"]:\n")   
nodes[node_index].print_graph(node_list=[])

print("\nOutput of Dijkstra's algorithm from node",node_index,":",dijkstra(nodes,nodes[node_index]))

List of values in the graph: [ 8 66 18 78 69  1 81 31 32 14]

Adjacency matrix:
 [[ 0  0 68  0 78  0 47  0 35  0]
 [ 0  0  0  0  0  0  0  0 50 22]
 [68  0  0  0 39  0  0 75 47 90]
 [ 0  0  0  0  0 76 15 46  0  0]
 [78  0 39  0  0 16  0 28  0  0]
 [ 0  0  0 76 16  0  0 67 50  0]
 [47  0  0 15  0  0  0 16  2  0]
 [ 0  0 75 46 28 67 16  0  4  0]
 [35 50 47  0  0 50  2  4  0 44]
 [ 0 22 90  0  0  0  0  0 44  0]]

Tree representation of the graph starting from node[ 0 ]:

|  > 8
| --- > 18
| ------ > 8
| ------ > 69
| --------- > 8
| --------- > 18
| --------- > 1
| ------------ > 78
| --------------- > 1
| --------------- > 81
| ------------------ > 8
| ------------------ > 78
| ------------------ > 31
| --------------------- > 18
| --------------------- > 78
| --------------------- > 69
| --------------------- > 1
| --------------------- > 81
| --------------------- > 32
| ------------------------ > 8
| ------------------------ > 66
| --------------------------- > 32
| --------------------------- > 14
| ------------------------------ > 66
| ------------------------------ > 18
| ------------------------------ > 32
| ------------------------ > 18
| ------------------------ > 1
| ------------------------ > 81
| ------------------------ > 31
| ------------------------ > 14
| ------------------ > 32
| --------------- > 31
| ------------ > 69
| ------------ > 31
| ------------ > 32
| --------- > 31
| ------ > 31
| ------ > 32
| ------ > 14
| --- > 69
| --- > 81
| --- > 32

Output of Dijkstra's algorithm from node 0 : [ 0. 85. 68. 52. 67. 83. 37. 39. 35. 79.]

1. Chained lists¶

2. Dial's implementation¶

Say u is the node added to A at iteration $i$. This means it has the smallest distance to the starting node start_node among the nodes in notA. During this same iteration, the distances of the nodes (be they in notA or not) are only updated by addition of a positive value. So at iteration $i+1$, the next node added to A must have a greater (or equal) distance to u.

For integer distances, this remark can be used to improve Dikjstra's algorithm by creating a list c of all nodes at distances $0,1,\ldots,w_{\rm max}(|\mathcal V|-1)$, where $w_{\rm max}$ is the maximum weight value, and looping over c.

The resulting complexity a priori seems larger because we need to loop over all c which is of large size. However, c is rather empty and only $O(|\mathcal E|)$ distance updates occur. Together, the complexity may be evaluated as $O(|\mathcal V|w_{\rm max}+|\mathcal E|)$ but, again, knowing that most of the $O(|\mathcal V|w_{\rm max})$ operations come at negligible cost. This implementation is at any rate particularly convenient if $w_{\rm max}$ is small, such as with only binary weights.

def dijkstra_Dial(nodes,start_node,maxW):
    A=[]
    notA=[start_node]
    n=len(nodes)
    distances=np.zeros(n).astype(int)
    
    ## creation of table 'c'
    c=[]
    for _ in range(maxW*(n-1)+1):
        c.append([])
    c[0].append(start_node)

    predecessors = []
    for node in nodes:
        predecessors.append([[]])

    for v in [v for v in nodes if v!= start_node]:
        distances[v.index]=maxW*(n-1)
        c[maxW*(n-1)].append(v)
        
    i=0
    while i<=(n-1)*maxW:
        if c[i]!=[]:
            u = c[i][0]
            c[i].remove(u)
            du = i

            for v in nodes:
                for child,w in v.children:
                    if (u==child and distances[v.index]>du+w):
                        c[distances[v.index]].remove(v)
                        distances[v.index]=du+w
                        c[distances[v.index]].append(v)
                        predecessors[v.index]=u
        else:
            i+=1
        
    return distances

print("\nOutput of Dijkstra-Dial's algorithm from node",node_index,":",dijkstra_Dial(nodes,nodes[node_index],max_weight))

Output of Dijkstra-Dial's algorithm from node 0 : [ 0 85 68 52 67 83 37 39 35 79]

Coming back to the original Dikjstra's method, we can show that, at any iteration of the while loop, $$ {\rm max}({\tt distances[v],~v}\in \bar A) - {\rm min}({\tt distances[v],~v}\in \bar A) \leq w_{\rm max}. $$

This follows by induction on the while loop. Assumption this holds at a given stage of the loop, two situations may occur that change the set of distances in $\bar A$: calling u the vertex selected having minimal distance within $\bar A$,

for every v such that $({\tt u},{\tt v})\in \mathcal E$, if v already belongs to $\bar A$, its distance is either maintained (when ${\tt distances[v]}\leq {\tt distances[u]}+w_{uv}$) or reduced by at most ${\tt distances[v]}-{\tt distances[u]}-w_{uv}$ which is less than $w_{\rm max}$ by the induction assumption, so the condition still holds;
for every v such that $({\tt u},{\tt v})\in \mathcal E$, if v did not belong to $\bar A$ (its distance was set to $+\infty$ thus far), then its distance becomes exactly ${\tt distances[u]}+w_{uv}$ which is at most $w_{\rm max}$ away from ${\tt distances[u]}$, again confirming the induction.

As a consequence to this remark, the array c introduced in the previous Dial approach may be reduced to a size $w_{\rm max}+1$ array (there is no need to maintain the complete $(|\mathcal V|-1)w_{\rm max}+1$ sized array, which can be discarded and replaced at each loop update).

3. Heap implementation¶

For a heap implementation based on the distances, it suffices to create a heap whose entries are the couple (node,distances[node.index]) and values the distances. Then we replace the calls (see TP8 for a reference):

notA=[start_node] by a binary heap constructor notA=BH([start_node])
u = min(notA,key=lambda v:distances[v.index]) by a "min extraction" from the heap (notA.remove_minimum())
notA.append(v) by a heap insertion (notA.insert(v))
after the update distances[v.index]=distances[u.index]+w, not forgetting to also adapt the heap "piority" (notA.reduce_value(v,distances[v.index],distances[v.index]-(distances[u.index]+w))