NETWORKX LIBRARY WILL BE USED FOR SOME DEMONSTRATIONS ABOUT HOW MUCH FUN ONE COULD HAVE WITH BORING DATA STRUCTURES LIKE GRAPHS

This article will touch the topic of graphs in Python. I will be using Networkx library to make some demonstrations about how much fun one could have with boring data structures like graphs. So I must point out that there is another way of making graphs not boring aside from college exams which give you constant nightmares, whether you are sleeping or not (by the way, if you ever had a graphs nightmare please share it to the world via Dreamophone).

NetworkX is a Python language software package for the creation, manipulation, and study of the structure, dynamics, and function of complex networks

As they claim on the networkx website (http://networkx.github.io/documentation/networkx-1.9.1/overview.html), and they also claim this is a useful tool for a wide range of people like mathematicians, physicists, biologists, computer scientists, social scientists and whatever other word you want to put before "scientists".

However, having no attribute that looks well before the word 'scientist' I will tackle Networkx from the developer's point of view. And that is I will try to play around with it and see if I can come up with something useful or at least nice and fun. All the code that I will be using for this article can be found at https://github.com/mihaigociu/myrepo/blob/master/networkx_presentation.py. This is the same code that I used for my presentation for the RoPython event (http://ropython.org/2015/06/11/graphs-using-networkx-and-semantic-web-using-rdflib/).

It's important, if you want to get the best out of Networkx, to combine it with other tools from the scipy environment. And please do use the IPython Notebook (http://ipython.org/notebook.html) especially if you're planning a presentation. This way people attending it will at least easily see nice pictures of graphs and don't die of boredom while listening to your ranting.

Other useful tools are:

```
import matplotlib.pyplot as plt
import networkx as nx
import numpy as np
import requests
import pylab
%pylab inline
```

This last line makes it possible to show nice pictures of graphs directly into IPython Notebook which, as I said before, will prevent people from falling asleep during the presentation, which you may want to happen unless you're planning to rob them and run away.

So the first thing I did with Networkx was to grow a tree. You know, a binary tree. Here's how I did it:

```
btree = nx.balanced_tree(2,4)
pos=nx.graphviz_layout(btree,prog='dot')
nx.draw(btree,pos,with_labels=False,arrows=False)
```

So that's a nice looking perfectly balanced binary tree in three lines of code. I used the **balanced_tree** generator provided by Networkx, the **graphviz_layout** which is a wrapper over the **Graphviz** library and the **draw **method of Networkx which is also merely a wrapper over **Matplotlib**.

**Shortest paths**

But you're not impressed. So I will show a Shortest Path example, which you absolutely, 100% know what it is and how it works because otherwise you would have failed that first year exam on Graph Algorithms. Unless you actually did, but I will show you the example nonetheless.

But first, one thing I forgot to mention is this:

```
g_mat = np.random.binomial(1, 0.1, (10, 10))
G = nx.Graph(g_mat)
```

So I'm generating a 10x10 matrix using numpy's binomial distribution and use it as a graph matrix. Yes, networkx is well integrated with scipy and numpy and uses efficient data structures for algorithms that require intensive computation. Like this numpy sparse matrix that Networkx uses as the adjacency matrix for our binary tree:

`In [9]: nx.adjacency_matrix(btree)Out[9]: <31x31 sparse matrix of type '<type 'numpy.int64'>'with 60 stored elements in Compressed Sparse Row format>`

But let's go back to our Shortest Path example. So we have generated a random graph matrix using binomial distribution which basically means we have random edges in the graph. Now to make things more interesting we will add weights on the edges. The weights will be random integers in range 1 to 10.

```
weights = np.random.random_integers(low=1, high=10, size=len(G.edges()))for i, (n1, n2) in enumerate(G.edges()):
G[n1][n2]['weight'] = weights[i]
```

Now that we have our graph prepared, in orther to get a nice drawing of it we will first need to decide where and how do we draw the nodes, and by that I mean the positioning for each node in the 2D space. Now for those of you who believe drawing graphs is an easy thing to do I am sad to inform you how mistaken you are. Take a look at this article to scratch the surface of Graph drawing: https://en.wikipedia.org/wiki/Graph_drawing. But fortunately, as you saw in the case of our binary tree, we have wonderful tools like Graphviz to come to our rescue. So we will use that.

`pos=nx.graphviz_layout(G,prog='neato')`

The **graphviz_layout**, as I pointed out earlier, is a wrapper over Graphviz. The above line will compute the 2D position of each node in our graph using some algorithm which those of you who haven't made it past the exam I was mentioning above have little chance to understand. I'm joking, here's the algo in the Graphviz documentation: http://www.graphviz.org/Documentation/GKN04.pdf.

Now back to shortest path. For our graph, we want to not only compute the shortest path between two nodes but also display it in a way that looks good to the user eye and will make his brain get out of Alpha mode (https://en.wikipedia.org/wiki/Alpha_wave). So I wrote a method that does exactly that. Two methods actually:

```
def draw_shortest_path(G, pos, start, end):
path = nx.shortest_path(G, start, end, weight='weight')
colors = node_colors(G, path)# draw graph with weights on edges
edge_labels = {(n1,n2): G[n1][n2]['weight'] for (n1,n2) in G.edges()}
pylab.figure(1, figsize=(8, 8))
nx.draw_networkx(G, pos, node_color=colors)
nx.draw_networkx_edge_labels(G , pos, edge_labels=edge_labels)
pylab.show()
```

And

```
def node_colors(G, path):
colors = []for node in G.nodes():if node in path:
colors.append('b')else:
colors.append('r')return colors
```

**draw_shortest_path** will compute the shortest path, paint the nodes on the path in blue and all other nodes in red, put a label containing the weight on each edge and draw the result.

So this is how the shortest path between nodes 7 and 8 looks like for our graph (given that it's a random generated graph you will most likely get a different result following the same steps; if that doesn't happen you are either extremely lucky or there's a bug in numpy). You can look at the graph until your eyes ake and you will not find a shorter path than the one computed by Networkx's shortest path algo, which is basically an implementation of Dijkstra's algorithm.

**Simulations and strategies**

Now let's move on to something even more interesting. One of the goals of Networkx is to provide

tools for the study of the structure and dynamics of social, biological, and infrastructure networks

Of all the goals listed at http://networkx.github.io/documentation/networkx-1.9.1/overview.html#goals I found this one the most interesting so I did a little research on the internet to see how people use graphs and Networkx in particular for such tasks. So I found this article http://timotheepoisot.fr/2012/05/18/networkx-metapopulations-python/ which simulates the expansion of populations over a graph-like map. While I found the expansion of metapopulations to be an interesting topic, I thought: you know what is better than simulating metapopulations? Simulating civilizations.

So I came up with a new simulation, one that includes multiple populations (which I call civilizations) which compete against each other, each trying to grab as much land as possible in the detriment of others. The rules are simple:

each civilization starts with a number of maximum three nodes

at each turn (it's a turn-based simulation) each civilization will try to expand over all neighboring nodes

the probability of conquering a node is given by the following formula: (number of neighbors you own) / (number of total neighbors of the node + the node itself); so if you want to exapnd over a node that has 9 neighbors and you already hold 5 of the neighbors you have 50% chance of conquering it

I placed three civs on the map: blue, red and yellow. Let's see how things go.

So this is how our civs start. Now pic a side, place your bet and hope for the best. Here's how the map looks like after 10 iteration

Things are undecided, let's jump 100 iterations forward.

For those of you who picked yellow, I am sorry, you got totaly wiped out. So who wins in the end? I'm not going to tell you that. You can run the simulation on your own, and if you're lucky enough to hit the same map structure and the same placement of the civilisations you will find out the answer. I'm just kidding, red wins in the end and here's a history of the epic clash between civilizations

In the end the reds win (just like in the russian civil war https://en.wikipedia.org/wiki/Russian_Civil_War) and they end up with almost all the 100 nodes after ~1000 iterations.

I was saying above that in order to replicate the result of this simulation you need the same **map structure**, the same **placement of the civilisations** and the same **order of moves**. Because this is what determines the output of this game. So the whole result depends on these three factors: map (network) structure, placement and order of moves. And some randomnes of course, but that's just part of life in general.

**Strategy games**

So while such a simulation is useful for analyzing the above mentioned variables, there's really not much strategy involved in it. So let's turn this simulation into a strategy game. One way to do that is to limit the number of moves (conquer actions) that a civilization can take in one iteration. So I placed the limit at 1/3 of the number of nodes it already has. I also gave each node a value (in 1-10 range, let's say) and use that value in the conquering formula. So instead of

(number of neighbors you own) / (number of total neighbors of the node + the node itself) you will have

sum of values of neighbors you own / sum of values of total neighbors of the node + value of node itself

Given these constraints, a civilization will have to decide which nodes to attack first in one iteration and that will be its strategy.

**What's your strategy?**

So I implemented three strategies: random, naive and big city. But first, let me tell you about the values of the nodes. I divided nodes into three categories. First there are the big cities which have a value of 10 and there is a maximum number of 7 such nodes. The first degree neighbors of a big city can not be big cities themselves so they can at most be cities, having the value of 5 or towns, having the value of 1. Each big city has a maximum number of 4 cities. The second degree neighbors of a big city are all doomed to be boring towns where nothing happens and they will also have the value of 1 (actually these towns will be most often raided and passed from one civilization to the other but that's even worse than being boring).

So the first strategy, the random strategy, does not imply any profound thinking and will do exactly what its name says: pick random nodes to attack at each iteration. Maybe that's why it ends up loosing all the time.

The naive strategy is somehow smarter, and it sorts its neighbors by the chance it has to conquer them, which is the number of neighboring nodes that the civilization already owns. So this strategy will first pick the nodes which it's more sure it can conquer. This greedy approach seems to work very well so far.

The big city strategy is the most complex of the three, but not allways the most successful. It will first try to secure its big cities and their borders (first and second degree neighbors) then it tries to conquer big cities in the vecinity by conquering their neighbors first.

Now let's see how well these strategies do by running a simulation.

The yellow civilization has a big city strategy, the red one has a naive strategy and the blue one has a random strategy. They all start with a big city and two neighboring nodes. Now let's see how things evolve in 10 iterations.

Red grabbed whatever was close at hand, yellow is trying to secure the borders and blue is just acting silly. Let's run 10 more iterations.

The situation hasn't changed that much so I will go ahead and run 100 iterations.

Amazingly, the blues have conquered the reds (which is a very rare thing to happen in the history of simulations) and the yellows have strenghtened their possition.

But who wins in the end? Let's run 1000 itertions and see.

Not enough apparently. So I kept running iterations and after ~9000 iterations the yellows have taken over the world. Good job.

I encourage you to take a look at the code and play with it a bit. Run simulations, change strategies, whatever you think would be interesting. And ask questions if there is anything unclear in the code. Maybe you'll find better strategies or better ways of generating the map or come up with rules to make the simulation more interesting. Then please share your results with all of us.

**Beyond games**

After experimenting a bit with Networkx and writing my strategy game I began thinking how could this kind of simulations be useful in real life. This is still an open question for me but here are some ideas worth exploring: Message propagation in Social Media (based on some previously collected data), economic simulation in a trade network, spreading of behaviors in a society (like coruption or violent behavior). If you have other ideas please drop some comments and we can discuss about it.

You can find other interesting articles on similar topics here.

## Comments