Introduction to the dataset

Intermediate Network Analysis in Python

Eric Ma

Data Carpentry instructor and author of nxviz package

Dataset & case study introduction

  • College forum posting dataset, 6 months
  • Node partitions: students, forums
  • Activities in the chapter:
    • Constructing a graph from a pandas DataFrame
    • Computing unipartite projections of a bipartite graph
    • Visualization
    • Time series filtering & analysis
  • Recap previously used functions
Intermediate Network Analysis in Python

Graphs from DataFrames

df
   customers    products
0  customerA    product1
1  customerB    product2
...
list(G = nx.Graph())

G.add_nodes_from(df['products'], bipartite='products') G.add_nodes_from(df['customers'], bipartite='customers')
list(G.nodes())
['product1', 'customerC', 'product2', 'customerB', 'customerA']
list(G.edges())
[]
Intermediate Network Analysis in Python

Graphs from DataFrames

G.add_edges_from(zip(df['customers'], df['products']))

list(G.edges())
[('product1', 'customerC'), ('product1', 'customerA'), 
    ('customerC', 'product2'), ('product2', 'customerB')]
Intermediate Network Analysis in Python

Bipartite projections

cust_nodes = [n for n in G.nodes() if G.node[n]
                 ['bipartite'] == 'customers']

prod_nodes = [n for n in G.nodes() if G.node[n] ['bipartite'] == 'products']
prodG = nx.bipartite.projected_graph(G, nodes=prod_nodes) custG = nx.bipartite.projected_graph(G, nodes=cust_nodes)
list(prodG.nodes())
['product1', 'product2']
list(custG.nodes())
['customerC', 'customerB', 'customerA']
Intermediate Network Analysis in Python

Let's practice!

Intermediate Network Analysis in Python

Preparing Video For Download...