Summary:
Network Science has emerged as a field of study to comprehensively understand how physical and virtual artifacts are connected along with the implications of such connectedness. Networks (or graphs) serves as the ubiquitous mathematical abstraction to represent structure induced by a binary relationship. Such model has been widely applied to represent and study information networks (e.g., web pages and hyper-links), social networks (e.g., people and co-authorship), biological networks (e.g., proteins and interactions) and technological networks (e.g., airports and flights). Fueled by the increasingly availability of vast amounts of empirical data, networks with millions of nodes are being studied with the goal of understanding various aspects of their structure. For example, a promising feature of a network is its degree distribution, i.e. the fraction of nodes that have degree k. Network structure can reveal various aspects about nodes, such as their relative importance or their community structure. Centrality metrics such as PageRank and Betweeness impose an importance ordering among the set of nodes, while community detection metrics partitions the nodes into subsets. In this tutorial we will introduce the field of Network Science by first presenting empirical studies of networks and their main findings with respect to structural features. We will identify commonalities among different real networks, such as heavy-tailed degree distribution and short distances, present in Facebook friendship network and the Internet AS-level graph, for example. Besides presenting classical metrics for characterizing networks, such as degrees, distances and clustering, we will also introduce centrality metrics and community structure metrics. We then present and discuss mathematical models for networks, starting with the well-studied Erdós-Réyni (G(n, p)) model followed by BA (Barabási-Albert) model which embodies preferential attachment property and WS (Watts-Strogatz) model which embodies small-world properties.
In the context of network science, random walks emerges as an ubiquitous methodology to characterize and measure different aspects of network structure, in part due to its mathematical simplicity and well-understood properties. Typical questions in the analysis of large complex networks are how large is a network in terms of nodes and links? which nodes are most important/central? the degree distribution follows a power law? if yes, what is the exponent of the power law? how to estimate quickly the clustering coefficient? how to detect quickly principal clusters/communities of the network? All these questions can be answered with the help of the theory of random walks on graphs. In particular, using the theory of random walks on graphs, we can design algorithms with linear or even sublinear complexity. Such light complexity is necessary if we need to deal with networks of billions of nodes and links.