Data drives the world.
Nowadays, most of the data (structured or unstructured) can be analysed as a graph. Today, many practical computing problems concern large graphs. Standard examples include the Web graph and various social networks. The scale of these graphs (in some cases billions of vertices, trillions of edges) poses challenges to their efficient processing.
Data we produce or consume has increasingly networked structures which have grown in complexity in different domains such as biology
, social networks
, economy
, communication
and transport networks
. The need to process and to analyze such data carries out the emergence of network science research community to define algorithms which allow to characterize such complex structures, to understand their topology, their evolution and to interpret the underlying phenomena.
Besides, the distributed storage and parallel computation technologies offer specific tools for networks based on large-scale graph processing paradigms such as MapReduce
and Pregel
of Google.
The purpose of this course is to study the main algorithms and their implementation on artificial and real data in a distributed environment.
- Preliminaries, Typology of graphs, Graph analytics measures
- Basic algorithms: Random walk and Page Rank
- Label propagation, Community detection, Influence maximisation
- Graph analytics & Deep Learning
The main aim of this repository is to keep track of the work we have done in Massive Graph Management and Analytics (MGMA) labs. During this course, we will focus on some basic graph algorithms and see how we can utilise these algorithms to efficiently analyse our data. Since, there exist many similarities between graph theory and network science, you will see us using network science related packages as well.
NetworkX is a Python package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks.
Please checkout lab's details here
Please checkout lab's details here
If you want to follow along with the lab exercises, make sure to clone and cd
to the relevant lab's directory:
git clone https://github.com/mohammadzainabbas/MGMA-Lab.git
cd MGMA-Lab/src/<lab-of-your-choice>
For e.g: if you want to practice lab # 1, then you should do
cd MGMA-Lab/src/lab1
.
Before starting, you may have to create new enviornment for the lab. Kindly, checkout the documentation for creating an new environment.
Once, you have activated your new enviornment, we may have to install all the dependencies for a given lab (kindly check if requirements.txt
file exists for a given lab before running the below command):
pip install -r requirements.txt
In order to setup pre-commit
hooks, please refer to the documentation.