GSoC 2014 project ideas

Prospective mentors:

please add ideas below by copying the template and include your name as a mentor,
link to your blog or github, etc

Students:

please feel free to add ideas below
subscribe to the [ganglia-general mailing list](https://lists.sourceforge.net/lists/listinfo/ganglia-general), send an email to introduce yourself, tell us about your ideas and ask for potential mentors
if you like an existing idea, please email the mentors and CC the [ganglia-general mailing list](https://lists.sourceforge.net/lists/listinfo/ganglia-general)
please also see [this blog from Daniel Pocock](http://danielpocock.com/google-summer-of-code-data-science-machine-learning-ganglia), one of the Ganglia developers and GSoC mentors

Everybody: don’t edit this wiki page interactively for a long time, edit your idea in a text editor first and then cut and paste it into the page quickly – don’t keep the page locked or you might find you can not submit because somebody else submitted at the same time

Brainstorm – very rough ideas without a confirmed mentor ###

MongoDB (C programming)
adapt gmetad to store the cluster state to MongoDB
adapt the web interface to read cluster state from MongoDB
Monitoring portal (HTML, PHP, JavaScript)
develop a single portal that integrates Ganglia, Nagios, [LogAnalyzer](http://loganalyzer.adiscon.com) and other popular monitoring tools
patch each of the individual web UIs (Ganglia, Nagios, LogAnalyzer)
to link back to the portal
to link to each other
to expose some data, keyed by hostname, as JSON and RSS for aggregation in the portal

RRDtool plugin for R project statistics (Data Science, Statistics)

Mentors
[Daniel Pocock](http://danielpocock.com) – contributor to the [awesome Ganglia book]
(http://shop.oreilly.com/product/0636920025573.do), previously mentored two projects under [Debian](http://www.debian.org)
[Tobias Oetiker](http://tobi.oetiker.ch) – author of [RRDtool](http://www.rrdtool.org)
Contact us by email and CC the [ganglia-general email list](https://lists.sourceforge.net/lists/listinfo/ganglia-general)
Development of an [R project](http://www.r-project.org) plugin for reading an [RRDtool](http://www.rrdtool.org) data file
Statistical analysis of network performance data
Integration with Ganglia web reports
Advanced: use AI techniques to make predictions about network growth, performance bottlenecks or deviations from normal behavior

Gweb enhancements for NVIDIA GPU monitoring

Mentors
Bernard Li
Contact by email and CC the [ganglia-general email list](https://lists.sourceforge.net/lists/listinfo/ganglia-general)
NVIDIA GPU metrics can be collected via [NVIDIA gmond plugin](https://github.com/ganglia/gmond_python_modules/tree/master/gpu/nvidia), but they are currently not very well visualized via Gweb
The goal of this project is to clean up the web code and come up with a good way to visualize these metrics for a large cluster
Required skills: PHP, jQuery, Python

Create cluster-wide metric aggregation for arbitrary metrics

Right now, the cluster view shows overall metrics for the cluster above all the individual hosts. It sums load, memory, and network and averages CPU. It would be useful to also have aggregated metrics for many or all other metrics represented in the cluster. One method of doing this at the moment is via an Aggregated Graph. These graphs can be added to views and achieve much of the benefit of aggregating metrics at the cluster level. However, aggregate graphs lose their efficacy in an environment with high machine turnover, such as in an Amazon Auto Scaling Group. As machines leave the cluster, their historic metrics are no longer counted in the aggregate graphs, giving you historic values that are drastically low. By aggregating metrics and storing the aggregated value at a cluster level, this problem goes away.

The task:

create a method for aggregating some or all metrics present in a cluster at the cluster level.
- include a method for specifying the aggregating function to use (sum, average, min, max)
add the aggregated metrics to the cluster view below the individual hosts in the cluster or create a pseudo-host named ‘all-$clustername’ and store all the aggregated metrics there.

Mentors:

Ben Hartshorne (http://ben.hartshorne.net) – long time ganglia metric contributor
Someone familiar with the ganglia core needed
Contact us by email and CC the [ganglia-general email list](https://lists.sourceforge.net/lists/listinfo/ganglia-general)

Internal Ganglia server metrics

It’s slightly ironic that a metric collection system does not produce performance metrics of its own. The ganglia server daemon suffers performance degradation under extreme load and having internal metrics would greatly improve the ability to diagnose these issues and increase the scalability of the server daemon.

The task:

add an endpoint to the gmetad daemon. a simple TCP listener would do.
the listener would responded to a “STAT” request with a JSON encoded list of internal metrics
these internal metrics would include things like:
general info about the server, eg. version number, uptime, ports numbers listening on
number of clusters, hosts and metrics collected
memory and cpu consumption information
number of rrd writes
number of metric forwarded to other tools eg. graphite, riemann

Mentors:
Nick Satterly (http://github.com/satterly) – Ganglia core committer and many other monitoring projects
Contact by email and CC the [ganglia-general email list](https://lists.sourceforge.net/lists/listinfo/ganglia-general)

Required skills: C, TCP/IP

API

Allowing internals of gmetad/gmond to be exposed via a REST-ful interface, both for the purposes of monitoring, and to provide another method for data to be exchanged.

The tasks:

Create a simple embedded REST-ful HTTP server with optional authentication
Support, at a minimum, the following tasks:
Basic metric querying for service internals (number of queries submitted, etc)
Reporting uptime and version
Retrieving an object representation of some part of the metric trees

Mentors:
Jeff Buchbinder (https://github.com/jbuchbinder)
Contact us by email and CC the [ganglia-general email list](https://lists.sourceforge.net/lists/listinfo/ganglia-general)

Required skills: C

Gweb enhancement: Tabular data views

Mentors
Peter Piela
Contact by email and CC the [ganglia-general email list](https://lists.sourceforge.net/lists/listinfo/ganglia-general)
Today Gweb provides the capability to create custom views that are collections of graphs
Users have requested the ability to define “realtime” tabular views that show current metric values organized in a specified row-column layout.
Required skills: PHP, jQuery

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GSoC 2014 project ideas

Clone this wiki locally