/*
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see
*/
#include
#include
#include
#include
#include"misc.h"
#include"LinkedList.h"
#include"MemoryManager.h"
#include"degeneracy_algorithm.h"
/*! \file degeneracy.c
\brief Execute the algorithm in degeneracy_algorithm.c
and print the number of cliques found and wall clock
execution time.
\author Darren Strash (first name DOT last name AT gmail DOT com)
\copyright Copyright (c) 2011 Darren Strash. This code is released under the GNU Public License (GPL) 3.0.
\image html gplv3-127x51.png
\htmlonly
See GPL 3.0 here
\endhtmlonly
*/
int main()
{
#ifdef MEMORY_DEBUG
fprintf(stderr, "WARNING: MEMORY_DEBUG is defined, timing will be off.\n");
#endif
int n; // number of vertices
int m; // 2x number of edges
LinkedList** adjacencyList = readInGraphAdjList(&n,&m);
int i;
int** adjList; // = Calloc(n, sizeof(int*));
int* degree; // = Calloc(n, sizeof(int));
#ifdef RETURN_CLIQUES_ONE_BY_ONE
LinkedList* cliques = createLinkedList();
#endif
runAndPrintStatsListList( &listAllMaximalCliquesDegeneracy,
"degeneracy",
adjacencyList, adjList,
#ifdef RETURN_CLIQUES_ONE_BY_ONE
cliques,
#endif
degree, n );
#ifdef RETURN_CLIQUES_ONE_BY_ONE
destroyCliqueResults(cliques);
#endif
// Free up memory from adjacency list.
i = 0;
while(i
See GPL 3.0 here
\endhtmlonly
\section intro_sec Introduction
This package contains the code that was used to generate the results of
Eppstein and Strash (2011). (See \ref recommended_reading for references I am using.)
This package contains code to generate all maximal cliques of graphs. It contains implementations
of the following algorithms:
-# (tomita) The algorithm of Tomita et al. (2006), which is known to work well in practice,
but consumes much memory because it uses an adjacency matrix representation of the input graph.
For small sparse graphs, this algorithm is an excellent choice, and it is worst-case
optimal. This algorithm also works very fast in practice for dense graphs.
\see tomita.c tomita_algorithm.c
-# (tomita-adjacency-list) The algorithm of Tomita et al. (2006), modified to use adjacency
lists, therefore only requiring space linear in the size of the graph. This version is explained in the paper
by Eppstein and Strash (2011), and is referred to in their paper as the "max-degree" algorithm. For sparse graphs,
this algorithm can be quite fast on some inputs, but very slow on others. In particular, there are some sparse graphs
that cause this algorithm to run for over 5 hours, when the hybrid and degeneracy algorithms described
next takes 10 and 4 minutes, respectively. This behavior is likely because there are some high-degree vertices whose
neighbor lists are iterated througha many times. Therefore, use caution when running this algorithm.
\see adjlist.c adjlist_algorithm.c
-# (hybrid) An implementation of the algorithm of Eppstein, Löffler, and Strash (2010), which
computes a degeneracy ordering of the input graph, and uses this ordering to speed up computation without any
fancy data structuring. This algorithm is called "hybrid" in Eppstein and Strash (2011). For sparse graphs, this
algorithm is consistently fast in practice, it is especially fast when the degeneracy of the graph is very small
(for example, less than 6), or with graphs whose vertices, on average, have very few later neighbors in the degeneracy
ordering. However, with larger degeneracy graphs, and graphs whose vertices have, on average, many more later neighbors in
the degeneracy ordering, using degeneracy, the algorithm described next, is typically faster. Why? It uses a more
complex data structure that lets us gets us better speed in this case.
\see hybrid.c hybrid_algorithm.c
-# (degeneracy) The algorithm of Eppstein, Löffler, and Strash (2010), which computes a degeneracy ordering of the
input graph, and builds a data structure to speed up computation. This algorithm is called "degeneracy" in Eppstein
and Strash (2011). For sparse graphs, this algorithm is consistently fast
in practice, it is particular fast when the degeneracy of the graph is higher than 6, or with graphs whose vertices,
on average, have many later neighbors in the degeneracy ordering.
\see degeneracy.c degeneracy_algorithm.c
\section install_sec Compilation and Execution
Under Linux systems, or *nix, executing "make" or "make all" from the top-level directory should compile all the source
files in the src directory, place all object files in the obj directory, and place all
executables in the bin directory. The makefile is not very sophisticated, so you may want to
always run "make clean" first. If you have gcc and gnu make, this should just work. This code has not been tested
in other environments.
Executing the bash shell script "test.sh" in the top-level directory will execute every executable
on the data sets in data and print statistics about the data sets.
\subsection Defines
You can modify the defines in the Makefile to alter the behavior of the executables.
- DEBUG - print verbose debug information; may be annoying
- MEMORY_DEBUG - check for malloc or calloc returning NULL; if so, exit gracefully (causes slightly slower code)
- RETURN_CLIQUES_ONE_BY_ONE - add cliques to a linked list that is returned by the clique listing algorithms
- PRINT_CLIQUES_ONE_BY_ONE - print cliques to standard output one per line
- (don't use with define PRINT_CLIQUES_TOMITA_STYLE)
- PRINT_CLIQUES_TOMITA_STYLE - print cliques as described by Tomita et al. (2006)
- prints to standard output: print the vertex when it is added to the partial clique,
print "b" when a vertex is removed, and print "c" when
a maximal clique is found. (don't use with define PRINT_CLIQUES_ONE_BY_ONE)
- ALLOW_ALLOC_ZERO_BYTES - some systems behave strangely when you allocate 0 bytes with either
malloc or calloc, if this is not defined, then we always allocated at least one byte.
The Defines used to generate the results of Eppstein and Strash (2011) are -DALLOW_ALLOC_ZERO_BYTES and -DPRINT_CLIQUES_TOMITA_STYLE
with redirecting the standard output to /dev/null.
\subsection Executables
Each executable reads in the input graph from standard input, and writes the main product to
standard output, any statistical information or by-products are printed to standard error.
- Statistics:
- bin/printnm - print the number of vertices and edges in the data set
- bin/compdegen - compute and print the degeneracy of the data set
- Maximal clique listing: (print number of maximal cliques, and algorithm running time)
- bin/tomita - execute algorithm tomita on the data set
- bin/adjlist - execute algorithm tomita-adjacency-list on the data set
- bin/hybrid - execute algorithm hybrid on the data set
- bin/degeneracy - execute algorithm degeneracy on the data set
\section data_sets Data Sets
Several data sets are in the directory data, in a subdirectory named according to their origin.
\subsection Sources
I have included several data sets from BioGRID version 3.0.65 (http://thebiogrid.org/) in subdirectory biogrid.
See Eppstein and Strash (2011) for more sources
\subsection Format
Each file contains the following:
-# The number of vertices in the graph
-# Twice the number of edges in the graph.
-# a list of edges in x,y format, where 0<=x,yTheoretical Computer Science, 363(1), 2006. (http://dx.doi.org/10.1016/j.tcs.2006.06.015)
- "Listing All Maximal Cliques in Near-Optimal Time" by David Eppstein, Maarten Löffler, and Darren Strash, ISAAC 2010, LNCS volume 6506, pp. 403-416, 2010. (http://dx.doi.org/10.1007/978-3-642-17517-6_36 or http://arxiv.org/abs/1006.5440)
- "Listing All Maximal Cliques in Large Sparse Real-World Graphs" by David Eppstein and Darren Strash, SEA 2011, LNCS volume 6630, pp. 364-375, 2011. (http://dx.doi.org/10.1007/978-3-642-20662-7_31 or http://arxiv.org/abs/1103.0318)
*/