An implementation of the algorithm¶
graphtools Package¶
This package includes graphtools.gengraph.GenGraph, a superclass implementing the descent algorithm, and subclasses of GenGraph which accomodate various graph data structures. Namely
graphtools.dbgraph.DBGraphreads graph data stored in a sqlalchemy-supported database,graphtools.sagegraph.SageGraphwraps asage.graphs.digraph.DiGraphobject, andgraphtools.listgraph.ListGraphreads graph data stored as a list of arrows.
Multiple types for the vertices are supported. The type vertextype refers to the instance-specific type of the vertices.
gengraph Module¶
-
class
graphtools.gengraph.GenGraph[source]¶ Bases:
objectA graph superclass implementing the descent algorithm. Subclasses should overwrite
The method
descent()runs the algorithm, updating the ranks of the vertices.-
count_neighbors(vert, out=True, cond=False, less=True, cutoff=0)[source]¶ Count the number of neighbors of a vertex.
Parameters: - vert (vertextype) – the vertex to count the neighbors of
- out (bool) – if True, count out neighbors, else in
- cond (bool) – if True, count the neighbors satisfying a condition on rank
- less (bool) – if True, count the neighbors with rank less than or equal to cutoff, else more
- cutoff (int) – the cutoff rank for the conditional
Returns: the number of (in or out) neighbors of vert (perhaps satisfying a condition on the rank) Return type: int
-
descend(vert, debug=False)[source]¶ Run one iteration of the descent algorithm.
Parameters: - vert (vertextype) – the vertex whose rank may change
- debug (bool) – if True, verbose output
-
descent(num=1, debug=False)[source]¶ Run
descend()a number of times on random vertices.Parameters:
-
get_num_arrows()[source]¶ Return the number of arrows.
Returns: the number of arrows in the graph Return type: int
-
get_rank(vert)[source]¶ Return the rank of vertex vert.
Parameters: vert (vertextype) – the vertex to get the rank of Returns: the rank of vert Return type: int
-
get_vert_list()[source]¶ Return a list of the vertices of the graph.
Returns: a list of vertices of the graph Return type: list
-
hierarchy_list= None¶ The list of hierarchy scores, updatedwith each run of descend.
-
dbgraph Module¶
-
class
graphtools.dbgraph.DBGraph(users, arrows, conn, group=None)[source]¶ Bases:
graphtools.gengraph.GenGraphA subclass of GenGraph for graphs stored in databases.
The vertices are stored in a table called users with columns
- user_id (any type) and
- rank (int)
(the name comes from the original motivation which was Twitter user subgraphs). The table may also have a column group (any type) specifying the particular graph that the user belongs to if the database contains multiple graphs. If the table has no group column, user_id should be a unique identifier; if there is a group column, user_id and group together should be unique.
The arrows are stored in a table called arrows with columns
- follow_id and
- lead_id
both refering to users.user_id. If users has a group column then arrows should have a corresponding group column.
Parameters: - users (sqlalchemy.schema.Table) – the table of vertices, described above
- arrows (sqlalchemy.schema.Table) – the table of arrows, described above
- conn (sqlalchemy.engine.Connection) – a connection to the database
- group – an optional identifier, described above
The initializer sets all entries in user.rank to 0.
For example,
>>> from graphtools.dbgraph import make_db, DBGraph >>> users, arrows, conn, eng = make_db([[1, 2], [2, 3]]) >>> graph = DBGraph(users=users, arrows=arrows, conn=conn) >>> print graph.get_num_arrows() 2 >>> set(graph.get_vert_list()) == set([1, 2, 3]) True >>> print graph.get_rank(1) 0 >>> graph.set_rank(3,2) >>> graph.get_rank(3) 2 >>> graph.reset_ranks() >>> graph.descend(2) >>> graph.descent(20) >>> hl = graph.hierarchy_list #get the list of hierarchy scores >>> print len(hl) #descend has been run 21 times, plus the initial score 22 >>> print hl[0] #the first score is always 0 0 >>> print hl[-1] #the score after 21 descends will probably be 1.0 1.0
-
graphtools.dbgraph.get_tables(engine, userstablename='users', arrowstablename='arrows')[source]¶ Get the tables and connections required for
DBGraphfrom a sqlalchemy Engine object.Parameters: - engine (sqlalchemy.engine.Engine) – the database engine
- userstablename (str) – the name of the users table
- arrowstablename (str) – the name of the arrows table
Returns: the users table, arrows table and database connection as two sqlalchemy Table objects and a sqlalchemy Connection object, respectively Return type: tuple For example,
>>> from graphtools.dbgraph import testeng, get_tables >>> users, arrows, conn = get_tables(testeng) >>> from sqlalchemy.sql import select >>> result = conn.execute(select([users])) >>> result.fetchall() [(1, 0), (2, 0), (3, 0)] >>> result = conn.execute(select([arrows])) >>> result.fetchall() [(1, 1, 2), (2, 2, 3)]
-
graphtools.dbgraph.make_db(arrows_list, name=None)[source]¶ Make a SQLite database in the correct format for
DBGraph.Parameters: Returns: the users table, arrows table, database connection and engine as two sqlalchemy Table objects, a sqlalchemy Connection object, and a sqlalchemy Engine object, respectively Return type: tuple For example,
>>> from graphtools.dbgraph import make_db >>> users, arrows, conn, eng = make_db([[1, 2], [2, 3]]) >>> from sqlalchemy.sql import select >>> result = conn.execute(select([users])) >>> result.fetchall() [(1, 0), (2, 0), (3, 0)] >>> result = conn.execute(select([arrows])) >>> result.fetchall() [(1, 1, 2), (2, 2, 3)]
listgraph Module¶
-
class
graphtools.listgraph.ListGraph(arrows_list)[source]¶ Bases:
graphtools.gengraph.GenGraphA subclass of GenGraph for graphs given as a list of arrows.
Parameters: arrows_list (list) – a list of the arrows in the graph; each arrow is an ordered list of 2 vertices (any type) where the first vertex is the tail of the arrow and the second is the head. We use the following simple example throughout.
>>> from graphtools.listgraph import ListGraph >>> graph = ListGraph([['a', 'b'], ['b', 'c']]) >>> print graph.get_num_arrows() 2 >>> set(graph.get_vert_list()) == set(['a', 'b', 'c']) True >>> print graph.get_rank('a') 0 >>> graph.set_rank('b',2) >>> graph.get_rank('b') 2 >>> graph.reset_ranks() >>> graph.descend('a') >>> graph.descent(20) >>> hl = graph.hierarchy_list #get the list of hierarchy scores >>> print len(hl) #descend has been run 21 times, plus the initial score 22 >>> print hl[0] #the first score is always 0 0 >>> print hl[-1] #the score after 21 descends will probably be 1.0 1.0
Graphs with isolated vertices (vertices with no neighbors) are not supported. Isolated vertices don’t affect the hierarchy of the graph.
-
neighbors_in(vert)[source]¶ Return the list of in neighbors of vertex vert.
Parameters: vert (vertextype) – the vertex to get the neighbors of Returns: the list of in neighbors of vert Return type: list For example,
>>> from graphtools.listgraph import ListGraph >>> graph = ListGraph([['a', 'b'], ['b', 'c']]) >>> graph.neighbors_in('a') [] >>> graph.neighbors_in('c') ['b']
-
neighbors_out(vert)[source]¶ Return the list of out neighbors of vertex vert.
Parameters: vert (vertextype) – the vertex to get the neighbors of Returns: the list of out neighbors of vert Return type: list For example,
>>> from graphtools.listgraph import ListGraph >>> graph = ListGraph([['a', 'b'], ['b', 'c']]) >>> graph.neighbors_out('a') ['b'] >>> graph.neighbors_out('c') []
-
sagegraph Module¶
-
class
graphtools.sagegraph.SageGraph(dg)[source]¶ Bases:
graphtools.gengraph.GenGraphA subclass of GenGraph which wraps a Sage DiGraph object.
Parameters: dg (sage.graphs.digraph.DiGraph) – the Sage DiGraph to run descent on