#Sociopatterns-neo4j
##contactadder-neo4j.py
###class Neo4jContactAdder - Description
Neo4jContactAdder adds contacts (defined in sociopatterns.loader) in a neo4j graph database modeled like specified in neo4j-dynagraph extended with labels on frames (first_frame, last_frame, year, month, day, hour) and index on actors' name. It uses db drivers from py2neo 1.6.1, and the version of neo4j and cypher is 2.0.0.
The main concept is influenced by the need to pay attention to performance, so py2neo's batches to send a bulk of insertion to DB, reducing REST overhead, are necessary. Py2neo's batches provide nice methods that reflect the power of Cypher 2.0.0 new functionalities, like get_or_create_path (=> CREATE UNIQUE
), but those do not permit local references in the same batch. To juggle the problem three couples of data structures are used to store useful references and KEYs to identify the elements that are going to be added:
-
actors_dict
,interactions_dict
(dict) : store actors and interactions references, are updated every new frame in the same batch and contain only real referencesactors_dict
: KEY,actor_id
; VALUE, actor reference to DBinteractions_dict
: KEY,tuple(actor_id1, actor_id2)
; VALUE, interaction reference to DB
-
actor_new
,interaction_new
(set): store the KEYs of the actors and the interactions, are used to know wich new actor/interaction have been encountered during the frame, so it is possible to create them in the database and store them into actors_dict/interactions_dict ready to be used for other purposes. -
frame_actors
,frame_interactions
(dict): store all the actors/interactions present in the current frame, it's evident that it's independent from other two data structuresframe_actors
: KEY,actor_id
; VALUE, number of times in which an actor appears in the current frameframe_interactions
: KEY,tuple(actor_id1, actor_id2)
; VALUE, number of times in which an interaction appears in the current frame
So the idea is to store in *_new
and frame_*
data structures everything concerning to the current frame, at creation of a new frame DB is updated according to those referencing with *_dict
structures: creating new frames *_new
will be used to create new nodes stored in *_dict
, then will be used frame_*
to use *_dict
and creating new non-temporal relationships.
Neo4jContactAdder also provide warm restart with some query to DB during initialization, so recreates *_dict
structures and few other datas, the fact that a frame is created after a hot restart is notified by a property of frame nodes called session, incremented every hot restart.
Notes : batch commands are processed like transactions.
- Parameters:
run_name
(str): name to identify the run in the DBstart_time
(int): the time when the capture begins expressed in seconds since the epochdeltat
(positive int) : time granularity of the captureneo4j_rest
(str): address of the neo4j db
-
Public methods to add contacts:
-
store_contact(self, contact) :
It's intended for real use, when a script receives packets from network, parses it in a Contact object and passes it to store_contact function. It unpacks the Contact object and for each contact contained calls add_single_contact to add it to DB.
- Parameter :
contact
(Contact object as defined in sociopatterns.loader)
- Returns :
None
- Parameter :
-
add_single_contact(self, timestamp, actor_id1, actor_id2) :
This method allows to add single real contacts, intended as two actors that met in a certain instant and no more like Contact objects as store_contact does. So it's useful also in a simulation when usually no metadatas and are needed unlike Contact objects.
- Parameters:
timestamp
(int): increasing timestamp, expressed in seconds since the epoch, identifying the instant of contact.actor_id1
,actor_id2
(__repr__
able objects): usually integer 16 bit IDs for the nature of Contact objects.
- Returns :
None
.
- Parameters:
-