Skip to content

marinaangelovska/spark-streaming-frequent-ips

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Frequent IPs Counter

Description

With the help of a socket network traffic simulator written in Java, it has been decided to develop a simple Spark Streaming application to process and monitor TCP-based tuples consisting of a port + ip_address. A Spark Streaming function called countByValueAndWindow has been used to filter the occurrences of the tuples in a certain time window and above a certain threshold.

How to run

Run the java helper to simulate the network traffic and after that run the pyspark class to monitor and count. In order to run the script, the following bash-command was used: spark-submit frequent ips.py < host > < port > < min packets > < window >

Results

With the following command spark-submit frequent ips.py localhost 9999 5 30, the script listens to the port 9999 on localhost, it sets 5 to the min packets variable and it counts in a window of 30 seconds. The following pictures show two subsequent time windows:

First time window

Second time window

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 57.6%
  • Java 42.4%