Log Parser

The repo is created to parse data from: Kaggle Global Terrorism Dataset in CSV file and some particular dataset on GDELT to the proper format then import to Elasticsearch for further analysis and visualizations.

Environment

The code has been tested on:

Ubuntu 20.04
Python 3.7.9 (compatible with python 3+)

Installation

Install all the libraries in requirements.txt

Using pip

pip install -r requirements.txt

Using conda

conda install --file requirements.txt

Usage

CSV

Params

'--path', '-p':
    type=str,
    des`cription='path to data file'
'--dump', '-d':
    default=False,
    type=bool, 
    description='dump or log for Beats fetch or not (True: dump, False: not dump)'
'--output', '-o':
    default='log.json',
    type=str,
    description:'define where to dump log, only use when --dump = True'

Run

python csv_parser.py -p path_to_csv_file -o output_file -d True/False

# Orignal way:
python csv_parser.py -p terrorism.csv

# Dump log for Beats, default to log.json
python csv_parser.py -p terrorism.csv -d True 

# Dump log for Beats to specific file
python csv_parser.py -p terrorism.csv -d True -o output.json

For more instruction on using parameters:

python csv_parser.py --help

GDELT

TV News

python gdelt_parser.py  -s startdate -e enddate --station station_list

# Example:
python gdelt_parser.py  -s 20210407 -e 20210409 --station CNN BCCNEWS DW

Events 2.0

python gdelt_parser.py  -s startdate -e enddate 

# Example 1: set all start + end date
python gdelt_parser.py -s 20210412 -e 20210416

# Example 2: set start date, end to default (now)
python gdelt_parser.py -s 20210416

For more instruction on using parameters:

python gdelt_parser.py --help

Customize your mapping body if you want. You can also use "analyzer" for some specific fields.

References

GDELT 2.0 Events

GDELT TV News

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.py		config.py
csv_parser.py		csv_parser.py
gdelt_parser.py		gdelt_parser.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Log Parser

Environment

Installation

Usage

References

About

Uh oh!

Releases

Packages

Languages

License

tuminguyen/log_parser

Folders and files

Latest commit

History

Repository files navigation

Log Parser

Environment

Installation

Usage

References

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages