Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed bug related to UTF-8 in certain requests #93

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,13 @@ __pycache__/

config.py
/config.py
Dockerimages/config.py
Dockerimages/src/*
facebook_user_details.py
generate_passwords.py
git_searcher.py


instaUsernameOsint.py
ip_to_neighboursites.py
test.py
Expand Down
36 changes: 36 additions & 0 deletions Dockerimages/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
FROM debian:jessie-slim
MAINTAINER Paul Ganea <[email protected]>
ENV C_FORCE_ROOT=root
RUN apt-get update -y && apt-get install -y build-essential \
curl \
git \
gnupg \
libxml2-dev \
libxmlsec1-dev \
python \
python-dev \
runit \
unzip \
vim \
wget \
zip \
&& apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 0C49F3730359A14518585931BC711F9BA15703C6 \
&& echo "deb http://repo.mongodb.org/apt/debian jessie/mongodb-org/3.4 main" | tee /etc/apt/sources.list.d/mongodb-org-3.4.list \
&& wget -O- https://www.rabbitmq.com/rabbitmq-release-signing-key.asc | apt-key add - \
&& echo 'deb http://www.rabbitmq.com/debian/ testing main' || tee /etc/apt/sources.list.d/rabbitmq.list && apt-get update -y \
&& apt-get install -y mongodb-org rabbitmq-server \
&& curl https://bootstrap.pypa.io/get-pip.py | python \
&& apt-get install -y libxml2-dev python-dev libxmlsec1-dev\
&& git clone https://github.com/DataSploit/datasploit \
&& cd datasploit \
&& mkdir /datasploit/datasploitDb \
&& pip install -r requirements.txt \
&& useradd -ms /bin/bash k4ch0w \
&& apt-get remove -y libxml2-dev python-dev libxmlsec1-dev curl wget build-essential \
&& rm -rf /var/lib/apt/lists/*

USER k4ch0w

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the user k4ch0w ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user was left over from deving. My bad, I didn't see the dockerimage was already made. :( I will remove my image then and add his link to the documentation for new users of the project.

WORKDIR /datasploit
COPY service /etc/service/
COPY config.py /datasploit/config.py
ENTRYPOINT ["runsvdir", "/etc/service"]
30 changes: 30 additions & 0 deletions Dockerimages/Dockerfile.dev
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
FROM debian:jessie-slim
MAINTAINER Paul Ganea <[email protected]>
ENV C_FORCE_ROOT=root
RUN apt-get update -y && apt-get install -y build-essential \
curl \
git \
gnupg \
libxml2-dev \
libxmlsec1-dev \
python \
python-dev \
runit \
unzip \
vim \
wget \
zip \
&& apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 0C49F3730359A14518585931BC711F9BA15703C6 \
&& echo "deb http://repo.mongodb.org/apt/debian jessie/mongodb-org/3.4 main" | tee /etc/apt/sources.list.d/mongodb-org-3.4.list \
&& wget -O- https://www.rabbitmq.com/rabbitmq-release-signing-key.asc | apt-key add - \
&& echo 'deb http://www.rabbitmq.com/debian/ testing main' || tee /etc/apt/sources.list.d/rabbitmq.list && apt-get update -y \
&& apt-get install -y mongodb-org rabbitmq-server \
&& curl https://bootstrap.pypa.io/get-pip.py | python
COPY src/ /datasploit/
WORKDIR /datasploit
RUN cd /datasploit && mkdir /datasploit/datasploitDb \
&& pip install -r requirements.txt
COPY service /etc/service/
#Provide your own config.py! Do not commit it to your forked repo!
COPY config.py /datasploit/config.py
ENTRYPOINT ["runsvdir", "/etc/service"]
48 changes: 48 additions & 0 deletions Dockerimages/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Datasploit dockerimages

### You will need to provide config.py file for both these images!

### Grab Coffee :coffee: on build it takes about ~5 minutes

## Development image:
### This image allows you test source code changes
```bash

$ cp $EDITED_SOURCE_CODE_DIR src/
$ docker build -t="dasploit/datasploit" -f Dockerfile.dev .
$ docker run -d --name="datasploit" datasploit/datasploit
$ docker exec -it datasploit bash
root@61e05f5b7776:/datasploit#

```


## Working docker image
### This image clones from the master branch

```bash
$ docker build -t="datasploit/datasploit" -f Dockerfile .
$ docker run -d --name="datasploit" datsploit/datasploit
$ docker exec -it datasploit bash
k4ch0w@5722c53edc24:/datasploit$ python domainOsint.py

____/ /____ _ / /_ ____ _ _____ ____ / /____ (_)/ /_
/ __ // __ `// __// __ `// ___// __ \ / // __ \ / // __/
/ /_/ // /_/ // /_ / /_/ /(__ )/ /_/ // // /_/ // // /_
\__,_/ \__,_/ \__/ \__,_//____// .___//_/ \____//_/ \__/
/_/

Open Source Assistant for #OSINT
website: www.datasploit.info

[-] Invalid argument passed.
Usage: domainOsint.py [options]

Options:
-h, --help show this help message and exit
-d DOMAIN, --domain=DOMAIN Domain name against which automated Osint is to be performed.
k4ch0w@5722c53edc24:/datasploit$
```



6 changes: 6 additions & 0 deletions Dockerimages/service/celery/run
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#!/bin/bash

C_FORCE_ROOT=root
cd /datasploit/core
celery -A core worker -l info --concurrency 20

1 change: 1 addition & 0 deletions Dockerimages/service/mongod/run
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
mongod --dbpath /datasploit/datasploitDb
2 changes: 2 additions & 0 deletions Dockerimages/service/rabbitmq/run
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
#!/bin/sh
rabbitmq-server
2 changes: 2 additions & 0 deletions Dockerimages/service/server/run
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
#!/bin/sh
python /datasploit/core/manage.py runserver 0.0.0.0:8000
3 changes: 2 additions & 1 deletion active_default_file_check.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
import requests
import re
import codecs
import sys

list_urls = open("check_urls.txt")
list_urls = codecs.open("check_urls.txt", encoding='utf-8')
existing_urls = []
host = sys.argv[1]
base_url = "http://" + host + "/"
Expand Down
45 changes: 23 additions & 22 deletions core/osint/domain_GooglePDF.py
Original file line number Diff line number Diff line change
@@ -1,37 +1,38 @@
from bs4 import BeautifulSoup
import sys
import urllib2
import re
import string
from bs4 import BeautifulSoup
import sys
import urllib2
import re
import string
from celery import shared_task

from osint.utils import *

'''
This code is a bit messed up. Lists files from first page only. Needs a lot of modification.
This code is a bit messed up. Lists files from first page only. Needs a lot of modification.

'''

def googlesearch(query, ext):
print query
try:
google="https://www.google.co.in/search?filter=0&q=site:"
google="https://www.google.co.in/search?filter=0&q=site:"
getrequrl="https://www.google.co.in/search?filter=0&num=100&q=%s&start=" % (query)
hdr = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3',
'Accept-Encoding': 'none',
'Accept-Language': 'en-US,en;q=0.8',
'Connection': 'keep-alive'}
req=urllib2.Request(getrequrl, headers=hdr)
response=urllib2.urlopen(req)
data = response.read()
data=re.sub('<b>','',data)
for e in ('>','=','<','\\','(',')','"','http',':','//'):
data = string.replace(data,e,' ')
hdr = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3',
'Accept-Encoding': 'none',
'Accept-Language': 'en-US,en;q=0.8',
'Connection': 'keep-alive'}
req=urllib2.Request(getrequrl, headers=hdr)
response=urllib2.urlopen(req)
encoding = response.headers.getparam('charset')
data = response.read().encode(encoding)
data=re.sub('<b>','',data)
for e in ('>','=','<','\\','(',')','"','http',':','//'):
data = string.replace(data,e,' ')

r1 = re.compile('[-_.a-zA-Z0-9.-_]*'+'\.'+ ext)
res = r1.findall(data)
r1 = re.compile('[-_.a-zA-Z0-9.-_]*'+'\.'+ ext)
res = r1.findall(data)
if not res:
print "No results were found"
return []
Expand All @@ -52,7 +53,7 @@ def run(domain, taskId):
return data

if __name__ == "__main__":
domain=sys.argv[1]
domain=sys.argv[1]
print "\t\t\t[+] PDF Files\n"

result = run(domain)
Expand Down
19 changes: 10 additions & 9 deletions core/osint/domain_censys.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,9 @@


def censys_search(domain):
#censys_list = []
pages = float('inf')
page = 1

#censys_list = []
while page <= pages:
print "Parsed and collected results from page %s" % (str(page))
params = {'query' : domain, 'page' : page}
Expand All @@ -24,17 +23,19 @@ def censys_search(domain):
proto = r["protocols"]
proto = [p.split("/")[0] for p in proto]
proto.sort(key=float)
protoList = ','.join(map(str, proto))
protoList = ','.join(map(str, proto))

temp_dict["ip"] = ip
temp_dict["protocols"] = protoList
temp_dict["protocols"] = protoList

#print '[%s] IP: %s - aaProtocols: %s' % (colored('*', 'red'), ip, protoList)


print temp_dict
if '80' in protoList:
new_dict = view(ip, temp_dict)
censys_list.append(new_dict)
else:

censys_list.append(temp_dict)

pages = payload['metadata']['pages']
Expand All @@ -47,16 +48,15 @@ def censys_search(domain):

def view(server, temp_dict):
res = requests.get("https://www.censys.io/api/v1/view/ipv4/%s" % (server), auth = (cfg.censysio_id, cfg.censysio_secret))
payload = res.json()

payload = res.json()
try:
if 'title' in payload['80']['http']['get'].keys():
#print "[+] Title: %s" % payload['80']['http']['get']['title']
title = payload['80']['http']['get']['title']
temp_dict['title'] = title
if 'server' in payload['80']['http']['get']['headers'].keys():
header = "[+] Server: %s" % payload['80']['http']['get']['headers']['server']
temp_dict["server_header"] = payload['80']['http']['get']['headers']['server']
temp_dict["server_header"] = payload['80']['http']['get']['headers']['server']
return temp_dict

except Exception as error:
Expand All @@ -68,6 +68,7 @@ def view(server, temp_dict):

def main():
domain = sys.argv[1]
print "Starting censys on : " + domain
censys_search(domain)
for x in censys_list:
print x
Expand Down
20 changes: 10 additions & 10 deletions core/osint/domain_subdomains.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import sys
import json
import requests
import requests
from bs4 import BeautifulSoup
import re
from domain_pagelinks import pagelinks
Expand Down Expand Up @@ -28,7 +28,7 @@ def subdomains(domain):
headers = {}
headers['Referer'] = "https://dnsdumpster.com/"
req = requests.post("https://dnsdumpster.com/", data = data, cookies = cookies, headers = headers)
soup = BeautifulSoup(req.content, 'lxml')
soup = BeautifulSoup(req.content.encode('utf-8'), 'lxml')
subdomains=soup.findAll('td',{"class":"col-md-4"})
for subd in subdomains:
if domain in subd.text:
Expand Down Expand Up @@ -63,15 +63,15 @@ def find_subdomains_from_wolfram(domain):

if recalculate != "":
recalc_code = json.loads(req1.content)['queryresult']['recalculate'].split("=")[1].split("&")[0]

#third request to get calc_id
#print "http://www.wolframalpha.com/input/json.jsp?action=recalc&format=image,plaintext,imagemap,minput,moutput&id=%s&output=JSON&output=JSON&scantimeout=10&statemethod=deploybutton&storesubpodexprs=true" % (recalc_code)
req2 = requests.get("http://www.wolframalpha.com/input/json.jsp?action=recalc&format=image,plaintext,imagemap,minput,moutput&id=%s&output=JSON&output=JSON&scantimeout=10&statemethod=deploybutton&storesubpodexprs=true" % (recalc_code), headers=headers, proxies=proxies)
pods = json.loads(req2.content)['queryresult']['pods']
for x in pods:
if "Web statistics for" in x['title']:
async_code = x['async'].split('=')[1]

#fourth request to get id for subdomains.
req3 = requests.get("http://www.wolframalpha.com/input/json.jsp?action=asyncPod&format=image,plaintext,imagemap,minput,moutput&formattimeout=20&id=%s&output=JSON&podtimeout=20&statemethod=deploybutton&storesubpodexprs=true" % (async_code), headers=headers, proxies=proxies)
for x in json.loads(req3.content)['pods'][0]['deploybuttonstates']:
Expand All @@ -80,7 +80,7 @@ def find_subdomains_from_wolfram(domain):
sub_code = x['input']
else:
pass

#fifth request to find few subdomains
url = "http://www.wolframalpha.com/input/json.jsp?async=false&dbid=%s&format=image,plaintext,imagemap,sound,minput,moutput&includepodid=WebSiteStatisticsPod:InternetData&input=%s&output=JSON&podTitle=Web+statistics+for+all+of+%s&podstate=%s&s=%s&statemethod=deploybutton&storesubpodexprs=true&text=Subdomains" % (sub_code, domain, domain, sub_code, server_value)
req4 = requests.get(url, headers = headers, proxies = proxies)
Expand All @@ -97,9 +97,9 @@ def find_subdomains_from_wolfram(domain):
else:
more_code = "blank_bro"

#wooh, final request bitch.
#wooh, final request bitch.
url = "http://www.wolframalpha.com/input/json.jsp?async=false&dbid=%s&format=image,plaintext,imagemap,sound,minput,moutput&includepodid=WebSiteStatisticsPod:InternetData&input=%s&output=JSON&podTitile=Subdomains&podstate=%s&s=%s&statemethod=deploybutton&storesubpodexprs=true&text=More" % (more_code, domain, more_code, servervalue_for_more)
req5 = requests.get(url, headers = headers, proxies = proxies)
req5 = requests.get(url, headers = headers, proxies = proxies)
for x in json.loads(req5.content)['queryresult']['subpods']:
if x['title'] == "Subdomains":
temp_subdomain_list = x['plaintext'].split("\n")
Expand Down Expand Up @@ -129,7 +129,7 @@ def subdomains_from_netcraft(domain):
link_regx = re.compile('<a href="http://toolbar.netcraft.com/site_report\?url=(.*)">')
links_list = link_regx.findall(req1.content)
for x in links_list:
dom_name = x.split("/")[2].split(".")
dom_name = x.split("/")[2].split(".")
if (dom_name[len(dom_name) - 1] == target_dom_name[1]) and (dom_name[len(dom_name) - 2] == target_dom_name[0]):
check_and_append_subdomains(x.split("/")[2])
num_regex = re.compile('Found (.*) site')
Expand All @@ -151,7 +151,7 @@ def subdomains_from_netcraft(domain):
link_regx = re.compile('<a href="http://toolbar.netcraft.com/site_report\?url=(.*)">')
links_list = link_regx.findall(req2.content)
for y in links_list:
dom_name1 = y.split("/")[2].split(".")
dom_name1 = y.split("/")[2].split(".")
if (dom_name1[len(dom_name1) - 1] == target_dom_name[1]) and (dom_name1[len(dom_name1) - 2] == target_dom_name[0]):
check_and_append_subdomains(y.split("/")[2])
last_item = links_list[len(links_list) - 1].split("/")[2]
Expand All @@ -174,7 +174,7 @@ def run(domain, taskId):
subdomains_from_netcraft(odomain)
save_record(domain, taskId, "Subdomains", subdomain_list)
return subdomain_list



def main():
Expand Down
4 changes: 2 additions & 2 deletions core/osint/domain_wikileaks.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ def wikileaks(domain, taskId):
req = requests.get('https://search.wikileaks.org/?query=&exact_phrase=%s&include_external_sources=True&order_by=newest_document_date'%(domain))
soup=BeautifulSoup(req.content, "lxml")
count=soup.findAll('div',{"class":"total-count"})
print "Total "+count[0].text
print "Total "+count[0].text.encode('utf-8')
divtag=soup.findAll('div',{'class':'result'})
links={}
for a in divtag:
Expand All @@ -28,7 +28,7 @@ def main():
print "%s (%s)" % (lnk, tl)
print "For all results, visit: "+ 'https://search.wikileaks.org/?query=&exact_phrase=%s&include_external_sources=True&order_by=newest_document_date'%(domain)
print "\n-----------------------------\n"



#if __name__ == "__main__":
Expand Down
Loading