Skip to content

Commit

Permalink
URLCrazy v0.2
Browse files Browse the repository at this point in the history
  • Loading branch information
urbanadventurer committed Apr 21, 2020
1 parent d94da14 commit 91f7ada
Show file tree
Hide file tree
Showing 2 changed files with 156 additions and 59 deletions.
126 changes: 126 additions & 0 deletions README
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
Title: UrlCrazy Readme
Version: 0.2
Description: UrlCrazy is for the study of domainname typos / url hijacking.
Release Date: March 2009
Author: horton.nz{at-nospam}gmail, Andrew Horton (urbanadventurer)
Primary-site: code.google.com/p/urlcrazy
Platforms: Linux, Anything with Ruby
Copying-policy: BSD


DESCRIPTION
UrlCrazy is for the study of domainname typos / url hijacking.

It generates domainname typo permutations then tests them to learn if they are in use, estimates their popularity and more.


TYPES OF TYPOS SUPPORTED
Character Omission.
These typos are created by leaving out a letter of the domain name, one letter at a time. For example, www.goole.com and www.gogle.com

Adjacent Character Swap.
These typos are created by swapping the order of adjacent letters in the domain name. For example, www.googel.com and www.ogogle.com

Adjacent Character Replacement.
These typos are created by replacing each letter of the domain name with letters to the immediate left and right on the keyboard. For example, www.googke.com and www.goohle.com

Adjacent Character Insertion.
These typos are created by inserting letters to the immediate left and right on the keyboard of each letter. For example, www.googhle.com and www.goopgle.com

Missing Dot.
These typos are created by omitting a dot from the domainname. For example, wwwgoogle.com and www.googlecom

Strip Dashes.
These typos are created by omitting a dash from the domainname. For example, www.domain-name.com becomes www.domainname.com

Singular or Pluralise.
These typos are created by making a singular domain plural and vice versa. For example, www.google.com becomes www.googles.com and www.trademe.co.nz becomes www.trademes.co.nz



DOMAIN TESTS
Is the domain valid?
--------------------
UrlCrazy has a database of valid top level and second level domains. This information has been compiled from wikipedia and domain registrars. We know whether a domain is valid by checking if it matches toplevel and second level domains. For example, www.trademe.co.bz is a valid domain in Belize which allows any second level domain registrations but www.trademe.xo.nz isn't because xo.nz isn't an allowed second level domain in New Zealand.

Popularity Estimate
-------------------
We can estimate the relative popularity of a typo by measuring how often that typo has been made on webpages. Querying cuil.com for the number of search results for a typo gives us a indication of how popular a typo is.

The drawback of this approach is that you need to manually identify and omit legitimate domains such as googles.com

For example, consider the following typos for google.com.
25424 gogle.com
24031 googel.com
22490 gooogle.com
19172 googles.com
19148 goole.com
18855 googl.com
17842 ggoogle.com
16490 googe.com
16367 googgle.com
15029 google.cm
14773 gogole.com
13227 googlle.com
11646 googlee.com
11345 googlr.com
7417 foogle.com
6132 hoogle.com
5313 googlw.com
5208 giogle.com
5151 googke.com
4838 goigle.com
4662 ogogle.com
4630 gopgle.com
4415 goofle.com
4118 wwwgoogle.com
3894 goohle.com
3399 gooigle.com
2675 gfoogle.com
1942 googlecom.com
1534 gopogle.com
1356 googfle.com
1089 googhle.com
892 googlew.com
747 googlke.com
618 goiogle.com
614 goopgle.com
413 ghoogle.com
341 goolge.com
232 googler.com
228 gpogle.com

IP Address
-------------------
If the typo domainname is in use Urlcrazy displays the IP it resolves to. An IP repeating for multiple typos or IPs in a close range shows common ownership. For example, gogle.com, gogole.com and googel.com all resolve to 64.233.161.104 which is owned by Google.



COUNTRY CODE DATABASE
http://en.wikipedia.org/wiki/Top-level_domain
http://en.wikipedia.org/wiki/Country_code_top-level_domain
2nd level domains here:
http://www.iana.org/domains/root/db/


SEE ALSO
http://en.wikipedia.org/wiki/Wikipedia:AutoWikiBrowser/Typos
http://en.wikipedia.org/wiki/Wikipedia:Typo
http://en.wikipedia.org/wiki/Typosquatting

Strider is tool with similar aims and is produced by Microsoft http://research.microsoft.com/csm/strider/


INSTALLATION
UrlCrazy requires ruby. If you are using Ubuntu or Debian try:
$ sudo apt-get install ruby.

Don't install this, instead execute it from it's own folder.



CREDITS
Authored by Andrew Horton (urbanadventurer) horton.nz {at-nospam} gmail

Thanks to Ruby on Rails for Inflector which allowing plural and singular permutations.

89 changes: 30 additions & 59 deletions urlcrazy.rb
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
#!/usr/bin/ruby
#!/usr/bin/env ruby
require 'getoptlong'
require 'singleton'
require 'inflector.rb'
Expand All @@ -7,36 +7,37 @@
require 'socket'
require 'net/http'

$VERSION="0.1"
$VERSION="0.2"

=begin
urlcrazy
Title: UrlCrazy Readme
Version: 0.2
Description: UrlCrazy is for the study of domainname typos / url hijacking.
Release Date: March 2009
Author: horton.nz{at-nospam}gmail, Andrew Horton (urbanadventurer)
Primary-site: code.google.com/p/urlcrazy
Platforms: Linux, Anything with Ruby
Copying-policy: BSD
generates domainname typo permutations to study typo squatting / url hijacking
Read the README file
similar research http://research.microsoft.com/csm/strider/
TO DO:
new typo - repeating characters
new typo - wrong tld.
anything to .com, anything like .org.nz to .co.nz or .org.au to .com.au
thanks to ruby on rails for inflector allowing plural and singular permutations
2do:
# options to turn on/off resolving, search engine popularity -r, -p
# say 'please wait' then rewrite it
new typo - common misspellings
vowel swap, phonetic spelling
stop the ..com
new typo - subtitution of doublecharacters. google => giigle
sort & uniq the typos
show the real domains results 1st
show only unique typos
wrong tld. anything to .com, anything like .org.nz to .co.nz or .org.au to .com.au
common misspelling
vowel swap, phonetic spelling
subtitution of doublecharacters. google => giigle
sort results with valid domains 1st
confirm popularity results - - compare popularity vs. AOL search histories
make gui that shows :
domain __________________ GO
Expand All @@ -45,38 +46,9 @@
yahoo.com missing-dot yahoocom 149,000 - n - AVAILABLE
dog.co.uk pluralize dogs.co.uk 45,000 .co.uk y 210.2.4.5 [UK] ?
keep results in mysql? or similar in files
let ppl register domains from it - aff links.
country codes backwards -- is it valid? i.e. 2nd level com.au typed as 1st level com.ua
TLDs
http://en.wikipedia.org/wiki/Top-level_domain
http://en.wikipedia.org/wiki/Country_code_top-level_domain
2nd level domains here:
http://en.wikipedia.org/wiki/.uk
http://www.iana.org/domains/root/db/
similar but not related are domains that look the same but are different. change l for 1 etc
Missing-dot typos: These typos occur when a user fails to type the ".", or dot between the "www" and the domain name in the URL. For example, typing http://wwwsecurityfocus.com rather than http://www.securityfocus.com.
Character-omission typos: These typo-domains are created by leaving out a letter of the domain name, one letter at a time. For example, http://www.securityfocs.com and http://www.securityfous.com.
cache/ save results in mysql or text files
Character-permutation typos: These are domains that occur when two of the letters in the domain name are transposed, or swapped while typing. Typo-neighborhood generator generates all such domains by swapping all characters one pair at a time. For example, http://www.securiytfocus.com or http://www.securityfcous.com.
Character-replacement typos: To generate character-replacement typo-domains, the Strider typo-neighborhood generator replaces each letter in the domain with each of the letters adjacent to it on the keyboard. For example, typing http://www.secueityfocus.com or http://www.securityfpcus.com.
extra www
strip dashes
Character-insertion typos: These typo domains are generated by inserting an additional character from one of the letters adjacent to the letter from the domain. It can also include using the same letter twice. For example, http://www.securiotyfocus.com or http://www.securityffocus.com.
singular / pluralise
person => people, car => cars
* A differently phrased domain name: examples.com
let ppl register domains from the gui version
* A common misspelling, or foreign language spelling, of the intended site: exemple.com
phonetic misspelling. OPERATOR vs OPERATAR
Expand All @@ -90,22 +62,21 @@
ei = > ie
* A different top-level domain: example.org
.com or .co versions. whitehouse.com instead of whitehouse.gov
References:
http://en.wikipedia.org/wiki/Typosquatting
http://en.wikipedia.org/wiki/Wikipedia:AutoWikiBrowser/Typos
maybe use http://whois.rubyforge.org/rdoc/ for whois info
maybe use maxmind geoip for countries of IPs -- maybe not as useful coz that's just where it's hosted
what does this provide that strider doesn't?
Similar but not related are domains that look the same but are different. change l for 1 etc
what this provides that strider doesn't?
shows potential domain typo popularity
shows available domain typos that aren't hijacked -- no whois yet
focused on finding typos more than discovering sites using typos to serve ads
discovers more classes of typos
crashes on some short inputs
=end

class Keyboard
Expand Down Expand Up @@ -185,6 +156,7 @@ def create_typos
t.name=c
@typos<<t
}
self.character_swap.sort.uniq.each {|c|
t=Typo.new
t.type ="Character Swap"
Expand Down Expand Up @@ -315,8 +287,8 @@ def stripdashes
def singular_or_pluralise
list= Array.new
list << ActiveSupport::Inflector.singularize(@registered_name)+"."+@extension
list << ActiveSupport::Inflector.pluralize(@registered_name)+"."+@extension
list << ActiveSupport::Inflector.singularize(@registered_name)+"."+@extension.to_s
list << ActiveSupport::Inflector.pluralize(@registered_name)+"."+@extension.to_s
list.delete(@domain)
list
end
Expand Down Expand Up @@ -375,9 +347,9 @@ def usage
d=Domainname.new(ARGV[0].downcase)
abort "Aborting. Invalid domainname." unless d.valid == true
puts "#Please wait ... generating typo's for #{d.domain}"
d.create_typos
columns=Array.new
Expand All @@ -393,7 +365,6 @@ def usage
headings[5]="Popularity"
headings[6]="IP"
d.typos.each {|typo|
columns[0] << d.domain.to_s
columns[1] << typo.type.to_s
Expand Down

0 comments on commit 91f7ada

Please sign in to comment.