Skip to content
This repository has been archived by the owner on Jan 18, 2018. It is now read-only.

Domains in list that might never have used Cloudflare #157

Open
Zenexer opened this issue Feb 25, 2017 · 21 comments
Open

Domains in list that might never have used Cloudflare #157

Zenexer opened this issue Feb 25, 2017 · 21 comments

Comments

@Zenexer
Copy link
Contributor

Zenexer commented Feb 25, 2017

8053a43 introduced at least one domain, zoho.com, that might not have any ties to Cloudflare (#83). This commit should be reviewed.

@pirate Do you happen to remember how those domains were found?

cc @deepsk79

@pirate
Copy link
Owner

pirate commented Feb 25, 2017

These were copied from reports found in the original HN thread.

@Zenexer
Copy link
Contributor Author

Zenexer commented Feb 25, 2017

Ah, figures. I suppose we'll have to check many of those manually.

@pirate
Copy link
Owner

pirate commented Feb 25, 2017

e.g. https://news.ycombinator.com/item?id=13720208

Yeah, some manual checking is a good idea.

@Zenexer
Copy link
Contributor Author

Zenexer commented Feb 25, 2017

Domains in 8053a43 using the CF proxy:

Edit: Updated at 2017-02-25 04:41 UTC to indicate that fitbit.com does actually use the CF proxy.

@Zenexer
Copy link
Contributor Author

Zenexer commented Feb 25, 2017

StackOverflow was already removed per #21 and zoho.com per #83. That just leaves fitbit.com.

I'm going to leave this issue open, as we should review related commits.

Zenexer added a commit to Zenexer/sites-using-cloudflare that referenced this issue Feb 25, 2017
Doesn't use Cloudflare presently, and didn't within the relevant
timeframe.

Signed-off-by: Paul Buonopane <[email protected]>
@pirate
Copy link
Owner

pirate commented Feb 25, 2017

@Zenexer I believe Fitbit data appeared in several search engine caches, I'm not what domain it was from though, probably a subdomain other than fitbit.com

@abalabahaha
Copy link
Contributor

See #158, both www.fitbit.com and api.fitbit.com are under CF, just not fitbit.com

@pirate
Copy link
Owner

pirate commented Feb 25, 2017

Let's add back those domains then @Zenexer.

@Zenexer
Copy link
Contributor Author

Zenexer commented Feb 25, 2017

@pirate That merge was closed. The other two have both confirmed they weren't using Cloudflare at the time.

@pirate
Copy link
Owner

pirate commented Feb 25, 2017

@Zenexer sorry I don't follow, which other two domains? zoho & SO, or fitbit domains?

@Zenexer
Copy link
Contributor Author

Zenexer commented Feb 25, 2017

@pirate

I checked zoho.com thoroughly enough to be confident that there weren't any blatant errors like in #158.

@deepsk79
Copy link
Contributor

deepsk79 commented Feb 25, 2017

Will it be removed from the master list?

@JedrzejMajko
Copy link

This request (and list in overall) is fundamentally flawed.
Big websites are attached via dns services that allow them to localize traffic.
Stackoverflow, github, ovh etc all use(d) cloudflare services to battle ddos attacks.
First two were using CF no longer than two weeks ago in some regions.

Not only it's ground for defamation (lawyers! authors info is here: https://github.com/pirate! ;)), but also assumes that your location is one that have all the correct DNS.

@Phineas
Copy link
Contributor

Phineas commented Feb 25, 2017

@Coobers This is in no way illegal, it's just a list of all domains using Cloudflare - and people can remove their domains if they prove it wasn't going through the proxy or if it hosts only static content. There are already thousands of sites that scan huge hosts like Cloudflare and find all domains associated, this isn't really new.

This repo is merely to inform people about the whole Cloudflare bug & what sites might have been affected.

@JedrzejMajko
Copy link

@Phineas I know you think that, but consider this. This approach allows us to create list of "possible" sexual offenders. You can put there anybody based on github avatar color.
From clearly methodological point of view such list would be flawed. Basis here is the same.
Methodological flaw leaves this list without merit other than defamation.

Regarding removal, if it was done via website - yes, but here this information is stored forever, so it's not really removed.

There's so much wrong here.

@Phineas
Copy link
Contributor

Phineas commented Feb 25, 2017

@Coobers The repo is called "sites-using-cloudflare".. It's also said so many times in the README that not all websites have used the proxy, and also - we're not creating a list of "possible sexual offenders", lmao, we're creating a list of websites that could've been affected by Cloudbleed.

@JedrzejMajko
Copy link

@Phineas Please understand it was an example, grounded to show you that it doesn't matter if you do it in IT or in simple terms, end cause is the same.

@coderobe
Copy link
Contributor

@Coobers the readme explains that this repo does contain unverified domains that could've possibly been affected.

@JedrzejMajko
Copy link

@coderobe It is vaguely doing that and reversing it later on. You have follow up in #172

@pirate
Copy link
Owner

pirate commented Feb 25, 2017

@Coobers I'm unclear as to where you think it's reversing it later on. We try to be very explicit in the README, and honestly not much more is needed than the methodology section, as people can read that and come to their own conclusions as to the accuracy of the list. Anyway, I've changed the header I think you might be referencing: #179

@Zenexer
Copy link
Contributor Author

Zenexer commented Feb 25, 2017

@Coobers To clarify, this issue was just to address the possibility that there could have been a mistake--not that there actually was a mistake. Ultimately, no action was taken a result of this issue.

When we're dealing with millions of entries in a dataset, errors are going to be inevitable, no matter how meticulous we are. This is why it's important for us to double-check if there's even a slight suspicion that a mistake could've been made.

It does appear there were initially two domains in the README that weren't using Cloudflare at the time (though at least one of them was in the past). However, from what I can tell, the list didn't serve the same purpose back then; it's come a long way since. Initially, it was just a list of sites that had been mentioned on social media as potentially worth looking into. They were looked into and subsequently removed from the list.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants