Giter VIP home page Giter VIP logo

dns-blocklists's Introduction

dns-blocklists's People

Contributors

hagezi avatar xruffkez avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dns-blocklists's Issues

cdn.elev.io blocked in light.txt

Could you add cdn.elev.io to the whitelist? It's a service used as a knowledge base by different companies and with this blocked, the knowledge base / support pages and in product help of companies using the service no longer work.

For example SmugMug is using it for its knowledge base on www.smugmughelp.com .

Thanks!

Possible FPs

Random .de sites from Top 1M domain list from the world's biggest botnet Cloudflare. Maybe some domains are quite familiar to you.

https://blog.cloudflare.com/radar-domain-rankings/

Found in pro.plus edition

2fine.de
2proxy.de
abitur-undwieweiter.de
aikq.de
alphafrau.de
angebotedirekt24.de
antonmack.de
aurum-juweliere.de
auskunft.de
autopfand24.de
av-by-dialog.de
ba-content.de
baustb.de
bdi-services.de
bero-webspace.de
bioverlag-online.de
blacksirius.de
blogtotal.de
blubroid.de
bluesummit.de
bouldercafe-wuppertal.de
boulderwelt-muenchen-west.de
brandl-blumen.de
brennendesreich.de
burda-forward.de
byavma.de
casa-versicherung.de
cashdorado.de
ccm19.de
cgames.de
chefdays.de
cnd-motionmedia.de
coeus-solutions.de
conasmanagement.de
conative.de
d41news.de
daklesa.de
dbvis.de
dealeffect.de
dein-perfekte-deal.de
dentsu.de
digivod.de
dirproxy.dev
div-vertriebsforschung.de
dkuim.de
domainassetmanager.de
domainkompetenz.de
dr-pipi.de
dw-css.de
econda.de
ecosia.de
edv-live.de
elpatra.de
emmamadchen.de
enovos.de
ergebnis-dienst.de
euro-ads.de
evangelische-pfarrgemeinde-tuniberg.de
expepp.de
facettenreich27.de
fairfriends18.de
fensterbau-ziegler.de
finanzenchecken24.de
finanzselect-online.de
fitnessmagazin.de
flife.de
flofire.de
freeware.de
freie-gewerkschaften.de
ft8hua063okwfdcu21pw.de
futurebiz.de
gamingweb.de
gastsicht.de
geisterradler.de
geno-mailings.de
globvill.de
gutscheinexxl.de
gutscheinmailer24.de
gutschein-pim.de
haustiercastingservice.de
hkr-reise.de
huehnerauge-entfernen.de
huesges-gruppe.de
i-arslan.de
ibanner.de
imaginado.de
infonewsletter.de
iqcontentplatform.de
jerling.de
juwelenmarkt.de
kairion.de
kilu.de
kundencontroller.de
kundenmail-worlddirect.de
kunze-immobilien.de
ligiercenter-sachsen.de
live-con-arte.de
logopaedie-blomberg.de
mailcosmos-news.de
mailpfeil-info.de
mank.de
marktplatzblatt.de
marpu.de
mdk-mediadesign.de
mdm.de
meetrics.de
meineinformationen.de
messserver.de
micahkoleoso.de
michaela-bendich.de
mirkoreisser.de
mobileadvertise.de
modellbau-universe.de
mousepad-direkt.de
mps-gba.de
mtusgate.de
mtusrede.de
nachrichtenbox24.de
nachrichtencontainer.de
nachrichtenspiegel24.de
nacktfalter.de
netzathleten-media.de
norovirus-ratgeber.de
ora-it.de
pensionhotel.de
perf-adv.de
pferdebiester.de
planso.de
plantag.de
pl.de
pocket-opera.de
pomodori-pizzeria.de
powermails.de
pph-server.de
praxis-foerderdiagnostik.de
preisdealz.de
profiwin.de
proxite.de
psa-sec.de
pt-arnold.de
purelocalmedia.de
pushfire.de
qlog.de
qualitaetstag.de
rabattoase.de
rabattprominenz.de
ra-staudte.de
reportsign.de
sandraboerner.de
schutzklick.de
serviceimg.de
servscrpt.de
skylinemailing.de
skylineperformance.de
smarketer.de
smartadcheck.de
spielenxxl.de
sportverein-tambach.de
srvdns.de
startfenster.de
stats.de
stroeer.de
stroeermediabrands.de
styles.de
superpromo24.de
supersrv.de
tagesaktuelleangebote.de
tanzschule-kieber.de
telenate.de
theduke.de
thewebguru.de
tmedianewsw1.de
tmedianewsw2.de
tor-gateways.de
tracking-kmms.de
trackingq.de
trend-email.de
triggi.de
trusted.de
trustiseverything.de
tvtelemetrie.de
unicepta-mind.de
uptain.de
usemaxserver.de
utkv6nyu.de
utrace.de
verbraucher-rat24.de
verbraucher-tipp64.de
verifort-capital.de
vermoote.de
vevor.de
viagogo.de
villa-marrakesch.de
walter-lemm.de
welect.de
weltsparer-online.de
westats.dev
wolf-glas-und-kunst.de
wutpostkarte.de
xiji.de
xn--logopdie-leverkusen-kwb.de
yggui.de
yoomedia.de
zimmerei-fl.de
zinspilot.de
zizashop.de
zovvo.de

Shrink pro version

Is it possible to shrink the pro version because I am using Adguard pro on ios and its only limitation is that greater than 500k domains it will not work? Thanks in advance

sea.net.edu.cn

sea.net.edu.cn

This is also portal of Tsinghua University mail system. Again it is blocked by someonewhocares's hosts. The author seems quite unsatisfied with Chinese educational institutions.

Nextdns?

Why Nextdns was removed from sources?

FPs

Found in pro.plus domain txt

Don't forget the www. variants, if any

chinatruck.org news portal about trucks
21food.cn food e-commerce site
8684.cn public transport navigation app
bjtime.cn time query and various info for public
bozhou.gov.cn government site
ccgslb.com.cn content delivery
cert.org.cn computer emergency response team
cggc.cn China Gezhouba Group site
cnnic.cn China Internet network Information Center
cnnic.net. alias of CNNIC
cnvd.org.cn Chinese National Vulnerability Database
game2.cn guess what the site is about
gxuwz.edu.cn school webiste
htpm.com.cn precision machinery vendor
maimai.cn career and social-networking platform
neixin.cn OA software
nejmqianyan.cn online medical magazine
openwrite.cn editor tool
quickapp.cn China's instant app platform
rayli.com.cn women magzeine
s11.cn ticket sales
sinastorage.cn,sinastorage.com Sina's cloud storage platform
whnews.cn local news
xyaz.cn Android emulator
z.cn amazon.cn short url

activehosted.com

activehosted.com is on your blocklist, which blocks the "unsubscribe from list" page mailing lists of activecampaign.com.

go.e2language.com

It is blocked on pro and it its unable to load links from yahoo mail. Its a english language reviewing website. Thanks

Trackers

Can you add blocklist please?
https://raw.githubusercontent.com/hagezi/dns-blocklists/main/domains/light.txt

Trackers by Vivo Mobile (have a couple of dns queries)

tr-vcode-od.vivoglobal.com
tr-vcode-api.vivoglobal.com
tr-st-sl.vivoglobal.com
tr-domaincfg.vivoglobal.com
tr-timesync.vivoglobal.com
tr-romsp-unifyconfig.vivoglobal.com
asia-ex-adlog.vivoglobal.com
asia-analyzer-appstore.vivoglobal.com

Trackers by Vodafone

netperform-is.vodafone.com.tr
vodafonetr.sc.omtrdc.net
mplusps.ims.vodafone.com
mpluswf.ims.vodafone.com

Advertising
advertise.ldplayer.net
creative.eagllwin.com
api.eagllwin.com
storeen.ldmnq.com

Google Spyware (have a couple of dns queries)

telephonyspamprotect-pa.googleapis.com
growth-pa.googleapis.com
deviceintegritytokens-pa.googleapis.com
gmscompliance-pa.googleapis.com
mdh-pa.googleapis.com
firebaseinstallations.googleapis.com
digitalassetlinks.googleapis.com

FP

www.ks.cn

This is portal of government of Kunshan, a city in Jiangsu province.

Blocked as a malicious domain in Urlhaus and even some DNS services because of a single file infected with macro virus

https://urlhaus.abuse.ch/url/1393110/

de.aukey.com false positive

de.aukey.com is listed in /data/black.list.fake.vzni.
It's the german site for Aukey, directly linked on the US site aukey.com.

The list source appears to be www.verbraucherzentrale-niedersachsen.de/vorsicht-falle which lists the de.aukey.com domain at www.verbraucherzentrale-niedersachsen.de/vorsicht-falle/deaukeycom-sitz-des-unternehmens-ist-china.
If the auto translation is correct the site is listed because Aukey is located in china, which is true, but I don't understands how that makes it a "fake shop"

Possible FPs

some academic sites. Found in pro.plus domain

atsn.ac.th
iitrpr.ac.in
kanlayanee.ac.th
mut.ac.ke
apu.edu.my
chnu.edu.ua
ifbaiano.edu.br
netco.edu.np
overseas.edu.lk
pamer.edu.pe
ringling.edu
smarttrain.edu.vn
tsatu.edu.ua
unicauca.edu.co
vanividyalaya.edu.in
wcc.edu.in

go.e2language.com

It is use to open links in e2 page where it is a english reviewing website.

networkbest.ru.com

networkbest.ru.com is used for video delivery on "some" websites
cosider unblocking it

FP

mail.tsinghua.edu.cn

This is portal of Tsinghua University mail system.

A reminder: one of 1Hosts' source is top 1m domain list by Cisco Umbrella.

https://github.com/badmojr/1Hosts/blob/master/-data/lists/assets.txt (http://s3-us-west-1.amazonaws.com/umbrella-static/top-1m.csv.zip)

Does the source is included in the so-called lite version? Shouldn't this list used more as an allow list? I know there are many ad, tracking and even gambling and porn sites are among the world's top domains. But compared to legit ones, the amount is not many. Using the list as a source is a bold and even aggressive move in my view.

ads trackers

for
https://raw.githubusercontent.com/hagezi/dns-blocklists/main/domains/light.txt

collect.analytics.unity3d.com
statsfe2.update.microsoft.com
static.cloudflareinsights.com
mads.amazon-adsystem.com
unagi.amazon.com
fls-na.amazon.com
data.media-lab.ai
r.remarketingpixel.com
toblog.tobsnssdk.com
www.adspirit.com
cdn.consentmanager.mgr.consensu.org
consentmanager.mgr.consensu.org
crm.adflex.com.tr
v10.events.data.microsoft.com
ic3.events.data.microsoft.com
browser.events.data.microsoft.com
browser.events.data.msn.com
report-edge.agora.io
snap.licdn.com
api-js.datadome.co
js.driftt.com
api.leanplum.com
device-metrics-us.amazon.com
log.core.cloud.vewd.com
customerevents.netflix.com IoT Telemetry
nrdp.nccp.netflix.com IoT Telemetry
api.miwifi.com
app.adjust.net.in
androidtvwatsonfe-pa.googleapis.com
functional.events.data.microsoft.com
global.asimov.events.data.trafficmanager.net
onedscolprdcus09.centralus.cloudapp.azure.com
edge.activity.windows.com
edge.activity.windows.com.akadns.net
edge-global.activity.windows.com.akadns.net
edge-emeu1.activity.windows.com.akadns.net
pf2am3-edge.activity.windows.com.akadns.net
a1h91ecnmw3isq-ats.iot.us-west-1.amazonaws.com
device-metrics-us-2.amazon.com
ap-oversea.agora.io
iot-logser.realme.com
iot-eu-logser.realme.com
browser.pipe.aria.microsoft.com
settings.data.microsoft.com
dlwnextsetting.blob.core.windows.net
evoke-windowsservices-tas.msedge.net
web.diagnostic.networking.aws.dev
cdn.segment.com
s.amazon-adsystem.com
performance.radar.cloudflare.com
applog.matrix.easebar.com
sigma-statistics-push.proxima.nie.easebar.com
mcount.easebar.com
who.nie.easebar.com
appdump.nie.easebar.com
usermetrics.lidlplus.com
search-log-1.tapatalk.com
lidlplusprod.blob.core.windows.net
self.events.data.microsoft.com
mobile.pipe.aria.microsoft.com
mon-va.tiktokv.com
psstats.superkinglabs.com
ota.lokalise.co
bundles.live.lokalise.cloud
livelogs.slike.in
mobile.protectt.ai
report-oversea.agora.io
report-oversea2.agora.io
aax-us-east.amazon-adsystem.com
woodpecker.uc.cn
report.banorte.glassboxdigital.io
acs.m.taobao.com
redevents-bigdata.redefine.pl
ipla.hit.stat24.com
pro.hit.gemius.pl
gapl.hit.gemius.pl
total-rekall.metrics.indazn.com
self-events-data.trafficmanager.net
onedscolprdeus02.eastus.cloudapp.azure.com
onedscolprdcus01.centralus.cloudapp.azure.com
biemffjjakefxlbom6zcwmvbiyy1q1663016176.uaid.imrworldwide.com
4ae1981d8d1a370d08bf3d24f50f5baf47ef1b62.cws.conviva.com
hpoahhuy2rocr9xb2kii0o2xeetb41663016176.uaid.imrworldwide.com
dishanalyticsandtesting.sc.omtrdc.net
tuplogpublic.bangcdn.net
akoss.bangcdn.net
mobile-data.onetrust.io
crash-reporting.duowan.com
lcprd1.samsungcloudsolution.net
clickserve.dartsearch.net
sla-intl.trustlook.com
plausible.io
cdn.ftd.agency
suphelper.com
go.trouter.skype.com

FPs

Found in pro.plus edition

chrunlee.cn personal blog

clarins.com.cn Clarin's official Chinese site

lockview.cn website/ip acess restriction

pepsico.com.cn Pepsico's official Chinese site

rbc.cn local radio site

wilf.cn personal blog

yhjxcj.cn machinery maker

zhongmin.cn insurer

Using Third-Party Allow Lists to Reduce FPs

OVPN support chat

Needed for OVPN support chat:

api-iam.intercom.io
widget.intercom.io

blockled by @ookangzheng blacklist

iOS insufficient blocking

Can you block more bad domains specifically in iOS? Im using adguard for iOS btw.Take for example the screenshot i added. And maybe there is even more that is considered as a tracker etc. Thank you in advance

image

FPs

This is a humble attempt to remove some most distinct FPs from Pro.Plus edition.

All results are based on

bf9d7f0#diff-f6a2b7e14a91f43dd335d0cd5a2d0011a98d9bf1e6181ae3624b231b77e36160

https://tranco-list.s3.amazonaws.com/tranco_K2QXW-1m.csv.zip

Methodology:

  1. Download the pro.plus file
  2. Download the tranco list and extract the top 500000 domains.
  3. List the common entries between the two files
  4. Install and config TrendMicro's parental control function. (leave six categories, namely web advertisement, proxy, illegal drugs, violence/hate/racism, illegal/questionable and gambling, unchecked) More info: https://helpcenter.trendmicro.com/en-us/article/tmka-14431

5.Simulate http browsing with cmd tool
6.Export the results. The most blocked categories fall in pornography, shopping, games, software downloads, erotic/mature, streaming... To check which category does a domain belongs to, you can use https://global.sitesafety.trendmicro.com/.

Note: I am not responsible for the accuracy and completeness of the results as a single domain categories database is impossible to cover all domains in a category and all results comes from the TM software.

100xuexi.com
123-games.org
123cha.com
17173.com
17tahun.com
1ting.com
2144.cn
24-sportnews.com
3d2f.com
3dxchat.com
40momporntube.com
4gltemall.com
4howcrack.com
4life.com
51.com
51240.com
51mypc.cn
52pk.com
52xiyou.com
52z.com
54kefu.net
7k7k.com
800pharm.com
88p2p.com
91wan.com
9377.com
9377s.com
973.com
991.com
9wee.com
about.co.kr
abum.com
acclaimimages.com
acestream.me
activeboard.com
ad4game.com
adcell.de
adf.ly
adforgames.com
adlermode.com
admantx.com
adpost.com
adquan.com
adsoftheworld.com
adsxyz.com
adultdates.com
adultsclips.com
affise.com
aikq.de
airydress.com
ajmalperfume.com
al4a.com
ali.ski
alivegirls.com
allsoftwarekeys.com
alxanosoft.com
amandahot.com
amateurseite.com
amazonbusiness.com
amiami.jp
angara.com
anyclip.com
appannie.com
appbrain.com
applixir.com
appolicious.com
apserver.net
arminius.io
artaban.ru
ashleyrnadison.com
atomiclearning.com
atwebpages.com
auctionads.com
avgle.com
avidly.com
avokazu.com
awejmp.com
awempire.com
babesroulette.com
babesuniversity.com
badoo.com
baixaki.com.br
baixing.com
bannersolutions.com
bbpeoplemeet.com
beachsissi.com
beboo.ru
beintoo.com
berrylook.com
besplatnyeprogrammy.ru
beyourxfriend.com
bigboobsalert.com
bigdick.com
biovea.com
bipblog.com
bk.ru
blackpeoplemeet.com
blogads.com
bloggeramt.de
bloglines.com
bmetrack.com
bmsend.com
boards2go.com
bokecc.com
bongacash.com
boobi.biz
boobscategory.com
bookmark.xxx
boostroyal.com
brainberries.co
braincash.com
brazzerssurvey.com
brucemulkey.com
buddygays.com
bunnyteens.com
burn4free.com
buysellads.com
byoblu.com
cameraprive.com
camvideos.tv
canadianpharmacyntv.com
carpediem.fr
celebjihad.com
chatroll.com
chemistry.com
chicme.com
chilliconnect.com
chromefans.org
cichic.com
click.in
clickandchat.com
clickzs.com
cmail19.com
cokodive.com
compricer.se
conservativedailypost.com
contentsbridge.com
cooch.tv
corachic.com
costaction.com
costaspain.net
coupert.com
cr173.com
crackdj.com
cracktube.net
cracxpro.com
createsend.com
creativecdn.com
crptentry.com
cumshots.com
cuntempire.com
cuntwars.com
cupid.com
cusecwhitten.com
cutedolls.top
d3go.com
dailypaul.com
datoporn.com
ddestiny.ru
ddfcash.com
ddownr.com
dealcatcher.com
dealhack.com
dealsfor.life
dedecms.com
deepikarai.com
defendershield.com
desura.com
dirtydating.com
dmanalytics2.com
dollarupload.com
dosenit.com
dovetale.com
dreamlist.top
drivers.com
driverupdate.net
dump.com
duoyi.com
easyrencontre.com
easysplashbuilder.net
ebates.com
ekomi.de
emailaccessonline.com
emalls.ir
endlessvideo.com
erologz.com
eskimi.com
eurolive.com
facebook.com.vn
falcoware.com
faptitans.com
fileeagle.com
filmi7.com
findglocal.com
findologic.com
fleshlight.com
fling.com
fliplineads.com
flowgo.com
foryourparty.com
frama.link
freeforums.org
freeprosoftz.com
freesoftwarefiles.com
friv5online.com
frtym.com
fullbeauty.com
fusepowered.com
futurebiz.de
gallfree.com
game2.cn
gameangel.com
gamehouse.com
gamesgx.net
gamesites200.com
gamesrevenue.com
gamingadult.com
gamingwonderland.com
gammae.com
gamooga.com
gamyun.net
gaoshouyou.com
getfireshot.com
getglue.com
ggcorp.me
ggeek.ru
glyde.com
go.pl
godatafeed.com
goodsearch.com
gotvg.com
gtsouth.com
guangzhuiyuan.com
gutscheinexxl.de
hawalili.com
hbg.com
hitboom.net
hooligapps.com
hoverwatch.com
hubspot.com
huckabuy.com
huffduffer.com
hunan.voc.com.cn
iamactivator.com
iamnaughty.com
iask.com.cn
idates.com
idreamsky.com
ikeymonitor.com
imlive.com
imop.com
indigorose.com
infinario.com
infinity.co
infocart.jp
inplayer.ru
inrdeals.com
insecure.org
instaimgs.com
inthevip.com
intimshop.ru
ioffer.com
istripper.com
iwantu.com
iwon.com
javascriptobfuscator.com
jeunesseglobal.com
jiayuan.com
jivosite.com
jizzy.org
jjyx.com
joinpiggy.com
jokeroo.com
jueriy.com
juicyads.in
justperfact.com
jzyx.com
katushka.net
keytiles.com
kidlogger.net
kiees.com
kizi10.org
koddi.com
koowo.com
koyotesoft.com
kuk8.com
kutt.it
kwanzoo.com
laohu.com
lbtinc.com
ldsplanet.com
librateam.net
limeroad.com
linekong.com
linkfeed.ru
linkr.top
linktech.cn
linkwithin.com
lite14.us
livede55.com
livenetlife.com
livesupporti.com
lnk.ie
localflirtbuddies.com
lolz.guru
loveaholics.com
luckycrush.live
lupoporno.com
macblurayplayer.com
mahimeta.com
mail333.com
mailclub.com
mailkit.eu
maimai.cn
mainadv.com
mallcom.com
mallfinder.com
malwarecrusher.com
manychat.com
marcofama.it
marydating.com
mbaobao.com
medipartner.jp
medisafeproject.com
megacool.co
meninosonline.net
metricool.com
miaxxx.com
microvirt.com
miyuhot.com
mkt2527.com
mobirisesite.com
mobshark.net
mochiads.com
modelsgonebad.com
moe.video
moevideo.biz
momently.com
mptgate.com
muamat.com
mybloglog.com
mycoupons.com
mydates.com
mydirtystuff.com
mysimon.com
neighborhoodsluts.com
newchic.com
newsletter2go.com
nexage.com
nextag.com
ngastatic.com
noracora.com
nosto.com
novynha.com
nsgalleries.com
nudedworld.com
nudespree.com
number1victimofcrime.com
nuubu.com
nydus.org
oasgames.com
offersuperhub.com
ojooo.com
onenightfriend.com
onlinebootycall.com
oo00.biz
ooshirts.com
operasoftware.com
optimumdesk.net
ositracker.com
otherprofit.com
ouku.com
owl.li
pamura.com
papim.net
patchcracks.com
pavtube.com
paydotcom.com
payserve.com
pcgamesupply.com
pcpop.com
peachy18.com
pegasproductions.com
performancing.com
phoenixstyle.com
phpbbex.com
picxxxhub.com
pieprzyc.com
pimproll.com
pingomatic.com
piranho.de
pisem.net
pixelpop.co
pixelunion.net
placelibertine.com
playerassist.com
playtomic.com
playwire.com
poczta.pl
porn2012.com
pornboard.us
porndroids.com
pornimg.xyz
pornobande.com
pornogrund.com
pornrabbit.com
posterburner.com
preauc.com
priceblink.com
pricee.com
privatecash.com
prizee.com
proxyclick.com
prpops.com
purechat.com
pusha.se
redlightcenter.com
redspot.tv
refog.com
repairmsexcel.com
responsiveads.com
reunion.com
reviewstream.com
reviversoft.com
rfrl.pw
rgho.st
rollercoin.com
rstgames.com
rtbstream.com
rusfolder.com
sam4m.com
sasisa.ru
sawlive.tv
scout69.com
searchanise.com
seedr.ru
seemygf.com
segundamano.es
sendspace.pl
sexcounter.com
sexemulator.com
sexsaoy.com
sextracker.com
sexu.com
sexuhot.com
sexwife.net
sexyads.net
sexyxxx.biz
sharpay.io
shnyagi.net
shoebuy.com
shopathome.com
shorte.st
simplestar.com
sinastorage.com
skem1.com
skenglish.com
slfeed.net
slimtrade.com
smarter.com
smutstone.com
snapengage.com
sneakerfreaker.com
socialoomph.com
socialvibe.com
softobase.com
sonnerie.net
soolinen.com
sortable.com
spartoo.co.uk
spartoo.de
spartoo.it
spartoo.pt
spdate.com
speakpipe.com
speedbit.com
speedyshare.com
splayer.org
startface.net
steamanalyst.com
steemkr.com
stir.com
streamen.com
sucksex.com
suo.im
tagtic.cn
tangiblee.com
tantanapp.com
tbdress.com
technorati.com
televisionfanatic.com
tenping.kr
thehun.net
theseoffersforyou.com
thickcash.com
thri.xxx
ticno.com
tipard.com
tny.im
together2night.com
top4top.net
topanasex.com
topbucks.com
topcracked.com
topgamesites.net
torrentz.eu
totemcash.com
trackduck.com
tracker.name
trafficforce.com
trafficmagnates.com
trony.it
trvdp.com
tube.ac
tube2012.com
tubecup.net
tudou.com
tuenti.com
tutuapp.com
tvcok.ru
tweepsmap.com
twinesocial.com
twinred.com
twittercounter.com
twitthis.com
ultrapartners.com
wasap.my
webcamsex.nl
webgozar.com
websas.hu
websta.me
weheartit.com
wellhello.com
wendise.com
wy213.com
x10.com
xblognetwork.com
xiaobaixitong.com
xlovecam.com
xmeeting.com
xmorex.com
xnxx4porn.com
xtremetop100.com
xvideoslive.com
xxxcounter.com
xxxvogue.net
xy.com
xyaz.cn
yandexadexchange.net
yasir252.com
zergnet.com
zero.kz
zipitdeal.com
zipshare.com

Using CWF API to Bulk Classify Domains

I haven't tried it myself. So, I cannot ensure the viability.Maybe you can have a try.

https://cwf.comodo.com/subscriptions.php

The available categories are listed here
https://cwf.comodo.com/categories.php

Find common entries between pro,plus version and top 1M.

Then, removing malicious ones identified by Google Safe browsing API (https://github.com/elliotwutingfeng/Inversion-DNSBL-Blocklists/blob/main/Google_hostnames.txt?raw=true) and so-called NSFW ones in https://oisd.nl/downloads.

Removing confirmed ad and tracking entries based on some sources you trust.

After that, filtering out domains with keywords such as 'sex', 'porn','adv', 'click', 'bet', "casino' and others to further reduce the amount.

And finally, using the free API to categorize the remainder.

I believe the number shall be less than 20000 ones.

||js.sentry-cdn.com^

When ||js.sentry-cdn.com^ is blocked it breaks https://www.pizzahut.com.au/ can't put store information or address in etc.

Unblocking ||js.sentry-cdn.com^ allows for postcode/address to be put in to set delivery or pickup of pizza.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.