sidn / cyclehunter Goto Github PK
View Code? Open in Web Editor NEWPython software that reads zone files, extract NS records, and detect cyclic dependencies
Home Page: https://tsuname.io
License: BSD 2-Clause "Simplified" License
Python software that reads zone files, extract NS records, and detect cyclic dependencies
Home Page: https://tsuname.io
License: BSD 2-Clause "Simplified" License
What does it mean? That no cycle was found?
% python3 CycleHunter.py --zonefile bortzmeyer.fr --origin bortzmeyer.fr --base-dir . --workers 1
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 4.82it/s]
Step 1: read timed out zones
Step 2: create Authority objects
Step 3: get only zones without in-bailiwick/in-zone authoritative servers
Step 4: sort which ones are cyclic
step 7: writing down results
step 8: read cyclic domains
./bortzmeyer.fr.2021-05-04.step4.json does not exist; exiting
Hello,
It would be good to add a script that runs periodically on the crontab and if it finds a Cycle that sends an email.
Thank you.
I just run CycleHunter (on FreeBSD) against com.ua zone file and got this:
$ python CycleHunter.py --zonefile com.ua/zone --origin com.ua. --save-file com.ua/hunt --base-dir com.ua --workers 8
[...]
at else not foundZone
result: False
at else not foundZone
result: False
at else not foundZone
result: False
reading line 1400000 of zone file
reading line 1500000 of zone file
troubleddomains: {}
step 8b: writing it to json
set()
{}
ERROR: could not match domain names to NS records; please check zoneMatcher.py
$
$ pip list | egrep 'async|dns|multip|tqdm'
async-lru 1.0.3
dnspython 2.2.1
multiprocess 0.70.12.2
tqdm 4.64.0
$
the zone file length is just over 1.5M lines:
$ wc -l com.ua/zone
1590970 com.ua/zone
$ ls -l com.ua/
total 59800
-rw-r--r-- 1 dk staff 307261 May 11 15:25 com.ua.2022-05-11.step1.txt
-rw-r--r-- 1 dk staff 474714 May 11 15:37 com.ua.2022-05-11.step2.txt
-rw-r--r-- 1 dk staff 373 May 11 15:37 com.ua.2022-05-11.step3.json
-rw-r--r-- 1 dk staff 42 May 11 15:37 com.ua.2022-05-11.step4.json
-rw-r--r-- 1 dk staff 60347063 May 11 15:23 zone
$
Zone files formatting isn't very straightforward. Some zone files are space separated, other tab separated.
We need to instrument largeZoneParser.py
to detect that, and parse the zone accordingly.
The most important requirement is to be able to retrieve the NS record itself.
ATM it only handles tab-separated zone files. Need to improve it to detect this, and find the exact position of the NS server
I have a list of public domains and I am having trouble scanning them. Do I need a zone file to initiate a scan? Is there a better example of the scan as well? Thank you so much.
In findCyclicDep.py
, there is:
clearedZonesForOK.append(zone.lowe())
which is probably a typo.
So we recently found that longer cycles are possible and resolvers get stuck in them too... like a->b->c->a
.
We need to verify if cycleHunter
can detect those, and if not, improve it.
I suggest adding the python library requirement "async_lru" to the README section. For ubuntu it's just "sudo pip install async_lru".
Thanks!
Say example.com
is cyclic dependent with example2.com
In this case, .com
requires that glue records are placed for both records
So even if the glue is unresponsive, these domains would still be cyclic dependent.
Need to label these cases in the output.
For some domains I get
Step 4: sort which ones are cyclic
analyzying xxxxxx.zone.. Domain 12 from 14
The DNS operation timed out after 5.1041343212127686 seconds
The DNS operation timed out after 5.105474233627319 seconds
<class 'dns.exception.Timeout'>
The DNS query name does not exist: N.
The DNS query name does not exist: N.
The DNS query name does not exist: X.
The DNS query name does not exist: X.
The DNS query name does not exist: D.
The DNS query name does not exist: D.
The DNS query name does not exist: O.
The DNS query name does not exist: O.
The DNS query name does not exist: M.
The DNS query name does not exist: M.
The DNS query name does not exist: A.
The DNS query name does not exist: A.
The DNS query name does not exist: I.
The DNS query name does not exist: I.
The DNS query name does not exist: N.
The DNS query name does not exist: N.
So it seems that if all NS of a zone do not reply or do not reply with useful data, the script does something not expected or does not have a managed case to handle it?
We can speed up this code by adding parallelism on CyclicDetector.py
, as we have for CyclicDetector.py
@subbink reported that some zone files are both \tab
and space
separated.
gotta fix this , the code now only supports one or the other
Hello,
Seems like there is a bug somewhere in the script
I get a file not found error when running the script.
I tried the following:
See error message below:
`
venv) user@server:/tmp/CycleHunter$ ./CycleHunter.py --zonefile /tmp/cyclic/xxx.com --origin xxx.com --save-file /tmp/cyclic/xxx_com_output
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:00<00:00, 62.02it/s]
und jetz?
Step 1: read timed out zones
Step 2: create Authority objects
Step 3: get only zones without in-bailiwick/in-zone authoritative servers
Step 4: sort which ones are cyclic
step 7: writing down results
step 1: read cyclic domains
Traceback (most recent call last):
File "/tmp/CycleHunter/zoneMatcher.py", line 104, in getCyclic
with open(infile, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'xxx.com.2021-02-08.step4.json'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./CycleHunter.py", line 51, in
zone_matcher(cyclic_domain_file=output4, zonefile=args.zonefile, zoneorigin=args.origin, output_file=args.save_file)
File "/tmp/CycleHunter/zoneMatcher.py", line 123, in zone_matcher
cyclic = getCyclic(cyclic_domain_file)
File "/tmp/CycleHunter/zoneMatcher.py", line 108, in getCyclic
with open(infile, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'xxx.com.2021-02-08.step4.json'
`
Best regards,
Dirk
What does it mean? This is with a very small zonefile where (unlike #26 ) I just added a cycle:
foo IN NS ns1.bar.bortzmeyer.fr.
bar IN NS ns1.foo.bortzmeyer.fr.
And it crashes CycleHunter:
% python3 CycleHunter.py --zonefile bortzmeyer.fr --origin bortzmeyer.fr --base-dir . --workers 1
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:12<00:00, 3.06s/it]
Step 1: read timed out zones
Step 2: create Authority objects
Step 3: get only zones without in-bailiwick/in-zone authoritative servers
Step 4: sort which ones are cyclic
analyzing foo.bortzmeyer.fr. Domain 1 from 2
analyzing bar.bortzmeyer.fr. Domain 2 from 2
step 7: writing down results
step 8: read cyclic domains
step 8a: read zone file and find them
troubleddomains: {}
step 8b: writing it to json
set()
{}
ERROR: could not match domain names to NS records; please check zoneMatcher.py
The documentation is quite vague about the requirments, mentioning packages but not their versions. For instance, on a Debian stable, the packages mentioned do not work:
Traceback (most recent call last):
File "CycleHunter.py", line 5, in <module>
from CyclicDetector import map_nsset
File "/home/bortzmeyer/tmp/CycleHunter/CyclicDetector.py", line 2, in <module>
import dns.asyncresolver
ModuleNotFoundError: No module named 'dns.asyncresolver'
This is because CycleHunter requires a more recent version of dnspython, so the apt install
command mentioned in the doc is wrong.
Same issue for tqdm:
Traceback (most recent call last):
File "CycleHunter.py", line 5, in <module>
from CyclicDetector import map_nsset
File "/home/bortzmeyer/tmp/CycleHunter/CyclicDetector.py", line 15, in <module>
import tqdm.asyncio
ModuleNotFoundError: No module named 'tqdm.asyncio'
In both cases, removing the Debian package and installing with pip works.
DEBUG: line ns 86400 IN AAAA 2001:1291:200:84ba::1
Traceback (most recent call last):
File "CycleHunter.py", line 41, in <module>
zone_parser(zonefile=args.zonefile, zonename=args.origin, output_file=output1)
File "/home/bortzmeyer/tmp/CycleHunter/largeZoneParser.py", line 45, in zone_parser
nsset = get_ns_set(zonefile=zonefile, extension=zonename)
File "/home/bortzmeyer/tmp/CycleHunter/largeZoneParser.py", line 38, in get_ns_set
ns_entry = parseNS(line, extension)
File "/home/bortzmeyer/tmp/CycleHunter/largeZoneParser.py", line 22, in parseNS
if ns_entry[-1] != ".":
IndexError: string index out of range
(DEBUG
was added by me to see the offending line)
This is apparently because the parser mistakes the name of the name server for a type NS.
This patch allows to continue:
diff --git a/largeZoneParser.py b/largeZoneParser.py
index 3584218..08b8ed4 100644
--- a/largeZoneParser.py
+++ b/largeZoneParser.py
@@ -10,8 +10,9 @@ def parseNS(s, extension=None):
if s[0] != ";":
sp = re.split('[\s]+', s.lower())
foundNS = False
+ firstItem = True
for item in sp:
- if item == 'ns' and foundNS is False and 'rrsig' not in s.lower() and 'nsec' not in s.lower():
+ if item == 'ns' and not firstItem and foundNS is False and 'rrsig' not in s.lower() and 'nsec' not in s.lower():
ns_entry = sp[-1].rstrip()
if len(ns_entry) == 0:
# test if the one before the last has NS
@@ -21,7 +22,7 @@ def parseNS(s, extension=None):
if ns_entry[-1] != ".":
ns_entry = ns_entry + extension
foundNS = True
-
+ firstItem = False
return ns_entry
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.