nccgroup / dirble Goto Github PK

View Code? Open in Web Editor NEW

595.0 595.0 87.0 385 KB

Fast directory scanning and scraping tool

License: GNU General Public License v3.0

Makefile 2.09% Rust 95.22% Dockerfile 0.42% Python 1.17% Shell 1.10%

pentest pentest-tool tool web

dirble's People

Contributors

Stargazers

Watchers

Forkers

sciguy16 totoroha keystroke95 ncivnp pweisdepp bbhunter rahmiy olivierh59500 mosuan sasqwatch minkione c0dak sgachies 1r0dm480 securux axiom215 hitesh50 open-sec mmg1 shreegowtham27 3453-315h duzhanyuan helloexp enumeration-tools evi1hack wayc0des-land 0d4rujd 5l1v3r1 elamaran619 strangerting slooppe ipv4v6 rajivraj hax0rg1rl blockchainguard sarmadbytes rezaduty deepwebhacker zmdprogrom angrykobe moonsun80 qianniaoge zacharyz-at bosci m3g4byt3 keyman9848 b-xiang yut0u xinyuweb kwesthaus trietptm-on-coding-algorithms xorgx304 pakrae asdlei99 roguesmg hunter0x8 polling-repo-continua w4fz5uck5 shahid1996 kamaal44 burnafterr3ad reewardius hartl3y94 nosorry gh0st0ne superdong link-spider mk-kaiser tmcmil markus851 ak-infosol-pvt-ltd sylviagaytaneh2021 qpc-github quantum-platinum-cloud nanaao dinarpay wandersonsousa hwatwasthat lgtm-migrator tools-env iq-scm edvrfn brainhub24

dirble's Issues

Further false positive detection

Hi!

I just wanted to drop here another use case that it'll be great to drop out from the results, marking it as a false positive.

During the nonexistent paths detection, it'll be great to test a random file with different extensions, as I've seen several cases where the response varies depending only on the extension appended. E.g.:

$ curl -s -o /dev/null -w "%{size_download}" http://[REDACTED]/error/1.html
14
$ curl -s -o /dev/null -w "%{size_download}" http://[REDACTED]/error/1.php
60

In this example, any request that ends in .html will have a size of 14 bytes, and any request that finishes in .php will have a size of 60 bytes.

It'll be great if the nonexistent detection routine could handle these cases too.

My two cents!

Filter and display response headers

It would be useful to be able to optionally display all or a subset of the response headers from each request, or to flag up when a reponse header matches a particular search string.

A recent test I did involved checking a load of API endpoints for header injection, and my requests included "X-Some-Custom-Header: <script>alert(1)</script>", and the server would sometimes duplicate this header in its response, or copy the header value into the response body. Being able to filter responses based on whether they included the payload in the response, against a wordlist of endpoints would dramatically speed up this testing.

Automate builds when new tags are pushed

Proper serialisation into JSON and XML

Currently there is no sanitisation or encoding of RequestResponse data when it gets serialised into XML and JSON. A URL or Location header could contain braces or angle brackets, which will mess up the structure. We should implement or derive the Serialize trait properly write unit tests with unusual inputs.

dirble/src/output_format.rs

Lines 97 to 113 in 2eb9801

 pub fn output_xml(response: &RequestResponse) -> String { 

 format!("<file url=\"{}\"> 

  <status_code>{}</status_code> 

  <size>{}</size> 

  <is_directory>{}</is_directory> 

  <is_listable>{}</is_listable> 

  <found_from_listable>{}</found_from_listable> 

  <redirect_url>{}</redirect_url> 

 </file>\n", 

 response.url, 

 response.code, 

 response.content_len, 

 response.is_directory, 

 response.is_listable, 

 response.found_from_listable, 

 response.redirect_url) 

 }

dirble/src/output_format.rs

Lines 116 to 134 in 2eb9801

 pub fn output_json(response: &RequestResponse) -> String { 

 format!("{{\ 

  \"url\": \"{}\", \ 

  \"code\": {}, \ 

  \"size\": {}, \ 

  \"is_directory\": {}, \ 

  \"is_listable\": {}, \ 

  \"found_from_listable\": {}, \ 

  \"redirect_url\": \"{}\"\ 

  }}", 

 response.url, 

 response.code, 

 response.content_len, 

 response.is_directory, 

 response.is_listable, 

 response.found_from_listable, 

 response.redirect_url) 

 }

%ext% support

Hi, I see dirble is not supporting %ext%/%EXT%. Many wordlists using this format and replace it with extensions.
Ex: admin.%ext% -> admin.php / admin.asp / admin.jsp

Increase wordlist splitting factor for base scan

Increasing the wordlist splitting factor for the initial scan of the base URL to max(wordlist_split, max_threads - 2) will dramatically increase the speed of the initial discovery phase while leaving a couple of "spare" threads available to start working on any discovered directories. Perhaps this could be the default behaviour, reverting to a fixed splitting factor when the user explicitly provides one on the command line.

Related: I don't think the splitting factor is sanity checked against the thread limit - would it make sense to cap wordlist_split at max_threads when validating the config?

Include requested host name in error message

dirble/src/arg_parse.rs

Line 874 in b0a33f5

Err(String::from("The provided target URI is invald"))

The error message is not very helpful as it does not say which hostname caused it. Include the hostname in the error message to make debugging easier.

Also there's a typo in the current message.

Directory detection fails if you specify port 80

If you run Dirble with a URL containing port 80 or 443 as follows http://[url]:80, most websites will redirect without the port number, breaking directory detection.
This would be fixed by removing :80 and :443 if the given url begins with http:// or https:// respectively.

Invoked as:
dirble http://[url]:80
Output line showing the issue:
+ http://[url]:80/javascript (CODE:301|SIZE:317|DEST:http://[url]/javascript/

Missing `follow redirects` feature

Hey,

While trying out this, I noticed that it is missing the follow redirects feature which both dirsearch and gobuster has.

It surely helps with the servers that redirects every request like:

port 80 redirecting request to 443
redirecting to add forward slash at the end.

Thanks

Output missing \n

In output.txt, you are missing \n after Dirble Scan Report for https://domain.com
Ex:

Dirble Scan Report for https://domain.com:8443/:Dirble Scan Report for https://domain.com/:+ https://domain.com/.passwd (CODE:0|SIZE:0)
+ https://domain.com/2005 (CODE:0|SIZE:0)

Options for wordlist.

Hi, can you make an option in wordlist have '/' at prefix or not? Now, if the wordlist has '/' at the prefix, it not remove it and requests with '//.' Some wordlist has '/' in default. Thanks.

Macos build for v1.4.0

Hi, I see Mac OS build is missing in the new release xD

rename --host to --url

I'm playing with the tool and getting confused everytime I run it.

host is an IP address or a dns name. But what you really require is an URL (with scheme)
https://danielmiessler.com/study/url-uri/

Raise error when -oA is provided

Currently the "output all" short alias is --oA with two dashes. Nmap uses -oA with one dash, and attempts to use "-oA" with dirble result in weird error messages (e.g. #43 ).

Short term: Error when -oA is specified
Long term: Make it so that -oA works as expected

Library to parse JSON output

It would be useful to have a library that can parse the JSON output and provide iterators over the discovered content.

JSON Output format

In the json output, can we include the target? like -> .target = "https://google.com"

Trying to see if I can integrate this wonderful tool with my automation platform(replace dirsearch) and i'm trying to match the format that was previously going into the database. Do you ever expect the url to have a different host then the target? Unless your extracting links from a page that go to other subdomains, I doubt you would, so even just a simple .path = "/api/v1/users/all" would be perfect. I don't really need to have the entire url in there but eh.

Thanks 👍

Embed current branch/commit in version number for builds that aren't directly on a release tag

To make it easier to keep track of development builds that don't correspond to releases it would be good to include the current branch and commit in the version string that's printed out in the header.

There are bindings to libgit2 that could be used in a build script to set an appropriate environment variable for the build process.

Write output every n seconds

Hi, Can it save in n seconds? Thanks.

Scanner Tripped up

Can You guys add a way to detect similar size responses for a tarpit throwing 200 with similar status codes and just disengage the host or better yet discard all results that match a similar size if its is a thing that happens over and over

Error `Option::unwrap()`

Hi, I got this error with newest source code from github

thread '<unnamed>' panicked at 'called `Option::unwrap()` on a `None` value', src/libcore/option.rs:347:21
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
stack backtrace:
   0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
   1: std::sys_common::backtrace::_print
   2: std::panicking::default_hook::{{closure}}
   3: std::panicking::default_hook
   4: std::panicking::rust_panic_with_hook
   5: std::panicking::continue_panic_fmt
   6: rust_begin_unwind
   7: core::panicking::panic_fmt
   8: core::panicking::panic
   9: dirble::output::directory_name
  10: alloc::slice::<impl [T]>::sort_by::{{closure}}
  11: alloc::slice::merge_sort
  12: dirble::output::print_report
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Any', src/libcore/result.rs:999:5
stack backtrace:
   0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
   1: std::sys_common::backtrace::_print
   2: std::panicking::default_hook::{{closure}}
   3: std::panicking::default_hook
   4: std::panicking::rust_panic_with_hook
   5: std::panicking::continue_panic_fmt
   6: rust_begin_unwind
   7: core::panicking::panic_fmt
   8: core::result::unwrap_failed
   9: dirble::main
  10: std::rt::lang_start::{{closure}}
  11: std::panicking::try::do_call
  12: __rust_maybe_catch_panic
  13: std::rt::lang_start_internal
  14: main

Embed version numbers in release zips

It would be useful to embed the version number in the release zipfile and binary names to make it easier to keep track of which version has been downloaded and whether it is up to date.

stream did not contain valid UTF-8

I quickly tested dirble and I came across this error message.

$ ./dirble http://127.0.0.1:8000 -w ~/clones/SecLists/Discovery/Web-Content/big.txt
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Custom { kind: InvalidData, error: StringError("stream did not contain valid UTF-8") }', src/libcore/result.rs:1009:5
note: Run with `RUST_BACKTRACE=1` for a backtrace.

It seems like this line causes it. Removing the line fixed the error.

https://github.com/danielmiessler/SecLists/blob/master/Discovery/Web-Content/big.txt#L16072

Silent option should not show host timeout

Hi, I think the silent option(-S) should not show host timeout (Timeout was reached) in runtime. You can make it show at the final result.
And in silent, that will be nice if you can put a processing bar into it :D

Project Roadmap

A list of features that would be nice to add, listed in no particular order:

Input

Load base request from a file
Load headers from a file
Remove empty lines from a wordlist when it's read in, but always scan [url]/
Support for multiple wordlists
Load command line options from a config file
Better detection of where the default wordlist is located
Option to pause and resume scans later

Error Checking

Check before scanning if a certificate is invalid
Optionally output certificate details
Better errors when curl returns an error, this is currently represented as a code 0
Detection and handling of URL rewriting
Wait after receiving a 429 - Too Many Requests
Detect when all responses are 401 - Unauthorized or 403 - Forbidden

Output

Scraping

Scrape pages for in scope URLs to scan
Printing of interesting comments, things such as todo, urls, high entropy sections such as hashes
Scrape robots.txt for URLs to scan

Scanning

Releasing

Actions

Run tests on Windows, Mac, Linux
Cross-compile for ARM
Build releases
Build dpkg & RPM

Utf8Error - Result::unwrap()

Hi!

I'm using dirble to run a scan using this wordlist: https://gist.github.com/jhaddix/b80ea67d85c13206125806f0828f4d10

with this options:

RUST_BACKTRACE=full ./dirble -l --scrape-listable --scan-401 --scan-403 --show-htaccess -w ../../content_discovery_all.txt -x js,php,java,bak,sql,inc,config,old,1 -u http://blank.blank

and this is what I get:

Dirble 1.4.2 (commit b6c46aa, build 2019-10-28)
Developed by Izzy Whistlecroft
Targets: http://blank.blank
Wordlists: ../../content_discovery_all.txt
No Prefixes
Extensions: 1 bak config inc java js old php sql
No lengths hidden

[INFO] Detected nonexistent paths for http:/blank.blank/ are (CODE:301)
[INFO] Increasing wordlist-split for initial scan of http://blank.blank/ to 8
thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: Utf8Error { valid_up_to: 41, error_len: Some(1) }', src/libcore/result.rs:1084:5
stack backtrace:
   0:     0x5635b03d54ab - std::panicking::default_hook::{{closure}}::hd4d730f4b49280ac
   1:     0x5635b03d5186 - std::panicking::default_hook::h15ad337e082b11af
   2:     0x5635b03d5c1d - std::panicking::rust_panic_with_hook::h1ae6f71213bb644c
   3:     0x5635b03d57a2 - std::panicking::continue_panic_fmt::h7260e5946830995a
   4:     0x5635b03d5686 - rust_begin_unwind
   5:     0x5635b03ece2d - core::panicking::panic_fmt::h0f33ccf7fc2a1201
   6:     0x5635b03ecf27 - core::result::unwrap_failed::h5f2f3948a0c719bd
   7:     0x5635b02a9bac - dirble::request::make_request::h6d5658e1e763b468
   8:     0x5635b02810af - dirble::request_thread::thread_spawn::h8b4fdc807a27d39e
   9:     0x5635b0284155 - std::sys_common::backtrace::__rust_begin_short_backtrace::hb12b1413905fb8ee
  10:     0x5635b029c476 - std::panicking::try::do_call::h1c781cdca5ded62e
  11:     0x5635b03d87da - __rust_maybe_catch_panic
  12:     0x5635b0285986 - core::ops::function::FnOnce::call_once{{vtable.shim}}::hcdba3607b5c903c6
  13:     0x5635b03c9daf - <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h31390944ec2de39e
  14:     0x5635b03d7ef0 - std::sys::unix::thread::Thread::new::thread_start::h98ef2794a4d7713d
  15:     0x7f2d2d3fd4cf - start_thread
  16:     0x7f2d2d3122d3 - clone
  17:                0x0 - <unknown>

Curl error after requesting c:/Users/Personal%201/Desktop/Portofolio : [1] Unsupported protocol
+ c:/Users/Personal%201/Desktop/Portofolio (CODE:0|SIZE:0)
Curl error after requesting c:/Users/ctyi/Desktop : [1] Unsupported protocol
+ c:/Users/ctyi/Desktop (CODE:0|SIZE:0)
Curl error after requesting c:/Users/ctyi/Desktop1 : [1] Unsupported protocol
+ c:/Users/ctyi/Desktop1 (CODE:0|SIZE:0)
Curl error after requesting c:/Users/K.HOW/Desktop/code/Responsive-Portfolio : [1] Unsupported protocol
+ c:/Users/K.HOW/Desktop/code/Responsive-Portfolio (CODE:0|SIZE:0)

Support for CURLOPT_RESOLVE

The CURLOPT_RESOLVE option instructs libcurl to override DNS lookups for particular (or all) hostnames. This would be useful when scanning a server that requires SNI but does not have a public DNS record, as an alternative to modifying /etc/hosts.

Hide-lenghts min value 0

Hi!

This is a small patch that allows zero value in the argument hide-lenght
In some cases it is necessary to hide the value 0, in ordern to avoid filling the json file with garbage.

+ https://www.example.com/test/index.php (CODE:0|SIZE:0)
Curl error after requesting https://www.example.com/test/index.php : [28] Timeout was reached

Replace min value 1 by 0.

dirble/src/arg_parse.rs

Line 553 in b6c46aa

.min_values(1)

Thanks!

Follow initial redirect

In testing out dirble, I noticed that it will attempt exactly the url that is given, but seems to not understand what to do if, for example, the following scenario is encountered:

./dirble --host abc.com
<dirble brutes abc.com, but abc.com 301's absolutely every request>

curl -skv abc.com
301 to https://abc.com

curk-skv https://abc.com
301 to https://www.abc.com

real site resides on https://www.abc.com, but input provided is just abc.com.

wpscan handles this pretty well with a function called 'follow initial redirect'.
If something like that could be possible here, it would greatly improve workflow!

	pub fn output_xml(response: &RequestResponse) -> String {
	format!("<file url=\"{}\">
	<status_code>{}</status_code>
	<size>{}</size>
	<is_directory>{}</is_directory>
	<is_listable>{}</is_listable>
	<found_from_listable>{}</found_from_listable>
	<redirect_url>{}</redirect_url>
	</file>\n",
	response.url,
	response.code,
	response.content_len,
	response.is_directory,
	response.is_listable,
	response.found_from_listable,
	response.redirect_url)
	}

	pub fn output_json(response: &RequestResponse) -> String {

	format!("{{\
	\"url\": \"{}\", \
	\"code\": {}, \
	\"size\": {}, \
	\"is_directory\": {}, \
	\"is_listable\": {}, \
	\"found_from_listable\": {}, \
	\"redirect_url\": \"{}\"\
	}}",
	response.url,
	response.code,
	response.content_len,
	response.is_directory,
	response.is_listable,
	response.found_from_listable,
	response.redirect_url)
	}