Giter VIP home page Giter VIP logo

dafthack / powermeta Goto Github PK

View Code? Open in Web Editor NEW
526.0 33.0 122.0 5.83 MB

PowerMeta searches for publicly available files hosted on various websites for a particular domain by using specially crafted Google, and Bing searches. It then allows for the download of those files from the target domain. After retrieving the files, the metadata associated with them can be analyzed by PowerMeta. Some interesting things commonly found in metadata are usernames, domains, software titles, and computer names.

License: MIT License

PowerShell 100.00%

powermeta's Introduction

PowerMeta

PowerMeta searches for publicly available files hosted on various websites for a particular domain by using specially crafted Google, and Bing searches. It then allows for the download of those files from the target domain. After retrieving the files, the metadata associated with them can be analyzed by PowerMeta. Some interesting things commonly found in metadata are usernames, domains, software titles, and computer names.

Public File Discovery

For many organizations it's common to find publicly available files posted on their external websites. Many times these files contain sensitive information that might be of benefit to an attacker like usernames, domains, software titles or computer names. PowerMeta searches both Bing and Google for files on a particular domain using search strings like "site:targetdomain.com filetype:pdf". By default it searches for "pdf, docx, xlsx, doc, xls, pptx, and ppt".

Metadata Extraction

PowerMeta uses Exiftool by Phil Harvey to extract metadata information from files. If you would prefer to download the binary from his site directly instead of using the one in this repo it can be found here: http://www.sno.phy.queensu.ca/~phil/exiftool/. Just make sure the exiftool executable is in the same directory as PowerMeta.ps1 when it is run. By default it just extracts the 'Author' and 'Creator' fields as these commonly have usernames saved. However all metadata for files can be extracted by passing PowerMeta the -ExtractAllToCsv flag.

Requirements

PowerShell version 3.0 or later

Usage

Import the Module

C:\> powershell.exe -exec bypass
PS C:\> Import-Module PowerMeta.ps1

Basic Search

This command will initiate Google and Bing searches for files on the 'targetdomain.com' domain ending with a file extension of pdf, docx, xlsx, doc, xls, pptx, or pptx. Once it has finished crafting this list it will prompt the user asking if they wish to download the files from the target domain. After downloading files it will prompt again for extraction of metadata from those files.

PS C:\> Invoke-PowerMeta -TargetDomain targetdomain.com

Changing FileTypes and Automatic Download and Extract

This command will initiate Google and Bing searches for files on the 'targetdomain.com' domain ending with a file extension of pdf, or xml. It will then automatically download them from the target domain and extract metadata.

PS C:\> Invoke-PowerMeta -TargetDomain targetdomain.com -FileTypes "pdf, xml" -Download -Extract

Downloading Files From A List

This command will initiate Google and Bing searches for files on the 'targetdomain.com' domain ending with a file extension of pdf, docx, xlsx, doc, xls, pptx, or pptx and write the links of files found to disk in a file called "target-domain-links.txt".

PS C:\> Invoke-PowerMeta -TargetDomain targetdomain.com -TargetFileList target-domain-links.txt

Extract All Metadata and Limit Page Search

This command will initiate Google and Bing searches for files on the 'targetdomain.com' domain ending with a file extension of pdf, docx, xlsx, doc, xls, pptx, or pptx but only search the first two pages. All metadata (not just the default fields) will be saved in a CSV called all-target-metadata.csv.

PS C:\> Invoke-PowerMeta -TargetDomain targetdomain.com -MaxSearchPages 2 -ExtractAllToCsv all-target-metadata.csv

Extract Metadata From Files In A Directory

This command will simply extract all the metadata from all the files in the folder "\2017-03-031-144953" and save it in a CSV called all-target-metadata.csv

PS C:\> ExtractMetadata -OutputDir .\2017-03-031-144953\ -ExtractAllToCsv all-target-metadata.csv

PowerMeta Options

TargetDomain        - The target domain to search for files. 
FileTypes           - A comma seperated list of file extensions to search for. By default PowerMeta searches for "pdf, docx, xlsx, doc, xls, pptx, ppt".
OutputList          - A file to output the list of links discovered through web searching to. 
OutputDir           - A directory to store all downloaded files in.
TargetFileList      - List of file links to download.
Download            - Instead of being prompted interactively pass this flag to auto-download files found.
Extract             - Instead of being prompted interactively pass this flag to extract metadata from found files pass this flag to auto-extract any metadata.
ExtractAllToCsv     - All metadata (not just the default fields) will be extracted from files to a CSV specified with this flag.
UserAgent           - Change the default User Agent used by PowerMeta.
MaxSearchPages      - The maximum number of pages to search on each search engine.

powermeta's People

Contributors

dafthack avatar nidem avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

powermeta's Issues

Increase Search Page Limit

Right now PowerMeta can only search the original page, and whatever additional pages are linked to that page (Around 5-8 Google pages ~ around 500-800 site results). Need to add logic to find all Google/Bing page results.

PDF download page restriction?

Hi!

I've been trialing your code and it's really great, however, when I enter a targetdomain that I know contains a PDF of say 200 pages, it only downloads a section of the report and only say 30 pages. I have very limited programming experience so just wondering what the reason might be.

Many thanks

Extracting Meta & Saving to CSV

Having some issues extracting and saving to CSV

Here's the command I tired using to attempt to extract Meta & save to CSV.

Invoke-PowerMeta NAMECHANGED.com -ExtractAllToCsv all-target-metadata.csv

Here's the output after completing the search:

Extract Metadata?
Would you like to extract metadata from all of the files downloaded now?
[Y] Yes  [N] No  [?] Help (default is "Y"): y
[*] Now extracting metadata from the files.
ExtractMetadata : Missing an argument for parameter 'ExtractAllToCsv'. Specify a parameter of type 'System.String' and try again.
At C:\PowerMeta-master\PowerMeta.ps1:378 char:63
+ ...                ExtractMetadata -OutputDir $OutputDir -ExtractAllToCsv
+                                                          ~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidArgument: (:) [ExtractMetadata], ParameterBindingException
    + FullyQualifiedErrorId : MissingArgument,ExtractMetadata

I have exiftool in the same directory as the folder as well.

Execution of program

hi am new to powershell... how do i do Invoke-powermeta because my gives me error: is not recorgnize as the name of a cmdlet, function, script file etc. :)

Add CAPTCHA Answer Functionality

PowerMeta doesn't take into effect rate limiting by sites like Google. Need to add in ability to answer CAPTCHA requests on the fly.

Invalid characters for filenames are not scrubbed

Links to download pages instead of direct to a file will look like /view.aspx?src=... which causes an error like this:

Invoke-WebRequest : Cannot perform operation because the wildcard path·
C:\Users\egypt\Desktop\powermeta-data\2018-03-12T11.45.30\view.aspx?src=https%3A%2F%2Fgo.redacted.com%2Frs%2Fredactedinc%2Fimages%2FQ314IndexData.xlsx di
At Z:\repo\PowerMeta\PowerMeta.ps1:479 char:11
+           Invoke-WebRequest $link -UserAgent $UserAgent -UseBasicPars ... 
+           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : OpenError: (C:\Users\egypt\...4IndexData.xlsx:String) [Invoke-WebRequest], FileNotFoundException
        + FullyQualifiedErrorId : FileOpenFailure,Microsoft.PowerShell.Commands.InvokeWebRequestCommand

FileNames?

Hi again!

Also, how does one add the option of searching for specific file names as well (in conjunction with file types)?

Many thanks
Rudolph

Add Multi-Threading

Right now PowerMeta only downloads one file at a time. Need to add in multi-threading ability.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.