Giter VIP home page Giter VIP logo

web-defacement-detection-tool's Introduction

Web Defacement Detection Tool

This tool uses archived defaced websites as source for learning about defacement signatures. Defacement signature is basic concept in this tool, and represents set of elements that are typical for specific defacer (notifier). Having knowledge of one’s signature (or signatures), it is possible to detect similar web defacements in future, or detect/prevent defacement hacking action on server side. Defacement signature is represented with 5 types of web elements that are typical for any defacer. Those elements are: all visible text, Images, backgroundImages, alerts, embedded audio or video content. Detection of these signatures is not trivial, as sometimes signature elements are embedded as part of legitimate website. Therefore, we are encountering problem of elimination of elements that are not part of signature. Idea, algorithm and whole theory behind algorithm concerning signature noise elimination is fully covered in DEFACEMENTS.docx.

Tool consists of four Python scripts, and one PostgreSQL database.

collector.py

This script uses Chrome web driver controlled over Selenium to access HTML elements of archived defaced website. Script also navigates trough list of reported defacements on publishing source (zone-h.org). All defacements are collected periodically and stored in database with complete set of elements from defaced webpage representation. Each defacement is associated with notifier in database.

processor.py

This script takes care of maintaining database size, generating defacement signatures and deleting old signatures that are not used anymore by defacers. As complete content of webpages is saved in database, we need to deal with problem of fast growing database size. Old defacements that does not anymore represent significant input to signature detection algorithm are deleted from database. All defacements from past 6 months are left in database for possible future testing with algorithm. Script takes necessary input from database and calls for signature detection algorithm, then saves resulting signature output back to database. All signatures older than 3 years are deleted from database.

crawler.py

Scripts reads domains.list file and scans all URLs in file for possible signature detection. Chrome web driver over Selenium is used to retrieve elements from scanned webpages, which are then compared against detected signatures. Comparison algorithm is to be further improved and described afterwards. Script returns scan result for each URL, with information if any signature and associated defacer are detected.

WebDfcAlg.py

Script implementing signature noise elimination algorithm. Algorithm is described in DEFACEMENTS.docx in detail.

schema.sql

pg_dump generated database schema

web-defacement-detection-tool's People

Contributors

marm0 avatar

Watchers

James Cloos avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.