Giter VIP home page Giter VIP logo

perimeter-scanner's Introduction

perimeter-scanner

A schedulable OSINT scanner using recon-ng which allows for the analysis of an attack surface over time.

Background

This project comes from a desire to learn how to build applications on AWS, and a practical need to understand the attack surface of an organisation as a function of time. This initial, very basic release does domain enumeration periodically with effectively indefinite persistence of results. This will hopefully serves as a helpful baseline for dcata analysis.

Overall Architecture and CICD

AWS helpfully explains how we should segment large Organisations into multiple Accounts, and I have used a similar architecture for this project. Specifically:

  • I have production and non-production workloads in their own accounts; and
  • The CICD pipeline and associated components are also has its own acccount.

The pipeline itself comprises a source stage in Github, a build step followed by a deployment to a non-production (I refer to it herein as devtest) environment, and finally: deployment to production.

The App

Fundamentally: recon-ng with a custom workflow run from the shell of a EC2 Instance created for the task. I could not figure out how to make recon-ng serverless, and so I've done what I hope is the next best thing:

  • An EC2 Instance which is configured to launch, install recon-ng, clone this rpo, set up cron, set up some environment variables, then stop.
  • A scheduled eventbridge task that invokes a Lambda, which itself starts the previously launched EC2 Instance. This act causes the enumeration script to run, resulting in a CSV file which is copied to S3 and then deleted locally. That EC2 Instance is then stopped (because I am cheap / frugal).
  • S3 event notifications then cause another Lambda to invoke, which parses the file, loads contents into a DynamoDB table, then deletes the CSV file.
  • A secondary DynamoDB table that summarises when hosts were first detected and most recently detected.

Instructions

  1. Deploy the crossAccountRoles stack to the Accounts you intend to use for production and devtest.
  2. Deploy the pipeline stack to the Account you intend to use for CICD.
  3. The CodePipeline object will run automatically once the pipeline stack reaches CREATE_COMPLETE. This will result in a built application.

Assumptions / Parameters

OK, so despite my best efforts the application is not perfectly self-contained. I.e. there are some items that need to be set up prior to deploying the pipeline stack.

  • The Pipeline's source stage connects to Github through a Codestar Connection. This needs to instantiated, and its ARN given as a parameter.
  • Since this repo is currently private, it cannot be accessed without authentication. Loathe as I am to store secrets in repos of any kind - I have placed the require Personal Access Token into the Systems Manager Paramater Store.

perimeter-scanner's People

Contributors

pa-wills avatar

Stargazers

Karthick Siva avatar

Watchers

 avatar Karthick Siva avatar  avatar

perimeter-scanner's Issues

CICD not updating the user data on updatestack

So - I update the ami, which forces the resource to rebuild which it then seemingly does with the old user data. So annoying. I wonder if it will do that when I do the May ami update.

Seems like a real aberration.

Port the ec2 enumerator to a container

Maerk's lecture on ECS Soln Archs is interesting.

Suggests that I should really just define the whole thing as a Task, launch it in Fargate, and then invoke it on a schedule. I think this would drastically simply the maintenance. For one thing - no need to adjust AMIs periodically, etc.

Items being enqueued multiple times

The queue entry logic is essentially - enqueue it if it hasn't been nmap'd for a week.

This was fine until I started experimenting with the frequency of the queue-filling lambda. It's now putting the same item in there multiple times. I need a check that says - here's the last time I enqueued this item to run, and gate it on this constraint as well, or find a way of ensuring deduplication at a message level.

Perhaps the SQS deduplication feature

Put the bash file onto the EC2 host

As part of the EC2 commissioning process - i need to go and get the bash script. Seems to me the easiest way to do this would be to clone the existing repo, but it's private.

So, one of two solutions:

  1. Give the EC2 instance a ssh key and then go clone the repo as part of launch.
  2. Stash the file in S3 as part of the stack's creation, and then cp the file over from s3.

Either way though seems to involve extending trust from GItHub to the stack, so might as well keep it very simple and go with (1).

Probably take the key pair information from CFN as a parameter, and then stash it in the SSM Param Store, and pull it out when needed. Good little exercise.

In the interim - I will hack my way through this, and just put the file in place.

Schedule / cron

Also - just to simplify this -

  • configure the enumerate script to run on boot.
  • configure that script to stop the Instance after execution, maybe? Or, create a flag, or a signal or something?

template.yaml is becoming unwieldy

I'd like to break out the lambdas at the very least. Can I do this without requiring a CodeBuild project? Probably not. Probably I can mitigate things though by putting build into its own stage.

Alerting

So I find a new enumerated host or a new open port. What then. Emails and ITOM would be my guess.

DyDB archiving

I need to be able to tail off some of the old entries, and stick them into S3 possibly with a lifecycle rule. Not so much a cost issue, but because the lambdas take a long time to run otherwise.

Configure backups

  • Make PITR active for the production DyDB table.
  • AWS Backup for both Prod and DevTest, though with different settings.
  • Probably backup the production EBS Instances as well. Weekly is probably fine; it's stateless after all.
  • Configure all of this into the CFN template.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.