Giter VIP home page Giter VIP logo

threat-hunting-samples's Introduction

Introduction

With this Github repository, Mossé Cyber Security Institute offers you multiple datasets to practice Threat Hunting.

For educational purposes, the answers to dataset 1 have been made available. For the other two datasets, it will be up to you to determine which devices have been compromised.

Getting Started

Step 1 - Download Anaconda

We strongly recommend that you download and use Anaconda:

Anaconda offers the easiest way to perform Python data science and machine learning on a single machine.

Step 2 - Download the required Python Packages

Install Pandas, Pyarrow, and Numpy:

python -m pip install -r requirements.txt
  • Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.
  • Apache Arrow is a development platform for in-memory analytics. It contains a set of technologies that enable big data systems to store, process and move data fast.
  • NumPy is the fundamental package for scientific computing in Python. It is a Python library that provides a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more.

Step 3 - Use Jupyter Notebook

We recommend that you work in a Jupyter Notebook:

Command: jupyter notebook

Step 4 - Bookmark online resources

If you're new to threat hunting and Pandas, then we recommend that you bookmark the following pages:

Important: Make sure to watch the introduction video. The first link.

Solving Dataset #1

We provide you solutions to identify all the Indicators of Compromise (IOC) in dataset 1.

Disclaimer: The solutions provided are designed to be simple. In the real-world, you'll need to engineer smarter ways of detecting attacks.

Step 1 - Reading the Datasets

Start a Jupyter Notebook and confirm that you can read one of the datasets:

import pandas as pd
import pyarrow as pa
import pyarrow.parquet as pq

dataset = pq.ParquetDataset('dataset-1/w32services/')
table = dataset.read()
w32services = table.to_pandas()

Step 2 - Identify Path Interception by Unquoted Path

Adversaries may execute their own malicious payloads by hijacking vulnerable file path references. Adversaries can take advantage of paths that lack surrounding quotations by placing an executable in a higher level directory within the path, so that Windows will choose the adversary's executable to launch. (source)

Here's how you can find Path Interception IOCs in the first dataset:

search_1 = w32processes[w32processes['name'] == 'Program.exe']

print("> Machines with Path Interception:")
print(search_1[['hostname', 'path', 'arguments']].to_string(index=False))

Step 3 - Identify Procdump.exe

ProcDump is a command-line utility whose primary purpose is monitoring an application for CPU spikes and generating crash dumps during a spike that an administrator or developer can use to determine the cause of the spike. ProcDump also includes hung window monitoring (using the same definition of a window hang that Windows and Task Manager use), unhandled exception monitoring and can generate dumps based on the values of system performance counters. It also can serve as a general process dump utility that you can embed in other scripts.

Here's how you find machines where the adversary used procdump to dump the memory of LSASS:

search_2 = w32processes[w32processes['name'] == 'procdump.exe']

print("> Machines with procdump.exe: %d" % len(search_2))
print(search_2[['hostname', 'arguments']].to_string(index=False))

Step 4 - Identify Accessibility Feature backdoors

Adversaries may establish persistence and/or elevate privileges by executing malicious content triggered by accessibility features. Windows contains accessibility features that may be launched with a key combination before a user has logged in (ex: when the user is on the Windows logon screen). An adversary can modify the way these programs are launched to get a command prompt or backdoor without logging in to the system. (source)

Here's how you can detect the Accessibility Feature backdoors in the dataset:

search_3 = w32registry[w32registry['valuename'] == 'Debugger']
search_3 = search_3[search_3['keypath'].str.contains('Image File Execution Options')]

print("Machines with Accessibility Features Backdoors:")
print(search_3[['hostname', 'keypath', 'text']].to_string(index=False))

Hints

Dataset Machines Hints
1 25 machines LSASS process dumping, PATH Interception, Accessibility Features Backdoor
2 50 machines DLL injection, PowerShell Execution, MSHTA Execution, Regsvr32 Execution
3 75 machines Malicious User Accounts, Living of the Land, DLL Injection
4 100 machines Jscript Backdoor, PowerShell Dropper, 2x Reverse Shells

Contact Us

We invite you to contact us if you have any questions or would like to report errors with the datasets. Our email is [email protected]

Have fun!

threat-hunting-samples's People

Contributors

testpersonal avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.