This jupyter notebook consists a project which implemets K mean clustering with PySpark. Meta data of each session showed that the hackers used to connect to their servers were found, for system that was breached. This data is used whether to identify whether 2 or 3 hackers were involved of the potential 3 hackers
- Jupyter notebook - an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text.
- PySpark
- Python
- Git clone the folder to your computer
- Run the jupyternote book. ( make sure you have either Anaconda or virtual env running jupyternotebook )
- Run each block of code to see what happens at each step.
Before installing pySpark, make sure you have Java 8 or higher installed on your computer.
-Farhana