brannondorsey / passgan Goto Github PK

View Code? Open in Web Editor NEW

1.7K 52.0 350.0 26.67 MB

A Deep Learning Approach for Password Guessing (https://arxiv.org/abs/1709.00440)

License: MIT License

Python 100.00%

password-cracking machine-learning deep-learning gan password password-strength

passgan's Introduction

PassGAN

This repository contains code for the PassGAN: A Deep Learning Approach for Password Guessing paper.

The model from PassGAN is taken from Improved Training of Wasserstein GANs and it is assumed that the authors of PassGAN used the improved_wgan_training tensorflow implementation in their work. For this reason, I have modified that reference implementation in this repository to make it easy to train (train.py) and sample (sample.py) from. This repo contributes:

A command-line interface
A pretrained PassGAN model trained on the RockYou dataset

Getting Started

# requires CUDA 8 to be pre-installed
pip install -r requirements.txt

Generating password samples

Use the pretrained model to generate 1,000,000 passwords, saving them to gen_passwords.txt.

python sample.py \
	--input-dir pretrained \
	--checkpoint pretrained/checkpoints/195000.ckpt \
	--output gen_passwords.txt \
	--batch-size 1024 \
	--num-samples 1000000

Training your own models

Training a model on a large dataset (100MB+) can take several hours on a GTX 1080.

# download the rockyou training data
# contains 80% of the full rockyou passwords (with repeats)
# that are 10 characters or less
curl -L -o data/train.txt https://github.com/brannondorsey/PassGAN/releases/download/data/rockyou-train.txt

# train for 200000 iterations, saving checkpoints every 5000
# uses the default hyperparameters from the paper
python train.py --output-dir output --training-data data/train.txt

You are encouraged to train using your own password leaks and datasets. Some great places to find those include:

LinkedIn leak (1.7GB compressed, direct download. Mirror from Hashes.org)
Exploit.in torrent (10GB+, 800 million accounts. Infamous!)
Hashes.org: Awesome shared password recovery site. Consider donating if you have the resources ;)

Results

I've yet to do an exhaustive analysis of my attempt to reproduce the results from the PassGAN paper. However, using the pretrained rockyou model to generate 10⁸ password samples I was able to match 630,347 (23.97%) unique passwords in the test data, using a 80%/20% train/test split.

In general, I am somewhat surprised (and dissapointed) that the authors of PassGAN referenced prior work in the ML password generation domain but did not compare their results to that research. My initial experience with PassGAN leads me to believe that it would significantly underperform both the RNN and Markov-based approaches mentioned in that paper and I hope that it is not for this reason that the authors have chosen not to compare results.

Attribution and License

This code is released under an MIT License. You are free to use, modify, distribute, or sell it under those terms.

The majority of the credit for the code in this repository goes to @igul222 for his work on the improved_wgan_training. I've simply modularized his code a bit, added a command-line interface, and specialized it for the PassGAN paper.

The PassGAN research and paper was published by Briland Hitaj, Paolo Gasti, Giuseppe Ateniese, Fernando Perez-Cruz.

passgan's People

Contributors

Stargazers

Watchers

Forkers

hbcbh1999 pwnbrewclub 5up3rc anshul999 tuxxon cclauss tspannhw lidd1224856175 tony32769 bitkingro happyshi0402 blshkv natay robotgf bmoar h1d3r wsygoogol secice avaudioplayer peibibing w00t3k jack51706 theralfbrown aroundight b1gw00d awesome-security hao-zi xavierxxx eos21 subzeroking fengyinyang kstom locussam noobzero alexanderluo puppycodes rusthat foxweek mycodenobug jiangilhui0330 minkione 111cyber0cculte888 rklasen jun1010 cringyman tim1512 ykankaya pacejj27 hydrogenion kayibal doublepg hemantkh f0829 sunshinelin12138 yinxx tamanmerah qmeeus ashr rock419 parse-alex willstruggle keychainx vijay586 xuhaowang fzxcp3 francenzo roofjack1 xiaojingyi swoy inspirationlaurie jgowgiel jgaydos gjlper mparvezrashid embarassed andmu saadmahboob idsdarg mcaramma pvcastro c1a1o1 futurebody ro9ueadmin akasaa101 hackinfinity pentest30 johnjohnsp1 projecttime libertain beyondqieji modulexcite ovosd lamdangm emmanueltsukerman qsdj phungvankhanh ifishzz hyperhex rongqinglee ladin157

passgan's Issues

i have three training datas. How to use train.py to process these data?

Win 7

Does this work with Win 7?

Thanks

failed to map segment from shared object

Hello guys, i got this error and don't know how to fix it. Stackoverflow says is related to permission stuff.

_ImportError: /home/USER/.local/lib/python3.10/site-packages/tensorflow/python/platform/pywrap_cpu_feature_guard.so: failed to map segment from shared object

Cheers

Requires CUDA 8

This implementation requires CUDA 8. If CUDA 9 or greater is installed an error similar to the one below will be thrown. I've received questions about this via email, so I've created this issue for future reference. I've also updated the README to specify CUDA 8 is required.

Traceback (most recent call last):
  File "train.py", line 8, in <module>
    import tensorflow as tf
  File "/home/bf/.local/lib/python2.7/site-packages/tensorflow/__init__.py", line 24, in <module>
    from tensorflow.python import *
  File "/home/bf/.local/lib/python2.7/site-packages/tensorflow/python/__init__.py", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "/home/bf/.local/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 72, in <module>
    raise ImportError(msg)
ImportError: Traceback (most recent call last):
  File "/home/bf/.local/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/home/bf/.local/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/home/bf/.local/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
ImportError: libcublas.so.8.0: cannot open shared object file: No such file or directory

Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/install_sources#common_installation_problems

for some common reasons and solutions.  Include the entire stack trace
above this error message when asking for help.

Passgzn

TensorFlow doesn't support NCHW

`UnimplementedError (see above for traceback):  Generic conv implementation only supports NHWC tensor format for now.
	 [[Node: Generator.1.1/conv1d/Conv2D = Conv2D[T=DT_FLOAT, data_format="NCHW", padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Generator.1.1/conv1d/ExpandDims, Generator.1.1/conv1d/ExpandDims_1)]]`

my tensorflow is 1.4.0
python is 2.7

ASCII code error

To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2023-02-24 12:27:40.098752: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT Traceback (most recent call last): File "/home/morpheuslord/Desktop/PassGAN-master/sample.py", line 74, in <module> charmap = pickle.load(f) ^^^^^^^^^^^^^^ UnicodeDecodeError: 'ascii' codec can't decode byte 0x83 in position 0: ordinal not in range(128)

So this is what error i am getting in addition with many bracket missing errors which were easy to locate and correct but this error i am not able to wrap my head around

How do you use the thing?

If i want to crack/brute force a login page how do i make PassGan do that?

i found many backticks in password simple，is it right？

root@07d198dcd43f:/notebooks# tail /mnt/passgan/samples/samples_172900.txt

Baugedd*76
ashas#1```
r.macj````
AvirTAbN``
dice@35```
Salderr65`
Ml4pann```
Ep044j8```
s?j009````
koph@9306`

Syntax error in train.py ?

Windows 10 64 Bit
Python 3.6.5 64 bit
grabbed all requirements except tensorflor 1.5.0 because pip couldn't find 1.4.1 for some reason

Wanted to train the GAN using the linkedin leak, so using

python train.py --output-dir output --training-data data/in.txt

And it throws me a syntax error:

File "D:\PassGAN\train.py", line 144
print " print "validation set JSD for n={}: {}".format(i+1, true_char_ngram_lms[i].js_with(validation_char_ngram_lms[i]))

Syntax Error: invalid syntax. The error is specified to be at the end of the "" encapsulated string. I have little knowledge of Python as I never got in contact with it, so I can't solve it myself..?

Question about PassGAN implementation

Hi, I have some questions:

Is the data encoded as onehot representation for each character of the password?
How do you enforce the generator to return onehot encoded characters to be fed into the discriminator?
How do you force the network to produce variable-length passwords?

Thanks

Consider to Stop Hotlinking Hashes.org Files

Full disclosure, I'm not affiliated with Hashes.org.

Every visitor of Hashes.org can read the following disclaimer on their website

We host Hashes.org out of our own pockets. Even a small donation can really help!

In your README.md file you are hotlinking a rather large (2.9 GB) file from their server.
Please consider to change this hyperlink to the homepage of Hashes.org, instead of pointing directly to the file. Independent of this, you really should promote donating for their services or stop linking their site completely. Otherwise you can be sure that the great places to find those will soon be gone.

Cheers!

Code has many errors

Hi. Trying this code in Python 3.9 (Window 10 build 1904), but I think the code is probably for Python 2.x as I am getting a lot of errors:

PS C:\github\PassGAN> python sample.py --output ./new_trained.txt --num-samples 1000000 --input-dir ./pretrained --checkpoint ./pretrained/checkpoints/195000.ckpt
2021-04-24 14:50:47.249852: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
Traceback (most recent call last):
File "sample.py", line 74, in
charmap = pickle.load(f)
_pickle.UnpicklingError: the STRING opcode argument must be quoted
PS C:\github\PassGAN>

Are you planning on updating this code at some point to work in python 3.x ?

Errors in thread operation

Hi Brannon

why in application work only one thread?

if detach this thread in strace, see many messages 'futex_wait_bitset_private futex_clock_realtime connection timed out'

My system:
00:1e.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1) (prog-if 00 [VGA controller]) Subsystem: NVIDIA Corporation GM204GL [Tesla M60] Physical Slot: 30 Flags: bus master, fast devsel, latency 248, IRQ 155 Memory at 94000000 (32-bit, non-prefetchable) [size=16M] Memory at 80000000 (64-bit, prefetchable) [size=256M] Memory at 92000000 (64-bit, prefetchable) [size=32M] I/O ports at c100 [size=128] Capabilities: [60] Power Management version 3 Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [78] Express Endpoint, MSI 00 Kernel driver in use: nvidia Kernel modules: nvidia_384_drm, nvidia_384
Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz

thanks

Conv2DCustomBackpropFilterOp only supports NHWC

Hi,
When I'm trying to run PassGAN model there is an error:
_2018-01-21 19:45:25.121447: E tensorflow/core/common_runtime/executor.cc:643] Executor failed to create kernel. Invalid argument: Conv2DCustomBackpropFilterOp only supports NHWC.
[[Node: gradients_2/Discriminator.5.2/conv1d/Conv2D_grad/Conv2DBackpropFilter = Conv2DBackpropFilter[T=DT_FLOAT, data_format="NCHW", padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, device="/job:localhost/replica:0/task:0/device:CPU:0"](Discriminator.5.2/conv1d/ExpandDims, gradients_2/Discriminator.5.2/conv1d/Conv2D_grad/ShapeN:1, gradients_2/Discriminator.5.2/conv1d/Squeeze_grad/Reshape)]]

Cuda and tensorflow-gpu are working correctly, I verified this for example by running CUDNN minimal deep learning training code sample using LeNet (https://github.com/tbennun/cudnn-training), no issues there.

After changing data format from NCHW to NHWC in files: batchnorm.py, conv1d.py and conv2d.py I have got another error:
ValueError: Dimensions must be equal, but are 10 and 128 for 'Generator.1.1/conv1d/Conv2D' (op: 'Conv2D') with input shapes: [64,1,128,10], [1,5,128,128].

Is anybody have the same issue? Any ideas about what is wrong and how to solve it?
BTW. solution from closed issue (#3) did not work for me...

My mistake, for python2.7 I had both tensorflow and tensorflow-gpu, after laving only GPU version, problem solved.
Closed issue.

PC details:
Linux Mint 18.3
Intel i7-8700K
Titan X(Pascal)
Nvidia drivers: 390.12

Setup details:
nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61

cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR 6
#define CUDNN_MINOR 0
#define CUDNN_PATCHLEVEL 21
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)
#include "driver_types.h

Will there be a ROCm version?

requirements.txt is missing

The README refers to requirements.txt, but it is not present in the repo.

Python and syntax errors and more in this project

problem with the Python and syntax errors and more in this project . Due to these issues, the project is not working properly and can be considered a waste of time for anyone trying to use it.

Error in training and password generation

1- I am getting the following error during training the model

python train.py --output-dir output --training-data data/rockyou-train.txt

File "train.py", line 97, in
real_inputs_discrete = tf.placeholder(tf.int32, shape=[args.batch_size, args.seq_length])
AttributeError: module 'tensorflow' has no attribute 'placeholder'

2- I am getting following error , while generating the password using the following command

python sample.py
--input-dir pretrained
--checkpoint pretrained/checkpoints/195000.ckpt
--output gen_passwords.txt
--batch-size 1024
--num-samples 1000000

Traceback (most recent call last):
File "sample.py", line 74, in
charmap = pickle.load(f)
UnicodeDecodeError: 'ascii' codec can't decode byte 0x83 in position 0: ordinal not in range(128)

Kindly help.

about memory

When the train.txt file is large (2G), train.py uses a lot of memory. Is this situation normal?

How to train model with LinkedIn leak ?

I unpacked the "LinkedIn leak" archive into the "/data" directory and renamed it to train.txt. Launched with the following command:
python2 train.py --output-dir output --training-data data/train.txt
And throws the following error:

loaded 0 lines in dataset
Traceback (most recent call last):
File "train.py", line 99, in
fake_inputs = models.Generator(args.batch_size, args.seq_length, args.layer_dim, len(charmap))
File "/share/home/nikita/rd/PassGAN/models.py", line 24, in Generator
output = tf.transpose(output, [0, 2, 1])
File "/home/nikita/.local/lib/python2.7/site-packages/tensorflow/python/ops/array_ops.py", line 1336, in transpose
ret = gen_array_ops.transpose(a, perm, name=name)
File "/home/nikita/.local/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 5694, in transpose
"Transpose", x=x, perm=perm, name=name)
File "/home/nikita/.local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/nikita/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2958, in create_op
set_shapes_for_outputs(ret)
File "/home/nikita/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2209, in set_shapes_for_outputs
shapes = shape_func(op)
File "/home/nikita/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2159, in call_with_requiring
return call_cpp_shape_fn(op, require_shape_fn=True)
File "/home/nikita/.local/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 627, in call_cpp_shape_fn
require_shape_fn)
File "/home/nikita/.local/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 691, in _call_cpp_shape_fn_impl
raise ValueError(err.message)
ValueError: Dimension must be 2 but is 3 for 'transpose' (op: 'Transpose') with input shapes: [64,10], [3].

P.S. sample rockyou traind successful

Auto Save Chekpoint and other staff.

Pls add configurable Auto Save Chekpoint --checpoint 1000000 (every milion pass).
Update requirements for last python 3.11.
Add gui.
Add auto detect dictionary list encoding and add support csv files.

Its not working

PassGAN/utils.py

Line 70 in 5eeca01

return float(num) / denom

Why use Tensorflow==1.4.1?

Is there a reason requirments.txt says tensorflow==1.4.1 and not the latest like 1.7.0 compatible with CUDA 9.1? Is it just what you had/have at the time or is there other library or dependency issues that pose an issue?

PassGAN did not compare with FLA?

You said that PassGAN did not compare their results to FLA ... (Results part in READ.md...)

However, I think PassGAN did compare their results to the results of FLA in the latest version paper..

Am I right?

TensorFlow uses only single GPU

I have a system with multiple GPUs, but the TensorFlow library only makes use of one of them. Is it possible to parallelize the training and password generation across multiple GPUs?