Giter VIP home page Giter VIP logo

passgan's Introduction

PassGAN

This repository contains code for the PassGAN: A Deep Learning Approach for Password Guessing paper.

The model from PassGAN is taken from Improved Training of Wasserstein GANs and it is assumed that the authors of PassGAN used the improved_wgan_training tensorflow implementation in their work. For this reason, I have modified that reference implementation in this repository to make it easy to train (train.py) and sample (sample.py) from. This repo contributes:

  • A command-line interface
  • A pretrained PassGAN model trained on the RockYou dataset

Getting Started

# requires CUDA 8 to be pre-installed
pip install -r requirements.txt

Generating password samples

Use the pretrained model to generate 1,000,000 passwords, saving them to gen_passwords.txt.

python sample.py \
	--input-dir pretrained \
	--checkpoint pretrained/checkpoints/195000.ckpt \
	--output gen_passwords.txt \
	--batch-size 1024 \
	--num-samples 1000000

Training your own models

Training a model on a large dataset (100MB+) can take several hours on a GTX 1080.

# download the rockyou training data
# contains 80% of the full rockyou passwords (with repeats)
# that are 10 characters or less
curl -L -o data/train.txt https://github.com/brannondorsey/PassGAN/releases/download/data/rockyou-train.txt

# train for 200000 iterations, saving checkpoints every 5000
# uses the default hyperparameters from the paper
python train.py --output-dir output --training-data data/train.txt

You are encouraged to train using your own password leaks and datasets. Some great places to find those include:

Results

I've yet to do an exhaustive analysis of my attempt to reproduce the results from the PassGAN paper. However, using the pretrained rockyou model to generate 10⁸ password samples I was able to match 630,347 (23.97%) unique passwords in the test data, using a 80%/20% train/test split.

In general, I am somewhat surprised (and dissapointed) that the authors of PassGAN referenced prior work in the ML password generation domain but did not compare their results to that research. My initial experience with PassGAN leads me to believe that it would significantly underperform both the RNN and Markov-based approaches mentioned in that paper and I hope that it is not for this reason that the authors have chosen not to compare results.

Attribution and License

This code is released under an MIT License. You are free to use, modify, distribute, or sell it under those terms.

The majority of the credit for the code in this repository goes to @igul222 for his work on the improved_wgan_training. I've simply modularized his code a bit, added a command-line interface, and specialized it for the PassGAN paper.

The PassGAN research and paper was published by Briland Hitaj, Paolo Gasti, Giuseppe Ateniese, Fernando Perez-Cruz.

passgan's People

Contributors

brannondorsey avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

passgan's Issues

TensorFlow uses only single GPU

I have a system with multiple GPUs, but the TensorFlow library only makes use of one of them. Is it possible to parallelize the training and password generation across multiple GPUs?

ASCII code error

To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2023-02-24 12:27:40.098752: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT Traceback (most recent call last): File "/home/morpheuslord/Desktop/PassGAN-master/sample.py", line 74, in <module> charmap = pickle.load(f) ^^^^^^^^^^^^^^ UnicodeDecodeError: 'ascii' codec can't decode byte 0x83 in position 0: ordinal not in range(128)

So this is what error i am getting in addition with many bracket missing errors which were easy to locate and correct but this error i am not able to wrap my head around

How to train model with LinkedIn leak ?

I unpacked the "LinkedIn leak" archive into the "/data" directory and renamed it to train.txt. Launched with the following command:
python2 train.py --output-dir output --training-data data/train.txt
And throws the following error:

loaded 0 lines in dataset
Traceback (most recent call last):
File "train.py", line 99, in
fake_inputs = models.Generator(args.batch_size, args.seq_length, args.layer_dim, len(charmap))
File "/share/home/nikita/rd/PassGAN/models.py", line 24, in Generator
output = tf.transpose(output, [0, 2, 1])
File "/home/nikita/.local/lib/python2.7/site-packages/tensorflow/python/ops/array_ops.py", line 1336, in transpose
ret = gen_array_ops.transpose(a, perm, name=name)
File "/home/nikita/.local/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 5694, in transpose
"Transpose", x=x, perm=perm, name=name)
File "/home/nikita/.local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/nikita/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2958, in create_op
set_shapes_for_outputs(ret)
File "/home/nikita/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2209, in set_shapes_for_outputs
shapes = shape_func(op)
File "/home/nikita/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2159, in call_with_requiring
return call_cpp_shape_fn(op, require_shape_fn=True)
File "/home/nikita/.local/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 627, in call_cpp_shape_fn
require_shape_fn)
File "/home/nikita/.local/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 691, in _call_cpp_shape_fn_impl
raise ValueError(err.message)
ValueError: Dimension must be 2 but is 3 for 'transpose' (op: 'Transpose') with input shapes: [64,10], [3].

P.S. sample rockyou traind successful

Requires CUDA 8

This implementation requires CUDA 8. If CUDA 9 or greater is installed an error similar to the one below will be thrown. I've received questions about this via email, so I've created this issue for future reference. I've also updated the README to specify CUDA 8 is required.

Traceback (most recent call last):
  File "train.py", line 8, in <module>
    import tensorflow as tf
  File "/home/bf/.local/lib/python2.7/site-packages/tensorflow/__init__.py", line 24, in <module>
    from tensorflow.python import *
  File "/home/bf/.local/lib/python2.7/site-packages/tensorflow/python/__init__.py", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "/home/bf/.local/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 72, in <module>
    raise ImportError(msg)
ImportError: Traceback (most recent call last):
  File "/home/bf/.local/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/home/bf/.local/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/home/bf/.local/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
ImportError: libcublas.so.8.0: cannot open shared object file: No such file or directory

Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/install_sources#common_installation_problems

for some common reasons and solutions.  Include the entire stack trace
above this error message when asking for help.

Conv2DCustomBackpropFilterOp only supports NHWC

Hi,
When I'm trying to run PassGAN model there is an error:
_2018-01-21 19:45:25.121447: E tensorflow/core/common_runtime/executor.cc:643] Executor failed to create kernel. Invalid argument: Conv2DCustomBackpropFilterOp only supports NHWC.
[[Node: gradients_2/Discriminator.5.2/conv1d/Conv2D_grad/Conv2DBackpropFilter = Conv2DBackpropFilter[T=DT_FLOAT, data_format="NCHW", padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, device="/job:localhost/replica:0/task:0/device:CPU:0"](Discriminator.5.2/conv1d/ExpandDims, gradients_2/Discriminator.5.2/conv1d/Conv2D_grad/ShapeN:1, gradients_2/Discriminator.5.2/conv1d/Squeeze_grad/Reshape)]]

Cuda and tensorflow-gpu are working correctly, I verified this for example by running CUDNN minimal deep learning training code sample using LeNet (https://github.com/tbennun/cudnn-training), no issues there.

After changing data format from NCHW to NHWC in files: batchnorm.py, conv1d.py and conv2d.py I have got another error:
ValueError: Dimensions must be equal, but are 10 and 128 for 'Generator.1.1/conv1d/Conv2D' (op: 'Conv2D') with input shapes: [64,1,128,10], [1,5,128,128].

Is anybody have the same issue? Any ideas about what is wrong and how to solve it?
BTW. solution from closed issue (#3) did not work for me...

My mistake, for python2.7 I had both tensorflow and tensorflow-gpu, after laving only GPU version, problem solved.
Closed issue.

PC details:
Linux Mint 18.3
Intel i7-8700K
Titan X(Pascal)
Nvidia drivers: 390.12

Setup details:
nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61

cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR 6
#define CUDNN_MINOR 0
#define CUDNN_PATCHLEVEL 21
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)
#include "driver_types.h

Syntax error in train.py ?

Windows 10 64 Bit
Python 3.6.5 64 bit
grabbed all requirements except tensorflor 1.5.0 because pip couldn't find 1.4.1 for some reason

Wanted to train the GAN using the linkedin leak, so using

python train.py --output-dir output --training-data data/in.txt

And it throws me a syntax error:

File "D:\PassGAN\train.py", line 144
print " print "validation set JSD for n={}: {}".format(i+1, true_char_ngram_lms[i].js_with(validation_char_ngram_lms[i]))

Syntax Error: invalid syntax. The error is specified to be at the end of the "" encapsulated string. I have little knowledge of Python as I never got in contact with it, so I can't solve it myself..?

Auto Save Chekpoint and other staff.

Pls add configurable Auto Save Chekpoint --checpoint 1000000 (every milion pass).
Update requirements for last python 3.11.
Add gui.
Add auto detect dictionary list encoding and add support csv files.

PassGAN did not compare with FLA?

You said that PassGAN did not compare their results to FLA ... (Results part in READ.md...)

However, I think PassGAN did compare their results to the results of FLA in the latest version paper..

Am I right?

Question about PassGAN implementation

Hi, I have some questions:

  • Is the data encoded as onehot representation for each character of the password?
  • How do you enforce the generator to return onehot encoded characters to be fed into the discriminator?
  • How do you force the network to produce variable-length passwords?

Thanks

Win 7

Does this work with Win 7?

Thanks

Consider to Stop Hotlinking Hashes.org Files

Full disclosure, I'm not affiliated with Hashes.org.

Every visitor of Hashes.org can read the following disclaimer on their website

We host Hashes.org out of our own pockets. Even a small donation can really help!

In your README.md file you are hotlinking a rather large (2.9 GB) file from their server.
Please consider to change this hyperlink to the homepage of Hashes.org, instead of pointing directly to the file. Independent of this, you really should promote donating for their services or stop linking their site completely. Otherwise you can be sure that the great places to find those will soon be gone.

Cheers!

about memory

When the train.txt file is large (2G), train.py uses a lot of memory. Is this situation normal?

failed to map segment from shared object

Hello guys, i got this error and don't know how to fix it. Stackoverflow says is related to permission stuff.

_ImportError: /home/USER/.local/lib/python3.10/site-packages/tensorflow/python/platform/pywrap_cpu_feature_guard.so: failed to map segment from shared object

Cheers

Why use Tensorflow==1.4.1?

Is there a reason requirments.txt says tensorflow==1.4.1 and not the latest like 1.7.0 compatible with CUDA 9.1? Is it just what you had/have at the time or is there other library or dependency issues that pose an issue?

TensorFlow doesn't support NCHW

`UnimplementedError (see above for traceback):  Generic conv implementation only supports NHWC tensor format for now.
	 [[Node: Generator.1.1/conv1d/Conv2D = Conv2D[T=DT_FLOAT, data_format="NCHW", padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Generator.1.1/conv1d/ExpandDims, Generator.1.1/conv1d/ExpandDims_1)]]`

my tensorflow is 1.4.0
python is 2.7

Errors in thread operation

Hi Brannon

why in application work only one thread?

screen shot 2018-06-05 at 12 24 10 pm

if detach this thread in strace, see many messages 'futex_wait_bitset_private futex_clock_realtime connection timed out'

screen shot 2018-06-05 at 12 24 26 pm

My system:
00:1e.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1) (prog-if 00 [VGA controller]) Subsystem: NVIDIA Corporation GM204GL [Tesla M60] Physical Slot: 30 Flags: bus master, fast devsel, latency 248, IRQ 155 Memory at 94000000 (32-bit, non-prefetchable) [size=16M] Memory at 80000000 (64-bit, prefetchable) [size=256M] Memory at 92000000 (64-bit, prefetchable) [size=32M] I/O ports at c100 [size=128] Capabilities: [60] Power Management version 3 Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [78] Express Endpoint, MSI 00 Kernel driver in use: nvidia Kernel modules: nvidia_384_drm, nvidia_384
Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz

thanks

Code has many errors

Hi. Trying this code in Python 3.9 (Window 10 build 1904), but I think the code is probably for Python 2.x as I am getting a lot of errors:

PS C:\github\PassGAN> python sample.py --output ./new_trained.txt --num-samples 1000000 --input-dir ./pretrained --checkpoint ./pretrained/checkpoints/195000.ckpt
2021-04-24 14:50:47.249852: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
Traceback (most recent call last):
File "sample.py", line 74, in
charmap = pickle.load(f)
_pickle.UnpicklingError: the STRING opcode argument must be quoted
PS C:\github\PassGAN>

Are you planning on updating this code at some point to work in python 3.x ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.