Giter VIP home page Giter VIP logo

dgen's Introduction

dgen

Build Status

Generate evil test data

dgen is a CLI tool for generating pseudorandom data in arbitrary formats. The goal is to have a tool that works equally well for both textual and binary formats. Dgen is really just an interpreter for a simple domain-specific functional language. The syntax is documented here. The language documentation is pretty sparse at the moment, but this will hopefully be improved shortly. PRs are welcome!

Example

The following program will print 5 random quoted words, separated by newlines:

$ dgen -p 'repeat_delimited(5, double_quote(words()), "\n")'
"gule"
"erugation"
"avouchment"
"hymnless"
"reclusory"

dgen programs are invoked by calling dgen and providing the program input in one of several ways:

  • -p, --program: the program can be provided on the command line. This is easiest for simple expressions.
  • -f, --program-file: interpret the given file as a program. Nice for more complex programs
  • -s, --stdin: read the program from stdin

You can also add your own libraries to the program scope using the --lib option.

dgen file1 file2 fileN can also be used as a shortcut for dgen --lib file1 --lib file2 -f fileN. This allows you to run an executable dgen script by simply putting a shebang (#!dgen) at the top of the file.

dgen has a bunch of builtin functions, too. You can list the builtin functions by executing dgen help. You can optionally filter the list of functions by name with dgen help --function <name>. Of course dgen --help will print out info on all of the available options.

Take a look at the examples for more.

Goals

  • Make it easy to generate files and streams of data for testing
  • Make it easy to share and re-use programs for data generation
  • Make it easy to test data that can be represented in multiple ways
  • Easily integrate with various testing tools and workflows

Non-Goals

  • Be a general purpose language. Currently the language is not even turing-complete, and it's not really clear that there would be any benefit to turing-completeness.
  • Understand the samantics of your data. DGen is more focused on how data is represented rather than what it means.

What's differnt about dgen?

DGen focuses on how data is represented. Take JSON for example. It's easy to find random data generators that will output the data as well formed JSON. But if you're testing a JSON parser, then you want to use JSON with different and inconsistent formatting! Like many other formats, JSON has many valid ways to represent the same data. You might want to test keys that are sometimes surrounded with double-quotes and sometimes single-quotes or unquoted. DGen is meant to fill this gap in between a well-formed dataset generator (which will typically always produce consistently formatted representations) and a fuzz tester (which is more useful for testing invalid input).

Build

dgen is built with Rust, and requires version 1.31 or later. Just a simple cargo build --release is all it takes to get a release binary. The build is currently tested only on OSX and GNU/Linux, but the intent is to start testing on Windows as well soon™️.

Stability

This project is still in the super early stages, so major breaking changes can happen at any time. I'm still experimenting with various aspects of the language and syntax. If you have any input on that, please file an issue!

Contributions

New contributors are welcome! Please feel free to send over a PR or file an issue.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.