Giter VIP home page Giter VIP logo

go-anonymize-mysqldump's Issues

Installation example doesn't work ("not in gzip format")

I just tried to follow the installation instructions for Mac. I ran

curl -OL https://github.com/humanmade/go-anonymize-mysqldump/releases/download/latest/go-anonymize-mysqldump_darwin_amd64.gz

and then tried

gunzip go-anonymize-mysqldump_darwin_amd64.gz

However, the latter returns

gunzip: go-anonymize-mysqldump_darwin_amd64.gz: not in gzip format

Make invalid field type error more clear

I'm getting a:

panic: interface conversion: sqlparser.Expr is *sqlparser.NullVal, not *sqlparser.SQLVal

goroutine 33 [running]:
main.modifyValues(0xc0004f4000, 0x4fd, 0x555, 0xc000014420, 0x9, 0xc0000f8700, 0x4, 0x4, 0x0, 0x0, ...)
	/Users/aang/Code/golang/src/github.com/humanmade/go-anonymize-mysqldump/anonymize-mysqldump.go:299 +0x572
main.applyConfigToInserts(0xc0000a2dc0, 0xc0000c08c0, 0x2, 0x4, 0x0, 0x0, 0x0)
	/Users/aang/Code/golang/src/github.com/humanmade/go-anonymize-mysqldump/anonymize-mysqldump.go:277 +0x1f9
main.applyConfigToParsedLine(0x66be60, 0xc0000a2dc0, 0xc0000c08c0, 0x2, 0x4, 0x0, 0x0, 0x0, 0x0)
	/Users/aang/Code/golang/src/github.com/humanmade/go-anonymize-mysqldump/anonymize-mysqldump.go:250 +0x68
main.processLine(0xc000560000, 0xff0d5, 0xc0000c08c0, 0x2, 0x4, 0x0, 0x0)
	/Users/aang/Code/golang/src/github.com/humanmade/go-anonymize-mysqldump/anonymize-mysqldump.go:218 +0x2c1
main.processInput.func1(0xc000014550, 0xc0000c08c0, 0x2, 0x4, 0xc0003dc360, 0xc000560000, 0xff0d5)
	/Users/aang/Code/golang/src/github.com/humanmade/go-anonymize-mysqldump/anonymize-mysqldump.go:194 +0xab
created by main.processInput
	/Users/aang/Code/golang/src/github.com/humanmade/go-anonymize-mysqldump/anonymize-mysqldump.go:192 +0x41c
mysqldump: Got errno 32 on write

I believe it is because i set some field types that are not yet supported.
If that's the case, i think the error should be more clear, so it's easier to know that it's my fault and not a real error : O

Enable ability to specify custom configs

I want to be able to add additional configs, as in when applications have additional fields to anonymize or aren't even WordPress. Ideally this would be in a JSON/YAML/TOML format.

Get this working with multi-line statements

I discovered today that this won't be able to parse multi-line insert statements like the following SQL:

INSERT INTO `wp_cavalcade_jobs` VALUES
(1,1,"wp_version_check","a:0:{}","2017-08-24 08:15:12","2017-09-11 20:15:12",43200,"failed"),
(2,1,"wp_update_plugins","a:0:{}","2017-08-24 08:15:12","2017-09-11 20:15:12",43200,"failed"),
(3,1,"wp_update_themes","a:0:{}","2017-08-24 08:15:12","2017-09-11 20:15:12",43200,"failed"),
(4,1,"wp_scheduled_delete","a:0:{}","2017-10-15 10:11:15","2017-10-15 10:11:15",86400,"failed");

Because we're parsing line by line, it reads each value as its own SQL statement which breaks the parser. I need to either merge them all into a single line or find some other solution.

Generate random data instead of fixed strings

While I've been building the prototype, I've been replacing values using functions with fixed strings. I want to use a faker-type library to actually generate random data, but unit testing randomly generated data is tough so I've been deferring solving that problem.

I just realized I could use the existing functions as mocks for testing purposes, now I just need to write the integration.

Ability to skip rows

We're currently using production database in dev.

We plan to improve it by anonymizing the user data.

However we have some test/dev/admin accounts with a variety of roles and associated data which we want to preserve.

Would be nice being able to skip any transformation to a row if a condition matches. Example config we'd use to achieve this

      "tableName": "users",
      "ignore": [
        {
          "field": "mail",
          "position": 6,
          "value": "[email protected]"
        },
        {
          "field": "mail",
          "position": 6,
          "value": "[email protected]"
        }
      ],
      "fields": [

As a work around:

We take 2 mysqldumps, the first one with all data going through anonymization.

The 2nd dump is appended to the original with "REPLACE INTO" queries with the users we plan to keep as is.

mysqldump --no-create-info --where "email like '%admin.com'" --replace application user_data > own_users.sql

Add CI

Need to run tests automatically.

[BUG] Generated emails & usernames aren't unique

If you have a UNIQUE index set for those fields, anonymize will break the dump as there's no option for truly unique emails/usernames.

My work around has been to add new transformation functions:

// add these imports
import (
    "strconv"
    "time"
)

func generateUniqueEmail(value *sqlparser.SQLVal) *sqlparser.SQLVal {
	return sqlparser.NewStrVal([]byte(strconv.Itoa(int(time.Now().UnixNano())) + "@gmail.com"))
}
func generateUniqueUsername(value *sqlparser.SQLVal) *sqlparser.SQLVal {
	return sqlparser.NewStrVal([]byte(faker.Internet().UserName() + strconv.Itoa(int(time.Now().UnixNano()))))
}

Crash if asked to replace a value which can be NULL

Given a database like

DROP TABLE IF EXISTS `test`;
CREATE TABLE `test` (
`email` varchar(255),
);
INSERT INTO `test` VALUES ('foo'), (NULL), ('hodger');

if you use a config like

{
  "patterns": [
    {
      "tableName": "test",
      "fields": [
        {
          "field": "email",
          "position": 1,
          "type": "email",
          "constraints": null
        }
      ]
    }
  ]
}

the tool will crash with a panic:

panic: interface conversion: sqlparser.Expr is *sqlparser.NullVal, not *sqlparser.SQLVal

goroutine 35 [running]:
main.modifyValues(0xc0003f6060, 0x3, 0x4, 0xc00008e2f8, 0x4, 0xc0001e6000, 0x1, 0x4, 0x0, 0x0, ...)
	/Users/aang/Code/golang/src/github.com/humanmade/go-anonymize-mysqldump/anonymize-mysqldump.go:299 +0x572
main.applyConfigToInserts(0xc00041a0b0, 0xc0001e4000, 0x1, 0x4, 0x0, 0x0, 0x0)
	/Users/aang/Code/golang/src/github.com/humanmade/go-anonymize-mysqldump/anonymize-mysqldump.go:277 +0x1f9
main.applyConfigToParsedLine(0x126f020, 0xc00041a0b0, 0xc0001e4000, 0x1, 0x4, 0x0, 0x0, 0x0, 0x0)
	/Users/aang/Code/golang/src/github.com/humanmade/go-anonymize-mysqldump/anonymize-mysqldump.go:250 +0x68
main.processLine(0xc0001cc1c0, 0x36, 0xc0001e4000, 0x1, 0x4, 0x0, 0x0)
	/Users/aang/Code/golang/src/github.com/humanmade/go-anonymize-mysqldump/anonymize-mysqldump.go:218 +0x2c1
main.processInput.func1(0xc00008e320, 0xc0001e4000, 0x1, 0x4, 0xc0003f20c0, 0xc0001cc1c0, 0x36)
	/Users/aang/Code/golang/src/github.com/humanmade/go-anonymize-mysqldump/anonymize-mysqldump.go:194 +0xab
created by main.processInput
	/Users/aang/Code/golang/src/github.com/humanmade/go-anonymize-mysqldump/anonymize-mysqldump.go:192 +0x41c

If NULL does not appear, it works fine.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.