feliixx / mgodatagen Goto Github PK
View Code? Open in Web Editor NEWGenerate random data for MongoDB
License: MIT License
Generate random data for MongoDB
License: MIT License
I have defined a "startDate" field to be generated but I need to create a "createdAt" field with the same value of startDate (or the day before). Is it possible?
seen on mongopayground:
2021/08/05 21:29:07 http2: panic serving 104.200.153.231:36142: runtime error: index out of range [0] with length 0
goroutine 348926 [running]:
net/http.(*http2serverConn).runHandler.func1(0xc0067a2030, 0xc0002d7f8e, 0xc000280f00)
/opt/hostedtoolcache/go/1.16.6/x64/src/net/http/h2_bundle.go:5716 +0x193
panic(0xeeb6a0, 0xc005a98600)
/opt/hostedtoolcache/go/1.16.6/x64/src/runtime/panic.go:965 +0x1b9
github.com/feliixx/mgodatagen/datagen/generators.(*fromArrayGenerator).EncodeValueAsString(0xc0000aee00)
/home/runner/go/pkg/mod/github.com/feliixx/[email protected]/datagen/generators/from_array_generator.go:62 +0x145
github.com/feliixx/mgodatagen/datagen/generators.(*stringFromPartGenerator).EncodeValue(0xc005d3e780)
/home/runner/go/pkg/mod/github.com/feliixx/[email protected]/datagen/generators/string_from_parts_generator.go:40 +0xc9
github.com/feliixx/mgodatagen/datagen/generators.(*DocumentGenerator).Generate(0xc005a72260, 0x14, 0x14, 0xc000168dc0)
/home/runner/go/pkg/mod/github.com/feliixx/[email protected]/datagen/generators/generators.go:36 +0x27d
github.com/feliixx/mongoplayground/internal.createDBFromMgodatagen(0xc0070f7140, 0xc000147180, 0x128, 0x140, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc0000321b0,
/home/runner/work/mongoplayground/mongoplayground/internal/run.go:195 +0x1eb
github.com/feliixx/mongoplayground/internal.(*storage).createDatabase(0xc0003f2000, 0xc0070f7140, 0x0, 0xc000147180, 0x128, 0x140, 0xc0070f7100, 0x0
/home/runner/work/mongoplayground/mongoplayground/internal/run.go:137 +0x44e
github.com/feliixx/mongoplayground/internal.(*storage).run(0xc0003f2000, 0x11728f0, 0xc0003f67c0, 0xc005c20b68, 0xe, 0x10, 0x1164e00, 0xc0070f8060,
/home/runner/work/mongoplayground/mongoplayground/internal/run.go:113 +0x253
github.com/feliixx/mongoplayground/internal.(*storage).runHandler(0xc0003f2000, 0x11711d0, 0xc0067a2030, 0xc005dfe700)
/home/runner/work/mongoplayground/mongoplayground/internal/run.go:92 +0x355
net/http.HandlerFunc.ServeHTTP(0xc0070f8060, 0x11711d0, 0xc0067a2030, 0xc005dfe700)
/opt/hostedtoolcache/go/1.16.6/x64/src/net/http/server.go:2049 +0x44
net/http.(*ServeMux).ServeHTTP(0xc000078000, 0x11711d0, 0xc0067a2030, 0xc005dfe700)
/opt/hostedtoolcache/go/1.16.6/x64/src/net/http/server.go:2428 +0x1ad
github.com/feliixx/mongoplayground/internal.latencyObserver.func1(0x11711d0, 0xc0067a2030, 0xc005dfe700)
/home/runner/work/mongoplayground/mongoplayground/internal/server.go:83 +0xa7
net/http.HandlerFunc.ServeHTTP(0xc00000e270, 0x11711d0, 0xc0067a2030, 0xc005dfe700)
/opt/hostedtoolcache/go/1.16.6/x64/src/net/http/server.go:2049 +0x44
net/http.serverHandler.ServeHTTP(0xc0003a09a0, 0x11711d0, 0xc0067a2030, 0xc005dfe700)
/opt/hostedtoolcache/go/1.16.6/x64/src/net/http/server.go:2867 +0xa3
net/http.initALPNRequest.ServeHTTP(0x1172998, 0xc000401da0, 0xc000315500, 0xc0003a09a0, 0x11711d0, 0xc0067a2030, 0xc005dfe700)
/opt/hostedtoolcache/go/1.16.6/x64/src/net/http/server.go:3439 +0x8d
net/http.(*http2serverConn).runHandler(0xc000280f00, 0xc0067a2030, 0xc005dfe700, 0xc00000e918)
/opt/hostedtoolcache/go/1.16.6/x64/src/net/http/h2_bundle.go:5723 +0x8b
created by net/http.(*http2serverConn).processHeaders
/opt/hostedtoolcache/go/1.16.6/x64/src/net/http/h2_bundle.go:5453 +0x505
Hello, I tried to rebuild my docker image today. I always get the following error.
Step 15/16 : RUN go get -u "github.com/feliixx/mgodatagen"
---> Running in 23c7fe93e440
unrecognized import path "go.mongodb.org/mongo-driver/bson/mgocompat": reading https://go.mongodb.org/mongo-driver/bson/mgocompat?go-get=1: 404 Not Found
The command '/bin/sh -c go get -u "github.com/feliixx/mgodatagen"' returned a non-zero code: 1
The corresponding part in the docker file looks like this
...
RUN wget https://dl.google.com/go/go1.14.1.linux-amd64.tar.gz
RUN tar -xvf go1.14.1.linux-amd64.tar.gz
RUN mv go /usr/local
RUN go get -u "github.com/feliixx/mgodatagen"
...
Do you have any idea what might be causing it?
Hello, this is very nice project and usefull. I would like to use it, but does it support srv connection such as: mongodb+srv://username:passowrd@host/db i try to connect pass it to host and etc but still not works, i got error like this:
./mgodatagen -f config.json -h host -u username -p password
connecting to mongodb://host:27017 connection failed cause: server selection error: server selection timeout, current topology: { Type: Unknown, Servers: [{ Addr: host:27017, Type: Unknown, State: Connected, Average RTT: 0, Last error: connection() : dial tcp: lookup host: no such host }, ] }
Please help thanks.
In a configuration containing the following index:
{ "key": {
"suID": 1,
"dvID": 1,
"tm": 1
},
"name": "suID_1_dvID_1_tm_1"
}
the following index was created:
{ "key": {
"tm": 1,
"suID": 1,
"dvID": 1
},
"name": "suID_1_dvID_1_tm_1"
}
Is it possible to distribute particular enum values in user defined probabilities ? This will be helpful in preparing a collections which resembles real world scenarios
It is possible to make nested objects greater than 1, for example, by the parameter "objectContentCount"
so that according to this configuration
[{"database":"db",
"collection":"cd_card_tree",
"count":1,
"content":{
"id_cleint":{"type":"long","maxLong":999999999999},
"id_card":{"type":"object", "objectContentCount":2, "objectContent":{
"card_no":{"type":"long","maxLong":9999999999},
"card_date":{"type":"date"},
"transaction":{"type":"object", "objectContentCount":2, "objectContent":{
"tran_sum":{"type":"decimal"},
"tran_date":{"type":"date"}}
}
}
}
}
}]
to make it work, this is the result
[
{
"_id": ObjectId("5a934e000102030405000001"),
"id_cleint": NumberLong(863926077139),
"id_card":[{
"card_date": "2021-10-01",
"card_no": NumberLong(1539691641),
"transaction":[
{
"tran_date": "2022-03-11",
"tran_sum": 100.0
},
{
"tran_date": "2022-03-11",
"tran_sum": 200.0
}]
},
{
"card_date": "2021-11-21",
"card_no": NumberLong(4539691641),
"transaction":[
{
"tran_date": "2022-03-11",
"tran_sum": 100.0
},
{
"tran_date": "2022-03-11",
"tran_sum": 200.0
}]
}]
}
]
It would be great to have an Enum type basic generator:
"fieldName": {
"type": "enum",
"values": ["Taurus", "Cancer", "Virgo", ..]
}
Currently, an array takes only generator
arrayContent. There are two problems with it:
"fieldName": {
"type": "array",
"arrayContent": [
{
"type": "constant",
"constVal": "one"
},
{
"type": "constant",
"constVal": 2
}
]
}
Having this template:
[
{
"collection": "collection",
"count": 1,
"content": {
"users": {
"type": "array",
"size": 0
}
}
}
]
The actual result is:
error in configuration:
fail to create collection collection: invalid generator for field 'key'
cause: make sure that 'size' >= 0
And the expected result:
[
{
"_id": ObjectId("5a934e000102030405000000"),
"users": []
}
]
I can see a bug in collinfo.go:542 (it should be if config.Size < 0
), but even if that is fixed, an "arrayContent" is also required in line 545 which would be unnecessary for an empty array.
hi is it possible to pass array of ObjectId in fromArray in props?
if yes then can you please share demo code?
"user": { "type": "fromArray", "in": [ "6107c50de31053fa30380735", "6107c50de31053fa30380736", "6107c50de31053fa30380737", "6107c50de31053fa30380738", "6107c50de31053fa30380739", "6107c50de31053fa3038073a", "6107c50de31053fa3038073b", "6107c50de31053fa3038073c", "6107c50de31053fa3038073d", "6107c50de31053fa3038073e", "6107c50de31053fa3038073f", "6107c50de31053fa30380740", "6107c50de31053fa30380741", "6107c50de31053fa30380742", "6107c50de31053fa30380743", "6107c50de31053fa30380744", "6107c50de31053fa30380745", "6107c50de31053fa30380746", "6107c50de31053fa30380747", "6107c50de31053fa30380748" ], "randomOrder": true }
Hi,
I'm trying to generate a collection with TTL Index, but I'm getting this error:
error in configuration file: object / array / Date badly formatted:
json: unknown field "expireAfterSeconds"
This is my schema:
[
{
"database": "example",
"collection": "test",
"count": 1,
"content": {
"expireAt": {
"type": "date",
"startDate": "2020-01-01T22:00:00+00:00",
"endDate": "2020-01-05T22:00:00+00:00"
}
},
"indexes": [
{
"name": "expireAt_1",
"key": {
"expireAt": 1
},
"expireAfterSeconds": 0
}
]
}
]
Thank you.
Is there a way to use an x509 auth when connecting to the DB?
is it possible to use generators for object keys?
Hi!
First, thank you for this awesome tool which I use for many years now when it comes to MongoDB performance evaluations.
I'm currently trying to generate a field, which consists of a couple of random strings. My approach was to use stringFromParts
together with string
generators as parts. But unfortunately, I cannot get it to work. Here is a sample configuration file:
[
{
"database": "someDatabase",
"collection": "someCollection",
"count": 10,
"content": {
"someString": {
"type": "string",
"minLength": 3,
"maxLength": 5
},
"someStringFromParts": {
"type": "stringFromParts",
"parts": [
{
"type": "string",
"minLength": 3,
"maxLength": 5
},
{
"type": "string",
"minLength": 3,
"maxLength": 5
}
]
}
}
}
]
which gives the following in the database
One can see, that the property someString
is filled, someStringFromParts
is not, although the used string
generator settings are exactly the same.
Here is a playground with configuration above:
Hi,
Not sure if I'm doing something wrong but I want to have a field that is an array of objects (questions) and each object should have an id and some text. When I do this the question text is different for each question but the ID is the same? I want the ID to be unique and want to reference it in another collection as a primary key/foreign key for joins so to speak.
My Mongo playground is: https://mongoplayground.net/p/YgxGJnMHmr-
Thanks,
Sean
I'm not sure if this is already possible,
I'm trying to generate a date in the following format:
YYYY/MM/DD
Example
1970/01/01
I can't use int, I can't use decimal. Is there a way I could format the int?
Also I'm not sure what "nullpercentage" means
Currently it is not possible to add a reference of one document to another one in the same collection like this:
[
{
"database": "db",
"collection": "users",
"count": 100,
"content": {
"profileName": {
"type": "faker",
"nullPercentage": 0,
"method": "UserName"
},
"friendsList": {
"type": "array",
"nullPercentage": 0,
"size": 30,
"arrayContent": {
"type": "ref",
"nullPercentage": 0,
"id": 0,
"refContent": {
"type": "objectId",
"nullPercentage": 0
}
}
},
"_id": {
"type": "ref",
"nullPercentage": 0,
"id": 0
}
}
}
]
Foollowed your install instructions ,and there are not build files, for go to execute .
files:
codecov.yml datagen demo.gif go.mod go.sum LICENSE main.go README.md test.sh
go install will not generate any ./mgodatagen executable
"location": {
"type": "object",
"objectContent": {
"type": {
"type": "constant",
"constVal": "Point"
},
"coordinates": {
"type": "position"
}
}
},
err
Error: (Location16755) Index build failed: c07cf731-43b6-4371-80e0-f29ef058cc90: Collection iloveu.profile ( 7783600a-51ea-4165-8acd-9f856a8b26eb ) :: caused by :: Can't extract geo keys: { _id: ObjectId('62215774a8f4a4641a612445'), username: "Idell", description: ""Vegan Thundercats Yuccie vinegar DIY Yuccie." - Devan Gerlach", subject_id: "04d5cce3-f1c6-4462-a605-58d06a2a8e94", seeking: "", location: { type: "Point", coordinates: [ 36.50425432207004, 110.2707559292275 ] }, identity: "m", gender: "m", birthdate: "1952/4/20" } longitude/latitude is out of bounds, lng: 36.5043 lat: 110.271
db := client.Database(config.Database)
indexOpts := mopts.CreateIndexes().SetMaxTime(config.DefaultTimeout)
// Index to location 2dsphere type.
pointIndexModel := mongo.IndexModel{
Options: mopts.Index(),
Keys: bsonx.MDoc{"location": bsonx.String("2dsphere")},
}
pointIndexes := db.Collection("profile").Indexes()
_, e = pointIndexes.CreateOne(ctx, pointIndexModel, indexOpts)
if e != nil {
return e
}
Either a new type should be added, geo2d, since you're generating data for mongodb, which uses geo2d...
or an option should be added to switch long,lat ordering
I would change it locally but I have no idea what
g.buffer.Write(float64Bytes(90 * float64(i+1) * (2*(float64(g.pcg64.Random())/(1<<64)) - 1)))
means...
Now I get it you push the y value then the x value because it's just 2*(1-90)
smh
func (g *positionGenerator) EncodeValue() {
current := g.buffer.Len()
g.buffer.Reserve()
// val 1 (0-180)
g.buffer.WriteSingleByte(byte(bson.TypeDouble))
g.buffer.WriteSingleByte(indexesBytes[0])
g.buffer.WriteSingleByte(byte(0))
g.buffer.Write(float64Bytes(90 * float64(2) * (2*(float64(g.pcg64.Random())/(1<<64)) - 1)))
// val 2 (0-90)
g.buffer.WriteSingleByte(byte(bson.TypeDouble))
g.buffer.WriteSingleByte(indexesBytes[1])
g.buffer.WriteSingleByte(byte(0))
g.buffer.Write(float64Bytes(90 * float64(1) * (2*(float64(g.pcg64.Random())/(1<<64)) - 1)))
g.buffer.WriteSingleByte(byte(0))
g.buffer.WriteAt(current, int32Bytes(int32(g.buffer.Len()-current)))
}
Is there a command line switch or a config file option so that the generated documents are sent to stdout or to a file?
For prototyping documents it would be useful to not need to round-trip through a MongoDB database/collection. It would also be useful in situations when a MongoDB is temporarily unavailable due to maintenance, network non-availability, etc.
Hello,
As valueAggregator returns an array, it would be nice if I could pass the array to the parameter "in" of fromArray.
This is my issue:
We have 3 collections : stores, users, events
stores:
{ "_id": ObjectId('store') }
users :
{ "_id": ObjectId('user'), "firstName":..., "lastName":..., "storeId": ObjectId('store') }
events:
{ "_id": ObjectId('event')], "participants": [ { "id": ObjectId('user') } ], "storeId": ObjectId('store') }
What I want?
I want participants in event will be populated with the random numbers of users which belong to the same store.
If we can mix fromArray with valueAggregator, I think it will solve the issue.
I want to insert static data (countries, languages), which I have as a json array.
This data is needed to have generated keys which i can reference in other documents.
Is this planned to be added?
Hello, is it possible to use a reference to a FromArray?
If I see this correctly, only the type object is possible in the documentation.
This would be super helpful to link accounts with randomly generated data.
In the sample config.json file there are two keys for object:
"object": {
"type": "object",
"objectContent": {
"k1": {
"type": "string",
"unique": true,
"minLength": 4,
"maxLength": 4
},
"k2":{
"type": "int",
"minInt": -10,
"maxInt": -5
}
}
}
However, when data is generated using mgodatagen it produces object as:
"object" : {
"object" : "aaaa"
}
The second key is not generated. Also, is there any way to name keys in object? For example, in above example the is key is named as object
but it's not specified on object content.
Please help.
Hi!
First of all, thank you for this neat little tool. It really fills a gap.
I have one problem. I cannot generate a constant int value.
The use case is to generate documents with a field version
, always set to 0
. Using the int32 generator I must use maxInt
> minInt
. Using constant generator and supplying 0, it generates a double 1.0
.
Any suggestions on this?
Thx in advance!
Hello,
is there an option for generator to generate UUID as binary (not as string)?
For example this is how you would write binary UUID to use in mongoimport/mongoexport:
{
"$binary": {
"base64": "zggG33O7Tgiaf0MIfkBp/g==",
"subType": "04"
}
}
Then in MongoCompass is shown as UUID('3508bdf2-b598-44a4-918c-4966e54a4d27'), not as simple string.
This is common practice for i.e. MongoDB Java Driver, it would be very useful to make it possible in this generator with ids, fields, references etc.
Hi!
Maybe this one is somehow related to #92
Having the following configuration file
[
{
"database": "someDatabase",
"collection": "someCollection",
"count": 10,
"content": {
"someInt": {
"type": "int"
}
}
}
]
what I get in the database is the following
I already noticed something similar with the string
generator, where I needed to specify minLength
and maxLength
, or else I got the empty string all the time.
As soon as one adds minInt
and maxInt
it starts to generate values other then 0
. This is also similiar to how string
generator behaves, as it starts to generate values as soon as minLength
and maxLength
are given. Unfortnuatly this does not help with stringFromParts
(see #92)
See the following playground for an interactive example for int
:
https://mongoplayground.net/p/u_dNIJBpf3R
And here for string
without further configuration:
Thanks for making this!
Is it possible to create an array of random length? Something along these lines is what I'm looking for:
"fieldName": {
"type": "array", // required
"size": `randomInt`, // required, size of the array
"arrayContent": `generator`, // generator use to create element to fill the array.
// can be of any type
"nullPercentage": `int`, // optional
"maxDistinctValue": `int` // optional
}
Currently autoincrement only supports start(int/long) parameters, but it would be very useful to add prefix/postfix values so that one could create values like "[email protected]", "[email protected]"
is there a way to generate UUIDs instead of ObjectIds?
I am trying to create collection with unique index. mgodatagen is creating the data first and then creating unique index.
Having duplicate records already in collection causing createIndex call to fail with cause: E11000 duplicate key error collection
"test_objectid": {
"type": "constant",
"constVal": {
"$oid": "5a934e000102030405000001"
}
}
https://mongoplayground.net/p/kLGsx6V2bHY
Output
{
"_id": ObjectId("5a934e000102030405000000"),
"test_objectid": {
"$oid": "5a934e000102030405000001"
}
}
test_objectid
is saved as { "$oid": "5a934e000102030405000001" }
, it doesn't converts to ObjectId.
[
{
"test_objectid": {
"$oid": "5a934e000102030405000001"
}
}
]
https://mongoplayground.net/p/2qIZHXMG561
Output is correct
[
{
"_id": ObjectId("5a934e000102030405000000"),
"test_objectid": ObjectId("5a934e000102030405000001")
}
]
db.collection.find({}).toArray()
.forEach(function (doc) {
try { doc.test_objectid = ObjectId( doc.test_objectid.$oid ) } catch(e) {} // convert manually
db.collection.save(doc);
});
int, long, double are documented in readme to have min and max limits, but not decimal.
why so?
is there are a way to limit decimals?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.