Giter VIP home page Giter VIP logo

mongodb-to-postgresql's Introduction

mongodb-to-postgresql

A tool to migrate and replicate data from MongoDB to PostgreSQL in Node.js.

The tool works by creating an object data model which includes all data fields that you want to migrate to SQL and executing a migration script. The tool has the option to either run as a one off script, or can be setup to keep your replicated PostgreSQL database in sync with MongoDB.

Getting Started

These instructions will get the project set up on your local machine for running the tool.

Prerequisites

  1. Node.js (Tested on V10.15.3)
  2. Node Package Manager (Tested on 6.4.1)
  3. PostgreSQL server (Tested on 10.3)
  4. MongoDB server (Tested on 4.0.0)

Setup

There are four steps for setup:

  1. Setting up your local environment
  2. Configure database connection settings
  3. Configure Data Model
  4. Configure custom fields (if required)
  5. Run scripts

Environment Setup

Once you have cloned this repository into a local folder, navigate to the folder in your choice of console/terminal, and run:

npm install

Once installed, you can type 'npm list' into the command prompt and should see something similar to (list below is truncated and will include more packages in your local install):

Database Connection Settings

Enter your MongoDB and PostgreSQL database connection settings in the db_details.json file:

Example:
{
    "mongo_connection": {
        "url": "mongodb://localhost:27017",
        "database": "mongodb_to_postgres"
    },
    "postgres_connection": {
        "user": "postgres",
        "host": "localhost",
        "database": "mongo_test",
        "password": "xxxxx",
        "port": "5432"
    }
}

Configure Database Model

An example database model setup has already been created based on the JSON data in the 'dummy_data' folder. Please see the section further down for instructions on running the example data.

The object data model is an object created in the data_model.js file. this contains a list of objects (one for each table), and within each of these an array with information on the table, field names and field types. The field and table names need to match each other to pick up the correct collection name and fields

Example:

module.exports.all_models = {
    table_1_model: [
        'table_1', // first item is always table (collection) name
        '_id VARCHAR(50) PRIMARY KEY NOT NULL', // second item is primary key field
        'created_at TIMESTAMPTZ', // this field is necessary for automatic sync (optional)
        'updated_at TIMESTAMPTZ', // this field is necessary for automatic sync (optional)
        'title TEXT',
        'authors _TEXT', // this is an array field
        'languages JSONB' // this is a json field
    ],
    table_2_model: [
        'table_2',
        '_id VARCHAR(50) PRIMARY KEY NOT NULL'
    ]
}

Configure Custom Fields (If Required)

There may be custom fields that you want to access that require more complex logic. Some example use cases are:

  1. The MongoDB field has a different name to the field name you want in Postgres
  2. The MongoDB field is nested within an object (see the example process below for more detail on this)
  3. You want to perform a transformation of the data prior to loading into Postgres

These custom field rules are created in the migrate_data.js file, in rows 321 onwards. The code block looks like this:

// custom rules applied in this switch statement if needed, otherwise default will be used
// -------------------------------------------------------
switch (columns[j]) {
    // custom rule for extracting value from child level i.e. 'common' that is stored in the 'name' object
    case 'common_name':
        insert_row.push(
            json_key(data_row.name, 'common', j)
        )
        break
    default:
        insert_row.push(
            json_key(data_row, columns[j], j)
        )
}
// -------------------------------------------------------
}

In the example shown here, the column defined in the data model as common_name will have a custom rule applied. In this case, a property 'common' in the 'name' object in a MongoDB document is what wants to be migrated.

Run Database Creation & Migration Scripts

There are two basic scripts for this tool. To run them, navigate to the root folder of this repository and run:

node start.js createdb

This will create the database model defined in the data_model.js file into your chosen Postgres instance. For basic changes, you can make changes to your data model and re-run the script to add or delete columns.

node start.js migratedata

This will execute the data migration from MongoDB to PostgreSQL based on the data model and custom rules setup. This can be run once to migrate data as a one off process. Optionally, if using 'created_at' and 'updated_at' timestamp dates in MongoDB and these have been included in the data_model.js setup, the script can be run at regular intervals to transfer new or updated documents to Postgres

Example

Steps to run example:

  1. Complete Steps 1 & 2 of the Setup instructions above
  2. Upload dummy MongoDB data
  3. Configure Data Model
  4. Configure custom fields
  5. Run scripts

To be continued...

Other Information

Further Notes

Please don't hesitate to provide feedback, feature improvements and bugs on this repository.

To Do

  • Provide example data step through
  • Use case information for timestamp fields
  • Make timestamp field names customisable
  • Currently, migration script can only be run once without timestamp fields, with data needing to be deleted before performing a new migration. If not using timestamp fields, potentially create option to remove and re-migrate data automatically in entire script

mongodb-to-postgresql's People

Contributors

kurt213 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mongodb-to-postgresql's Issues

error: malformed array literal: "[USA]"

I am using the mongodb_sample_schema, with the following transformation and encountering the error: malformed array literal: "[USA]" when when running "node start.js migratedata".
// data_model.js movies: [ 'movies', '_id VARCHAR(50) PRIMARY KEY NOT NULL', 'plot TEXT', 'genras TEXT[]', '"cast" TEXT[]', 'title TEXT', 'year NUMERIC', 'runtime NUMERIC', 'num_mflix_comments NUMERIC', 'fullplot TEXT', 'countries TEXT[]', 'type VARCHAR(100)', 'tomatoes JSONB', 'awards JSONB', ],
I also encountered a number of other problems mainly with the package.json but after upgrading some of the packages the createdb errors disappeared.

{ "name": "mongodb-to-postgres-npm", "version": "1.0.0", "description": "", "main": "index.js", "scripts": { "test": "echo \"Error: no test specified\" && exit 1" }, "author": "", "license": "ISC", "dependencies": { "async": "^3.1.0", "async-waterfall": "^0.1.5", "moment": "^2.24.0", "mongodb": "^4.2.2", "pg": "^8.7.1" } }

I think this is mostly because postgress expects the array in countries to be presented as {USA} and not [USA]

How to make a model to take objects within objects and duplicate the lines when this occurs?

Hello! I'm new to using NodeJs and I thought your project was amazing, so I'm using it to help me.
However, I'm trying to do something that I don't understand how to execute within your project.
How can I get objects nested inside another object and the lines multiply according to the recurrence of these cases?
It's similar to a 1-to-many Inner Join.
Example:
Table A has one row with ID = X and Table B has 7 rows with ID = X and Collumn NAME = [a,b,c,d,e,f,g].
When I do the INNER JOIN the end result is a single table with 7 Rows:
ID NAME
X a
X b
X c
X d
X e
X f
X g

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.