Giter VIP home page Giter VIP logo

tap-salesforce's Introduction

tap-salesforce

CircleCI Build Status

Singer tap that extracts data from a Salesforce Account and produces JSON-formatted data following the Singer spec.

This is a forked version of tap-salesforce (v1.4.24) that maintained by the Meltano team.

Main differences from the original version:

  • Support for username/password/security_token authentication
  • Support for concurrent execution (8 threads by default) when accessing different API endpoints to speed up the extraction process
  • Support for much faster discovery

Quickstart

Install the tap

This version of tap-salesforce is not available on PyPi, so you have to fetch it directly from the Meltano maintained project:

python3 -m venv venv
source venv/bin/activate
pip install git+https://github.com/MeltanoLabs/tap-salesforce.git

Create a Config file

Required

{
  "api_type": "BULK",
  "select_fields_by_default": true,
}

Required for OAuth based authentication

{
  "client_id": "secret_client_id",
  "client_secret": "secret_client_secret",
  "refresh_token": "abc123",
}

Required for username/password based authentication

{
  "username": "Account Email",
  "password": "Account Password",
  "security_token": "Security Token",
}

Optional

{
  "start_date": "2017-11-02T00:00:00Z",
  "state_message_threshold": 1000,
  "max_workers": 8,
  "streams_to_discover": ["Lead", "LeadHistory"]
}

The client_id and client_secret keys are your OAuth Salesforce App secrets. The refresh_token is a secret created during the OAuth flow. For more info on the Salesforce OAuth flow, visit the Salesforce documentation.

The start_date is used by the tap as a bound on SOQL queries when searching for records. This should be an RFC3339 formatted date-time, like "2018-01-08T00:00:00Z". For more details, see the Singer best practices for dates.

The api_type is used to switch the behavior of the tap between using Salesforce's "REST" and "BULK" APIs. When new fields are discovered in Salesforce objects, the select_fields_by_default key describes whether or not the tap will select those fields by default.

The state_message_threshold is used to throttle how often STATE messages are generated when the tap is using the "REST" API. This is a balance between not slowing down execution due to too many STATE messages produced and how many records must be fetched again if a tap fails unexpectedly. Defaults to 1000 (generate a STATE message every 1000 records).

The max_workers value is used to set the maximum number of threads used in order to concurrently extract data for streams. Defaults to 8 (extract data for 8 streams in paralel).

The streams_to_discover value may contain a list of Salesforce streams (each ending up in a target table) for which the discovery is handled. By default, discovery is handled for all existing streams, which can take several minutes. With just several entities which users typically need it is running few seconds. The disadvantage is that you have to keep this list in sync with the select section, where you specify all properties(each ending up in a table column).

Run Discovery

To run discovery mode, execute the tap with the config file.

tap-salesforce --config config.json --discover > properties.json

Sync Data

To sync data, select fields in the properties.json output and run the tap.

tap-salesforce --config config.json --properties properties.json [--state state.json]

Copyright © 2017 Stitch

tap-salesforce's People

Contributors

ccapurso avatar kallan357 avatar dmosorast avatar nick-mccoy avatar meltybot avatar dan-ladd avatar dependabot[bot] avatar jaredgj avatar aaronsteers avatar jaceksan avatar willdasilva avatar s7clarke10 avatar reubenfrankel avatar psantacl avatar kjford avatar dinoshauer avatar iterati avatar isaiahgm avatar haleemur avatar edgarrmondragon avatar dlouseiro avatar luandy64 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.