workday / prism-python Goto Github PK
View Code? Open in Web Editor NEWPython client library for interacting with Workday’s Prism API.
License: Apache License 2.0
Python client library for interacting with Workday’s Prism API.
License: Apache License 2.0
Hi Team, hopefully this is right place to ask, if not, I'd appreciate if you can direct me.
I'm the founder of cloudquery.io, a high performance open source ELT framework.
Our users are interested in a Workday plugin, but as we cannot maintain all the plugins ourselves, I was curious if this would be an interesting collaboration, where we would help implement an initial source plugin, and you will help maintain it.
This will give your users the ability to sync Workday data to any of their datalakes/data-warehouses/databases easily using any of the growing list of CQ destination plugins.
Best,
Yevgeny
A quick fix, but this:
# create an empty API table with your schema
table = prism.create_table("my_new_table", schema=schema["fields"])
Should be:
table = prism.create_table(p,"my_new_table, schema=schema["fields"])
Documentation is missing for the class attribute version
.
Line 71 in dd9a7c0
Hi, I'm thinking of contributing to this repo and I'm familiarizing myself with code. I noticed that there were a few whitespaces (found with flake8) in two of the files: _version.py
and versioneer.py
An example of the flake8
output for the current file has been attached. Please view the E203 whitespace before ':'
While this is a relatively small issue and the changes would not affect the performance, I would still like to submit a PR for this and will do so. If it is decided that the changes will not be merged I can move on to trying to look at the other issues.
Version 2 is the future of the Prism API. In order to evolve prism-python
to incorporate more features of V2 of the Prism API, we need to deprecate support for V1.
In effort to make prism-python
as intuitive as possible, we will be renaming many of the existing functions to be better aligned with nomenclature of the Prism API V2 (e.g.: tables instead of datasets).
While these changes will be consider "breaking", we believe the benefit of increased functionality outweighs the cost of some minor refactoring after the new version is released.
When calling the function prism.upload_file()
, an error can occur yet the process continues. Instead of just logging the error, we should raise an exception to break the process.
Line 224 in df35c67
Add optional parameter named fields
that contains the JSON schema for the table:
Hard coding the bucket schema makes it so that no other file configuration besides this can be used (e.g.: change delimiter, skip rows, etc.)
Instead, we should look to see if we can dynamically pick up the existing schema from the API.
Lines 390 to 402 in 276b5ed
Within the file prism.py
, the following:
Line 265 in 790a0f8
Should be changed to: url = self.prism_endpoint + "/wBuckets"
Error message returned when using prism upload
Add a simple unit test to make sure the package successfully imports. Unit tests can be expanded in the future to offer greater coverage.
When creating a new wBucket, the name is created as follows:
Line 179 in 89f7902
This name should be change to something like prismpython_123456
. This change will enable better auditing of wBuckets created using this package.
Documentation should be built for this project using Sphinx. The documentation should be built and upload to gh-pages using the GitHub action Sphinx to GitHub Pages V3. To use this feature, we must enable the GitHub action to build and deploy the Sphinx documentation as described here.
One way of finding the table_wid
of an existing table is as follows:
all_tables = p.list_table()
for table in all_tables['data']:
if table['name'] == "my_new_table":
print(table)
break
This snippet of code is not only wordy but also error prone if you have more than 100 tables, as that is the maximum returned from list_tables()
.
I propose we create a new function that is something like find_table("table_name")
that will search all of your existing tables, on all available pages of results, and if the search string is found, that table will be returned. If multiple tables contain the same search string, maybe we return them all in a list? The function should also indicate what has or has not been found through logging messages to the user.
# find the table
> table = p.find_table("table_name_BDS")
2020-12-07 01:07:39 INFO: Found 1 table(s) containing "table_name_BDS"
# inspect table data
> type(table)
dict
# find the table
> table = p.find_table("BDS")
2020-12-07 01:07:39 INFO: Found 10 table(s) containing "BDS"
# inspect table data
> type(table)
list
If a request fails, we should retry it before moving on. The parameters for retry (e.g.: number of attempts, back-off time, etc.) should be configurable by the user, but we should choose sensible defaults.
Check out #35636367 on StackOverflow for an example of how the Retry
class can be mounted to a Requests session. For more information about the Retry class, refer to the documentation.
The following section, among others:
$ export workday_base_url=<INSERT WORKDAY BASE URL HERE>
$ export workday_tenant_name=<INSERT WORKDAY TENANT NAME HERE>
$ export prism_client_id=<INERT PRISM CLIENT ID HERE>
$ export prism_client_secret=<INSERT PRISM CLIENT SECRET HERE>
$ export prism_refresh_token=<INSERT PRISM REFRESH TOKEN HERE>
Should be changed to:
export workday_base_url=<INSERT WORKDAY BASE URL HERE>
export workday_tenant_name=<INSERT WORKDAY TENANT NAME HERE>
export prism_client_id=<INERT PRISM CLIENT ID HERE>
export prism_client_secret=<INSERT PRISM CLIENT SECRET HERE>
export prism_refresh_token=<INSERT PRISM REFRESH TOKEN HERE>
If the verbose flag isn't passed, only print out the table name and table id. If the verbose flag is passed, print out all the details.
What is Versioneer?
This is a tool for managing a recorded version number in distutils-based python projects. The goal is to remove the tedious and error-prone "update the embedded version string" step from your release process. Making a new release should be as easy as recording a new tag in your version-control system, and maybe making new tarballs.
With the recent changes to move to focus on tables instead of datasets, the CLI has become outdated and needs attention.
Currently, if a request is not successful, the error code is returned. However, additional information could be returned to better identify the issue.
For example, if p.complete_bucket(bucket["id"])
returns a 400
status code, we should also return the content of the request.
> r.content
b'{"error":"invalid request: validation errors","errors":[{"error":"A different wBucket is in the Processing stage and is currently loading data to the same target table."}]}'
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.