Giter VIP home page Giter VIP logo

libify's Introduction

Libify

Libify makes it easy to import notebooks in Databricks. Notebook imports can also be nested to create complex workflows easily. Supports Databricks Runtime Version 5.5 and above.

Installation
  1. Click the Clusters icon in the sidebar
  2. Click a cluster name (make sure the cluster is running)
  3. Click the Libraries tab
  4. Click Install New
  5. Under Library Source, choose PyPI
  6. Under Package, write libify
  7. Click Install

Capture.png

Typical Usage

After installing the package, add the following code snippets to the notebooks:

  1. In the importee notebook (the notebook to be imported), add the following cell at the end of the notebook. Make sure that dbutils.notebook.exit is not used anywhere in the notebook and that the last cell contains exactly the following snippet and nothing else:

    import libify
    libify.exporter(globals())
  2. In the importer notebook (the notebook that imports other notebooks), first import libify:

    import libify

    and then use the following code to import the notebook(s) of your choice:

    mod1 = libify.importer(globals(), '/path/to/importee1')
    mod2 = libify.importer(globals(), '/path/to/importee2')

    Everything defined in importee1 and importee2 would now be contained in the namespaces mod1 and mod2 respectively, and can be accessed using the dot notation, e.g.

    x = mod1.function_defined_in_importee1()

Databricks Community Cloud Workaround

Databricks Community Cloud (https://community.cloud.databricks.com) does not allow calling one notebook from another notebook, but notebooks can still be imported using the following workaround. However, both of the following steps will have to be run each time a cluster is created/restarted.

  1. Run step 1 from above (Typical Usage). Make a note of the output of the last cell (only the part marked below): Capture.png

  2. In the importer notebook, call libify.importer with the config parameter as the dictionary obtained from the previous step:

    import libify
    mod1 = libify.importer(globals(), config={"key": "T5gRAUduh9uSbhHIrj2c9R4UbrXUt2WiA4aYIpl3gGo=", "file": "/tmp/tmpmcoypj24"})

Build/Push Pipeline GitHub issues PyPI - Format PyPI version GitHub last commit GitHub tag (latest by date) visitors Downloads

libify's People

Contributors

vagrantism avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

libify's Issues

About importing function

Code that I run:
!pip install libify
import libify
io = libify.importer(globals(), '/Users/[email protected]/src/input_output/test_notebook')
function = io.cast_days()

Error that I get:
AttributeError: module 'libified_' has no attribute 'cast_days'

AttributeError Traceback (most recent call last)
in
----> 1 function = io.cast_days()

AttributeError: module 'libified_' has no attribute 'cast_days

Code in the notebook:
import pandas as pd
import libify

def cast_days(days):
if not isinstance(days, pd.DatetimeIndex):
return pd.DatetimeIndex(list(days))
return days

libify.exporter(globals())

libified functions can't be pickled

Hey,

Thanks a lot for the lib, very useful! :) There is however one annoying issue with it at the moment.

Whenever we have the need to use a function that we import using libify within a dataframe foreach/foreachPartition we receive these errors:

ModuleNotFoundError: No module named 'libified_'
or
Can't pickle <class 'libified_.BW'>: it's not found as libified_.BW

My use-case is the following. I want to create a database connection for each partition to do some sort of batch export from a spark dataframe using https://github.com/SAP/PyHDB

If I create directly a pyhdb connection using pyhdb.connect(...) within my foreachPartition it works fine but if I am using a library that I imported with libify that would do a

def get_connection():
  return pyhdb.connect(...)

it does not work with the errors mentioned above

Parse Exception

Hi Team,

I am facing a parse exception on libify.importer(globals(), config={}. In config I am passing key and file path through a json. See below screen shot:

image

I also tried to pass the key and file in this way (libify.importer(globals(), config={"key"= "afdasdfsadfsf" , "file" : "/abc/abc"}))

Can someone please help me here to understand the issue and its resolution?

Thanks,

Libify speed & debugging

Hi,

this is more of a question than a bug report. I just implement the lib to organize code for my somewhat complex databricks project. However, I am experiencing significant drops in loading speeds when compared to standard %run './utils', that is from seconds to minutes.

Is there something I could do to improve the loading speed?

Also, how do you debug such a code without reporting the actual error messages in the importing notebook?

Thanks.

Br,

MF

Missing the dependencies setting in setup.py

Summary

setup.py in this package seems to have no install_requires. This causes error "ModuleNotFoundError: No module named 'cryptography'" when importing libify because pip doesn't know dependencies of the package.

Version

v0.78

Steps to Reproduce

$ pip install libify
Collecting libify
  Downloading libify-0.78-py3-none-any.whl (4.3 kB)
Installing collected packages: libify
Successfully installed libify-0.78

$ python -c "import libify"
Traceback (most recent call last):    
  File "<string>", line 1, in <module>
  File "/.../python3.8/site-packages/libify/__init__.py", line 3, in <module>
    from cryptography.fernet import Fernet
ModuleNotFoundError: No module named 'cryptography'

Possible Solution

By writing dependencies in setup.py, pip should find information for the dependencies correctly.

--- a/setup.py
+++ b/setup.py
@@ -20,4 +20,7 @@ setup(name='libify',
         'Operating System :: OS Independent',
     ],
     python_requires='>=3.5',
+    install_requires=[
+      'cryptography>=37.0.2',
+    ],
     zip_safe=False)```

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.