Comments (17)
For operators, you can derive BaseOperator anywhere outside of the Airflow code and use them in your DAGs (we do that internally for operators that aren't relevant to the open source community).
For executors they are a little more deeply embedded in the code as you pointed out. If you need a hook in https://github.com/mistercrunch/airflow/blob/master/airflow/executors/__init__.py I'll be happy to accept it. Could be something like:
try:
from airflow_custom import DEFAULT_EXECUTOR
except:
DEFAULT_EXECUTOR = None
And then somehow have this take precedence over the if statements underneath.
Then all you need is a airflow_custom.py module in your environment that defines DEFAULT_EXECUTOR as a derivative of BaseExecutor.
from airflow.
To ease a community development of operators,i think that a plugin
mechanism such as provided by yappsy would be best. A plugin dir in AIRFLOW
dir could simply hold those airflow_XXX operators and init would simply
load the selected plugin (if not one of the base operator).
I can fork the project and propose such solution with a pull request if you
want.
Le sam. 6 juin 2015 00:21, Maxime Beauchemin [email protected] a
Γ©crit :
For operators, you can derive BaseOperator anywhere outside of the Airflow
code and use them in your DAGs (we do that internally for operators that
aren't relevant to the open source community).For executors they are a little more deeply embedded in the code as you
pointed out. If you need a hook in
https://github.com/mistercrunch/airflow/blob/master/airflow/executors/__init__.py
I'll be happy to accept it. Could be something like:try:
from airflow_custom import DEFAULT_EXECUTOR
except:
DEFAULT_EXECUTOR = NoneAnd then somehow have this take precedence over the if statements
underneath.Then all you need is a airflow_custom.py module in your environment that
defines DEFAULT_EXECUTOR as a derivative of BaseExecutor.β
Reply to this email directly or view it on GitHub
#3 (comment).
from airflow.
Sounds perfect. Would the folder structure go $REPO/plugins/<plugin_set>/operators/<operator>.py
or straight $REPO/plugins/operators/<operator>.py
? How should we enable plugins / plugin sets? Load all we find in the folders when present?
from airflow.
+1 for plugin architecture via Yapsy.
from airflow.
@osallou , I just read about Yapsy and this sounds like a great idea. We can squeeze a plugin_folder
setting in airflow.cfg
and operators / executors / macros in that folder would get discovered and integrated.
Internally we were mentioned the possibility of having plugins that would have UI components to them. Seems like it'd be doable too eventually.
from airflow.
I used Yapsy in several of my projects, it is really easy and you can group plugins by "type" (operators, executors, ...). Then you only need to match your config with available plugins.
Do you want me to code it and send a pull request or do you prefer to manage it yourself?
from airflow.
Regarding structure and Yapsy use, I think that all plugins could go directly in a plugin dir (defined in config or $airhome/plugins by default) then you load the one defined in config file (for operator).
With base objects, you can group plugins and load the expected one with something like
# Build the manager
simplePluginManager = PluginManager()
# Tell it the default place(s) where to find plugins
simplePluginManager.setPluginPlaces([plugins_dir_from_config])
simplePluginManager.setCategoriesFilter({
"Operator": BaseOperator,
"Executor": BaseExecutor
})
# Collect all plugins
simplePluginManager.collectPlugins()
# Get an instance of the executor defined in config
for pluginInfo in simplePluginManager.getPluginsOfCategory("Executor"):
if pluginInfo.plugin_object.get_name() == executor_defined_in_config:
self.executor = pluginInfo.plugin_object
from airflow.
Sounds like we'd need a bit of code in operators/__init__.py
to integrate the plugins that are instances of BaseOperator if we want them to be namespaced there. They could also be namespaced under airflow.plugins.operators.PluggedInOperator
I'm not sure how it's usually done, but it seems like it'd be nice to have them integrated in airflow.operators (same goes for executors and macros)
from airflow.
A plugin system would be useful. I just wrote a hook and sensor operator for RabbitMQ so I could fire off a task when a queue became empty. master...codewithcheese:rabbitmq_hook_sensor
from airflow.
I'm going to work on a plugin system over the next week, seems pretty straightforward. @codewithcheese, we support defining pool of tasks in Airflow, can your use case be handled by Airflow pools? http://pythonhosted.org/airflow/concepts.html#pools
from airflow.
Actually I am using rabbitmq for data processing in a different project and need to run some bash commands when it is complete. I was sharing that to demonstrate that a plugin system for hooks and not so necessarily operators would be useful to me.
from airflow.
I'm starting work on a plugin system using yapsy, I'll paste a link to the PR here when it's baked. I'm planning on integrating hooks, operators, macros, webviews, executors and I think that's it for now. We have use cases internally so that justifies the work.
from airflow.
π
from airflow.
Merged 22ac771
Documented here: http://pythonhosted.org/airflow/plugins.html
Let me know what you think
from airflow.
It's out on pypi (v1.1.0)
from airflow.
Sounds nice and fitting needs.
Thanks
Le mer. 17 juin 2015 03:09, Maxime Beauchemin [email protected] a
Γ©crit :
Merged 22ac771
22ac771
Documented here: http://pythonhosted.org/airflow/plugins.htmlLet me know what you think
β
Reply to this email directly or view it on GitHub
#3 (comment).
from airflow.
Sweeet thanks!. ill try it out. Now i can stop maintaining a fork for my deployment.
from airflow.
Related Issues (20)
- AirFlow Unit Testing not working as described in the documentation: HOT 1
- Undesired "<SomeOperator>.execute cannot be called outside TaskInstance!" warning HOT 6
- Airflow fully supports multi-tenancy HOT 1
- Task Killed because Recorded pid does not match the current pid HOT 1
- Wrong schedule for hourly dag HOT 10
- apache-airflow-providers-amazon added xmlsec as new dependency and pinned to a version that doesn't have wheels for new python versions HOT 19
- Task groups are not being represented in bold letters anymore HOT 4
- XCom unable to parse tuple response from DatabricksSQLOperator on SQL query execution HOT 1
- webserver: Additional property base_url is not allowed HOT 1
- DAGs are able to see historical dataset events when created new HOT 3
- Add Support for GitHub App Installation Authentication in `GithubHook` HOT 1
- Add a new ExternalAPITaskSensor to monitor external DAGs via Airflow REST API HOT 6
- [OpenLineage] Add new DAG job facet HOT 1
- Improve visual presentation of task log groups HOT 3
- Best practice: Mock connections AttributeError: type object 'Connection' has no attribute 'get' HOT 5
- union of dynamically mapped tasks in airflow
- Error when _upload_file_temp in DataprocSubmitPySparkJobOperator HOT 4
- ProcessPoolExecutor in CeleryExecutor should be reused
- Resolve `AirflowProviderDeprecationWarning` in providers system tests HOT 1
- [Bug] Strict validation in Dataset URI in Airflow 2.9 breaks some DAGs HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from airflow.