Comments (24)
I'm OK to remove them since they are moved in Gluten now. See if Voltron Data guys still wants to work on it.
from velox.
I'm asking in the substrait slack as well (substrait.slack.com)
from velox.
I was considering creating a velox-substrait repo
A separate repo would be preferred. Thanks.
from velox.
thanks for the fast response!
from velox.
I'll try to have drin/velox-substrait#1 finished soon so that you can resolve this issue.
from velox.
from velox.
I don't think anyone else can use the version that's inside Gluten. Ideally the Substrait->Velox code would live in its own repository so it can be used by anyone needing a Substrait gateway to Velox (similar to datafusion's substrait offering).
from velox.
I don't think anyone else can use the version that's inside Gluten. Ideally the Substrait->Velox code would live in its own repository so it can be used by anyone needing a Substrait gateway to Velox (similar to datafusion's substrait offering).
Yes, but the problem now is it's not mature yet and no one is developing or maintain the code. I don't know if anyone still uses it. In future if someone owns the work, they can start from current code.
Gluten team currently has no resource to work on it, it's not our high priority work. If Voltron Data is going to make it mature, Gluten will switch from our internal one to Velox's.
from velox.
Is velox open to someone else maintaining the code in a separate repository as @EpsilonPrime mentioned?
I am researching the use of substrait to span engines such as Acero, DuckDB, etc. I would be willing to be a maintainer for a separate repository for the forseeable future; or, I can coordinate with VoltronData or @EpsilonPrime on some sort of shared maintainership.
from velox.
with a very brief look at the code, it seems like the code doesn't need to be compiled in-tree, but let me know if that's incorrect.
from velox.
Is velox open to someone else maintaining the code in a separate repository as @EpsilonPrime mentioned?
Definitely. You may contribute if you need. The code in velox is uncompleted yet. You may refer to the one Gluten currently is using:
https://github.com/oap-project/gluten/tree/main/cpp/velox/substrait
The substrait we used is a pretty old version, you may upgrade to latest version. Also substrait itself is not completed yet. There should be some PR to add functions.
from velox.
Is velox open to someone else maintaining the code in a separate repository as @EpsilonPrime mentioned?
Sure. That works. Thanks.
from velox.
You may refer to the one Gluten currently is using
Thanks for the pointer. Was the substrait code donated to the Gluten codebase? the licenses attribute different organizations.
If Voltron Data is going to make it mature, Gluten will switch from our internal one to Velox's.
Are Gluten developers maintaining the substrait -> velox code? What would be the process for Gluten to switch from an internal implementation to an external implementation? I'm wondering if I can work with Gluten developers to move the code out sooner rather than later.
Probably for both of the above I should just ask Gluten devs anyways, but figured I would ask here in case you know.
from velox.
Thanks for the pointer. Was the substrait code donated to the Gluten codebase? the licenses attribute different organizations.
No, the substrait support in Velox is added by Gluten team initially. Gluten only holds the codes which are not upstream to velox yet. But later the hold code is more and more so we stopped the upstream work. Your contribution should be to Velox directly. Not gluten.
Are Gluten developers maintaining the substrait -> velox code? What would be the process for Gluten to switch from an internal implementation to an external implementation? I'm wondering if I can work with Gluten developers to move the code out sooner rather than later.
Probably for both of the above I should just ask Gluten devs anyways, but figured I would ask here in case you know.
Gluten developers only maintain the substrait -> Velox code in Gluten repo, not the one in Velox now. You may port the code in Gluten and upstream to Velox. Or you can start from fresh. Either way works. If the conversion in Velox is mature enough, we will remove the code in gluten and switch to Velox one.
from velox.
Your contribution should be to Velox directly.
If the conversion in Velox is mature enough
To clarify: it is preferred that I just help maintain the substrait conversion in the velox repo itself?
I was considering creating a velox-substrait repo and porting most of the CMakeLists.txt from gluten to place in the new repo. Then, the code can be compiled out of the velox source tree and any systems that use velox can leverage that as a submodule or something similar. However, it sounds like this is not the preferred approach?
Would a preferred approach be to somehow facilitate changes upstreamed from gluten into velox?
The reason I ask is that it seems that if gluten is willing to use substrait conversion in the velox repo, but they're maintaining their own version already, what is the barrier to them just migrating the code themselves and working in the velox repo?
I also want to clarify that I am currently a grad student that is not funded by anything gluten or velox related, so if some of the need is for someone to review changes to upstream into velox then I probably would not be able to keep up with that.
from velox.
@drin Thanks.
from velox.
The reason I ask is that it seems that if gluten is willing to use substrait conversion in the velox repo, but they're maintaining their own version already, what is the barrier to them just migrating the code themselves and working in the velox repo?
Because upstreaming to velox still needs much clean up work and take much time. It's not Gluten's high priority work.
If you are going to setup a new repo, let's remove the code in velox now.
from velox.
@mbasmanova Any possible to put the substrait-Velox repo as a side project of Velox?
from velox.
@FelixYBW Binwei, what do you mean by "side project"?
from velox.
I mean put the repo like https://github.com/facebookincubator/substrait-velox
So we needn't maintain it in Velox. People who wants to maintain it can submit PR to the new repo. In future if It's really commonly used, then we can move back to Velox.
from velox.
@FelixYBW I don't think we can create a repo like that without maintaining it.
from velox.
@FelixYBW do you want to maintain the repo in a separate org? maybe we can ask to keep it in https://github.com/substrait-io/?
from velox.
@FelixYBW do you want to maintain the repo in a separate org? maybe we can ask to keep it in https://github.com/substrait-io/?
Yes, it's another choice. @EpsilonPrime
from velox.
The initial responses seem to be the following:
I'd like to see at least one committer and one non-committer that are invested in maintaining such a project
and
Presumably a new repo would require an SMC vote
So, I think for a short-term solution either me owning it or making an unofficial "incubation" organization is the most timely solution
from velox.
Related Issues (20)
- Dependencies of GCS have not been updated for nearly a year
- Nightly Jobs (Scheduled Jobs) are failing HOT 4
- UBSAN: overflow and hookLimits in SumAggregationTest failed HOT 1
- Suspected bug related to read with Mutation
- Add NO_PROXY CIDR support for s3 hive connector
- System crashes while build the velox HOT 4
- Join spilling support in mixed execution mode
- Right side join is not supported for mixed execution mode
- Add decimal support for min(x,n), max(x,n) HOT 1
- Upgrade presto docker image to Presto 0.286
- approx_set Presto function in Velox supports more signatures than Presto HOT 3
- Add support for manylinux_2_28 from manylinux 2014
- Fix test method names in sparksql/tests/StringTest.cpp
- Fails to create vector function with hard-coded scale HOT 1
- Add Classification Metrics Aggregate Presto Functions
- Make test methods lambdas in sparksql/tests/StringTest.cpp
- Long time spent in function 'listJoinResults' for array mode join probe HOT 6
- Presto aggregate function regr_r2 fails in AggregationFuzzer HOT 2
- Build ERROR ON APPLE M1
- Add support for DECIMAL types to Simple Function API HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from velox.