Comments (21)
In order to support Stream UDF, we need a Lua runtime in the client. We have weighed using the C Lua runtime, but haven't been comfortable with that idea thus far.
I will leave this open, make it an "enhancement" and rename the title.
from aerospike-client-go.
+1
from aerospike-client-go.
According to the document, UDF runs inside database server.
Why we need Lua runtime in the client?
Is there anything I've misunderstood?
from aerospike-client-go.
Stream UDFs (not to be confused with simple UDF) are Map/Reduce functionality.
Each node applies the map/reduce functions on its own data, and then the final result of each node will be sent to the client. The final reduce stage is done on client side, hence Lua is needed on client side.
from aerospike-client-go.
Got it, Thanks @khaf
from aerospike-client-go.
Any update on when we'll see stream udfs in the go client? We're currently wrapping the c client using cgo for this call. Thanks!
from aerospike-client-go.
Sorry, I still didn't get.
We do register .lua file on server, so server has function for reduce step.
Why "random" node in cluster can't do for us reduce step?
from aerospike-client-go.
@cstivers78 can you share current wins and problems? Maybe we can somehow help with this feature?
from aerospike-client-go.
@nizsheanez we simply did not design it this way. You would have to ship all the data to that "random node" which would dramatically un-balance performance. Perhaps we should have it as a backup for clients that don't support a lua scripting system, but we have not yet done that. The Go aggregation work is behind another criitcal feature and as we use an agile scheduling system we can't say with accuracy when the work for this will be done - it is certainly high on our priority list.
from aerospike-client-go.
@abramovic, can you show how do you deal with C bindings?
from aerospike-client-go.
@bbulkow, can we have some patter for do filtering without reduce step? (not only by secondary index)
For example:
On map step:
Check that 1st bin is true, 2nd bin is true, 3rd bin is true
Then write to some key "id" bin of record (append to some large list or something)
from aerospike-client-go.
@khaf, sorry for stupid question. Why we can't do reduce step on Go(no Lua)?
from aerospike-client-go.
When you write your Map/Reduce in Lua, all map steps and almost all reduce steps happen on server nodes, without any data streaming to the client. Each node will return ONE result to the client, and the client will do the last <num nodes
> - 1 reduce actions on client side.
If you wanted to do this in Go, you would have to stream ALL data from database nodes to your client (could be Terabytes) to do the reduction there, which is awfully inefficient, even for small amounts of data.
from aerospike-client-go.
@khaf, thanks for detailed answer.
I understand that filtering and aggregation will execute on nodes.
But why last small client-side reduce-step can't be done on GoLang, why we must do it in Lua?
from aerospike-client-go.
You can do it in Go as well if you don't mind. Example:
const udfFilter = `
local function map_record(record)
-- Add name and age to returned map.
-- Could add other record bins here as well.
-- This code shows different data access to record bins
return map {bin4=record.bin1, bin5=record["bin2"]}
end
function filter_records(stream, name)
local function filter_name(record)
-- Here is your compound where clause
return (record.bin1 == -1) and (record.bin2 == name)
end
return stream : filter(filter_name) : map(map_record)
end
`
regTask, err := client.RegisterUDF(nil, []byte(udfFilter), "udfFilter.lua", LUA)
panicOnErr(err)
// wait until UDF is created
err = <-regTask.OnComplete()
panicOnErr(err)
// setup statement
stm := NewStatement(namespace, set)
//set the predicate, index the field with better selectivity
stm.Addfilter(NewRangeFilter(bin3.Name, 0, math.MaxInt16/2))
// This is where UDF is set for filter
stm.SetAggregateFunction("udfFilter", "filter_records", []Value{NewValue("Aeropsike")}, true)
recordset, err := client.Query(policy, stm)
panicOnErr(err)
for rec := range recordset.Records {
fmt.Println(rec.Bins)
}
This would return Rec.Bins
as:
map[SUCCESS:map[bin4:constValue bin5:1]]
map[SUCCESS:map[bin4:constValue bin5:19]]
map[SUCCESS:map[bin4:constValue bin5:3]]
map[SUCCESS:map[bin4:constValue bin5:7]]
map[SUCCESS:map[bin4:constValue bin5:162]]
map[SUCCESS:map[bin4:constValue bin5:215]]
map[SUCCESS:map[bin4:constValue bin5:122]]
from aerospike-client-go.
Wow-Wow!
This is what i tried to find!
Gone to test it....
Many thanks, @khaf
from aerospike-client-go.
Maybe this lua VM in pure go could help:
https://github.com/Shopify/go-lua
from aerospike-client-go.
Thanks. We are already using it to bring the Aggregation to the client.
from aerospike-client-go.
Great,can I use this feature now ?
Is there roadmap for "full lua reduce support" ?
from aerospike-client-go.
Not yet, but I hope before the end of year.
from aerospike-client-go.
All, we just released this feature. Please give it a spin and let us know what you think.
from aerospike-client-go.
Related Issues (20)
- Difference size of allocated memory between client v4.5 and v6.9.1 HOT 3
- PARAMETER_ERROR using expressions and batch write HOT 21
- Client v6 occasionally crushes with concurrent map read and map write HOT 6
- Partition map empty error from Aerospike on vm HOT 7
- panic: unaligned 64-bit atomic operation in Cluster.tend (on 32bit) HOT 4
- Is there a way to get items in list for a PK? HOT 2
- BatchGet with secondary index HOT 3
- command execution timed out in go-client when Docker container is restarted HOT 2
- Is there any way to mock client for UT? HOT 3
- Why is ClientPolicy.Timeout used to create new connections during reads instead of basePolicy.connectTimeout HOT 11
- Type consts are hidden in `internal/particle_type` HOT 7
- How to get show distribution time_to_live HOT 1
- invalid go.mod on master HOT 2
- Could you clarify timeouts for the Query? HOT 10
- [6.14.0] Proto registration conflict caused by `kv.proto` HOT 10
- Support more integer sizes in the Expressions API. HOT 1
- Getting record keys when using QueryObjects
- Is it possible to index and filter keys in map?
- 6.4 upgrade from 5.7 HOT 3
- Massive amount of memory occupied by `newPartitionStatus` and `newNodePartitions` HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from aerospike-client-go.