Comments (5)
Hi Alexey !
So far there are no NULLs in Alenka. I agree that it would be a good idea to have NULLs in database and that storing NULLs as maximum values would be the best way to implement it. However it would complicate operations on numbers - for arithmetic operations we would have to check for NULL values before adding or multiplying vectors.
It also would be necessary to implement functions (like NVL in Oracle ) to convert NULLS to some values.
So probably it is not critical to implement NULLS right now - for now NULL values become 0's and empty strings.
I would rather focus now on improving performance and may be getting some real projects to use Alenka.
Btw, the exception that you got from loading files is a bug, so I fixed it.
Best regards,
Anton
from alenka.
Hi Anton!
Big thanks for your fix!
Yes, I am now testing on a small part of the real data from the customer's project and faced with such difficulties.
For testing right now on a real project hampered by the absence of the free GPU at the customer at this moment. But I can download a small part of data, that will skipped the automatic security system to test the project on my computer.
You are right, that it would complicate operations on numbers. But when joining tables the maximum value will be much less likely to coincide with the real values, ββthan it will happen with 0. And in this case we can't solve this problem.
Please, could you implement it, an example through #define - by switching between 0 and a maximum values, for integers and decimals? And NULL strings also always remain empty - all chars are 0.
In this case, for arithmetic operations on numbers, the maximum value converts to zero through easily implementing functions Oracle decode/nvl:
SQL: select sum(nvl(field, 0)) from table;
C++: nvl(field, 0);
template <typename T>
__host__ __device__ inline T decode(T &field, T const& compare_val, T const& then_val, T const& else_val)
{
return (field == compare_val)? then_val: else_val;
}
template<typename T>
__host__ __device__ inline T nvl(T &field, T const& then_val)
{
decode(field, NULL_VAL, then_val, field);
}
from alenka.
Hi Alexey !
You are right about joining on NULL values - they can conflict with existing valid 0 values.
I will have to modify a bit the compression routines which check min and max values of vectors and also the data load procedure and arithmetic calculations as well. I suppose that you still want NULLS to be converted to 0's in calculations ? That is, 1 + 3 + NULL = 4 ? Or the Oracle way, 1 + 3 + NULL = NULL ?
Anton
from alenka.
Hi Anton!
If this is not difficult, it is better as in Oracle: 1 + 3 + NULL = NULL
- In expression behaves as maximum value, in summation behaves as overflow and saves its maximum value, i.e. 1 + 3 + NULL = NULL.
- In the aggregate functions behaves as 0. An example sum(field) equal to: sum(nvl(field, 0)). And count(field) equal to: sum(decode(field, NULL, 0, 1)).
- In JOIN behaves as maximum value. But NULL must not match in join. (An example, if using sort-merge-join and ordered-group-by, then before this, sorted vectors must reduce size to exclude maximum values.)
Behavior as in Oracle in the examples:
select 2 + NULL from dual;
// result: NULL
with t1 as (
select NULL as f1, 'a' as f2 from dual
union all
select 2 as f1, 'b' as f2 from dual
)
select sum(f1) from t1;
// result: 2
with t1 as (
select NULL as f1, 'a' as f2 from dual
union all
select 2 as f1, 'b' as f2 from dual
), t2 as (
select NULL as f1, 'a' as f2 from dual
union all
select 2 as f1, 'b' as f2 from dual
)
select * from t1 join t2 on t1.f1 = t2.f1;
// result: 2 b 2 b
Thanks for your future fixes!
from alenka.
Hi,
I came across the discussion, and I love to see how this project will turn out. AlexeyAB correctly explains how nulls should be handled. I just wanted to add that wikipedia has a great article on how nulls are handled: http://en.wikipedia.org/wiki/Null_(SQL)#Arithmetic_operations it references to the ISO/IEC (2003). ISO/IEC 9075-2:2003.
from alenka.
Related Issues (20)
- Device Errors HOT 5
- Loading error HOT 2
- Alenka unable to find a table HOT 1
- trivial_copy_n: an illegal memory access was encountered HOT 14
- Project license? HOT 2
- What is the syntax for appending data into a table? HOT 1
- Data partitioning architecture HOT 12
- OpenCL port HOT 4
- Compilation on Windows HOT 2
- Generic Interface for Storage Behavior Layout HOT 1
- Compilation time very long HOT 4
- Benchmarking HOT 3
- License? HOT 1
- Compile problem for strings_sort_device.cu HOT 4
- Is this project live? HOT 1
- Out of memory error HOT 3
- Compile problem HOT 1
- Hardware spec HOT 4
- Case sensitive columns HOT 1
- mgpu::cuda_exception_t "Invalid device function" exception HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from alenka.