Comments (8)
Basic implementation of this is done. I need to test it further. Right now it works as follows,
SELECT VAR(B) FROM SANE_R;
gives the following result
+-------------+
| EXPR_1_VAR |
+-------------+
| 0 |
| 0 |
|8.96187201351742 |
| 0 |
| 0 |
| 0 |
| 0 |
+-------------+
from mimir.
8.96 seems ... a bit high. If memory serves, SANE_R's B attribute can only take values from 1-5. A 9 shouldn't even be possible, even with an even split of samples with B=1 and B= 5.
from mimir.
Just tested it out. The numbers seem a bit more sane. We should do a code review, but otherwise looks good. Let's integrate it into the UI
from mimir.
I had made a mistake with the indexes in the earlier one. But the variance calculation is still wrong. When I get classes from the classifiers getVotesForInstance method, I thought that we get the PDF for the model, but the probabilities don't add up to one. I'm not sure if I have understood something wrong here.
from mimir.
Ok, the returned values were probabilities which the classifiers belonging to each class assign for the given instance. It is not the PDF.
There was a function for variance in the moa Instances class, which when used with the class index returned the required attributes variance. The variance seems to be working correctly now.
from mimir.
Yeah, the only lens we have at the moment is a categorical version of the DCR lens. Realistically, this is not something for which variance will be particularly helpful. We probably need a few more lenses before we can test this process end-to-end rather than with just unit test cases.
from mimir.
Variance is now calculated by sampling.
from mimir.
Cool! Close the ticket after the code review.
from mimir.
Related Issues (20)
- Mimir still creating spark-warehouse and metastore_db
- Support for Vizual within Mimir
- Support table mutation operations in Mimir
- OFFSET queries are painfullly slow / do not complete HOT 1
- Catch UserInterruptedException (and others) in Mimir Command Line
- crash on use of 'like' HOT 2
- Error compiling float/int addition
- row_number is incorrectly pulled into lazy_row HOT 1
- Interpolation Model impossibly slow HOT 1
- Add NLP lenses, e.g., for Date Extraction
- Sanitize sheet name and headers for google sheet datasource HOT 1
- Replace UDFs/UDAs with Spark's Catalog
- Stratified Sampling Operator HOT 3
- The shape detector lens is not producing sensible caveats HOT 1
- CAST behavior inconsistent between Mimir and Spark
- Detect Headers not properly removing header row
- Order-by resolves (group-by) attributes against pre-aggregate schema, not post-aggregate schema.
- Replace typesystem with Spark-/Hive- types
- Switch to HyperLogLog for domain tests
- SystemCatalog is sloooooow. HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mimir.