Got a version of transpose that does not cause nil's and thus no crash from <a href="h

I added an imputer class, see its <a href="http://fzalkow.github.io/cl-mlep/usage_exam

No missing attribute values possible in naive-bayes about cl-mlep HOT 9 CLOSED

fzalkow commented on August 29, 2024

No missing attribute values possible in naive-bayes

from cl-mlep.

Comments (9)

fzalkow commented on August 29, 2024

Thanks for your suggestion.

Could you describe a situation where the version currently used crashes? It is thought for internal usage for list of lists of same length.

The version you proposed may be an elegant recursive one, but is has a drawback: It is not tail recursive, so for arbitrary large lists it can cause a stack overflow.

(defun transpose-list (l)
  (cond ((some #'null l) '())
        (t (cons (mapcar #'car l)
                 (transpose-list-new (mapcar #'cdr l))))))

(transpose-list-new (list (loop repeat 100000 collect (random 100))
                          (loop repeat 100000 collect (random 100)))) ; Stack Overflow

This could be avoided by a tail recursive version:

(defun transpose-list (l)
  (labels ((transpose-intern (l result)
             (if (some #'null l)
                 (nreverse result)
               (transpose-intern (mapcar #'cdr l)
                                 (cons (mapcar #'car l) result)))))
    (transpose-intern l nil)))

This is actually a nice one, but is there a specific reason to replace the current one by this?

from cl-mlep.

Harag commented on August 29, 2024

The data I am using to train with some time has 27 fields and sometimes
26. So my question is would having different row lengths be detrimental
to the calculations? If not, stopping the system from crashing in those
scenarios would be great else the user has to fake missing fields.

Testing with a record of different length does not crash, once again the
question is would this adversely effect calculations? If so then I would
expect an error for that as well.

If you want I can give you some example data etc and describe my
scenario in more detail but lets clear up the above assumptions before I
give you more detail.

Regards
Phil

On 13/08/2015 14:44, Frank Zalkow wrote:

Thanks for your suggestion.

Could you describe a situation where the version currently used
crashes? It is thought for internal usage for list of lists of same
length.

The version you proposed may be an elegant recursive one, but is has a
drawback: It is not tail call optimized, so for arbitrary large lists
it can cause a stack overflow.

|(defun transpose-list (l)
(cond ((some #'null l) '())
(t (cons (mapcar #'car l)
(transpose-list-new (mapcar #'cdr l))))))

(transpose-list-new (list (loop repeat 100000 collect (random 100))
(loop repeat 100000 collect (random 100)))) ; Stack Overflow
|

This could be avoided by a tail recursive version:

|(defun transpose-list (l)
(labels ((transpose-intern (l result)
(if (some #'null l)
result
(transpose-intern (mapcar #'cdr l)
(cons (mapcar #'car l) result)))))
(transpose-intern l nil)))
|

This is actually a nice one, but is there a specific reason to replace
the current one by this?

—
Reply to this email directly or view it on GitHub
#2 (comment).

from cl-mlep.

Harag commented on August 29, 2024

Had to throw in a reverse for it to work

(defun transpose-list (l)
(labels ((transpose-intern (l result)
(if (some #'null l)
result
(transpose-intern (mapcar #'cdr l)
(cons (mapcar #'car l) result)))))
(reverse (transpose-intern l nil))))

On 14/08/2015 10:57, Phil Marneweck wrote:

Hi

The data I am using to train with some time has 27 fields and
sometimes 26. So my question is would having different row lengths be
detrimental to the calculations? If not, stopping the system from
crashing in those scenarios would be great else the user has to fake
missing fields.

Testing with a record of different length does not crash, once again
the question is would this adversely effect calculations? If so then I
would expect an error for that as well.

If you want I can give you some example data etc and describe my
scenario in more detail but lets clear up the above assumptions before
I give you more detail.

Regards
Phil

On 13/08/2015 14:44, Frank Zalkow wrote:

Thanks for your suggestion.

Could you describe a situation where the version currently used
crashes? It is thought for internal usage for list of lists of same
length.

The version you proposed may be an elegant recursive one, but is has
a drawback: It is not tail call optimized, so for arbitrary large
lists it can cause a stack overflow.

|(defun transpose-list (l)
(cond ((some #'null l) '())
(t (cons (mapcar #'car l)
(transpose-list-new (mapcar #'cdr l))))))

(transpose-list-new (list (loop repeat 100000 collect (random 100))
(loop repeat 100000 collect (random 100)))) ; Stack Overflow
|

This could be avoided by a tail recursive version:

|(defun transpose-list (l)
(labels ((transpose-intern (l result)
(if (some #'null l)
result
(transpose-intern (mapcar #'cdr l)
(cons (mapcar #'car l) result)))))
(transpose-intern l nil)))
|

This is actually a nice one, but is there a specific reason to
replace the current one by this?

—
Reply to this email directly or view it on GitHub
#2 (comment).

from cl-mlep.

fzalkow commented on August 29, 2024

Oh, ok, here we have the problem: In fact Naive Bayes currently doesn't support missing attribute values! One has to think of how to handle this, because when just leaving a value out in a row, there is no information about which attribute is missing there.

I'll change the title of your issue, hope that is ok for you. I won't have time to work on this in the near-term.

Thanks for spotting this issue. Haven't worked on learning with missing attribute values so far.

PS: The reverse in my version is within the local function. Should work just as your version.

from cl-mlep.

Harag commented on August 29, 2024

Is there another model that work then?

Or is missing values an issue that has to be handled by each model on it's
own?

I take it that the order of attributes are important as well then for
Bayes?
On 14 Aug 2015 12:14 PM, "Frank Zalkow" [email protected] wrote:

Oh, ok, here we have the problem: In fact Naive Bayes currently doesn't
support missing attribute values! One has to think of how to handle this,
because when just leaving a value out in a row, there is no information
about which attribute is missing there.

I'll change the title of your issue, hope that is ok for you. I won't have
time to work on this in the near-term.

Thanks for spotting this issue. Haven't worked on learning with missing
attribute values so far.

—
Reply to this email directly or view it on GitHub
#2 (comment).

from cl-mlep.

fzalkow commented on August 29, 2024

Unfortunatly, currently there is no one...

An easy method would be to fill a missing value by the mean across all valus of this attributes. That would be quite easy to add. But only possible for numerical data. For categorical data one would have to use the most frequent value.

The order of attributes is important in all machine learnig methods I know!

from cl-mlep.

Harag commented on August 29, 2024

I implemented a Naive Bayes that handles missing values fine. According
to the reading I did Naive Bayes can just exclude missing values from
the calculations.

The order of the columns is a minor issue, my issue is that I don't know
what the columns are identifying columns is a major issue especially
when they contain proper names like surnames,towns,company names etc. I
have a suspicion that Hidden Markov Model could help with that but I
have yet to find a write up on HMM's that was not written for a
mathematician.

Any suggestions on how to identify columns?

Regards
Phil

On 19/08/2015 10:56, Frank Zalkow wrote:

Unfortunatly, currently there is no one...

An easy method would be to fill a missing value by the mean across all
valus of this attributes. That would be quite easy to add. But only
possible for numerical data. For categorical data one would have to
use the most frequent value.

The order of attributes is important in all machine learnig methods I
know!

—
Reply to this email directly or view it on GitHub
#2 (comment).

from cl-mlep.

fzalkow commented on August 29, 2024

Did you extend the mlep version of Naive Bayes? If you want to, you can send me the code or do a pull request, to contribute your work to mlep.

The column identification is maybe a little off topic here. Could you write me a mail and describe it in a little more detail? I didn't got it.

Anyway, there are HMM libraries for Common Lisp out there, see mulm and cl-hmm. (I din't tried them...)

from cl-mlep.

fzalkow commented on August 29, 2024

I added an imputer class, see its documentation. Does this solve this issue for you?

from cl-mlep.

No missing attribute values possible in naive-bayes about cl-mlep HOT 9 CLOSED

Comments (9)

Related Issues (4)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent