Hi team, I decided to give Refinery a try with a classification prob

Thanks for the input <a class="user-mention notranslate" data-hovercard-type="user" da

Thanks for the heads up <a class="user-mention notranslate" data-hovercard-type="user"

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Thanks for getting back to me <a class="user-mention notranslate" data-hovercard-type=

This will be first solved by implementing <a class="issue-link js-issue-link" data-err

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Classification weak learners should allow more than one input feature about refinery HOT 8 CLOSED

code-kern-ai commented on May 18, 2024

Classification weak learners should allow more than one input feature

from refinery.

Comments (8)

jhoetter commented on May 18, 2024 1

Thanks for the input @agravier. We already have a format to upload existing data (https://docs.kern.ai/docs/project-creation-and-data-upload#uploading-existing-labeled-data), but I agree that this requires UX improvement. We'll work on this, and I'd be happy to have your feedback again when that's implemented :)

from refinery.

agravier commented on May 18, 2024 1

Thanks for the heads up @jhoetter , I'll give it a try at the next occasion. Cheers

from refinery.

JWittmeyer commented on May 18, 2024

Hi @agravier,

thank you for reaching out to us and your feedback. You are right, both options (1. multi-attribute embeddings & 2. "calculated" columns) aren't part of our current UI. Calculated columns are on our roadmap for 2022.

from refinery.

jhoetter commented on May 18, 2024

Hi! That point is 100% valid, and we thought about it too. We're thinking about the following, and I'd be curious what you think about it:

currently, you have one programming interface, i.e. in the heuristics sections
in the near future (Q4), you'll be able to have a programming interface similar to that to write computed attributes, e.g.

def word_a_cat_word_b(record):
    return str(record["word_a"]) + str(record["word_b"])

also, we're continuing our work on our embedder library. Here, again we want to provide a programmatic interface that provides similar to the active learning templates, but with which you can compute your very own customized (and finetuned) embeddings, e.g.

from embedders.classification.contextual import TransformerSentenceEmbedder
def classification_word_a_cat_word_b_distilbert(record):
    embedder = TransformerSentenceEmbedder("distilbert-base-cased")
    return embedder.fit_transform(record["word_a_cat_word_b"], record["is_oxymoron"])

of course, not 100% sure about the exact interface here, but that is the general idea.

And thanks for trying out refinery, means a lot! :)

from refinery.

agravier commented on May 18, 2024

Thanks for getting back to me @JWittmeyer and @jhoetter. Sound good, as long as the UX is there to make all this clear. Another couple of things that you may want to consider, from my trial: tabular data export (not that JSON is horrible, but the thing lends itself to a tabular format) and "partially annotated input reconciliation", when one of the columns of the imported data already contains some labels. Obviously this raises some more questions that could be presented to the user about what to do with this data, like assign it to which annotator, etc.

from refinery.

agravier commented on May 18, 2024

I'll revisit in a few months, all the best, cheers!

from refinery.

jhoetter commented on May 18, 2024

This will be first solved by implementing #40. You'll be able to modify any attributes, in that case have e.g. a concatenation of word_a and word_b (similar to this):

def word_a_cat_word_b(record):
    return str(record["word_a"]) + str(record["word_b"])

Afterward, you can apply encoding to this attribute.

We'll ultimately provide an extensive interface to program embeddings, but that is a bit further down the road :)

from refinery.

jhoetter commented on May 18, 2024

@agravier This is solved with the release of version 1.3.0. You can now do attribute modifications, which allow you to then create exactly the embeddings you like. Let us know what you think :)

from refinery.

Classification weak learners should allow more than one input feature about refinery HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent