Giter VIP home page Giter VIP logo

edm2016's People

Contributors

chaitue avatar karklin avatar khwilson avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

edm2016's Issues

Online Prediction Accuracy

Regarding the paper titled “Back to the basics: Bayesian extensions of IRT outperform neural networks for proficiency estimation”, I am interested in the online prediction accuracy metric of evaluation.

Couple questions (in relation to the 1PL IRT model):

  1. In this metric, students are split into training and testing populations. In a real life scenario, the initial training population used to determine item-level parameters would not always be available, especially in a flashcard application, where predictions are required immediately without any prior item-level parameter estimation.

In such a situation, is an IRT model unsuitable? Must the IRT model have initial data to work with, before making predictions; or can the model be continuously trained from the start? If so, what would be the default parameters to start with?

  1. When you say the students are split into training and testing populations, what is the ratio between the populations? 70/30? 60/40?

Thanks so much for your time, looking forward to your response.

Obtaining Prediction Result

We're working to build a learner model in our research project, and we'd love to be able to use your code base as a starting point. I am looking to obtain prediction result for each individual instead of an aggregate AUC. Can you point me towards where to look in the code?

Thanks in advance!

Questions on training the model

  1. What are your thoughts on training a model with pre-existing data? For instance, DKT/logistic models do not require any student/item specific parameters, hence, the model could be trained with data collected elsewhere, and then implemented in an application to make immediate, accurate predictions.
    For a model that does not require item or student parameters, would this be appropriate? What are the benefits of using data from the same students/items to train general weights? A model trained on dataset A could then be tested on datasets B and C, just like a real flashcard scenario; what are your thoughts on this?

  2. When trying to evaluate an IRT model through online prediction accuracy, after determining all item parameters, is the ability parameter updated through retraining the model with ALL the data collected thus far (all the students + training data), or just the students’ INDIVIDUAL data? In other words, what data is used to train the student-level parameters?
    Thanks again, looking forward to your response.

problem_id key error

Hi!

I'm trying to reproduce the results of the EDM2016 paper to see if HIRT is viable (rapid enough) for real time computation, and it seems that I have some issues with the code (I didn't modify it). When I try to run HIRT with the Bridge to Algebra dataset, I have the following error:

...
File "/path/edm2016/.tox/py27/lib/python2.7/site-packages/rnn_prof/cli.py", line 163, in irt
    data, _, _, _, _ = load_data(data_file, source, data_opts)
  File "/path/edm2016/.tox/py27/lib/python2.7/site-packages/rnn_prof/data/wrapper.py", line 77, in load_data
    min_interactions_per_user=data_opts.min_interactions_per_user)
  File "/path/edm2016/.tox/py27/lib/python2.7/site-packages/rnn_prof/data/kddcup.py", line 99, in load_data
    data = data.sort(sort_keys)
...
KeyError: u'problem_id'

I don't know why this happens, any help would be appreciated!
Thanks

Key error: Hashing function?

I am just wondering what are the format requirement of Problem Id and Step Name column. I have tried to reuse the code for a research I am doing, but there would always be a key error when I tried to run the code on my data, unless I use the exact same problem id as the one in the KDD Cup dataset.

Any help will be appreciated.

Code for TIRT?

Hi I read your paper Back to the Basics: Bayesian extensions of IRT outperform neural networks for proficiency estimation, and really want to try TIRT on my dataset, but I can't find the code in this repo. Is that possible to add it to this repo?or is it confidential? Thanks!

Running without Tox

Hi there,
I'm very interested in implementing IRT; this repo seems to be well-suited to what I am to achieve. However, i find it very difficult to play around with the code when I have to continuously run the command tox after any changes. What would be the best approach to achieve the same results WITHOUT using tox? What files should I individually run?

I also hope to implement this in a web application in the future, so knowing which files to run (without tox) would help as well.

Thanks.

How to get the parameters of each item?

Hi I'm trying to get the parameters of each item in the HIRT model. I found these two functions in OnePOLearner. But they are actually the same and I'm wondering what's the meaning of offset coefficient.

    def get_difficulty(self, item_ids):
        """ Get the difficulties (in standard 1PO units) of an item or a set of items.
        :param item_ids: ids of the requested items
        :return: the difficulties (-offset_coeff)
        :rtype: np.ndarray
        """
        return -self.nodes[OFFSET_COEFFS_KEY].get_data_by_id(item_ids)

    def get_offset_coeff(self, item_ids):
        """ Get the offset coefficient of an item or a set of items.
        :param item_ids: ids of the requested items
        :return: the offset coefficient(s)
        :rtype: np.ndarray
        """
        return self.nodes[OFFSET_COEFFS_KEY].get_data_by_id(item_ids)

Maximum interactions?

I must be reading the code wrong in someway, but I'm running into an issue where the max_inter variables drops all values in the data frame. When I look at the common options I see the default set at 0 which seems to be the origin of this issue.

common_options.add('--max-inter', '-m', type=int, default=0, help="Maximum interactions per user", extra_callback=lambda ctx, param, value: value or None)

About reproduce the result in this paper

Hi, I read this paper a few weeks ago, I want to implement IRT, I read several paper but still don't know how to implement it. So I download this project to see how the code were organized. First I wanted to reproduce the result, but when I run
rnn_prof irt assistments skill_builder_data_big.txt --onepo \ --drop-duplicates --no-remove-skill-nans --num-folds 5 \ --item-id-col problem_id --concept-id-col single
The system warned that "rnn_prof is not an internal or external command". So I want to know what mistake I have made, and how can I reproduce this result.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.