Let's say I want to add a PyMC version of the <a href="https://github.com/MansMeg/post

Adding models of different frameworks needs duplication in many posterior files about posteriordb HOT 11 CLOSED

stan-dev commented on July 30, 2024

Adding models of different frameworks needs duplication in many posterior files

from posteriordb.

Comments (11)

MansMeg commented on July 30, 2024

That sound like a very good idea. Essentially move out the individual PPF to the model info files instead. This is very reasonable.

from posteriordb.

eerolinna commented on July 30, 2024

I have one possible folder structure that contains the PPF info in model info files here: https://github.com/eerolinna/posterior_database (models directory). I also have a script that can do the change automatically so we don't need to manually move the PPF information to model info files.

Would this structure be good or should we change it somehow? It's easy to change it

from posteriordb.

MansMeg commented on July 30, 2024

Alright. I'm currently adding some more stuff and a test suite. As soon as I'm done with that, could you do a PR with this change?

from posteriordb.

eerolinna commented on July 30, 2024

Sure! I'll show here what would change for the 8_schools|centered posterior so that we'll more clearly see if there's something that needs to be changed.

Current files

Currently we have posteriors/8_schools|centered.json:

{
  "data": "content/data/8_schools.json",
  "model": {
    "stan": "content/models/8_schools/8_schools_centered.stan"
  },
  "gold_standard": null,
  "keywords": []
}

and also the model info file content/models/8_schools/8_schools_centered.info.json:

{
  "title": "A centered hiearchical model for 8 schools",
  "description": "A centered hiearchical model for the 8 schools example of Rubin (1981)",
  "urls": ["http://www.stat.columbia.edu/~gelman/arm/examples/schools"],
  "references": ["Rubin (1981)", "Gelman et. al. (2014)"],
  "keywords": ["bda3_example", "hiearchical"]
}

After the change

After the change we have posteriors/8_schools|centered.json:

{
  "data": "8_schools",
  "model": "8_schools_centered",
  "gold_standard": null,
  "keywords": []
}

models/8_schools_centered.json:

{
  "model_code": {
    "stan": "content/models/8_schools/8_schools_centered.stan"
  },
  "title": "A centered hiearchical model for 8 schools",
  "description": "A centered hiearchical model for the 8 schools example of Rubin (1981)",
  "urls": ["http://www.stat.columbia.edu/~gelman/arm/examples/schools"],
  "references": ["Rubin (1981)", "Gelman et. al. (2014)"],
  "keywords": ["bda3_example", "hiearchical"]
}

datasets/8_schools.json:

{
  "data_file": "content/data/8_schools.json",
  "maybe_other_keys": "like keywords, references, description, etc"
}

It might not be 100% necessary to also create the new data info file but I feel like it would be a good idea. What do you think?

from posteriordb.

MansMeg commented on July 30, 2024

Im now all in for these changes, with one exception. I still want to have the pointers:
models/8_schools_centered.json in the posterior file. This is currently just for convinience and we can remove it later. But now it would make more sense.

Could you make these changes and do a PR?

from posteriordb.

eerolinna commented on July 30, 2024

Are you in favor of eventually moving to plain model names or do you feel that it is better to keep the model file paths for good?

If yes

What are the things that are currently more convenient with the file path instead of model name?

If no

Can you explain what is better about having the file paths?

Feel free to just post yes/no at first and add the answer to the follow up questions later

from posteriordb.

MansMeg commented on July 30, 2024

Yes, I am.

My code is built upon it currently so means less job to fix it now.

Also, remember that model infor and data inf should be called [model].info.json both for clarity and since the data will be stored in json format.

from posteriordb.

MansMeg commented on July 30, 2024

How is it going?

from posteriordb.

eerolinna commented on July 30, 2024

Are you referring to code in this repository / bayesbenchr or some other code? Can you link to the files that would need to be changed and I can check if I could do the changes in the same PR so there would not be extra effort for you.

The code will probably need some changes anyway to accomodate the fact that posterior info format has changed. Adding the model names likely doesn't require much additional effort. On the other hand, from my experience delaying this change might mean that we never actually get around to doing it, or that it requires more effort later than what would be needed to do it now.

If you can link to the code I will check if it reasonable to do everything in one go or if we should break the change into two parts.

from posteriordb.

eerolinna commented on July 30, 2024

I found these locations in the code that need to be changed

https://github.com/MansMeg/posterior_database/blob/d696a3185d94d83d5e0be2c446416be208f2f8d4/rpackage/R/model_code.R#L10

https://github.com/MansMeg/posterior_database/blob/d696a3185d94d83d5e0be2c446416be208f2f8d4/rpackage/R/model_code.R#L12

https://github.com/MansMeg/posterior_database/blob/d696a3185d94d83d5e0be2c446416be208f2f8d4/rpackage/R/model_data_info.R#L7

https://github.com/MansMeg/posterior_database/blob/d696a3185d94d83d5e0be2c446416be208f2f8d4/rpackage/R/model_data_info.R#L15

https://github.com/MansMeg/posterior_database/blob/d696a3185d94d83d5e0be2c446416be208f2f8d4/rpackage/R/data.R#L29

https://github.com/MansMeg/posterior_database/blob/d696a3185d94d83d5e0be2c446416be208f2f8d4/rpackage/R/data.R#L34

https://github.com/MansMeg/posterior_database/blob/d696a3185d94d83d5e0be2c446416be208f2f8d4/rpackage/R/data.R#L45

For all of these it seems to be straightforward to adapt to model names directly instead of model filepaths

I didn't check bayesbenchr yet

from posteriordb.

eerolinna commented on July 30, 2024

I made PR #14 with a slightly hacky solution to the new PDB structure

I think there is probably a better way to do it but I don't know what it is yet.

Note that the actual PDB structure is not modified yet, I will create a separate PR for that

from posteriordb.

Adding models of different frameworks needs duplication in many posterior files about posteriordb HOT 11 CLOSED

Comments (11)

Current files

After the change

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent