Giter VIP home page Giter VIP logo

lyscripts's Introduction

Hey, I'm Roman ๐Ÿ‘‹

๐Ÿ”ญ Working on probabilistic models to predict how cancer spreads
๐Ÿ‘ฏ Interested in collaborating on datasets of lymphatic progression patterns in head & neck cancer
๐Ÿ’ฌ Always happy to hear feedback on our interactive Lymphatic Progression eXplorer (LyProX)

๐Ÿ“š๐Ÿ” Research fields

I am a PostDoc in the medical physics research group of Prof. Jan Unkelbach at the University Zurich and the University Hospital Zurich.

In our main project, we try to model the risk for metastases in the lymph system of patients with squamous cell carcinomas in the head & neck region. You can read more on that in an excellent paper by a PostDoc in our group: Pouymayou et al. You can also check out our code for the lymph model, which is a python package containing the code to learn and compute this risk of lymphatic metastases using Bayesian networks (mentioned paper) and also - this is new - hidden Markov models (Ludwig et al).

Another project deals with optimal fractionation schemes. Fractionation is the splitting of a prescribed dose of radiation designed to kill cancer cells in a tumor into multiple sessions to allow the healthy parts of the body to recover better. Innovative technologies like the MR-LinAc at our institution enable us to tackle this problem with reinforcement learning

๐Ÿ”ญ Topics I'm interested in

  • probabilistic models
  • interpretable machine learning methods
  • statistical learning theory

and also (though not necessarily research-related)

  • ๐ŸŒŒ (theoretical) astrophysics (I did my master in this group)
  • web development
  • open source

๐Ÿ› ๏ธ Tech Stack

Writing Markdown Quarto LaTeX
Coding Python NumPy Pandas SciPy Jupyter Notebook Django
Dev Git GitHub GitHub Actions CodeCov
Software Microsoft Office Affinity Photo Inkscape
Learning Exercism Julia JavaScript

Thanks a lot for reading ๐Ÿ˜ƒ

๐Ÿ“ซ In case you want to reach me: [email protected]

lyscripts's People

Contributors

rmnldwg avatar

Stargazers

 avatar

Watchers

 avatar

Forkers

larstwi julianbro

lyscripts's Issues

colliding modalities

Both the enhance script and the sample script (and maybe some more) access the key modalities in the params.yaml file. However, they use it for different purposes: The former combines all defined modalities into "consensus" diagnoses, while the sampling program uses all defined modalities for inference.

This clash needs to be resolved. An idea would be to use different lists of modalities for the different scripts or have it provided to the scripts via an optional argument.

refactor scripts into library

I would like to refactor the scripts as they are into libraries of atomic, reusable functions that I can then again combine into versatile and declarative scripts.

add precompute commands

For speedier computation of risks and prevalences, it can make sense to precompute prior and posterior state distributions. The new lymph API allows that via the methods state_dist() and posterior_state_dist(). Thus, this package should take advantage of that.

`enhance` command not deterministic

The output of the lyscripts data enhance command is not fully deterministic: The order of the columns varies from run to run. As far as I can tell, the content remains the same. Nonetheless, this is annoying and should be fixed.

wrapped function in `rich` context

When a function that is wrapped in the report_state decorator gets called inside a report.status context, a rich.errors.LiveError gets raised. It complains about two "live displays" running at the same time.

So, I should make sure that all such wrapped functions stand on their own and not inside rich contexts.

Create docs by version

Right now, the documentation exists always only for the latest version. But it would be helpful too look at earlier version's docs. Maybe I can adapt the respective GitHub action to enable that.

See here for some ideas.

implement logging

It would be great to have the ability to log progress and intermediate results in a file. I think this should be quite straightforward with the use of rich, but extending this would also be nice.

make sampling deterministic

Use numpy's seed function before starting the sampling rounds so that by providing the same seed value can reproduce the same sampling round.

I am not sure this will work, as I think I have tried this before.

wrong indent length in nested markdown docs

The utility function generate_markdown_docs in the lyproxify.py file uses three spaces as depth of indentation for nested lists. This is wrong, it should be four spaces.

Filter command

It could be useful to have a command that filters datasets based on some common features. E.g., filter based on tumor location, subsite, T-category, ...

Exporting histograms & plots to HDF5

Right now lyscripts contains utilities and commands to compute predicted and observed distributions over prevalences and risks. It also defines functions to plot these computed values, e.g. when the computations have been stored as HDF5 files. What it is missing is methods to export these plots to re-read them later. This is sort of the missing link to efficiently use lyscripts as a library for computing and plotting histograms over risks and prevalences.

convergence sampling does not thin

When sampling until convergence, the script does not thin out the chain by e.g. keeping only every fifth sample. This leads to repeated values when a new proposal for a walker gets rejected multiple times and hence decreases the statistical power of subsequently computed values.

Ideally, the thin_by parameter that is read from the parameter YAML file and used in the TI procedure should also apply to the convergence sampling procedure.

sampling is uninformative

There are several issues with the information provided by the sampling script:

  1. During burnin, it implies to know how long the sampling takes, but doesn't. It'd be better if it conveyed to the user that it samples until convergence.
  2. There's always a warning about not finding the attribute _random in the ConvenienceSampler. I think this is due to a recent update of the emcee package.
  3. When performing a thermodynamic integration, it only provides the value of the beta parameter. But also providing acceptance rates or maybe even mean and standard deviation of the sampled parameters would also be super helpful.

update type hints to Python 3.10

I am still using the old syntax for type hints. I can replace the typing type hints with the built-in version of Python 3.10 now.

split functionality into subcommands

The argparse package provides functionality for sub-commands, which I would like to implement, so that I could do stuff like

python -m lyscripts thermoint --args

and stuff like that. Essentially similar to how git has many subcommands.

use generators over samples

Instead of complicated custom enumerators inside functions that compute likelihoods, prevalences and risks, I could simply implement them as generators. In this way, I could set up progress bars and what not outside the function generating these values.

Allow selecting no. of cores via `Pool()`

Add an argument to the sample script that allows one to choose the number of cores to use in parallel and - if possible - disable using the multiprocessing library altogether.

sampling crashes when `pools` argument not given

When the optional argument pools is not given, the sampling crashes. This is because I forgot to handle the case where the variable npools inside the run_mcmc_with_burnin function is set to None.

use sphinx for documentation

While pydoc is super nice for small and straightforward projects, I think the flexibility of sphinx is needed here. E.g., there is this nice extension for automatically inserting commands and their output.

combine data scripts into one command

Right now, I have the subcommands generate, clean, join, enhance and split that all deal with the data before sampling. It would make sense to group them under one intermediate command data.

implement plotting utils from lyThesis

While writing my thesis I implemented some nice plotting utilities using simple dataclasses and a functional approach. Copy that over to the plot commands in lyscripts.

negative sublevels override superlevel

In case two sublevels, e.g. "IIa" and "IIb", are both reported/inferred to be healthy, they override the information in the superlevel, even if that does report involvement.

histograms.py not working

After sampling and the prediction of the prevalences, the histograms.py file does not output the plotted histograms but shows following error:

Traceback (most recent call last):
  File "/opt/anaconda3/envs/lynforigin/bin/lyscripts", line 8, in <module>
    sys.exit(main())
  File "/opt/anaconda3/envs/lynforigin/lib/python3.8/site-packages/lyscripts/__init__.py", line 123, in main
    args.run_main(args)
  File "/opt/anaconda3/envs/lynforigin/lib/python3.8/site-packages/lyscripts/plot/histograms.py", line 124, in main
    draw(
  File "/opt/anaconda3/envs/lynforigin/lib/python3.8/site-packages/lyscripts/plot/utils.py", line 270, in draw
    axes.hist(content.values, **tmp_hist_kwargs)
  File "/opt/anaconda3/envs/lynforigin/lib/python3.8/site-packages/matplotlib/__init__.py", line 1446, in inner
    return func(ax, *map(sanitize_sequence, args), **kwargs)
  File "/opt/anaconda3/envs/lynforigin/lib/python3.8/site-packages/matplotlib/axes/_axes.py", line 6944, in hist
    p._internal_update(kwargs)
  File "/opt/anaconda3/envs/lynforigin/lib/python3.8/site-packages/matplotlib/artist.py", line 1223, in _internal_update
    return self._update_props(
  File "/opt/anaconda3/envs/lynforigin/lib/python3.8/site-packages/matplotlib/artist.py", line 1197, in _update_props
    raise AttributeError(
AttributeError: Polygon.set() got an unexpected keyword argument 'kwargs'

Used command: lyscripts plot histograms models/prevalences.hdf5 plots/hist_prev_ipsiI.png --names ipsiI/early ipsiI/late

Lyscripts version: 0.7.2
Lymph version: 0.4.3

Used files: used_files.zip

Workaround: Use lyscripts version 0.5.11 to plot the histograms.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.