petab-dev / libpetab-python Goto Github PK

View Code? Open in Web Editor NEW

14.0 14.0 5.0 11.05 MB

Python package for working with PEtab files

Home Page: https://libpetab-python.readthedocs.io

License: MIT License

Python 99.39% CSS 0.06% HTML 0.16% Shell 0.08% ANTLR 0.30%

combine-archive hacktoberfest parameter-estimation petab python sbml systems-biology

libpetab-python's People

Contributors

Stargazers

Watchers

Forkers

emadalamoudi matthiaskoenig m-philipps muellerwhd dweindl

libpetab-python's Issues

xOffset for logscale xValues

I feel like xOffset should be adapted to the scale that is employed, applying linear offsets on a logscale doesnt work well

simulation_and_data__cell_cycle_no_M_washout_ridge__AU565__LEE011_Ribociclib__lambda10.0.pdf

Simulation is not plotted correctly using visualization specification

Hey,

I am using PEtab in the develop branch on a WSL.

I tried to plot some simulations and experimental data using the visualization specification.
These are the files I used:
issue.zip

an here is the code:

from petab.visualize import plot_data_and_simulation
import matplotlib.pyplot as plt


ax = plot_data_and_simulation(data_file_path,
                              condition_file_path,
                              visualization_file_path,
                              simulation_file_path)

It returns:

The simulations are not correctly plotted.

If in PEtab/petab/visualize/plot_data_and_simulation.py the lines 138 and 139 are commented out, the simulations are correctly plotted:

I am not sure, what the use of lines 138 and 139 is. But at least for my model they seem to be problematic, although I had no problem with re-plotting the test models Isensee and Fujita locally.

consistent coloring of legend entries across plots

simulation_and_data__cell_cycle_no_M_washout_ridge__AU565__LEE011_Ribociclib__lambda10.0.pdf

would be great if coloring of invidual label entries is consistent across plotIds

noise in visualization dataframe gets overwritten by meanAndSD

Add support for multiple models per petab.Problem

Allowed in YAML file, but not implemented in library.

Cleanup repository after splitting off PEtab library

PEtab-dev/PEtab#476

Unreachable TypeError in `petab.parameter_mapping.apply_overrides_for_observable`

I don't think

libpetab-python/petab/parameter_mapping.py

Line 365 in 0917194

except KeyError as e:

can ever be reached.

If overridee_id is not present in mapping, the code will simply add a new entry and not throw a KeyError.

Errorbars not directly plotted

Good morning,

I am using PEtab in the develop branch on a WSL.

I tried to plot the experimental data with error bars using the visualization specification.
These are the files I used:
issue.zip

and here is the code:

from petab.visualize import plot_data_and_simulation
import matplotlib.pyplot as plt


ax = plot_data_and_simulation(data_file_path,
                              condition_file_path,
                              visualization_file_path,
                              simulation_file_path)

The noise for each datapoint is directly given in the measurement table, so I chose plotTypeData = provided in the visualization specification file. Still no errorbars appear:

To plot the errorbars plotted_noise='provided' has to be added to the plot_data_and_simulation function:

As the errorbars for e.g. plotTypeData = MeanAndSD are also directly plotted without that any additional argument has to be set. I think it would be nice to implement this for provided aswell or instead mention in the example notebook/documentation the need to set the plotted_noise argument.

Add: Allow plotting by observableID & simCondID at the same time

Could in some cases be useful to plot e.g. a single observable for only a few conditions.
So far an error rises: "Plotting without visualization specification file and datasetId can be performed via grouping by simulation conditions OR observables, but not both. Stopping."

Cleanup visualization docstrings

In petab/visualize/*:

Ensure things look fine with sphinx (e.g. cd doc; make html; firefox build/html/index.html)
Add typehints and remove types from docstrings
Document arguments
Document return types

visu-error if dataset_id not given, but yValues, yValues need to be sorted alphabetically

If in the visualization-specification file datasetId is not given, but yValues is given, the yValues need to be sorted alphabetically, otherwise the following error is thrown:

Traceback (most recent call last):
  File "/home/erika/Documents/Python/PEtab_my_files/visu_test.py", line 37, in <module>
    simulation_file_path
  File "/home/erika/Documents/env-petabpypesto37/lib/python3.7/site-packages/petab/visualize/plot_data_and_simulation.py", line 136, in plot_data_and_simulation
    plotted_noise)
  File "/home/erika/Documents/env-petabpypesto37/lib/python3.7/site-packages/petab/visualize/helper_functions.py", line 572, in create_or_update_vis_spec
    vis_spec = expand_vis_spec_settings(vis_spec, columns_dict)
  File "/home/erika/Documents/env-petabpypesto37/lib/python3.7/site-packages/petab/visualize/helper_functions.py", line 487, in expand_vis_spec_settings
    vis_spec[select_conditions].loc[:, column].values[0])
IndexError: index 0 is out of bounds for axis 0 with size 0

Process finished with exit code 1

This is because

obs_uni = list(np.unique(exp_data[OBSERVABLE_ID]))

(helper_functions.py, line 446)
sorts observables alphabetically.
Would be good, if it's not mandatory that yValues are sorted alphabetically.

barplot labeling issues

x-tick-labels do not tilt anymore
labels for simulation/measurement vanished

GHA: Add Windows

Verify all tests pass on Windows

Should unspecified optional strings be the empty string or NaN?

At the moment, there can be NaNs (after pd.read_csv) in optional PEtab string columns, such as observableNames, that, if interpreted as a string, are converted to the string literal 'nan'.

>>> import numpy as np
>>> str(np.nan)
'nan'

An issue can occur in the AMICI plotting functions. This issue can be fixed by replacing

elif model.getObservableNames()[iy] != '':

with

elif model.getObservableNames()[iy] in ['', 'nan']:

to correctly identify unspecified observable names. However, testing for the string 'nan' seems unintuitive, and this fix might cause another issue if an observable is named 'nan'.

Here's a solution, which could be implemented in PEtab, and might resolve the issue in AMICI.

$ cat test_str.csv
observableId    observableName
a_id    a_name
b_id

>>> import pandas as pd
>>> df1 = pd.read_csv('test_str.csv', sep='\t')
>>> df2 = pd.read_csv('test_str.csv', sep='\t')
>>> df2['observableName'] = df2['observableName'].fillna('')
>>> df1
  observableId observableName
0         a_id         a_name
1         b_id            NaN
>>> df2
  observableId observableName
0         a_id         a_name
1         b_id

pandas 1.2.x requirement?

I just tried installing petab in the colab environment, unfortunately that failed, since colab only seems to work with pandas up to 1.1.5. Any chance we could lower our requirement here?

Collecting petab
  Downloading petab-0.1.20-py3-none-any.whl (84 kB)
...
Collecting pandas>=1.2.0
  Downloading pandas-1.3.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.3 MB)
     |████████████████████████████████| 11.3 MB 12.6 MB/s 
...
Installing collected packages: pandas, colorama, petab
  Attempting uninstall: pandas
    Found existing installation: pandas 1.1.5
    Uninstalling pandas-1.1.5:
      Successfully uninstalled pandas-1.1.5
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-colab 1.0.0 requires pandas~=1.1.0; python_version >= "3.0", but you have pandas 1.3.3 which is incompatible.
Successfully installed colorama-0.4.4 pandas-1.3.3 petab-0.1.20
WARNING: The following packages were previously imported in this runtime:
  [pandas]
You must restart the runtime in order to use newly installed versions.

no lint error for non-unique condition ids

petab.lint.check_condition_df does not check whether index entries are unique, but this is required according to documentation.

Tests: Make sure to delete temporary files

Use the tempfile....File context managers with delete=True, and yield results in between if necessary.

virtual datasetId for sim df only works if ordered as mes df

c.f. discussion at end of PEtab-dev/PEtab#408

Petablint doesn't check for noise in measurement table

No error is raised when noise is given as normally distributed with noiseParameter1_{observableId} in the observables table but noiseParameters is missing in the measurement table.

For example,
Zhao_QuantBiol2020.zip
passes petablint.

Visualization of measurements without condition table and visualization table

Would be nice if measurements could be visualized without providing a condition table or visualization table. Currently not possible afaik.

Condition data should only be required for dose-response plots. Those should be plotted without the respective dose information.

encoding issue while writing out experiment

If after reading the Sneyd_PNAS2002 model from the benchmark collection, i write it out again using the petab library, i get the following exception:

self = <encodings.cp1252.IncrementalEncoder object at 0x000001DBE8E097B8>
input = 'Ca_dose_response__1\t10 μM IP_3,  0.1 μM Ca^{2+}\t10.0\t0.1\r\r\n'
final = False

    def encode(self, input, final=False):
>       return codecs.charmap_encode(input,self.errors,encoding_table)[0]
E       UnicodeEncodeError: 'charmap' codec can't encode character '\u03bc' in position 23: character maps to <undefined>

cp1252.py:19: UnicodeEncodeError

it turns out the unit is given in special characters. Since PEtab had no problems reading it, maybe the writers should pass a utf8 encoding attribute when writing out the tables. Rather than just:

conditions.py:55: in write_condition_df
    df.to_csv(fh, sep='\t', index=True)

recommonmark is deprecated since may

https://github.com/readthedocs/recommonmark/tree/2357251067481413916b309c6c84575eb9018ba7

Should use MyST instead?

visulization function import_from_file has no use

The function import_from_file is currently not used at all. cf
https://github.com/PEtab-dev/PEtab/blob/40ca2a012295a8009ed72cca22b60048cacf4e7b/petab/visualize/helper_functions.py#L29

Imo it does not provide extra value because all the importing is done in plot_data_and_simulation
Since it has to be kept up to date if e.g. changes to functions called by this function happen, I was wondering if we might just delete it

add: plotting STD if the column noiseParameter has values and no replicates available

e.g. the Chen model has specified noiseParameters in the measurement data, and no replicate measurement data is available. (so, standard deviation cant be calculated from replicates)
For plotting the standard deviation if no measurement replicates are available, and the column noiseParameters has values, plot these values as standard deviation.

Add support for multiple condition files per petab.Problem

Allowed in YAML file, but not implemented in library.

Additional test for visualisation

We should check that the experimentalCondition file does not contain as parameters "time" and "condition".

writer functions do not perform sanity checks + use misleading options for writing tsv

When having dedicated writer functions such as petab.conditions.write_condition_df etc, I would actually expect that they perform some kind of sanity check on the output. Having pretty bare wrappers around pd.to_csv doesn't seem to helpful, especially since the writers use index=True, which is not consistent with the providede spec and, since the readers don't use index_col=0, leaves an Unnamed 0 column in the imported DataFrame.

Missing validity check for simulationConditionId.

Which problem would you like to address? Please describe.
petab.lint.lint_problem does not check whether simulationConditionIds in measurement_df are valid condition Ids.

Describe the solution you would like
petab.lint.lint_problem flags condition ids in measurement_df['simulationConditionId'] that are not defined in the condition table.

Describe alternatives you have considered
Actually specifying valid simulationConditionIds

Additional context
Add any other context about the request here.

implement way to plot only simulation

add case in visualization routine: in measurementData file, if same simulationConditionId, but differing preequilibrationId etc.

in file 'get_data_to_plot.py', see ll. 92-102
TODO: Here not the case: So, if entries in measurement file: preequCondId, time, observableParams, noiseParams, observableTransf, noiseDistr are not the same, then -> differ these data into different groups!
now: go in simulationConditionId, search group of unique simulationConditionId
e.g. rows 0,6,12,18 share the same simulationCondId, then check if other column entries are the same (now: they are), then take intersection of rows 0,6,12,18 and checked other same columns (-> now: 0,6,12,18) and then go on with code.
if there is at some point a difference in other columns, say e.g. row 12,18 have different noiseParams than rows 0,6 the actual code would take rows 0,6 and forget about rows 12,18

Implement validation of PEtab visualization files

observables but not species supported in `noiseFormula`

The documentations states that the noiseFormula in the observable table can be specified like so:
noiseParameter1_observable_pErk + noiseParameter2_observable_pErk*pErk

However, when I use a species in the observable formula, I get the following error:

TypeError                                 Traceback (most recent call last)
/media/sf_DPhil_Project/Project07_Parameter Fitting/PEtab/petab/calculate.py in evaluate_noise_formula(measurement, noise_formulas, parameter_df, simulation)
    187     try:
--> 188         noise_value = float(noise_value)
    189     except TypeError:

~/venvs/std/lib/python3.8/site-packages/sympy/core/expr.py in __float__(self)
    324             raise TypeError("can't convert complex to float")
--> 325         raise TypeError("can't convert expression to float")
    326 

TypeError: can't convert expression to float

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
<ipython-input-11-8dc41b43303c> in <module>
     34 problem = core.DisFitProblem(PETAB_YAML)
     35 problem.write_jl_file()
---> 36 problem.optimize()
     37 problem.plot_results('c0', path='plot.pdf')
     38 problem.write_results()

/media/sf_DPhil_Project/Project07_Parameter Fitting/df_software/DisFit/DisFit/core.py in optimize(self)
    386         print(self.petab_problem.parameter_df)
    387 
--> 388         self._results['fval'] = -petab.calculate_llh(self.petab_problem.measurement_df.loc[:, cols],
    389             pd.concat([self.petab_problem.simulation_df.rename(columns={'measurement': 'simulation'}), ndf], axis=1),
    390             self.petab_problem.observable_df,

/media/sf_DPhil_Project/Project07_Parameter Fitting/PEtab/petab/calculate.py in calculate_llh(measurement_dfs, simulation_dfs, observable_dfs, parameter_dfs)
    272     for (measurement_df, simulation_df, observable_df, parameter_df) in zip(
    273             measurement_dfs, simulation_dfs, observable_dfs, parameter_dfs):
--> 274         _llh = calculate_llh_for_table(
    275             measurement_df, simulation_df, observable_df, parameter_df)
    276         llhs.append(_llh)

/media/sf_DPhil_Project/Project07_Parameter Fitting/PEtab/petab/calculate.py in calculate_llh_for_table(measurement_df, simulation_df, observable_df, parameter_df)
    314 
    315         # get noise standard deviation
--> 316         noise_value = evaluate_noise_formula(
    317             row, noise_formulas, parameter_df, petab.scale(simulation, scale))
    318 

/media/sf_DPhil_Project/Project07_Parameter Fitting/PEtab/petab/calculate.py in evaluate_noise_formula(measurement, noise_formulas, parameter_df, simulation)
    188         noise_value = float(noise_value)
    189     except TypeError:
--> 190         raise TypeError(
    191             f"Cannot replace all parameters in noise formula {noise_value} "
    192             f"for observable {observable_id}.")

TypeError: Cannot replace all parameters in noise formula 0.1*A + 0.5 for observable obs_a.

When I replace the species with an observable everything works fine.

Parallelize plotting for large amounts of subplots (saved to files)

Recent merge PEtab-dev/PEtab#214 allows for visu specs with a large amount of subplots to save them individually to files (c.f. also PEtab-dev/PEtab#213).
This can take some time and should be well parallelizable

Logic of petab.visualize.helper_functions.get_vis_spec_dependent_columns_dict

So I am currently running into the following issue:

Traceback (most recent call last):
  File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/pydevd.py", line 1434, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "pretrain_per_sample.py", line 70, in <module>
    petab.visualize.plot_data_and_simulation(
  File "/Users/ffroehlich/Documents/HMS/mechanismEncoder/venv/lib/python3.8/site-packages/petab/visualize/plot_data_and_simulation.py", line 128, in plot_data_and_simulation
    exp_data, vis_spec = create_or_update_vis_spec(exp_data,
  File "/Users/ffroehlich/Documents/HMS/mechanismEncoder/venv/lib/python3.8/site-packages/petab/visualize/helper_functions.py", line 572, in create_or_update_vis_spec
    vis_spec = expand_vis_spec_settings(vis_spec, columns_dict)
  File "/Users/ffroehlich/Documents/HMS/mechanismEncoder/venv/lib/python3.8/site-packages/petab/visualize/helper_functions.py", line 487, in expand_vis_spec_settings
    vis_spec[select_conditions].loc[:, column].values[0])
IndexError: index 0 is out of bounds for axis 0 with size 0

I explicitly specify PLOT_ID but no DATASET_ID in my visualization table (if I do specify DATASET_ID, which is not specified in my measurements_df, petab will just show empty plots, ... bummer).

This means that I end up with https://github.com/PEtab-dev/PEtab/blob/c3d51d6ce98a533c686243dd9ba163276785fc44/petab/visualize/helper_functions.py#L558, which makes no effort at all to respect the previously defined plot_id_list, but instead creates a new set of plot ids of the form plot{i}. Those PLOT_ID of course don't match what's in my vis_spec, which means that the select_conditions in https://github.com/PEtab-dev/PEtab/blob/c3d51d6ce98a533c686243dd9ba163276785fc44/petab/visualize/helper_functions.py#L484 will never match anything.

I am at loss how this is conceptually supposed to work at all, so I would appreciate if anybody could enlighten me such that I can fix this issue.

wrong legends when visu_specification file without datasetId & yValues

If the visualization-specification file does not have the column datasetId (with and without yValues column), then the automatic legend generation is wrong.

it should be like this:

visualization, multiple yValues in one plot

Documentation isn't very explicit about whether this should only be a single value or whether this can be multiple values (and if so in which format they should be passed).

plotting petab problem with simulations by observables does not work

Which problem would you like to address? Please describe.
Plotting a petab.Problem together with a simulation_df by observable_ids does not work properly: Simulation is plotted as constant zero.

plot_petab_problem(
    petab_problem,
    sim_data=simulation_df.rename(columns={"measurement": "simulation"}),
    observable_id_list=[["abs_pSCTF"]],
)

leads to

Describe the solution you would like
Simulations should be plotted by their values.

Add consistent support for pathlib.Path

Would be quite convenient if Paths could be used in addition path strings.

petab.visualize.* changes matplotlib rcParams

... this is at least a major inconvenience as it prevents users using their own styles, and therefore, should be changed.

Currently affects petab/visualize/plotter.py and petab/visualize/helper_functions.py.

Visualization: if column 'XValues' not provided and multiple conditions should be plotted automatically

If in the visualization table, the column XValues is not provided, it is inferred:

if in measurement file measurement time is not constant, then XValues = 'time' ->plot time course
if its constant, then dose response plots are plotted. For now, only the first column in the condition file is taken into account.
Expand to automatically plot all conditions. Ideally like subplots B and C in:
https://journals.plos.org/ploscompbiol/article/file?type=supplementary&id=info:doi/10.1371/journal.pcbi.1007147.s005

see conversation in PEtab-dev/PEtab#283.

Visualization: cropped error bars

It often happens that (in line plots), the errors bars are not completely visible. It seems, y-limits are set based on the measurement or simulation, ignoring the error bars. This should be changed. Happens at both, the lower and the upper end.

Implement validation CompositeProblem

All basic checks via lint_problem
basic parameter table checks
Special checks related to parameter table dependence on multiple model and measurement and condition files

timepoints 'inf' should throw an error if one plots time series

Visualization: plotTypeData ' replicate ' not working

Which problem would you like to address? Please describe.
I try to plot individual replicates of the data.

If I have the column plotTypeData with replicate in the visualization table, the data is still plotted as MeanAndSTD. You manually have to add plotted_noise='replicate' in:

plot_data_and_simulation(data_file_path,
condition_file_path,
visualization_file_path,
simulation_file_path, plotted_noise='replicate'
)

Doing that, an error arises:

Traceback (most recent call last):
  File "/home/erika/Documents/Python/PEtab_my_files/Boehm_visu_test.py", line 21, in <module>
    simulation_file_path, plotted_noise='replicate'
  File "env-petabpypesto37/lib/python3.7/site-packages/petab/visualize/plot_data_and_simulation.py", line 165, in plot_data_and_simulation
    exp_conditions, sim_data)
  File "env-petabpypesto37/lib/python3.7/site-packages/petab/visualize/helper_functions.py", line 765, in handle_dataset_plot
    plot_lowlevel(plot_spec, ax, conditions, measurement_to_plot, plot_sim)
  File "env-petabpypesto37/lib/python3.7/site-packages/petab/visualize/plotting_config.py", line 93, in plot_lowlevel
    conditions[conditions.index.values],
AttributeError: 'numpy.ndarray' object has no attribute 'index'

so conditions seems to have the wrong data type, here its an numpy.array with the time-points, that should be plotted on the x-axis.

Describe the solution you would like

It would be nice if only defining plotTypeData with replicate in the visualization table would be enough (and not additionally adding **plotted_noise='replicate'** in the function plot_data_and_simulation. Could be easier to understand for beginners.
Despite of addressing the error, it would be nice, if also one datapoint could be plotted as replicate, such that I can plot one datapoint with the marker 'x' (For now I have to plot it with MeanAndSTD which uses the marker '.' which leads to invisible overlapping between data points and simulation points ('o').

Describe alternatives you have considered

Additional context

BarPlots ignore xValues

simulation_and_data__cell_cycle_no_M_washout_ridge__AU565__LEE011_Ribociclib__lambda10.0.pdf

Add function to create joint parameter table from multiple models and measurement files

This can also be extended to resolve name collisions from different models, where parameters have the same IDs but different meaning.

lint doesnt check for `;` at the end of observable parameter definitions

simply add ';' at the end of an petab.OBSERVABLE_PARAMETERS, should ideally lead to a check for '' in the parameter table which fails

single objectivePriorParameters are cast to numpy.float64

When all objectivePriorParameters are either empty or set to a single value, the respective column will be read with dtype numpy.float64, which may cause problems down the line as spec says this should be a string. Not a issue currently since all currently available priors require two parameters, but may lead to problems if that changes.

datasetId not optional for visualization table

When trying to use petab.visualize.plot_data_and_simulation without the column datasetId in the visualization table and without datasetId and preequilibrationConditionId produces the following error (file paths edited), while petablint does not produce any error. Attached a working example as well as the fix. One could make datasetId and preequilibrationConditionId mandatory.
example.zip
@elbaraim

  File "...AMICI_PEtab_simulation.py", line 62, in <module>
    ax = plot_data_and_simulation(exp_data=dir_measurments,
  File "...petab/visualize/plot_data_and_simulation.py", line 128, in plot_data_and_simulation
    exp_data, vis_spec = create_or_update_vis_spec(exp_data,
  File "...petab/visualize/helper_functions.py", line 572, in create_or_update_vis_spec
    vis_spec = expand_vis_spec_settings(vis_spec, columns_dict)
  File   "...petab/visualize/helper_functions.py", line 487, in expand_vis_spec_settings
    vis_spec[select_conditions].loc[:, column].values[0])
IndexError: index 0 is out of bounds for axis 0 with size 0```

Python API for linter

A 1-line python API to the linter would be nice, basically calling petablint::main with arguments.

petab-dev / libpetab-python Goto Github PK

libpetab-python's People

Contributors

Stargazers

Watchers

Forkers

libpetab-python's Issues

Describe alternatives you have considered

Additional context

Recommend Projects

Recommend Topics

Recommend Org