petab-dev / libpetab-python Goto Github PK
View Code? Open in Web Editor NEWPython package for working with PEtab files
Home Page: https://libpetab-python.readthedocs.io
License: MIT License
Python package for working with PEtab files
Home Page: https://libpetab-python.readthedocs.io
License: MIT License
I feel like xOffset should be adapted to the scale that is employed, applying linear offsets on a logscale doesnt work well
simulation_and_data__cell_cycle_no_M_washout_ridge__AU565__LEE011_Ribociclib__lambda10.0.pdf
Hey,
I am using PEtab in the develop branch on a WSL.
I tried to plot some simulations and experimental data using the visualization specification.
These are the files I used:
issue.zip
an here is the code:
from petab.visualize import plot_data_and_simulation
import matplotlib.pyplot as plt
ax = plot_data_and_simulation(data_file_path,
condition_file_path,
visualization_file_path,
simulation_file_path)
It returns:
The simulations are not correctly plotted.
If in PEtab/petab/visualize/plot_data_and_simulation.py
the lines 138 and 139 are commented out, the simulations are correctly plotted:
I am not sure, what the use of lines 138 and 139 is. But at least for my model they seem to be problematic, although I had no problem with re-plotting the test models Isensee and Fujita locally.
simulation_and_data__cell_cycle_no_M_washout_ridge__AU565__LEE011_Ribociclib__lambda10.0.pdf
would be great if coloring of invidual label entries is consistent across plotId
s
Allowed in YAML file, but not implemented in library.
I don't think
libpetab-python/petab/parameter_mapping.py
Line 365 in 0917194
If overridee_id
is not present in mapping
, the code will simply add a new entry and not throw a KeyError
.
Good morning,
I am using PEtab in the develop branch on a WSL.
I tried to plot the experimental data with error bars using the visualization specification.
These are the files I used:
issue.zip
and here is the code:
from petab.visualize import plot_data_and_simulation
import matplotlib.pyplot as plt
ax = plot_data_and_simulation(data_file_path,
condition_file_path,
visualization_file_path,
simulation_file_path)
The noise for each datapoint is directly given in the measurement table, so I chose plotTypeData = provided
in the visualization specification file. Still no errorbars appear:
To plot the errorbars plotted_noise='provided'
has to be added to the plot_data_and_simulation
function:
As the errorbars for e.g. plotTypeData = MeanAndSD
are also directly plotted without that any additional argument has to be set. I think it would be nice to implement this for provided
aswell or instead mention in the example notebook/documentation the need to set the plotted_noise
argument.
Could in some cases be useful to plot e.g. a single observable for only a few conditions.
So far an error rises: "Plotting without visualization specification file and datasetId can be performed via grouping by simulation conditions OR observables, but not both. Stopping."
In petab/visualize/*
:
cd doc; make html; firefox build/html/index.html
)If in the visualization-specification file datasetId
is not given, but yValues
is given, the yValues
need to be sorted alphabetically, otherwise the following error is thrown:
Traceback (most recent call last):
File "/home/erika/Documents/Python/PEtab_my_files/visu_test.py", line 37, in <module>
simulation_file_path
File "/home/erika/Documents/env-petabpypesto37/lib/python3.7/site-packages/petab/visualize/plot_data_and_simulation.py", line 136, in plot_data_and_simulation
plotted_noise)
File "/home/erika/Documents/env-petabpypesto37/lib/python3.7/site-packages/petab/visualize/helper_functions.py", line 572, in create_or_update_vis_spec
vis_spec = expand_vis_spec_settings(vis_spec, columns_dict)
File "/home/erika/Documents/env-petabpypesto37/lib/python3.7/site-packages/petab/visualize/helper_functions.py", line 487, in expand_vis_spec_settings
vis_spec[select_conditions].loc[:, column].values[0])
IndexError: index 0 is out of bounds for axis 0 with size 0
Process finished with exit code 1
This is because
obs_uni = list(np.unique(exp_data[OBSERVABLE_ID]))
(helper_functions.py, line 446)
sorts observables alphabetically.
Would be good, if it's not mandatory that yValues
are sorted alphabetically.
Verify all tests pass on Windows
At the moment, there can be NaN
s (after pd.read_csv
) in optional PEtab string columns, such as observableNames
, that, if interpreted as a string, are converted to the string literal 'nan'
.
>>> import numpy as np
>>> str(np.nan)
'nan'
An issue can occur in the AMICI plotting functions. This issue can be fixed by replacing
elif model.getObservableNames()[iy] != '':
with
elif model.getObservableNames()[iy] in ['', 'nan']:
to correctly identify unspecified observable names. However, testing for the string 'nan'
seems unintuitive, and this fix might cause another issue if an observable is named 'nan'
.
Here's a solution, which could be implemented in PEtab, and might resolve the issue in AMICI.
$ cat test_str.csv
observableId observableName
a_id a_name
b_id
>>> import pandas as pd
>>> df1 = pd.read_csv('test_str.csv', sep='\t')
>>> df2 = pd.read_csv('test_str.csv', sep='\t')
>>> df2['observableName'] = df2['observableName'].fillna('')
>>> df1
observableId observableName
0 a_id a_name
1 b_id NaN
>>> df2
observableId observableName
0 a_id a_name
1 b_id
I just tried installing petab in the colab environment, unfortunately that failed, since colab only seems to work with pandas up to 1.1.5. Any chance we could lower our requirement here?
Collecting petab
Downloading petab-0.1.20-py3-none-any.whl (84 kB)
...
Collecting pandas>=1.2.0
Downloading pandas-1.3.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.3 MB)
|████████████████████████████████| 11.3 MB 12.6 MB/s
...
Installing collected packages: pandas, colorama, petab
Attempting uninstall: pandas
Found existing installation: pandas 1.1.5
Uninstalling pandas-1.1.5:
Successfully uninstalled pandas-1.1.5
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-colab 1.0.0 requires pandas~=1.1.0; python_version >= "3.0", but you have pandas 1.3.3 which is incompatible.
Successfully installed colorama-0.4.4 pandas-1.3.3 petab-0.1.20
WARNING: The following packages were previously imported in this runtime:
[pandas]
You must restart the runtime in order to use newly installed versions.
petab.lint.check_condition_df
does not check whether index entries are unique, but this is required according to documentation.
Use the tempfile....File context managers with delete=True, and yield results in between if necessary.
c.f. discussion at end of PEtab-dev/PEtab#408
No error is raised when noise is given as normally distributed with noiseParameter1_{observableId}
in the observables table but noiseParameters
is missing in the measurement table.
For example,
Zhao_QuantBiol2020.zip
passes petablint.
Would be nice if measurements could be visualized without providing a condition table or visualization table. Currently not possible afaik.
Condition data should only be required for dose-response plots. Those should be plotted without the respective dose information.
If after reading the Sneyd_PNAS2002 model from the benchmark collection, i write it out again using the petab library, i get the following exception:
self = <encodings.cp1252.IncrementalEncoder object at 0x000001DBE8E097B8>
input = 'Ca_dose_response__1\t10 μM IP_3, 0.1 μM Ca^{2+}\t10.0\t0.1\r\r\n'
final = False
def encode(self, input, final=False):
> return codecs.charmap_encode(input,self.errors,encoding_table)[0]
E UnicodeEncodeError: 'charmap' codec can't encode character '\u03bc' in position 23: character maps to <undefined>
cp1252.py:19: UnicodeEncodeError
it turns out the unit is given in special characters. Since PEtab had no problems reading it, maybe the writers should pass a utf8 encoding attribute when writing out the tables. Rather than just:
conditions.py:55: in write_condition_df
df.to_csv(fh, sep='\t', index=True)
https://github.com/readthedocs/recommonmark/tree/2357251067481413916b309c6c84575eb9018ba7
Should use MyST instead?
The function import_from_file
is currently not used at all. cf
https://github.com/PEtab-dev/PEtab/blob/40ca2a012295a8009ed72cca22b60048cacf4e7b/petab/visualize/helper_functions.py#L29
Imo it does not provide extra value because all the importing is done in plot_data_and_simulation
Since it has to be kept up to date if e.g. changes to functions called by this function happen, I was wondering if we might just delete it
e.g. the Chen model has specified noiseParameters in the measurement data, and no replicate measurement data is available. (so, standard deviation cant be calculated from replicates)
For plotting the standard deviation if no measurement replicates are available, and the column noiseParameters has values, plot these values as standard deviation.
Allowed in YAML file, but not implemented in library.
We should check that the experimentalCondition file does not contain as parameters "time" and "condition".
When having dedicated writer functions such as petab.conditions.write_condition_df
etc, I would actually expect that they perform some kind of sanity check on the output. Having pretty bare wrappers around pd.to_csv
doesn't seem to helpful, especially since the writers use index=True
, which is not consistent with the providede spec and, since the readers don't use index_col=0
, leaves an Unnamed 0
column in the imported DataFrame.
Which problem would you like to address? Please describe.
petab.lint.lint_problem
does not check whether simulationConditionId
s in measurement_df
are valid condition Ids.
Describe the solution you would like
petab.lint.lint_problem
flags condition ids in measurement_df['simulationConditionId']
that are not defined in the condition table.
Describe alternatives you have considered
Actually specifying valid simulationConditionId
s
Additional context
Add any other context about the request here.
in file 'get_data_to_plot.py', see ll. 92-102
TODO: Here not the case: So, if entries in measurement file: preequCondId, time, observableParams, noiseParams, observableTransf, noiseDistr are not the same, then -> differ these data into different groups!
now: go in simulationConditionId, search group of unique simulationConditionId
e.g. rows 0,6,12,18 share the same simulationCondId, then check if other column entries are the same (now: they are), then take intersection of rows 0,6,12,18 and checked other same columns (-> now: 0,6,12,18) and then go on with code.
if there is at some point a difference in other columns, say e.g. row 12,18 have different noiseParams than rows 0,6 the actual code would take rows 0,6 and forget about rows 12,18
petablint
see also #1
The documentations states that the noiseFormula
in the observable table can be specified like so:
noiseParameter1_observable_pErk + noiseParameter2_observable_pErk*pErk
However, when I use a species in the observable formula, I get the following error:
TypeError Traceback (most recent call last)
/media/sf_DPhil_Project/Project07_Parameter Fitting/PEtab/petab/calculate.py in evaluate_noise_formula(measurement, noise_formulas, parameter_df, simulation)
187 try:
--> 188 noise_value = float(noise_value)
189 except TypeError:
~/venvs/std/lib/python3.8/site-packages/sympy/core/expr.py in __float__(self)
324 raise TypeError("can't convert complex to float")
--> 325 raise TypeError("can't convert expression to float")
326
TypeError: can't convert expression to float
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
<ipython-input-11-8dc41b43303c> in <module>
34 problem = core.DisFitProblem(PETAB_YAML)
35 problem.write_jl_file()
---> 36 problem.optimize()
37 problem.plot_results('c0', path='plot.pdf')
38 problem.write_results()
/media/sf_DPhil_Project/Project07_Parameter Fitting/df_software/DisFit/DisFit/core.py in optimize(self)
386 print(self.petab_problem.parameter_df)
387
--> 388 self._results['fval'] = -petab.calculate_llh(self.petab_problem.measurement_df.loc[:, cols],
389 pd.concat([self.petab_problem.simulation_df.rename(columns={'measurement': 'simulation'}), ndf], axis=1),
390 self.petab_problem.observable_df,
/media/sf_DPhil_Project/Project07_Parameter Fitting/PEtab/petab/calculate.py in calculate_llh(measurement_dfs, simulation_dfs, observable_dfs, parameter_dfs)
272 for (measurement_df, simulation_df, observable_df, parameter_df) in zip(
273 measurement_dfs, simulation_dfs, observable_dfs, parameter_dfs):
--> 274 _llh = calculate_llh_for_table(
275 measurement_df, simulation_df, observable_df, parameter_df)
276 llhs.append(_llh)
/media/sf_DPhil_Project/Project07_Parameter Fitting/PEtab/petab/calculate.py in calculate_llh_for_table(measurement_df, simulation_df, observable_df, parameter_df)
314
315 # get noise standard deviation
--> 316 noise_value = evaluate_noise_formula(
317 row, noise_formulas, parameter_df, petab.scale(simulation, scale))
318
/media/sf_DPhil_Project/Project07_Parameter Fitting/PEtab/petab/calculate.py in evaluate_noise_formula(measurement, noise_formulas, parameter_df, simulation)
188 noise_value = float(noise_value)
189 except TypeError:
--> 190 raise TypeError(
191 f"Cannot replace all parameters in noise formula {noise_value} "
192 f"for observable {observable_id}.")
TypeError: Cannot replace all parameters in noise formula 0.1*A + 0.5 for observable obs_a.
When I replace the species with an observable everything works fine.
Recent merge PEtab-dev/PEtab#214 allows for visu specs with a large amount of subplots to save them individually to files (c.f. also PEtab-dev/PEtab#213).
This can take some time and should be well parallelizable
So I am currently running into the following issue:
Traceback (most recent call last):
File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/pydevd.py", line 1434, in _exec
pydev_imports.execfile(file, globals, locals) # execute the script
File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "pretrain_per_sample.py", line 70, in <module>
petab.visualize.plot_data_and_simulation(
File "/Users/ffroehlich/Documents/HMS/mechanismEncoder/venv/lib/python3.8/site-packages/petab/visualize/plot_data_and_simulation.py", line 128, in plot_data_and_simulation
exp_data, vis_spec = create_or_update_vis_spec(exp_data,
File "/Users/ffroehlich/Documents/HMS/mechanismEncoder/venv/lib/python3.8/site-packages/petab/visualize/helper_functions.py", line 572, in create_or_update_vis_spec
vis_spec = expand_vis_spec_settings(vis_spec, columns_dict)
File "/Users/ffroehlich/Documents/HMS/mechanismEncoder/venv/lib/python3.8/site-packages/petab/visualize/helper_functions.py", line 487, in expand_vis_spec_settings
vis_spec[select_conditions].loc[:, column].values[0])
IndexError: index 0 is out of bounds for axis 0 with size 0
I explicitly specify PLOT_ID
but no DATASET_ID
in my visualization table (if I do specify DATASET_ID
, which is not specified in my measurements_df, petab will just show empty plots, ... bummer).
This means that I end up with https://github.com/PEtab-dev/PEtab/blob/c3d51d6ce98a533c686243dd9ba163276785fc44/petab/visualize/helper_functions.py#L558, which makes no effort at all to respect the previously defined plot_id_list
, but instead creates a new set of plot ids of the form plot{i}
. Those PLOT_ID
of course don't match what's in my vis_spec
, which means that the select_conditions
in https://github.com/PEtab-dev/PEtab/blob/c3d51d6ce98a533c686243dd9ba163276785fc44/petab/visualize/helper_functions.py#L484 will never match anything.
I am at loss how this is conceptually supposed to work at all, so I would appreciate if anybody could enlighten me such that I can fix this issue.
Documentation isn't very explicit about whether this should only be a single value or whether this can be multiple values (and if so in which format they should be passed).
Which problem would you like to address? Please describe.
Plotting a petab.Problem together with a simulation_df by observable_ids does not work properly: Simulation is plotted as constant zero.
plot_petab_problem(
petab_problem,
sim_data=simulation_df.rename(columns={"measurement": "simulation"}),
observable_id_list=[["abs_pSCTF"]],
)
leads to
Describe the solution you would like
Simulations should be plotted by their values.
Would be quite convenient if Path
s could be used in addition path strings.
... this is at least a major inconvenience as it prevents users using their own styles, and therefore, should be changed.
Currently affects petab/visualize/plotter.py
and petab/visualize/helper_functions.py
.
If in the visualization table, the column XValues is not provided, it is inferred:
see conversation in PEtab-dev/PEtab#283.
It often happens that (in line plots), the errors bars are not completely visible. It seems, y-limits are set based on the measurement or simulation, ignoring the error bars. This should be changed. Happens at both, the lower and the upper end.
Which problem would you like to address? Please describe.
I try to plot individual replicates of the data.
plotTypeData
with replicate
in the visualization table, the data is still plotted as MeanAndSTD
. You manually have to add plotted_noise='replicate'
in:plot_data_and_simulation(data_file_path,
condition_file_path,
visualization_file_path,
simulation_file_path, plotted_noise='replicate'
)
Traceback (most recent call last):
File "/home/erika/Documents/Python/PEtab_my_files/Boehm_visu_test.py", line 21, in <module>
simulation_file_path, plotted_noise='replicate'
File "env-petabpypesto37/lib/python3.7/site-packages/petab/visualize/plot_data_and_simulation.py", line 165, in plot_data_and_simulation
exp_conditions, sim_data)
File "env-petabpypesto37/lib/python3.7/site-packages/petab/visualize/helper_functions.py", line 765, in handle_dataset_plot
plot_lowlevel(plot_spec, ax, conditions, measurement_to_plot, plot_sim)
File "env-petabpypesto37/lib/python3.7/site-packages/petab/visualize/plotting_config.py", line 93, in plot_lowlevel
conditions[conditions.index.values],
AttributeError: 'numpy.ndarray' object has no attribute 'index'
so conditions
seems to have the wrong data type, here its an numpy.array with the time-points, that should be plotted on the x-axis.
Describe the solution you would like
plotTypeData
with replicate
in the visualization table would be enough (and not additionally adding **plotted_noise='replicate'**
in the function plot_data_and_simulation
. Could be easier to understand for beginners.MeanAndSTD
which uses the marker '.' which leads to invisible overlapping between data points and simulation points ('o').This can also be extended to resolve name collisions from different models, where parameters have the same IDs but different meaning.
simply add ';' at the end of an petab.OBSERVABLE_PARAMETERS
, should ideally lead to a check for ''
in the parameter table which fails
When all objectivePriorParameters
are either empty or set to a single value, the respective column will be read with dtype numpy.float64
, which may cause problems down the line as spec says this should be a string. Not a issue currently since all currently available priors require two parameters, but may lead to problems if that changes.
When trying to use petab.visualize.plot_data_and_simulation
without the column datasetId
in the visualization table and without datasetId
and preequilibrationConditionId
produces the following error (file paths edited), while petablint does not produce any error. Attached a working example as well as the fix. One could make datasetId
and preequilibrationConditionId
mandatory.
example.zip
@elbaraim
File "...AMICI_PEtab_simulation.py", line 62, in <module>
ax = plot_data_and_simulation(exp_data=dir_measurments,
File "...petab/visualize/plot_data_and_simulation.py", line 128, in plot_data_and_simulation
exp_data, vis_spec = create_or_update_vis_spec(exp_data,
File "...petab/visualize/helper_functions.py", line 572, in create_or_update_vis_spec
vis_spec = expand_vis_spec_settings(vis_spec, columns_dict)
File "...petab/visualize/helper_functions.py", line 487, in expand_vis_spec_settings
vis_spec[select_conditions].loc[:, column].values[0])
IndexError: index 0 is out of bounds for axis 0 with size 0```
A 1-line python API to the linter would be nice, basically calling petablint::main with arguments.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.