Comments (8)
October 12th, 2017 Edit: This topic is now it's own part of the user guide. Please see http://plot.ly/dash/sharing-data-between-callbacks
Ah, that's interesting. Thanks for sharing! I'm not sure if these arguments apply to Dash, but I'm curious to learn more. Some notes:
- Note that chained dependencies / intermediate inputs / intermediate steps are already supported (see the "Multiple Outputs" section in the user guide here: https://plot.ly/dash/getting-started-part-2)
- For the sake of code modularity, you can just use regular functions like this:
global_df = pd.read_csv('...')
app.layout = html.Div([
dcc.Graph(id='graph'),
html.Table(id='table'),
dcc.Dropdown(id='dropdown')
])
def clean_data(df, value):
# some expensive clean data step
[...]
return cleaned_df
@app.callback(Output('graph', 'figure'), [Input('dropdown', 'value'])
def update_graph(value):
dff = clean_data(global_df, value)
figure = create_figure(dff)
return figure
@app.callback(Output('table', 'figure'), [Input('dropdown', 'value'])
def update_table(value):
dff = clean_data(global_df, value)
table = create_table(dff)
return table
- In this case, we're performing the
clean_data
step twice when the dropdown changes. In the case of something like a shared reactive expression, this could potentially only be done once. However, in Dash, all of these callbacks are executed in parallel on the server so you wouldn't end up being faster (as long as you aren't request bound). - If performance was really an issue, then you could add caching around
clean_data
, so that long expensive computations are only performed once (see https://plot.ly/dash/performance for more details) - If we did something like intermediate expressions, we'd have to send the intermediate data back to the client (the browser), which would incur a network delay cost
- You can sort of already do this by just serializing your data as a string and displaying it in a hidden div. Again, this will incur a network delay cost, so this solution might not be any faster than just performing the calculation (albeit twice) in 2 different callbacks (which will be executed in parallel) and/or caching the intermediate values
Here's how you would do this in a hidden div:
global_df = pd.read_csv('...')
app.layout = html.Div([
dcc.Graph(id='graph'),
html.Table(id='table'),
dcc.Dropdown(id='dropdown'),
html.Div(id='intermediate-value', style={'display': 'none'})
])
@app.callback(Output('intermediate-value', 'children'), [Input('dropdown', 'value')])
def clean_data(value):
# some expensive clean data step
cleaned_df = your_expensive_clean_or_compute_step(value)
return cleaned_df.to_json() # or, more generally, json.dumps(cleaned_df)
@app.callback(Output('graph', 'figure'), [Input('intermediate-value', 'children'])
def update_graph(jsonified_cleaned_data):
dff = pd.read_json(jsonified_cleaned_data) # or, more generally json.loads(jsonified_cleaned_data)
figure = create_figure(dff)
return figure
@app.callback(Output('table', 'children'), [Input('intermediate-value', 'children'])
def update_table(jsonified_cleaned_data):
dff = pd.read_json(jsonified_cleaned_data) # or, more generally json.loads(jsonified_cleaned_data)
table = create_table(dff)
return table
Finally, note that when you run just app.run_server()
only a single process is running which means that only one request can be made at a time. If you run the app with gunicorn
or, for development purposes, just add app.run_server(processes=4)
, then multiple requests can happen at the same time. This means that callbacks will be executed in parallel (reducing the time cost of shared values).
from dash.
As someone who works a lot in Shiny, and am trying out dash, really expected this equivalency in dash in some form or the other. This is kind of a game changer for me.
I disagree that dash overcomes this with multi-processing, specifically the statement.
In this case, we're performing the
clean_data
step twice when the dropdown changes. In the case of something like a shared reactive expression, this could potentially only be done once. However, in Dash, all of these callbacks are executed in parallel on the server so you wouldn't end up being faster (as long as you aren't request bound).
What if the function (e.g. clean_data
) is 1) performed more than 4 (number of core/processors available) times and 2) what if it is a long running and/or memory/computationally expensive algorithm?
from dash.
@chriddyp and @kmader, the hidden Div appears to be a bad idea in practice. I just spent the better (or worse) part of 2 days trying to figure out why my app was working flawlessly when run on my desktop but stumbled when pushed to the server (RHEL 7.1). As it turns out, the issue had to do with my use of a hidden Div to pass data between callbacks. When the DataFrame reached a certain number of rows (861), it would fail to execute the callbacks that depended on those Divs. I am guessing this has something to do with a size limit on Divs in HTML, but I am not certain.
I will try to post a reproducible example here in the next couple of days.
So far, Dash has been fantastic! But this has been a massive frustration. Live and learn...
from dash.
Thanks the Output('intermediate-value', 'children')
seems to be the closest match. The additional step of de/serialization is a bit clumsy but maybe a good start. The primary issue I was having in a current use case is that I had quite repetitive code with the same input arguments being copy and pasted across multiple callbacks.
On the performance side, the additional benefit of having a 'ReactiveExpressions' is the caching could be invisibly globally handled by Dash rather than having it on a function-by-function basis (what Shiny does). I'll need to study the code a bit better to see if there is anything else that might work.
from dash.
For future readers, I have written up some solutions to this problem in a new section of the user guide: http://plot.ly/dash/sharing-data-between-callbacks
from dash.
As it turns out, the issue had to do with my use of a hidden Div to pass data between callbacks. When the DataFrame reached a certain number of rows (861), it would fail to execute the callbacks that depended on those Divs. I am guessing this has something to do with a size limit on Divs in HTML, but I am not certain.
I will try to post a reproducible example here in the next couple of days.
Please try to recreate a reproducible example. I have used this method with 5MB of data successfully before, I'm not aware of any inherent limitations. Another issue could be a request or response size limitation on the server that you are deploying it on (frequently the default is like 1MB).
from dash.
I'll try to recreate an example. It is probably the request/response size limitation as you indicate. I'm a newb web-dev, hence why Dash has been fantastic for me!
I'm also checking with my server admin to see if they're imposing some kind of size limit.
Thanks for the speedy response.
from dash.
Closing this. We have to do things differently than other frameworks because we support multiple-processes. We have documented several ways to pass intermediate data around in https://dash.plot.ly/sharing-state-between-callbacks and this will only get better with declarative client-side transformations (#266 ) and support for multiple outputs (#80 )
from dash.
Related Issues (20)
- Delete Data from Heat Figs with Patch() [Feature Request] HOT 3
- [Feature Request] Global set_props in backend callbacks. HOT 2
- [BUG] The `infer_jupyter_proxy_config` method does not work due to timeout
- CI broken due to `flaky` incompatibility with latest `pytest`
- Improved Datatables HOT 3
- Background callbacks with celery + redis getting 204s HOT 1
- How does the new version of dash dynamically create callback HOT 1
- When there are three identical outputs, the allow_duplicate is set and an error will be reported. HOT 5
- dash title design issues HOT 3
- [BUG] When using multiple instances of dash on a flask server, they cannot share the same page name, even though they have different url routes HOT 3
- How does dash combine with flask jwt? HOT 1
- The problem of parameter transparency between dash custom components HOT 1
- Running argument in a callback breaks when provided with a dictionary as an component_id instead of a string HOT 2
- [BUG] Dash Testing: `wait_for_text_to_equal` may incorrectly succeed when used with text `"None"` HOT 1
- [Feature Request] Allow dcc.Dropdown with multi=True to stay open when item is selected HOT 3
- Can callback be set to take effect with the current page or globally HOT 1
- [BUG] Using gunicorn to deploy a dash app with plotly-resampler in Linux, and the figure didnt render correctly
- [BUG] cannot pickle 'SSLContext' with background callback HOT 5
- [BUG] A progress callable of a bg callback sometimes doesn't trigger. HOT 2
- [BUG] dcc.Clipboard functionality broken in Notification component HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dash.