doubleml / doubleml-docs Goto Github PK
View Code? Open in Web Editor NEWDocumentation and User Guide for DoubleML - Double Machine Learning in Python & R
Home Page: https://docs.doubleml.org
License: BSD 3-Clause "New" or "Revised" License
Documentation and User Guide for DoubleML - Double Machine Learning in Python & R
Home Page: https://docs.doubleml.org
License: BSD 3-Clause "New" or "Revised" License
Change the Gallery to use Sphinx-Gallery.
Also update versions of
No response
Reminder: Adapt the new notebooks to the changes in PR #73
R and python notebook for A/B example: #75
Preprocessing file required for demand elasticity example : https://github.com/DoubleML/doubleml-docs/blob/p-add-demand-data-preprocessing/doc/examples/py_elasticity_preprocessing.ipynb
Depending on the time of the PR #73 this might also involve adjusting the analysis notebook for demand elasticity estimation
Changes required due to PRs #73, DoubleML/doubleml-for-py#151 and DoubleML/doubleml-for-r#161
1.3.0
The example for the evaluate_learners()
method is not complete. Since the IRM contains np.nan
values for some targets of the nuisance function the mean_absolute_error
metric has to be adjusted.
The example
import numpy as np
import doubleml as dml
from sklearn.metrics import mean_absolute_error
from doubleml.datasets import make_irm_data
from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier
np.random.seed(3141)
ml_g = RandomForestRegressor(n_estimators=100, max_features=20, max_depth=5, min_samples_leaf=2)
ml_m = RandomForestClassifier(n_estimators=100, max_features=20, max_depth=5, min_samples_leaf=2)
data = make_irm_data(theta=0.5, n_obs=500, dim_x=20, return_type='DataFrame')
obj_dml_data = dml.DoubleMLData(data, 'y', 'd')
dml_irm_obj = dml.DoubleMLIRM(obj_dml_data, ml_g, ml_m)
dml_irm_obj.fit()
dml_irm_obj.evaluate_learners(metric=mean_absolute_error)`
returns a ValueError: Input contains NaN
Adding
def mae(y_true, y_pred):
subset = np.logical_not(np.isnan(y_true))
return mean_absolute_error(y_true[subset], y_pred[subset])
dml_irm_obj.evaluate_learners(metric=mae)
will solve this issue.
Currently the navigation (and logo) on the left and right are not sticky (see screenshot). Usually with pydata_sphinx_theme
this is the case.
Tests with a minimal working example show that it seems to be somehow related with sphinx_panels
. Looking into pandas we presumably can fix it with the following lines pandas-dev/pandas@b5622c6#diff-d8d3ed25802824d15bf411f8e97416d8fc6f9247e821b0617f2b869dc584b99cR68-R146.
Currently, the code cells are not displayed in the DoubleML Workflow, see https://docs.doubleml.org/dev/workflow/workflow.html
No response
Build problems (https://github.com/DoubleML/doubleml-docs/runs/5689388602?check_suite_focus=true) exist since the release of jinja2
in version 3.1.0
. Problems go back to depreciated functions and root cause seems to be in nbconvert
. See among others sphinx-doc/sphinx#10289, jupyter/nbconvert#1736, jupyter/nbconvert#1624.
In #117 I fixed the version number for sphinx. However, we should check if we can update the sphinx version soon . There were several releases and changes since version 4.5.0
No response
Currently we have two workflows for this repo. One is deploying our site to dev and one to stable. The dev deploy is also used to check whether everything is still working (e.g. checking for broken jupyter notebooks due to dependency updates). I would like to adapt and extend the workflows:
_build
folder with the html files etc. as artifacts of the workflow. The test workflow should then also be activated for pull requests. There we currently have no checks activated. In the test workflow I also want to integrate the linkcheck
(see #55).Currently, we are relying on sphinx==4.5.0
and pydata-sphinx-theme==0.9.0
(see requirements).
I would suggest to update these to sphinx==5.0.2
and pydata-sphinx-theme==0.13.1
as these are the latest versions on anconda.
But some files have to be adjusted. The current version would look like this:
html.zip
Especially, our templates for the workflow and guide have to be reworked as the sidebars do not work properly and the guide spacing at depth 3 is off. Further, it might be nice to include a version switcher to change between the dev and stable version of the website.
Further, our example gallery now includes more cases. Maybe it is possible to define another level at the left panel.
No response
pydata_sphinx_theme
in version 0.6.0
Broken links due to the newly generated API docu with roxygen2 (see DoubleML/doubleml-for-r#167 and https://github.com/DoubleML/doubleml-docs/runs/7918698190?check_suite_focus=true)
We have more and more links in our documentation. Recently many of the links to mlr3book
were broken (see #48). Therefore, we should add an automatic check via the sphinx
tool linkcheck
(see e.g. https://www.writethedocs.org/guide/tools/testing/#sphinx). This should be possible via GitHub actions. It won't prevent broken links, but at least we get notified.
See deprecation warnings for basically all jobs, for example for this run
For reference: https://github.blog/changelog/2022-09-22-github-actions-all-actions-will-begin-running-on-node16-instead-of-node12/
No response
Using the Copy Button for the R Code, copies the row numbers, too.
See for example: https://docs.doubleml.org/stable/intro/intro.html
Error message:
Run install.packages('remotes')
Installing package into ‘/home/runner/work/doubleml-docs/doubleml-docs/tmp_r_libs_user’
(as ‘lib’ is unspecified)
trying URL 'https://cloud.r-project.org/src/contrib/remotes_2.4.0.tar.gz'
Content type 'application/x-gzip' length 149836 bytes (146 KB)
==================================================
downloaded 146 KB
* installing *source* package ‘remotes’ ...
** package ‘remotes’ successfully unpacked and MD5 sums checked
** using staged installation
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (remotes)
The downloaded source packages are in
‘/tmp/RtmpxK56gG/downloaded_packages’
Using bundled GitHub PAT. Please add your own PAT to the env var `GITHUB_PAT`
Error: Failed to install 'unknown package' from GitHub:
HTTP error 401.
Bad credentials
Rate limit remaining: 59/60
Rate limit reset at: 2021-08-06 13:57:47 UTC
Execution halted
Error: Process completed with exit code 1.
See https://github.com/DoubleML/doubleml-docs/runs/3262342769?check_suite_focus=true
Currently the sphinx build with pydata_sphinx_theme in version 0.10.1 fails, see https://github.com/DoubleML/doubleml-docs/actions/runs/2991978426. The issue might be related to pydata/pydata-sphinx-theme#911 / pydata/pydata-sphinx-theme#878.
When running sphinx-build
with increased verbosity, the following exception is thrown
Traceback (most recent call last):
File "/home/malte/github_projects/doubleml-docs/venv/lib/python3.10/site-packages/sphinx/builders/html/__init__.py", line 1048, in handle_page
output = self.templates.render(templatename, ctx)
File "/home/malte/github_projects/doubleml-docs/venv/lib/python3.10/site-packages/sphinx/jinja2glue.py", line 188, in render
return self.environment.get_template(template).render(context)
File "/home/malte/github_projects/doubleml-docs/venv/lib/python3.10/site-packages/jinja2/environment.py", line 1301, in render
self.environment.handle_exception()
File "/home/malte/github_projects/doubleml-docs/venv/lib/python3.10/site-packages/jinja2/environment.py", line 936, in handle_exception
raise rewrite_traceback_stack(source=source)
File "/home/malte/github_projects/doubleml-docs/venv/lib/python3.10/site-packages/sphinx/themes/basic/page.html", line 10, in top-level template code
{%- extends "layout.html" %}
File "/home/malte/github_projects/doubleml-docs/venv/lib/python3.10/site-packages/pydata_sphinx_theme/theme/pydata_sphinx_theme/layout.html", line 25, in top-level template code
{% set remove_sidebar_secondary = (meta is defined and meta is not none
File "/home/malte/github_projects/doubleml-docs/venv/lib/python3.10/site-packages/sphinx/themes/basic/../basic/layout.html", line 169, in top-level template code
{%- block content %}
File "/home/malte/github_projects/doubleml-docs/venv/lib/python3.10/site-packages/pydata_sphinx_theme/theme/pydata_sphinx_theme/layout.html", line 75, in block 'content'
{% block docs_navbar %}
File "/home/malte/github_projects/doubleml-docs/venv/lib/python3.10/site-packages/pydata_sphinx_theme/theme/pydata_sphinx_theme/layout.html", line 77, in block 'docs_navbar'
{%- include "sections/header.html" %}
File "/home/malte/github_projects/doubleml-docs/venv/lib/python3.10/site-packages/pydata_sphinx_theme/theme/pydata_sphinx_theme/sections/header.html", line 16, in top-level template code
{% include navbar_item %}
File "/home/malte/github_projects/doubleml-docs/venv/lib/python3.10/site-packages/pydata_sphinx_theme/theme/pydata_sphinx_theme/components/navbar-nav.html", line 6, in top-level template code
{{ generate_header_nav_html(n_links_before_dropdown=theme_header_links_before_dropdown) }}
File "/home/malte/github_projects/doubleml-docs/venv/lib/python3.10/site-packages/jinja2/sandbox.py", line 393, in call
return __context.call(__obj, *args, **kwargs)
File "/home/malte/github_projects/doubleml-docs/venv/lib/python3.10/site-packages/pydata_sphinx_theme/__init__.py", line 250, in generate_header_nav_html
title = app.env.titles[page].astext()
KeyError: 'self'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/malte/github_projects/doubleml-docs/venv/lib/python3.10/site-packages/sphinx/cmd/build.py", line 276, in build_main
app.build(args.force_all, filenames)
File "/home/malte/github_projects/doubleml-docs/venv/lib/python3.10/site-packages/sphinx/application.py", line 330, in build
self.builder.build_update()
File "/home/malte/github_projects/doubleml-docs/venv/lib/python3.10/site-packages/sphinx/builders/__init__.py", line 286, in build_update
self.build(to_build,
File "/home/malte/github_projects/doubleml-docs/venv/lib/python3.10/site-packages/sphinx/builders/__init__.py", line 350, in build
self.write(docnames, list(updated_docnames), method)
File "/home/malte/github_projects/doubleml-docs/venv/lib/python3.10/site-packages/sphinx/builders/__init__.py", line 524, in write
self._write_serial(sorted(docnames))
File "/home/malte/github_projects/doubleml-docs/venv/lib/python3.10/site-packages/sphinx/builders/__init__.py", line 534, in _write_serial
self.write_doc(docname, doctree)
File "/home/malte/github_projects/doubleml-docs/venv/lib/python3.10/site-packages/sphinx/builders/html/__init__.py", line 625, in write_doc
self.handle_page(docname, ctx, event_arg=doctree)
File "/home/malte/github_projects/doubleml-docs/venv/lib/python3.10/site-packages/sphinx/builders/html/__init__.py", line 1055, in handle_page
raise ThemeError(__("An error happened in rendering the page %s.\nReason: %r") %
sphinx.errors.ThemeError: An error happened in rendering the page api/api.
Reason: KeyError('self')
Theme error:
An error happened in rendering the page api/api.
Reason: KeyError('self')
As an intermediate solution, I try to (temporarily) bump / fix the pydata_sphinx_theme version.
Warning is being thrown for the workflow:
FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
See failed scheduled run: https://github.com/DoubleML/doubleml-docs/runs/4919219709?check_suite_focus=true.
We should update our user guide by adding brief demos and explanatations on
No response
The labels in the legend of the last graph on the page are in the wrong order.
I tried running the code to generate the graph on my system but I failed to reproduce it. The graph that I got after running the code was the right one.
I even tried to compile the code on my system to generate the HTML files but it still generated the right graph.
Below is my system info that I got after running the following code:
import platform; print(platform.platform())
import sys; print("Python", sys.version)
import doubleml; print("DoubleML", doubleml.__version__)
import sklearn; print("Scikit-Learn", sklearn.__version__)
import seaborn; print("Seaborn", seaborn.__version__)
Linux-5.11.0-41-generic-x86_64-with-glibc2.10
Python 3.8.8 (default, Apr 13 2021, 19:58:26)
[GCC 7.3.0]
DoubleML 0.4.1
Scikit-Learn 0.24.1
Seaborn 0.11.1
No response
The histograms produced by the R code in https://docs.doubleml.org/stable/guide/basics.html are not shown .
probably caused by changes in #109
just a small correction to this sentance is necessery at the end. "our the"
"Add the notebook to doubleml-docs/doc/examples/index.rst in order to have it listed in the Sandbox section of our the gallery."
rpy2
. Seems like blp_data
is no longer an object of class pd.DataFrame
.The tree-diagram explaining the class structure of the object-oriented implementation still has the old names for the private methods ml_nuisance_and_score_elements
and ml_nuisance_tuning
(see screenshot below)
https://docs.doubleml.org/stable/_images/oop.svg
Should be nuisance_est
and nuisance_tuning
instead.
Changes to be done here https://github.com/DoubleML/doubleml-docs/blob/master/doc/oop.svg
The figure has already been update in the Python and R package repos, e.g. https://github.com/DoubleML/doubleml-for-r/blob/master/man/figures/oop.svg
The appearance (and size) of figures in the python notebooks in the example gallery changed in one of the recent deploys. See https://docs.doubleml.org/stable/examples/py_double_ml_multiway_cluster.html and the screenshot
.
The notebook https://github.com/DoubleML/doubleml-docs/blob/master/doc/examples/R_double_ml_pipeline.ipynb requires a new release of the R package which includes the changes from DoubleML/doubleml-for-r#141.
Seems to be caused by the new scikit-learn in version 1.0.0
, see https://github.com/DoubleML/doubleml-docs/runs/3742278774?check_suite_focus=true.
All formulas are broekn. For example on this page:
https://docs.doubleml.org/stable/guide/models.html#models
I'm using a corporate laptop with Windows & MS Edge browser
No response
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.