Giter VIP home page Giter VIP logo

ecoinvent_interface's Introduction

ecoinvent_interface

PyPI Status Python Version

This is an unofficial and unsupported Python library to get ecoinvent data.

Quickstart

from ecoinvent_interface import Settings, EcoinventRelease, ReleaseType
my_settings = Settings(username="John.Doe", password="example")
release = EcoinventRelease(my_settings)
release.list_versions()
>>> ['3.9.1', '3.9', '3.8', '3.7.1', ...]
release.list_system_models('3.7.1')
>>> ['cutoff', 'consequential', 'apos']
release.get_release(version='3.7.1', system_model='apos', release_type=ReleaseType.ecospold)
>>> PosixPath('/Users/JohnDoe/Library/Application Support/'
              'EcoinventRelease/cache/ecoinvent 3.7.1_apos_ecoSpold02')

The ecospold files are downloaded and extracted automatically.

Usage

Authentication via Settings object

Authentication is done via the Settings object. Accessing ecoinvent requires supplying a username and password.

Note that you must accept the ecoinvent license and personal identifying information agreement on the website before using your user account via this library.

You can provide credentials in three ways:

  • Manually, via arguments to the Settings object instantiation:
from ecoinvent_interface import Settings
my_settings = Settings(username="bob", password="example")
  • Via the EI_PASSWORD and EI_USERNAME environment variables
export EI_USERNAME=bob
export EI_PASSWORD=example

If your environment variable values have special characters, using single quotes should work, e.g. export EI_PASSWORD='compl\!cat$d'.

Followed by:

from ecoinvent_interface import Settings
# Environment variables read automatically
my_settings = Settings()
from ecoinvent_interface import Settings, permanent_setting
permanent_setting("username", "bob")
permanent_setting("password", "example")
# Secrets files read automatically
my_settings = Settings()

Secrets files are stored in ecoinvent_interface.storage.secrets_dir.

For each value, manually set values always take precedence over environment variables, which in turn take precendence over secrets files.

A reasonable guide for choosing between the three is to use secrets on your private, local machine, and to use environment variables on servers or containers.

EcoinventRelease interface

To interact with the ecoinvent website, instantiate EcoinventRelease.

from ecoinvent_interface import EcoinventRelease, Settings
my_settings = Settings()
ei = EcoinventRelease(my_settings)

Database releases

To get a database release, we need to make three selections. First, the version:

ei.list_versions()
>>> ['3.9.1', '3.9', '3.8', '3.7.1', ...]

Second, the system model:

ei.list_system_models('3.7.1')
>>> ['cutoff', 'consequential', 'apos']

The ecoinvent API uses a short and long form of the system model names; you can get the longer names by passing translate=False. You can use either form in all EcoinventRelease methods.

ei.list_system_models('3.7.1', translate=False)
>>> [
  'Allocation cut-off by classification',
  'Substitution, consequential, long-term',
  'Allocation at the Point of Substitution'
]

Finally, the type of release. These are stored in an Enum. There are six release types; if you just want the database to do calculations choose the ecospold type.

  • ReleaseType.ecospold: The single-output unit process files in ecospold2 XML format
  • ReleaseType.matrix: The so-called "universal matrix export"
  • ReleaseType.lci: LCI data in ecospold2 XML format
  • ReleaseType.lcia: LCIA data in ecospold2 XML format
  • ReleaseType.cumulative_lci: LCI data in Excel
  • ReleaseType.cumulative_lcia: LCIA data in Excel

See the ecoinvent website for more information on what these values mean.

Once we have made a selection for all three choices, we can get the release files. They are saved to a cache directory and extracted by default.

ei.get_release(version='3.7.1', system_model='apos', release_type=ReleaseType.matrix)
>>> PosixPath('/Users/JohnDoe/Library/Application Support/'
              'EcoinventRelease/cache/universal_matrix_export_3.7.1_apos')

The default cache uses platformdirs, and the directory location is OS-dependent. You can use a custom cache directory with by specifying output_dir when creating the Settings class instance.

You can work with the cache when offline:

cs = CachedStorage()
list(cs.catalogue)
>>> ['ecoinvent 3.7.1_LCIA_implementation.7z']
cs.catalogue['ecoinvent 3.7.1_LCIA_implementation.7z']
>>> {
  'path': '/Users/<your username>/Library/Application Support/'
          'EcoinventRelease/cache/ecoinvent 3.7.1_LCIA_implementation',
  'extracted': True,
  'created': '2023-09-03T20:23:57.186519'
}

EcoinventRelease extra files

There are two other kinds of files available: reports, and what we call extra files. Let's see the extra files for version '3.7.1':

ei.list_extra_files('3.7.1')
>>>  {'ecoinvent 3.7.1_LCIA_implementation.7z': {
    'uuid': ...,
    'size': ...,
    'modified': datetime.datetime(2023, 4, 25, 0, 0)
  },
  ...
}

This returns a dictionary of filenames and metadata. We can download the ecoinvent 3.7.1_LCIA_implementation.7z file; by default it will automatically be extracted.

ei.get_extra(version='3.7.1', filename='ecoinvent 3.7.1_LCIA_implementation.7z')
>>> PosixPath('/Users/<your username>/Library/Application Support'
              '/EcoinventRelease/cache/ecoinvent 3.7.1_LCIA_implementation')

EcoinventRelease reports

Reports require a login but not a version number:

ei.list_report_files()
>>> {
  'Allocation, cut-off, EN15804_documentation.pdf': {
    'uuid': ...,
    'size': ...,
    'modified': datetime.datetime(2021, 10, 1, 0, 0),
    'description': ('This document provides a documentation on the calculation '
                    'of the indicators in the “Allocation, cut-off, EN15804” '
                    'system model.')
  }
}

Downloading follows the same pattern as before:

ei.get_report('Allocation, cut-off, EN15804_documentation.pdf')
>>> PosixPath('/Users/<your username>/Library/Application Support/EcoinventRelease/cache/Allocation, cut-off, EN15804_documentation.pdf')

Zip and 7z files are extracted by default.

EcoinventProcess interface

This class gets data and reports for specific processes. It first needs to know what release version and system model to work with:

from ecoinvent_interface import EcoinventProcess, Settings
my_settings = Settings()
ep = EcoinventProcess(my_settings)
ep.set_release(version="3.7.1", system_model="apos")

Finding a dataset id

The ecoinvent API uses integer indices, and these values aren't found in the release values. We have cached these indices for versions 3.7.1, 3.8, and 3.9.1. If you already know the integer index, you can use that:

ep.select_process(dataset_id="1")

You can also use the filename, if you know it:

F = "b0eb27dd-b87f-4ae9-9f69-57d811443a30_66c93e71-f32b-4591-901c-55395db5c132.spold"
ep.select_process(filename=F)
ep.dataset_id
>>> "1"

Finally, you can pass in a set of attributes. You should use the name, reference product, and/or location to uniquely identify a process. You don't need to give all attributes, but will get an error if the attributes aren't specific enough.

attributes is a dictionary, and can take the following keys: name or activity_name, reference product or reference_product, and location or geography. The system will adapt the names as needed to find a match.

ep.select_process(
    attributes={
        "name": "rye seed production, Swiss integrated production, for sowing",
        "location": "CH",
        "reference product": "rye seed, Swiss integrated production, for sowing",
    }
)
ep.dataset_id
>>> "40"

Basic process information

Once you have selected the process, you can get basic information about that process:

ep.get_basic_info()
>>> {
  'index': 1,
  'version': '3.7.1',
  'system_model': 'apos',
  'activity_name': 'electricity production, nuclear, boiling water reactor',
  'geography': 'FI',
  'reference_product': 'electricity, high voltage',
  'has_access': True
}

You can also call ep.get_documentation() to get a representation of the ecospold2 XML file in Python.

Process documents

You can use ep.get_file with one of the following file types to download process files:

  • ProcessFileType.upr: Unit Process ecospold XML
  • ProcessFileType.lci: Life Cycle Inventory ecospold XML
  • ProcessFileType.lcia: Life Cycle Impact Assessment ecospold XML
  • ProcessFileType.pdf: PDF Dataset Report
  • ProcessFileType.undefined: Undefined (unlinked and multi-output) Dataset PDF Report

For example:

from ecoinvent_interface import ProcessFileType
from pathlib import Path
ep.get_file(file_type=ProcessFileType.lcia, directory=Path.cwd())

Would download the life cycle impact assessment ecospold XML file to the current working directory. The get_file method requires specifying the directory.

Relationship to EIDL

This library initially started as a fork of EIDL, the ecoinvent downloader. As of version 2.0, it has been completely rewritten. Currently only the authentication code comes from EIDL.

Differences with EIDL:

  • Designed to be a lower-level infrastructure library. All user and web browser interaction was removed.
  • Username and password can be specified using pydantic_settings.
  • Can download all release files, plus reports and "extra" files.
  • Will autocorrect filenames when possible for ecoinvent inconsistencies.
  • Can download data on inventory processes.
  • Can find inventory processes using their filename or attributes.
  • Uses a more robust caching and cache validation strategy.
  • More reasonable token refresh strategy.
  • No HTML parsing or filename string hacks.
  • Streaming downloads.
  • Descriptive logging and error messages.
  • No shortcuts for Brightway or other LCA software.
  • Custom library headers are set to allow users of this library to be identified. No user information is transmitted.
  • Comprehensive tests.

Contributing

Contributions are very welcome, but please note the following:

  • This library consumes and unpublished an under development API
  • Extensions of the current API to get process LCI or LCIA data or LCIA scores won't be included
  • Brightway-specific code won't be included

To learn more, see the Contributor Guide.

License

Distributed under the terms of the MIT license, ecoinvent_interface is free and open source software.

Issues

If you encounter any problems, please file an issue along with a detailed description.

ecoinvent_interface's People

Contributors

cmutel avatar ernestorocha avatar haasad avatar jsvgoncalves avatar pjamesjoyce avatar raphaeljolivet avatar renovate[bot] avatar stephane-dubois avatar tngtudor avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

ecoinvent_interface's Issues

Fix invalid unit process datasets

The pyecospold library finds some invalid unit process datasets, but the missing elements are easy to add. We could manually patch the releases (like we already do with minor versions) instead of serving invalid data to our users.

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

Open

These updates have all been created already. Click a checkbox below to force a retry/rebase of any.

Detected dependencies

github-actions
.github/workflows/python-package-deploy.yml
  • actions/setup-python v5
.github/workflows/python-test.yml
  • actions/checkout v4@a5ac7e51b41094c92402da3b24376905380afc29
  • actions/setup-python v5
  • codecov/codecov-action v4
pep621
pyproject.toml
  • setuptools >=68.0

  • Check this box to trigger a request for Renovate to run again on this repository

Support for python3.8

python3.8 is still in use.
I just found a couple of type hint fix make it compatible with python3.8.

A PR is on-going.
Compatbility could be checked automatically with Tox

installing the package under python 3.12 fails

Current

When doing "python -m pip install ecoinvent_interface" in a python=3.12 environment, the installation fails with:

...
Collecting lxml
  Using cached lxml-4.9.2.tar.gz (3.7 MB)
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [7 lines of output]
      /tmp/pip-install-gp53p6d9/lxml_0e364b4116bd4d7c9728b6f67fdcb15a/setup.py:117: SyntaxWarning: invalid escape sequence '\.'
        is_interesting_header = re.compile('^(zconf|zlib|.*charset)\.h$').match
      /tmp/pip-install-gp53p6d9/lxml_0e364b4116bd4d7c9728b6f67fdcb15a/setup.py:67: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
        import pkg_resources
      Building lxml version 4.9.2.
      Building without Cython.
      Error: Please make sure the libxml2 and libxslt development packages are installed.
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

Expected

works under python 3.12

System to find API dataset IDs

Ecoinvent datasets are identified by the dataset_id field, an integer. But users don't know this - they know the process metadata, or the filename (with activity and product UUIDs). We need some way to build a cache to lookup the ID from the attributes, and to share this cache via some free web resource.

  • Utility function to get activity, product, unit, and location from downloaded release files
  • Utility function to combine above with polling each possible ascending integer ID
  • Combine the two above and save this mapping
  • System to download the mappings if available from some web site

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.