Giter VIP home page Giter VIP logo

pyiati's Introduction

pyIATI

A developers’ toolkit for IATI.

Build Status PyPI

master: Requirements Status dev: Requirements Status

Varying between: experimental and unstable (see docstrings)

General Information

This is a Python module containing IATI functionality that would otherwise be replicated in many different locations by many different software projects.

The contents of this library is at best unstable, and much is experimental. As such, it must be expected that its contents and API will change over the short-to-medium term future. Warning sections in docstrings help flag up some particular known stability concerns. Todo sections describe known missing or incorrect features.

Feedback, suggestions, use-case descriptions, bug reports and so on are much appreciated - it's far better to know of issues earlier in the development cycle. Please use Github Issues for this.

At present the library (core) represents much of the contents of the IATI Single Source of Truth (SSOT).

More pleasant API naming, better hiding of underlying lxml, full documentation, improved error handling, and a greater number of tests for edge-cases are known key areas for improvement.

General Installation for System Use

# install software dependencies
apt-get install python-pip libxml2-dev libxslt-dev python-dev pandoc

# install this package
pip install pyIATI

Documentation

The docs are not currently hosted and must be locally generated. To do this you must first:

  1. Clone this repo.
  2. Create a new virtualenv using at least python3.5+
  3. pip install -r requirements_dev.txt

At present, an HTML documentation site can be generated using the following commands:

# to build the documentation
cd pyIATI
make -B docs
open docs/build/index.html # for Mac OS
xdg-open docs/build/index.html # for linux

IATI Version Support

pyIATI fully supports versions 1.04, 1.05, 2.01, 2.02 and 2.03 of the IATI Standard.

Schemas for versions 1.01, 1.02 and 1.03 are included in the iati/resources/standard directory but are not yet accessible using the available pyIATI functions to return default schemas.

Usage

WARNING: This library is currently in active development. All usage examples are subject to change, as we iteratively improve functionality. Therefore, the following examples are provided for illustrative purposes only. As the library matures, this message and other documentation will be updated accordingly.

Once installed, the library provides functionality to represent IATI Schemas, Codelists and publisher datasets as Python objects. The IATI Standard schemas and codelists are provided out of the box (using iati.default), however this can be manipulated if bespoke versions of the Schemas/Codelists are required.

Loading an XSD Schema

A number of default IATI .xsd schema files are included as part of the library. They are stored in the folder: iati.core/iati/core/resources/schemas/

A schema must now be instantiated with a specified version.

The following example loads the latest IATI Activity Schema:

import iati.default
schema = iati.default.activity_schema('2.03')

By default, the default Schema will be populated with other information such as Codelists and Rulesets for the specified version of the Standard.

To access an Organisation Schema for version 1.05, with no additional information added:

import iati.default
schema = iati.default.organisation_schema('1.05', False)

Helper functions will be written in due course to return all XPaths within a Schema, as well as documentation for each element. Work in this area can be seen in the get-data-from-schema branch.

Loading Codelists

A given IATI Codelist can be added to a Schema. Example using the Country codelist.

import iati.default
country_codelist = iati.default.codelist('Country', '2.03')
schema.codelists.add(country_codelist)

All Codelists for the latest version of the Standard can be accessed with:

import iati.default
all_latest_codelists = iati.default.codelists('2.03'):

Loading Rulesets

The default IATI Ruleset can be loaded by using:

import iati.default
iati.default.ruleset('2.03')

If you wish to load your own Ruleset you can do this using:

import iati.rulesets
import iati.utilities

# Load a local Ruleset
ruleset_str = iati.utilities.load_as_string('/absolute/path/to/ruleset.json')

# To create a Ruleset object from your ruleset_str:
iati.Ruleset(ruleset_str)

Working with IATI Datasets

Loading a dataset - local

import iati.utilities

# Load a local file
dataset = iati.utilities.load_as_dataset('/absolute/path/to/iati-activites.xml')

Loading a dataset - remote

This functionality converts XML strings into bytes and passes it through some internal validation using lxml. Because of this Unicode strings with encoding declaration cannot be instantiated without additional steps as Datasets at this time. See: Python Unicode Strings for more information.

import iati.data

# Load a remote file
# Assumes the Requests library is installed: http://docs.python-requests.org/
import requests
dataset_as_string = requests.get('http://XML_FILE_URL_HERE').text

dataset = iati.Dataset(dataset_as_string)

Validating datasets

A Dataset object can be validated for adherence to XML and/or the IATI schemas. IATI schemas can be verified using methods in iati.validator.

Simple validation

Returns a number of booleans:

import iati.default
import iati.validator

# Set-up a sample dataset and get the default v2.03 schema
>>> dataset = iati.Dataset("""
... <iati-activities version="2.03">
...   <iati-activity>
...   </iati-activity>
... </iati-activities>
... """)  # This dataset is XML, but not IATI XML as it's missing mandatory elements.
>>> v203_schema = iati.default.activity_schema('2.03')

# Check whether the dataset is valid XML.
>>> iati.validator.is_xml(dataset)
True

# Check whether the dataset is valid IATI XML according to the 2.03 schema version.
>>> iati.validator.is_iati_xml(dataset, v203_schema)
False

# Check whether the dataset is valid according to the 2.03 IATI schema and ruleset.
>>> iati.validator.is_valid(dataset, v203_schema)
False

Detailed validation

Datasets can be validated to return a ValidationErrorLog. This can be performed using:

import iati.default
import iati.validator

# Set-up a sample dataset and get the default v2.03 schema
>>> dataset = iati.Dataset("""
... <iati-activities version="2.03">
...   <iati-activity>
...   </iati-activity>
... </iati-activities>
... """)  # This dataset is XML, but not IATI XML as it's missing mandatory elements.
>>> v203_schema = iati.default.activity_schema('2.03')

# Check whether the dataset is valid XML. Returns a ValidationErrorLog object.
>>> error_log = iati.validator.full_validation(dataset, v203_schema)

# The error log can be read using the following:
>>> len(error_log)  # Number of errors or warnings found
25

>>> error_log.contains_errors()  # Boolean value returned if at least one error is present
True

>>> error_log.contains_warnings() # Boolean value returned if at least one warning is present
True

# A breakdown of the first error found:
>>> first_error = error_log[0]
>>> first_error.info
"<string>:2:0:ERROR:SCHEMASV:SCHEMAV_ELEMENT_CONTENT: Element 'iati-activity': Missing child element(s). Expected is ( iati-identifier )."

>>> first_error.description
'A different element was found than was expected.'

>>> first_error.help
'There are a number of mandatory elements that an IATI data file must contain. Additionally, these must occur in the required order.\nFor more information about what an XML element is, see https://www.w3schools.com/xml/xml_elements.asp'

>>> first_error.status
'error'

>>> first_error.name
'err-not-iati-xml-missing-required-element'

Accessing data

The Dataset object contains an xml_tree attribute (itself an lxml.etree object). XPath expessions can be used to extract desired information from the dataset. For example:

# WARNING: The following examples assume the source dataset file is produced in IATI v2.x format

# Show the activities contained within the dataset
> dataset.xml_tree.xpath('iati-activity')
[<Element iati-activity at 0x2c5a5f0>, <Element iati-activity at 0x2c5ac68>, <Element iati-activity at 0x2c5acf8>, <Element iati-activity at 0x2c5ad40>]

# Show the title for each project
> dataset.xml_tree.xpath('iati-activity/title/narrative/text()')
['\nIMPROVING MATERNAL HEALTH AND REDUCING CHILD MORTALITY THROUGH DEVELOPING HEALTH SERVICE DELIVERY FOR THE POOR AND MARGINALISED COMMUNITY OF BAGHBANAN, NORTH WEST PAKISTAN\n', '\nIMPROVING MATERNAL HEALTH AND REDUCING CHILD MORTALITY THROUGH DEVELOPING HEALTH SERVICE DELIVERY FOR THE POOR AND MARGINALISED COMMUNITY OF BAGHBANAN, NORTH WEST PAKISTAN\n', '\nImproving maternal health and reducing child mortality through developing health service delivery for the poor and marginalised community in Baghbanan, North West Pakistan\n', '\nIMPROVED HEALTH SERVICE DELIVERY IN NORTH WEST PAKISTAN (\n']

# For the first activity only, show the planned start date (i.e. activity date type = 2)
> dataset.xml_tree.xpath('iati-activity[1]/activity-date[@type=2]/@iso-date')
['2014-01-01']

Python Version Support

This code supports Python 2.7 and 3.4+. We advise use of Python 3.5 (or above) as these versions of the language provide some rather useful features that will likely be integrated into this codebase.

Dev Installation

# install software development dependencies
apt-get install python-pip python-virtualenv

# create and start a virtual environment
virtualenv -p python3 pyenv
source pyenv/bin/activate

# install Python package dependencies
pip install -r requirements_dev.txt

Tests

# to run the tests
py.test iati/

# to run the linters
pylint iati
flake8 iati/
pydocstyle iati/
# OR
pylint iati; echo; flake8 iati/; echo; pydocstyle iati/

# to run the complexity and maintainability checks
radon mi iati/ -nb
radon cc iati --no-assert -nc

Alternatively, the Makefile can be used.

make test
make lint
make complexity
make docs

# OR

make all

Licensing

This software is available under the MIT License (see LICENSE.txt), and utilises third party libraries and tools that are distributed under their own terms (see LICENSE-3RD-PARTY.txt). Details of the authors of this software are provided in AUTHORS.txt.

pyiati's People

Contributors

hayfield avatar allthatilk avatar dalepotter avatar requires avatar ocre42 avatar andylolz avatar akmiller01 avatar bjwebb avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.