Comments (4)
If I understand this correctly, this is not actually a bug but rather potentially unintuitive behavior. It is true that the highly generic example above is confusing, i.e.,
model: model_a
native_regions:
- region_a: Region_A
common_regions:
- Region_B:
- region_b
but in an actual project setting, this would like as follows:
model: model_a
native_regions:
- EUR: model_a|EUR
common_regions:
- Europe:
- EUR
In my opinion, this is not a major reason for concern, but rather highlights that we need to improve our documentation, not change the logic of the processing.
from nomenclature.
I partially agree, for sure the documentation needs to be better, I'll make sure I get on that right away.
There are, however, in my opinion two issues that go beyond that:
- Potential for data loss: If the
common_region
in question is renamed as is your example- Europe: EUR
, the resulting data will be obtained through region processing alone. Not only is this unnecessary work, making the processing slower, as you could just use the provided data, it can also lead to data loss in case the aggregation fails.
We had this issue in ngfs-internal recently where a variablePrice|Carbon
that had a weighted aggregation (usingEmissions|Kyoto Gases
) had a negative weight after the year 2060. As is the pyam default, the negative values were dropped. This lead to missing values inPrice|Carbon
even though the completePrice|Carbon
timeseries was provided model natively.
This could be mitigated1, by skipping region aggregation and simply taking the provided model native results in case acommon_region
just consists of a single region. - Inconsistency with the renaming: I frequently find myself going over the question "What is region x now going to be uploaded as?". For
native_regions
it is the 'upload name' is the second value (taking your previous example:- EUR: model_a|EUR
), while forcommon_regions
it is the first one (-Europe: EUR
). Maybe it's just me but I find that a bit inconsistent and as a result confusing.
Footnotes
-
In the specific case of ngfs-internal it was solved by choosing a different aggregation weight, that did not have negative values. ↩
from nomenclature.
As a suggestion for 1, we could split out common-region mappings that only have one component and treat them via renaming.
For 2, I still believe that a clear separation into common regions (used for comparison in a project) vs. native region is the more intuitive approach for modelers. The alternative (assuming you would want to save the native region and the compare-region) would be to have
model: model_a
native_regions:
- EUR: model_a|EUR
- EUR: Europe
which will be just as confusing for larger models...
from nomenclature.
Both proposed points sound good to me, let's do it that way.
from nomenclature.
Related Issues (20)
- Is CodeList.name needed? HOT 1
- Default-attributes are exported to yaml HOT 2
- Filter CodeLists by any attribute HOT 4
- Pandas 2.0 breaks test_check_aggregate.expected_fail_return
- MetaCode brainstorming HOT 6
- Align Processor structure HOT 2
- Use different name for ISO3-code attribute
- Improve required data output
- pydantic 2.0 breaks nomenclature
- General options for a DataStructureDefinition HOT 3
- Enable using model mappings from external repository
- Implement MetaCodeList-validation using pandera
- Get relative path fails if file is not part of current working directory
- nomenclature install from GitHub fails exclude tests HOT 1
- Refactor and extend RequiredDataValidator HOT 2
- Invalid regions do not raise an error as part of RegionProcessor
- General-config fails if folder is not present
- country names HOT 2
- Potential conflicts with overlapping region-aggregation instructions
- Upgrade to use latest pydantic HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nomenclature.