energieid / entsoe-py Goto Github PK

View Code? Open in Web Editor NEW

393.0 393.0 178.0 1.95 MB

Python client for the ENTSO-E API (european network of transmission system operators for electricity)

License: MIT License

Python 100.00%

entsoe-py's People

Contributors

Stargazers

Watchers

Forkers

bfauser etesio fgenoese alexandrehuynen sn4i1 ssozcan atemmo crlsmcl octiembre80 dmartid gjertro powersym olafsk diostam jpaduart mohamedmogy nsandev wbworks fabianhofmann paronax kryzhov waldemarmeier bchaudron tinkaa consideratio kosto gmohandas cateye0 duizendnegen dipoll hellowlol thedirtyfew powermart-aps openstrategist aslten jeroone mikaello yuanzy97 daveparr rablancomo watttime ishansaraswat giuliobeseghi nadia-el philipdefteraios xytreyum ewapazd fleimgruber ad-pro tolgayan r4ch45 maurerle sikunzzz norllysenergytrading pponl dimoschi henri-chat-noir leoldn tommasogallingani zuinnote antobr96 pierreattard jimich its-a-python frankboermantennet h0rn3t branx dave-cz dariohett jakobkruse1 pcacunar ment911 dimitris94ece riccardopizzi andre-ortner robertoquadrini eamonn-bell francescortiz woutertp1 mattewen stevenriedijkgreenchoice johnifx drieshugaerts nhcb timonviola hammershoj dkgh-90 wuifi fgotzens mmanana fcambrea lpirl p-reuber ac-jdaley r-uben shatteringlass chris-fr-c corralien borisken lizat-i

entsoe-py's Issues

Typo in TIMEZONE_MAPPINGS Dict

Typo in Timezone mappings Dict: Must be 'Tallinn' not 'Talinn'

crossborder_flows: help

Hello
This is my first issue in GitHub.
I am trying to use this project to build a PowerBI application.
I can use some functions (installed_generation_capacity, unavailability_of_generation_units, query_generation, withdrawn_unavailability_of_generation_units).
I don't get any data when I try to use crossborder_flows.
The message is "The table is empty".
No error is returned.
Can you help me?
Thank you

CODE:
from entsoe import EntsoePandasClient
import pandas as pd
client = EntsoePandasClient(api_key='KEY')
start = pd.Timestamp('20200101', tz='Europe/Brussels')
end = pd.Timestamp('20210101', tz='Europe/Brussels')
df = client.query_crossborder_flows('ES', 'FR', start=start, end=end)
print (df)

Time interval

Hello,
I really appreciate the effort has been put towards making this package it's just saved me a lot of time. For some reason the time interval described in the documentation is returning the same values when I do not use the time interval. For example:

start = pd.Timestamp('2019-06-22 12:12:00', tz='Europe/Zurich')
end = pd.Timestamp('2019-06-23 12:12:00', tz='Europe/Zurich')
country_code = 'CH'  # swiss
pandaDp=client2.query_day_ahead_prices(country_code, start=start, end=end)

the resulting list I get from this piece of code is exactly the same when I don't use the hour.

start = pd.Timestamp('2019-06-22', tz='Europe/Zurich')
end = pd.Timestamp('2019-06-23', tz='Europe/Zurich')
country_code = 'CH'  # swiss
pandaDp=client2.query_day_ahead_prices(country_code, start=start, end=end)

I would like to get the values from one specific hour to the next specific hour. However, I am just getting the same exact thing which the values from 00:00 to 00:00 the next day. Am I missing anything? Once again thanks a lot

KeyError: 'start'

Hello,

I am trying to use the ENTSOE package to download gas burn in CCGTs in Germany. I get the following error when I try to run the query below

de_ccgt=client.query_generation(country_code, start, end, psr_type=None)

error

Traceback (most recent call last):

  File "<ipython-input-67-2c98097570c2>", line 1, in <module>
    de_ccgt=client.query_generation(country_code, start, end, psr_type=None)

  File "C:\Users\ah76671\AppData\Local\Continuum\anaconda2\lib\site-packages\entsoe\entsoe.py", line 393, in year_wrapper
    start = kwargs.pop('start')

KeyError: 'start'

Any idea how to solve this? Thanks for your help

duplicated timestamps for ranges longer than one year

When looking for date ranges that include more than one year, the result has (at least) a duplicate element.

client = EntsoePandasClient(api_key=api_key)

start = pd.Timestamp('20180101', tz='UTC')
end = pd.Timestamp('20190901', tz='UTC')

dat = client.query_day_ahead_prices('AT', start=start, end=end)

dat.index.shape
(8041,)
dat.index.unique().shape
(8040,)
dat[dat.index.duplicated()]
2019-01-01 01:00:00+01:00 39.76

start = pd.Timestamp('20150101', tz='UTC')
end = pd.Timestamp('20190901', tz='UTC')
dat = client.query_day_ahead_prices('DK-1', start=start, end=end)
dat.shape

(40898,)
dat.index.unique().shape
(40894,)

dat[dat.index.duplicated()]
2016-01-01 01:00:00+01:00 16.04
2017-01-01 01:00:00+01:00 20.90
2018-01-01 01:00:00+01:00 26.43
2019-01-01 01:00:00+01:00 10.07

I assume this comes from the splitting into yearly blocks.

Version: 0.2.10

NoMatchingDataError for query_load with country_code 'DE' (Germany)

Compliments for the package, it was very useful for our covid-19 research / blogpost https://amperon.co/blog/covid-19-impact-on-global-energy-markets/

Almost everything I desired worked, except downloading the German load data.
Running the next code block in my jupyter notebook with my own api_key returned a NoMatchingDataError. Running the same for country_code = 'FR' would return proper data
Also running the client.query_wind_and_solar_forecast for country_code = 'DE' would work

from entsoe import EntsoePandasClient
import pandas as pd
client = EntsoePandasClient(api_key='xxx')
start = pd.Timestamp('20200108', tz='Europe/Brussels')
end = pd.Timestamp('20200109', tz='Europe/Brussels')
country_code = 'DE'
entsoe_load = pd.DataFrame(client.query_load(country_code, start=start,end=end)).reset_index()
entsoe_load.head()

Exact outcome:

NoMatchingDataError: between 2020-01-08 00:00:00+01:00 and 2020-01-09 00:00:00+01:00

NoMatchingDataError Traceback (most recent call last)
in ()
5 end = pd.Timestamp('20200109', tz='Europe/Brussels')
6 country_code = 'DE'
----> 7 entsoe_load = pd.DataFrame(client.query_load(country_code, start=start,end=end)).reset_index()
8 entsoe_load.head()

/home/geert/.local/lib/python3.6/site-packages/entsoe/entsoe.py in year_wrapper(start, end, *args, **kwargs)
740 if sum([f is None for f in frames]) == len(frames):
741 # All the data returned are void
--> 742 raise NoMatchingDataError
743
744 df = pd.concat(frames, sort=True)

NoMatchingDataError:

About my Jupyter Notebook:

The version of the notebook server is: 5.2.2
The server is running on this version of Python:
Python 3.6.8 (default, Oct 7 2019, 12:59:55) [GCC 8.3.0]
Current Kernel Information:
Python 3.6.8 (default, Oct 7 2019, 12:59:55)
IPython 5.5.0 -- An enhanced Interactive Python.

query_contracted_reserve_prices error

Hello,

There are some errors on API for getting data for the "query_contracted_reserve_prices" and "query_contracted_reserve_amount for Romanian Country".

NoMatchingDataError Traceback (most recent call last)
in
----> 1 client.query_contracted_reserve_amount(country_code=Area.RO, start=start, end=end, type_marketagreement_type='A01')

f:\ML\Git_Repo\Market-Price-Forecast\1. DAM_Market_Price_Forecast\env\lib\site-packages\entsoe\entsoe.py in year_wrapper(start, end, *args, **kwargs)
835 if sum([f is None for f in frames]) == len(frames):
836 # All the data returned are void
--> 837 raise NoMatchingDataError
838
839 df = pd.concat(frames, sort=True)

NoMatchingDataError:

Also Day-Ahead Market Price , Actual Load, Load Forecast, Generation Forecast, Crossborder Flows, Scheduled Exchanges doesn't have a default column name, but this can be solved with pandas DataFrame function.

Petrica

time zone inconsistency with UK spot prices

I suspect that there might be a time zone inconsistency with the results of the UK spot price query (may possibly extend to other UK variables). When the timezone is set to 'Europe/Berlin' or 'CET', the returned pandas series datetime index does not match that on the ENTSOE website when displayed in the equivalent timezone.

The following code snippet

country_code = 'GB'
start = pd.Timestamp('20190614', tz='Europe/Berlin')
end = pd.Timestamp('20190615', tz='Europe/Berlin')
dap = client.query_day_ahead_prices(country_code, start=start, end=end)

returns the following

This does not match that on the website

response status code 400 -- No RESULT OBTAINED --

Hi, I tried to use entsoe-py, I can make valid base_requests, but when I try to get a query_price I get the response:

response status code 400
-- No RESULT OBTAINED --

So this is possible a feature request, what about adding a logger to entsoe-py to see what is going on and to be able to to see what went wrong? I am pretty sure I invoked the query_price(...) method with the correct parameters. Also some examples would be nice to see how the module actually works. I had some problems to get the imports working in my virtual environment.... Otherwise nice work and I hope I get it going...

Retrieve static data

Dear EnergieID team,

thanks for your work! Is there any possibility to query static data such as installed capacity per generation unit? If I see correctly only dynamic data is available so far

Thank you

Issue with Hydro Pumped Storage values

Hi,

The entsoe pandas client (probably also the Raw client but I didn't tested it), does not take into consideration the "negative values" of the Hydro Pumped Storage. This production is able to pump water and so have a negative production (it's a charge not a generator).

The ENTSOE API website does the difference

But the Python client does not :

I don't know how to do the difference too in the raw XML

	<TimeSeries>
		<mRID>5</mRID>
		<businessType>A01</businessType>
		<objectAggregation>A08</objectAggregation>
		<inBiddingZone_Domain.mRID codingScheme="A01">10YFR-RTE------C</inBiddingZone_Domain.mRID>
		<quantity_Measure_Unit.name>MAW</quantity_Measure_Unit.name>
		<curveType>A01</curveType>
		<MktPSRType>
			<psrType>B10</psrType>
		</MktPSRType>
		<Period>
			<timeInterval>
				<start>2019-05-20T05:00Z</start>
				<end>2019-05-20T06:00Z</end>
			</timeInterval>
			<resolution>PT60M</resolution>
				<Point>
					<position>1</position>
                        <quantity>1902</quantity>
				</Point>
		</Period>
	</TimeSeries>
	<TimeSeries>
		<mRID>6</mRID>
		<businessType>A01</businessType>
		<objectAggregation>A08</objectAggregation>
		<outBiddingZone_Domain.mRID codingScheme="A01">10YFR-RTE------C</outBiddingZone_Domain.mRID>
		<quantity_Measure_Unit.name>MAW</quantity_Measure_Unit.name>
		<curveType>A01</curveType>
		<MktPSRType>
			<psrType>B10</psrType>
		</MktPSRType>
		<Period>
			<timeInterval>
				<start>2019-05-20T03:00Z</start>
				<end>2019-05-20T05:00Z</end>
			</timeInterval>
			<resolution>PT60M</resolution>
				<Point>
					<position>1</position>
                        <quantity>1236</quantity>
				</Point>
				<Point>
					<position>2</position>
                        <quantity>308</quantity>
				</Point>
		</Period>
	</TimeSeries>

The first TimeSeries should be considerated as positive and the second as negative.

I have send an email to ENTSO-E platform for more information.

Cyprus DOMAIN_MAPPINGS and TIMEZONE_MAPPINGS missing

Cyprus mappings are missing in the mappings dictionaries.
What I could find for sure is the fix for DOMAIN_MAPPINGS & TIMEZONE_MAPPINGS,

item for DOMAIN_MAPPINGS: 'CY': '10YCY-1001A0003J',
item for TIMEZONE_MAPPINGS: 'CY': 'Asia/Nicosia'

query_generation and ENTSO-E point NoneType

On large request, I got an error:

The code:

country = 'DE'
start = pd.Timestamp('2018-01-01 00:00:00', tz='Europe/Brussels')
end = pd.Timestamp('2018-12-31 12:00:00', tz='Europe/Brussels')

result = client.query_generation(country, start=start, end=end, psr_type=None)

The log:

Traceback (most recent call last):
  File "./project/test.py", line 23, in <module>
    data = ENTSOE.getData(countries, start, end)
  File "./project/entsoe/ENTSOE.py", line 72, in getData
    result = client.query_generation(country, start=start, end=end, psr_type=None)
  File "./project/entsoe/entsoe.py", line 537, in year_wrapper
    in blocks]
  File "./project/entsoe/entsoe.py", line 536, in <listcomp>
    frames = [func(*args, start=_start, end=_end, **kwargs) for _start, _end
  File "./project/entsoe/entsoe.py", line 665, in query_generation
    df = parse_generation(text)
  File "./project/entsoe/parsers.py", line 71, in parse_generation
    ts = _parse_generation_forecast_timeseries(soup)
  File "./project/entsoe/parsers.py", line 247, in _parse_generation_forecast_timeseries
    quantities.append(float(point.find('quantity').text))
AttributeError: 'NoneType' object has no attribute 'text'

I imagine that the line quantities.append(float(point.find('quantity').text)) does not check if the quantity exist ? It feels like ENTSO-E is providing the uncorrect formated data.

Query cross-border flows (enhancement)

Very useful tool, congratulations to the team.

I see that currently one cannot query the cross-border flows, e.g.:
https://transparency.entsoe.eu/transmission-domain/physicalFlow/show

Do you plan to add this in a future release?

SSLCertVerificationError

Hello,

I have these problem with the API. Can you help me! Thanks a lot!

MaxRetryError: HTTPSConnectionPool(host='transparency.entsoe.eu', port=443): Max retries exceeded with url: /api?documentType=A44&in_Domain=10YRO-TEL------P&out_Domain=10YRO-TEL------P&securityToken=XXXXXXXXXX&periodStart=202010112200&periodEnd=202010122200 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1076)')))

DE-AT-LU bidding zone split

I have found this on the Entsoe-E news section concerning a 'DE-AT-LU' bidding zone split that is coming up:

Dear Users,

Please be informed, that we will start preparation of the platform for DE-AT-LU bidding zone split on >the 23rd of August. BZN|DE-AT-LU will be separated into 2 new bidding zones BZN|DE-LU and >BZN|AT.

New bidding zones will be active from the 1st of October, however, first data submissions, like month >ahead forecasts, are expected from the 1st of September.

Validity end date for BZN|DE-AT-LU is the end of September 2018.

EIC codes for new bidding zones:
BZN|DE-LU - 10Y1001A1001A82H
BZN|AT - 10YAT-APG------L (same as control area or country)

Thanks for your patience and understanding.

Best Regards
Transparency Platform Team.

These new EIC codes should be added to the code, while making sure that you can still perform historical queries.

Mappings to extend

There is a document "ENTSO-E codelists" which contains a complete list of constants like "processType" and others. You may extend your mappings with data from there. See links below.

The whole library: https://www.entsoe.eu/publications/electronic-data-interchange-edi-library/

The code list itself (sorry, only link to current latest version): https://www.entsoe.eu/Documents/EDI/Library/CodelistV71.zip

It contains xsd files and probably you could just extract mappings from there instead of hardcoding constants in python code.

query_installed_generation_capacity_per_unit broken

The truncation in

entsoe-py/entsoe/entsoe.py

Line 761 in ae578fc

df = df.truncate(before=start, after=end)

leads to an error as the dataframe df does not have a DateTimeIndex. Commenting the line resolves the bug.

Some periods are missing

Hi!
I'm using the generation query for several years and countires.
I count the number of different periods and some periods are missing. The function doesn't return the date. I don't know wich ones are missing.

COUNT
COUNTRIE -//- 2018 -//- 2019
ES -//- 8759 -//- 8755
FR -//- 8757 -//- 8751
IT -//- 8751 -//- 8760
PT -//- 8760 -//- 8760

I use the same code changing the countrie.

CODE
from entsoe import EntsoePandasClient
import pandas as pd
client = EntsoePandasClient(api_key='MY KEY')
start = pd.Timestamp('20180101', tz='Europe/Brussels')
end = pd.Timestamp('20190101', tz='Europe/Brussels')
country_code = 'FR'
ts = client.query_generation(country_code, start=start,end=end, psr_type=None)
print (ts)

Making another request - EntsoePandasClient

Is it possible to make another requested not included in that listed by EntsoePandasClient?

for what i can see, using EntsoeRawClient is possible to Making another request if the API-call I want is not in the list, instead i do not find the same thing for EntsoePandasClient.

thanks in advance.

Gabriele

Does anyone happen to know the rate limit of entsoe?

Thanks for making entsoe-py, it is a very helpful library!

I am trying to run the code periodically but I am getting HTTP error (too many requests) from time to time, does anyone happen to know what is the approximate rate limit for entsoe's API?

Thanks

Download of imbalance prices has stopped working

Download of imbalance prices has stopped working, it is very likely because API request for imbalance prices now returns ZIP file, whereas API request for day ahead prices (which still works) returns directly XML data.

Truncate requires a sorted index

Executing for example this query for Belgian offshore wind for the year 2018

client.query_generation('BE', 
    start=pd.Timestamp('2018-01-01', tz='UTC'), 
    end=pd.Timestamp('2019-01-01', tz='UTC'), 
    psr_type='B18'
)

results in a ValueError: truncate requires a sorted index from pandas which requires the truncated series to be sorted first. This happens on line 903 in entsoe.py:

 df = df.truncate(before=start, after=end)

If I manually do

 df = df.sort_index().truncate(before=start, after=end)

in a debugger, it works fine.

I’m running entsoe-py v. 0.2.12 and pandas v. 1.0.3.

missing zones in the timezone-mapping

Example:

Input

client.query_import(start=pd.TimeS, end=end, country_code="DE-AT-LU")

Last Error Message

D:\Programme\Anaconda3\lib\site-packages\entsoe\entsoe.py in query_crossborder_flows(self, country_code_from, country_code_to, start, end, lookup_bzones)
    973             lookup_bzones = lookup_bzones)
    974         ts = parse_crossborder_flows(text)
--> 975         ts = ts.tz_convert(TIMEZONE_MAPPINGS[country_code_from])
    976         ts = ts.truncate(before=start, after=end)
    977         return ts

KeyError: 'IT-NORD-AT'

In my eyes, the TIMEZONE_MAPPINGS misses the following lines:

'IT-NORD-AT': 'Europe/Rome',
'IT-NORD-FR': 'Europe/Rome',
'IT-GR': 'Europe/Rome'

Transmission capacity forecast

I believe that in addition to 'query_crossborder_flows', it is quite straightforward to create 'query_crossborder_Capacity_Forecast' using documenttype 'A61'. The codes will be equivalent, except that extra granularities ( contract_MarketAgreement = 'A01','A02','A03') are allowed for the capacity forecasts.

GB seeming to incorrectly resolve in mappings.py

Hi guys,

I noticed an issue earlier I thought I'd make you aware of.

When calling query_load with 'GB' as the country it resolves the area to '10YGB----------A', however I believe it should resolve to simply 'GB' as this is the value that is used to get country load in the Entsoe API docs.

from entsoe import EntsoePandasClient
import pandas as pd

client = EntsoePandasClient(api_key=ENTSOE_API_KEY)

start = pd.Timestamp('2020-11-1', tz='Europe/London')
end= pd.Timestamp('2020-11-2', tz='Europe/London')

print(client.query_load('GB', start=start, end=end))

I'd expect the above to display the data for the Country Total Load for United Kingdom however it instead returns the data for the Bidding Zone.

Let me know if you need any further information.

Regards,
Ditchell

Possibility to get country area for load ?

I did a modification in the code on my side:

    @year_limited
    def query_load(self, country_code, start, end, psr_type=None,
                         lookup_bzones=False) -> pd.Series:
        """
        Parameters
        ----------
        country_code : str
        start : pd.Timestamp
        end : pd.Timestamp
        psr_type : str
            filter on a single psr type
        lookup_bzones : bool
            if True, country_code is expected to be a bidding zone

        Returns
        -------
        pd.Series
        """
        text = super(EntsoePandasClient, self).query_generation(
            country_code=country_code, start=start, end=end, psr_type=psr_type,
            lookup_bzones=lookup_bzones)
        series = parse_loads(text)
        series = series.tz_convert(TIMEZONE_MAPPINGS[country_code])
        return series

It allows to get the load per countries. Is it possible to add this to the code on this repo ?

from entsoe import EntsoePandasClient does not work

Thank your for the downloader. I am trying to create one from entso-e that takes generation on a unit basis, which is usually given only in 24-hour frames, and instead to extend the downloaded generation data to a calendar year series.
For this I wanted to test your downloader and to seek inspiration.
However, although I installed entsoe-py (pip install), all the requirements were satisfied, I fail to import as the first step 'from entsoe import EntsoePandasClient'. The module is not found. I am using Python 3, Jupiter notebooks. What am I doing wrong? Do not judge me for not knowing - I am quite new in this downloaders' business.

scheduled commercial exchanges

It seems to me that the query_crossborder_flows method can only return physical flows. Is there a way to retrieve scheduled commercial exchanges ?

From what I can see in the code, the function has an hardcoded documentType of A11, corresponding to physical flows. It should be enough to turn this into an input keyed argument to make retrieval of scheduled exchanges possible. documentType for scheduled exchanges is A09 (please double check).

A caveat is that scheduled exchanges have a "Day Ahead" version and a "Total" version in the ENTSOE table. If I try to naively modify the function as detailed above, the code works but retrieves both versions, with no field specifying what is what:

def query_crossborder_flows(self, country_code_from, country_code_to, start, end, lookup_bzones=False, documentType="A11"):
        """
        Parameters
        ----------
        country_code_from : str
        country_code_to : str
        start : pd.Timestamp
        end : pd.Timestamp
        lookup_bzones : bool
            if True, country_code is expected to be a bidding zone
        documentType :  "A09" for scheduled commercial flows, "A11" for physical flows

        Returns
        -------
        str
        """
        if not lookup_bzones:
            domain_in = DOMAIN_MAPPINGS[country_code_to]
            domain_out = DOMAIN_MAPPINGS[country_code_from]
        else:
            domain_in = BIDDING_ZONES[country_code_to]
            domain_out = BIDDING_ZONES[country_code_from]

        params = {
            'documentType': documentType,
            'in_Domain': domain_in,
            'out_Domain': domain_out
        }
        response = self.base_request(params=params, start=start, end=end)
        return response.text

client.query_installed_generation_capacity not working as expected

Hi, the above function seems to be not returning the expected data. here is my code:

from entsoe import EntsoePandasClient
import pandas as pd
entsoe_api_key = 'my_key'
client = EntsoePandasClient(api_key=entsoe_api_key)
st = pd.Timestamp('20170101', tz='Europe/Brussels')
en = pd.Timestamp('20170601', tz='Europe/Brussels')
cc = 'DE' # Germany
igc = client.query_installed_generation_capacity(country_code=cc, start=st, end=en)
print(igc)

returns only a single line from 2017-01-01, I have tried with many date combinations, it always returns the first date in the range and no other dates? Many thanks

Time zone ignored in start and end parameters to base request

Hi,
When giving time-zoned timestamps to start and end arguments of, e.g., EntsoePandasClient.generation, the latter are converted to strings without timezones in entsoe.base_request and given as periodStart and periodEnd parameters to requests.Session().get.
In the ENTSO-E Transparency API documentation, it reads that these parameters use local time-zones.

However, the TimeInterval parameter should allow one to give time-zoned time intervals in ISO UTC format.

Could we add a TimeInterval argument to give to the query functions instead of the start and end arguments to ask for data in ISO UTC format?

The code to reproduce this behavior:

from entsoe import EntsoePandasClient
client = EntsoePandasClient(api_key=security_token)

start = pd.Timestamp('2015-01-01T00:00Z') 
end = pd.Timestamp('2016-01-01T00:00Z') 
client.generation('SE-4', start=start, end=end, psr_type='B19', lookup_bzones=True)

Error fetching unavailablility of generation units with pandas client

I always get the following error when trying to fetch:

File "u:\code\new\entsoe-outage-fetcher.py", line 29, in <module>
    run()
  File "u:\code\new\entsoe-outage-fetcher.py", line 24, in run
    df = client.query_unavailability_of_generation_units(country_code, start=start_date, end=end_date, docstatus=None)
  File "C:\Users\me\AppData\Roaming\Python\Python36\site-packages\entsoe\entsoe.py", line 537, in year_wrapper
    in blocks]
  File "C:\Users\me\AppData\Roaming\Python\Python36\site-packages\entsoe\entsoe.py", line 536, in <listcomp>
    frames = [func(*args, start=_start, end=_end, **kwargs) for _start, _end
  File "C:\Users\me\AppData\Roaming\Python\Python36\site-packages\entsoe\entsoe.py", line 518, in pagination_wrapper
    df = func(*args, start=start, end=end, **kwargs)
  File "C:\Users\me\AppData\Roaming\Python\Python36\site-packages\entsoe\entsoe.py", line 792, in query_unavailability_of_generation_units
    df = parse_unavailabilities(content)
  File "C:\Users\me\AppData\Roaming\Python\Python36\site-packages\entsoe\parsers.py", line 352, in parse_unavailabilities
    with zipfile.ZipFile(BytesIO(response), 'r') as arc:
TypeError: a bytes-like object is required, not 'DataFrame'

end_date: Timestamp('2015-01-02 00:00:00+0100', tz='CET')
start_date: Timestamp('2015-01-01 00:00:00+0100', tz='CET')

client.query_unavailability_of_generation_units('DE', start=start_date, end=end_date, docstatus=None)

EDIT: I think you are not converting the zip file into a DataFrame, see here. You do, however, seem to convert the bytes into a DataFrame some lines below.

EDIT 2: Could you elaborate more on the difference between generation and production units? :)

Time periods?

These are my first steps into Python.
How can I get the time period as a column when I use this code?
I think this is important when the hour change form winter to summer time (or summer to winter)
Can the functions return the time period?
Thank you.

CODE
from entsoe import EntsoePandasClient
import pandas as pd
client = EntsoePandasClient(api_key='KEY')
start = pd.Timestamp('20200101', tz='Europe/Brussels')
end = pd.Timestamp('20210101', tz='Europe/Brussels')
country_code = 'ES'

methods that return Pandas DataFrames

ts = client.query_generation(country_code, start=start,end=end, psr_type=None)
print (ts)

Add timeout to session request

I propose adding a timeout for the request in entsoe.py to prevent it from hanging if the server does not respond. (As it did last week.)

In line 105
response = self.session.get(url=URL, params=params, proxies=self.proxies)

add the parameter timeout, for example timeout=1.

Missing <resolution> type

_resolution_to_timedelta(res_text) function in parsers.py.

Add PT30M as a possible resolution text in the soup:

'''
elif res_text == 'PT30M':
delta = '30min'
'''

Imbalance prices not available for bidding zones but only for countries

When using request for imbalance prices, the expected area code is mapped from DOMAIN_MAPPINGS, thus only allows to specify country code.

When reading imbalance prices manually from the Transparency Platform it is possible to access imbalance prices per bidding zone which should also be the case when using Python API.

Suggestion - substitute domain in query_imbalance_prices from DOMAIN_MAPPINGS to BIDDING_ZONES.

query_generation_per_plant for Germany

I saw few closed issues related to this but I don't seem able to make this work
Did entsoe remap for Germany since the last update?

Tried 'DE', 'DE-LU', 'DE-AT-LU' with and without lookup_bzones, and got a Bad Request for all cases :(
Thanks!

Generation Forecast Austria contains unexplained zeros

I am running query_generation_forecast for 'AT' using version 0.3.0 (pip installed) and am receiving the following, as an example:
2017-09-28 18:00:00+00:00 7724.0
2017-09-28 19:00:00+00:00 0.0
2017-09-28 20:00:00+00:00 0.0
2017-09-28 21:00:00+00:00 5145.0

When the entsoe data platform shows (CET):
20:00 - 21:00 7724
21:00 - 22:00 5541
22:00 - 23:00 5408
23:00 - 00:00 5145
The corresponding Scheduled Consumption in the data platform is 0.

As you can see some returned values are correct, others are just zero. This behaviour happens consistently every day and doesn't seem to follow a specific intuitive rule.
Before updating to 0.3.0, I had exactly the same issue with 'DE-AT-LU'

Confusion between PeriodStart/End and PeriodStartUpdate/EndUpdate?

Hi there,

I'm trying to look at outages published on the Transparency Platform, and I'm having some dates problems.

From what I understood of the API, it is possible to specify in a request two different intervals. One representing the dates/times of the last update of outages, and the other representing the dates/times of the outage itself. If I'm correct, the former is set using the parameters PeriodStartUpdate and PeriodEndUpdate, and the latter using PeriodStart and PeriodEnd.

However, in your EntsoePandasClient, it looks like you're mixing them, and more precisely in _query_unavailability where you use start/end (passed as PeriodStart/PeriodEnd in the actual request) to truncate the dataframe on its index. However, this index represents the last update time, so shouldn't this truncation be done on periodstartupdate/periodendupdate when they are given?

Best,
Yannick

lookup_bzones in query imbalance prices

Hello,

great job, this package helped me alot!

Can you add lookup_bzones logic to other queries as well? I see data exist for:
query_imbalance_prices,
query_generation,
query_wind_and_solar_forecast,
query_installed_generation_capacity,

generation by unit for germany

I am trying to query generation by unit for Germany with the following:

start = pd.Timestamp("20190801", tz="Europe/Brussels")  
end = pd.Timestamp("20190802", tz="Europe/Brussels")  
zone_code = "DE"  
degen = client.query_generation_per_plant(country_code="DE", start=start, end=end)

but the code returns the following error:
400 Client Error: Bad Request for url: https://transparency.entsoe.eu/api?documentType=A73&processType=A16&in_Domain=10Y1001A1001A83F&securityToken=35e58724-cbfb-4bf6-b12d-ad8ab9c49171&periodStart=201907312200&periodEnd=201908012200

Looking in the ENTSOE transparency portal I see that generation by unit in Germany seems to be reported only by control area, not by country total and I wonder if this is the reason why the query fails. If this is the case, I don't see in the documentation how I could query by control area.

An example of what I am trying to retrieve is here:

Logging definition mismatches

Hello,

Thanks for the package - has saved me a lot of time and effort!

I get a crashing and logging overload when I use the package along with the logging implemented as part of my wider project.
Code to reproduce:
`import sys, os
import datetime as dt

from entsoe import EntsoePandasClient
import pandas as pd

import logging

logging setttings

LOGNAME = "entsoe_test"
LOGTOFILE = True # true = create and send all messages or errors to a log file
LOGFILENEWONSTART = False # true = create a new file each startup, false = append across multiple startups
LOGCONSOLETOFILE = True # true = echo the stderr and stdout to the log file as well as application messages, it usually desireable to keep this turned on
LOGFILENAME = "test.log"
LOGLEVEL = "DEBUG"

global logger variable

logger = logging.getLogger(LOGNAME)

proxies = None
entsoe_api_key = ""

class StreamLogger(object):
def init(self, stream, prefix=""):
self.stream = stream
self.prefix = prefix
self.data = ""

def write(self, data):
    self.stream.write(data)
    self.stream.flush()

    self.data += data
    tmp = str(self.data)
    if "\x0a" in tmp or "\x0d" in tmp:
        tmp = tmp.rstrip("\x0a\x0d")
        logger.info("%s%s" % (self.prefix, tmp))
        self.data = ""

def flush(self):
    pass

def init_logging(runid, name=LOGFILENAME):
print(f"Saving LOGFILE to {name}")

if LOGTOFILE == True:

    if LOGFILENEWONSTART == True:
        try:
            os.remove(name)
        except:
            None

    hdlr = logging.FileHandler(name)
    formatter = logging.Formatter("%(asctime)s %(levelname)s %(message)s")
    hdlr.setFormatter(formatter)
    logger.addHandler(hdlr)

    if LOGLEVEL == "DEBUG":
        logger.setLevel(logging.DEBUG)
    if LOGLEVEL == "INFO":
        logger.setLevel(logging.INFO)
    if LOGLEVEL == "WARNING":
        logger.setLevel(logging.WARNING)
    if LOGLEVEL == "ERROR":
        logger.setLevel(logging.ERROR)
    if LOGLEVEL == "CRITICAL":
        logger.setLevel(logging.CRITICAL)

    # with this redirect stdout and std error will go to both the consol and the log.
    if LOGCONSOLETOFILE == True:

        sys.stdout = StreamLogger(sys.stdout, ":  [stdout] ")
        sys.stderr = StreamLogger(sys.stderr, ":  [stderr] ")

# log some useful info to the log file
logger.info("---------------------------------")
logger.info("---------------------------------")
logger.info(f"RUNID: {runid}")
try:
    logger.info(f"USER: {os.getlogin()}")
    logger.info(f"COMPUTERNAME: {os.environ['COMPUTERNAME']}")
    logger.info(f"OS: {os.environ['OS']}")
    logger.info(f"CODE EXE: {sys.executable}")
except:
    pass

def writelog(level, msg):

if LOGTOFILE == True:

    if level == "DEBUG":
        logger.debug(LOGNAME + ":  " + msg)
    if level == "INFO":
        logger.info(LOGNAME + ":  " + msg)
    if level == "WARNING":
        logger.warning(LOGNAME + ":  " + msg)
    if level == "ERROR":
        logger.error(LOGNAME + ":  " + msg)
    if level == "CRITICAL":
        logger.critical(LOGNAME + ":  " + msg)
    if level == "EXCEPTION":
        logger.exception(LOGNAME + ":  " + msg)

else:  # at least tell the console
    if level == "DEBUG":
        print("DEBUG: " + LOGNAME + ":  " + msg)
    if level == "INFO":
        print("INFO: " + LOGNAME + ":  " + msg)
    if level == "WARNING":
        print("WARNING: " + LOGNAME + ":  " + msg)
    if level == "ERROR":
        print("ERROR: " + LOGNAME + ":  " + msg)
    if level == "CRITICAL":
        print("CRITICAL: " + LOGNAME + ":  " + msg)
    if level == "EXCEPTION":
        print("EXCEPTION: " + LOGNAME + ":  " + msg)

init_logging("TEST", name="test.log")

writelog("INFO", "Setting up")

client = EntsoePandasClient(api_key=entsoe_api_key, proxies=proxies)

start = dt.date(2020, 11, 15)
end = dt.date(2020, 11, 19)

start = pd.Timestamp(start, tz="Europe/London")
end = pd.Timestamp(end, tz="Europe/London")

country = "FR"
writelog("DEBUG", f"ENTSOE: Gathering Data for {country}")
df_country = client.query_load_forecast(country, start=start, end=end).to_frame(
f"{country}_LOAD_FORECAST"
)

writelog("INFO", "Done")
`

To fix it, within the entsoe.py I can define the logging object like:
logger = logging.getLogger(__name__)
and use like:
logger.debug(f'Performing request to {URL} with params {params}')

Does that work as a fix?

Issue with Installed Capacity per Production Type

I would like to download the installed capacity per production type for each country. For this, I am using the following:

client.query_installed_generation_capacity(country_code, start=start,end=end, psr_type=None)

For some country codes, I get an empty dataframe; even though the data exist online, e.g. for FI (Finland). Is there a reason for that?

Thank you in advance.

Discarded data in EntsoePandasClient._query_unavailability

Hi there,

While looking at the data returned by a call to EntsoePandasClient._query_unavailability, I saw that there was data missing from what we can see on the transparency website.

It turns out that the issue comes from the year_limited decorator, and more precisely from this line (l.839 of entsoe.py in the current version):
df = df.loc[~df.index.duplicated(keep='first')]

What is the intended purpose of this line? I'm asking this because at the moment, data returned by the API for doctypes A77/A80 is split by time periods, not by version or something like that. That means that this line only keeps the first period of the entire actual outage, discarding lots of data in the process.

In my opinion, I'd say what needs to be changed is the parsing of outages (function _outage_parser in file parsers.py) to concatenate the different time periods and thus return a dataframe with only one line.

Best,
Yannick

Bad request for query_imbalace_prices

dear all,
first of all, thank's very much for developing this package, it saved me a lot of time learning to work with the original API.
For the most part it works fine for me and I can easily query load- and generation data on different countries.
However I'm having troubles with the "query_imbalance_prices" function. My code looks like this:

from entsoe import EntsoePandasClient
import pandas as pd
from datetime import date

s_token = 'my_token'

year = 2015
start_month = 1 
end_month = 1
end_day = 31
country = 'DE'
client = EntsoePandasClient(api_key= s_token)
 tz ='Europe/Berlin'
start = pd.Timestamp(year = year, month = start_month, day = 1, hour = 0 , minute = 0, tz = tz )
    end = pd.Timestamp(year = year, month = end_month , day = end_day  , hour = 23, minute = 45, tz = tz )
load_df = client.query_imbalance_prices(country_code = country, start = start, end = end, psr_type = 'A95')
HTTPError: 400 Client Error: Bad Request for url: https://transparency.entsoe.eu/api?documentType=A85&controlArea_Domain=10Y1001A1001A83F&psrType=A96

I tried a lot of different types for the psr_type and also included some "DocumentTypes", such as 'A81' - 'A86' and 'BSN-types' ('A95'-'98'), all resulting in the same error.
On the API documentation I found that you should also include the bidding zone, which the query function seems not to do.

Am I missing something or should the "query imbalance_prices" function take BSN-types or DocumentTypes instead of psr_types?

thanks very much for your help !

best
Philipp

Length mismatch: Expected axis has 24 elements, new values have 1 elements

Running:

from entsoe import EntsoePandasClient
import pandas as pd

client = EntsoePandasClient(api_key=api_key)

tz = "Europe/Dublin"

start = pd.Timestamp("20180101", tz=tz)
end = pd.Timestamp("20201015", tz=tz)

country_code = "IE-SEM"

df = client.query_day_ahead_prices(country_code, start=start, end=end)

fails with

Traceback (most recent call last):
  File "C:/Users/user/dev/entsoe-py/scripts/dayahead.py", line 16, in <module>
    df = client.query_day_ahead_prices(country_code, start=start, end=end)
  File "C:\Users\user\dev\entsoe-py\entsoe\entsoe.py", line 740, in year_wrapper
    frame = func(*args, start=_start, end=_end, **kwargs)
  File "C:\Users\user\dev\entsoe-py\entsoe\entsoe.py", line 798, in query_day_ahead_prices
    series = parse_prices(text)
  File "C:\Users\user\dev\entsoe-py\entsoe\parsers.py", line 39, in parse_prices
    series = series.append(_parse_price_timeseries(soup))
  File "C:\Users\user\dev\entsoe-py\entsoe\parsers.py", line 313, in _parse_price_timeseries
    series.index = _parse_datetimeindex(soup)
  File "C:\Users\user\AppData\Local\Continuum\miniconda3\envs\entsoe-py\lib\site-packages\pandas\core\generic.py", line 5143, in __setattr__
    return object.__setattr__(self, name, value)
  File "pandas\_libs\properties.pyx", line 66, in pandas._libs.properties.AxisProperty.__set__
  File "C:\Users\user\AppData\Local\Continuum\miniconda3\envs\entsoe-py\lib\site-packages\pandas\core\series.py", line 424, in _set_axis
    self._mgr.set_axis(axis, labels)
  File "C:\Users\user\AppData\Local\Continuum\miniconda3\envs\entsoe-py\lib\site-packages\pandas\core\internals\managers.py", line 226, in set_axis
    raise ValueError(
ValueError: Length mismatch: Expected axis has 24 elements, new values have 1 elements

My guess is that

entsoe-py/entsoe/parsers.py

Line 449 in 5d17669

end = pd.Timestamp(soup.find('end').text)

finds the first end tag <end>2018-01-01T01:00Z</end> in soup when it should really find the last one, i.e. <end>2018-01-02T00:00Z</end> so it ends up with a TimeIndex of length 1 (one hour) and fails with the above error.

I tried the above code with country_code = "BE" as this is also what the tests cover and in that case the response contains only one end tag. I am now wondering why the returned XML looks different for "BE" and "IE-SEM"? Should we handle the XML differently or can we just say we pick the last end tag in which case a fix might be minimal, trivial and would work for both cases.

energieid / entsoe-py Goto Github PK

entsoe-py's People

Contributors

Stargazers

Watchers

Forkers

entsoe-py's Issues

Example:

Input

Last Error Message

methods that return Pandas DataFrames

logging setttings

global logger variable

Recommend Projects

Recommend Topics

Recommend Org