Giter VIP home page Giter VIP logo

Comments (8)

sjpfenninger avatar sjpfenninger commented on July 3, 2024

As of 47a49c4 this is sort of possible by extending _TIMESERIES_PARAMS -- but that's still hardcoded..

from calliope.

brynpickering avatar brynpickering commented on July 3, 2024

To get over the hardcoding issue, it seems to make sense to populate _TIMESERIES_PARAMS based on file referrals in the YAML files. Then it is a matter of deciding how to catch only timeseries files (rather than spatial or load-rate based ones). You could:

  • assume that it is a time series file based on # of lines being the same as set_t, but this could cause issues if there are files with as many lines but referring to something else (unlikely as time series is usually the longest, but still possible).
  • name the file to make it obvious (e.g. append "_t", or similar), but then the onus is on the user to remember to put them in.
  • have the first column of all time based files be a timestamp rather than an integer. I suppose this way it also allows you to set time dependency at time intervals that differ from set_t (e.g. a daily change rather than hourly) but then you'd need to know how to interpolate between each value (step change, straight line, etc.).
  • simply ask them to list all values which change in time at the start of the yaml file. Easier to process but relies on the user to remember to fill it in.

from calliope.

brynpickering avatar brynpickering commented on July 3, 2024

Having tested extending _TIMESERIES_PARAMSI'm certain that there'll need to be an overhaul if it were to go beyond just r and e_ff. The reason being that in constraint generation other values are assumed to be static and so are not referred to in time (e.g. here for r_eff), simply using get_option instead.

Could have an optional input in get_option for time? I'll test it out as a possibility but I do wonder whether it might not be cleaner to just create arrays for the entire dataset, not just for r and e_eff. In this case, everything would be searchable as e.g. model.m.r_eff['ccgt','r1','2015 01 01 09:00'] rather than model.get_option('ccgt.constraints.r_eff', x=r1) and static values would just be repeated across the time dimension.

from calliope.

sjpfenninger avatar sjpfenninger commented on July 3, 2024

Re the first comment, I would say that option 3 (time-based data indexed by timestamps rather than integers) is probably best -- although it would require some thinking to ensure that all data files contain a consistent time dimension (which is easier to do if the time dimension is specified only once in a separate file).

Re the second comment, it is true that values like r_eff are currently assumed to be static in time. Hmm... I suspect the main issue with adding more parameters defined over time and space will be the performance impact, particularly on Pyomo. Perhaps this requires some testing? Iterating over a pandas DataFrame would is likely slower than get_option now is, and I think iterating over a Pyomo parameter would be too (but I'm not certain about that - nor am I certain whether that is a relevant time cost in the context of all the other stuff Pyomo does when constructing the model).

from calliope.

brynpickering avatar brynpickering commented on July 3, 2024
  1. The benefit of specifying the time dimension per file would be to actually remove the need to have the consistency. For instance, if your set_t file was hourly from 2005-01-01 to 2005-12-31 then you could have an e_eff file with values which are daily from 2005-01-01 to 2005-12-31 and the system could simply infer hourly values. This could get quite complicated, but it essentially says that provided the time-scale has the same lower and upper bound as set_t you could theoretically define any time-step granularity and it could deal with it. Granted, I'm not certain as to whether such functionality would be truly desirable.
  2. I've created a version in my fork that can load all the efficiencies in time. I'll test out scenarios of heavy file reading and light file reading use and see what happens.

from calliope.

brynpickering avatar brynpickering commented on July 3, 2024

OK, so I've used the example model running over 768 time steps to do some speed tests below. What it shows is an understandable increase in preprocessing, as more files need to be opened and more data sets have to be produced (instead of just relying on get_option). Once produced though, the actual runtime of optimisation is the same, due to the constraints all iterating over time anyway, so it doesn't matter if it is a single number or a number that varies in time. It does show that searching the data array is just as quick as getting the static value from a dictionary.
Basically, Pyomo doesn't change much, but just the act of opening and reading CSVs takes its toll. This suggests that any kind of additional preprocessing (e.g. to only load e_eff as a time dependant set following combining e.g. r_eff and c_eff into it) wouldn't improve solution time by much. It would need more information per CSV file (so fewer would need opening overall).
I should also point out that CPLEX reads the pyomo LP file in ~1s and runs the optimisation in ~1-2s in all cases.

  1. no efficiencies loaded from file - base case
    Preprocessing: 25s
    Optimisation: 44s
    Total: 69.2s
  2. e_eff loaded from file for csp, all static for ccgt:
    Preprocessing: 68s
    Optimisation: 46s
    Total: 113.7s
  3. e_eff loaded from file for both csp and ccgt:
    Preprocessing: 68s
    Optimisation: 41s
    Total: 108.5s
  4. e_eff and r_eff loaded from file for both csp and ccgt:
    Preprocessing: 114s
    Optimisation: 47s
    Total: 161.1s
  5. e_eff, r_eff and c_eff loaded from file for both csp and ccgt:
    Preprocessing: 173s
    Optimisation: 49s
    Total: 222.1s
  6. e_eff, r_eff and c_eff loaded from file for just csp:
    Preprocessing: 172s
    Optimisation: 59s - not sure what happened here!
    Total: 230.9s
  7. e_eff, r_eff and c_eff loaded from file for just ccgt:
    Preprocessing: 168s
    Optimisation: 51s
    Total: 219.1s

Note:

  1. all efficiency files were loaded with randomised data, between 0.1 and 0.8 for csp.e_eff and ccgt.r_eff and 0.8 - 1 for all other e_eff, r_eff, and c_eff cases.
  2. r is always loaded from file for demand and csp.

from calliope.

brynpickering avatar brynpickering commented on July 3, 2024

So this has been on the backburner for a while. I came back to it today and have done some more in-depth comparisons.
Time-penalty comes purely from getting the data element at any given time-step during the constraint setting phase. For the previous sets of timings, it was using the 'loc' function to search the DataArrays of the timeseries constraints. This is quite time consuming.
I tested it then with getting the data from the Pyomo param, it was significantly faster. Below are the results in some more detail (1/01 - 31/03 in example model):

  1. No efficiencies from file:
  • Current Master
    preprossessing: 77.5s
    solving: 115s
    tot: 196s
  • DataArray searching
    preprossessing: 86.55s
    solving: 112.86
    tot: 204s
  • Pyomo param searching
    preprossessing: 78s
    solving: 112s
    tot: 195s

2.e_eff from file:

  • Current Master
    preprossessing: 87s
    solving: 119s
    tot: 209s
  • DataArray searching
    preprossessing: 248s
    solving: 127s
    tot: 378s
  • Pyomo param searching
    preprossessing: 85s
    solving: 119s
    tot: 207s

3.e_eff & r_eff efficiencies from file:

  • Current Master (not possible)
  • DataArray searching
    preprossessing: 437s
    solving: 190
    tot: 671
  • Pyomo param searching
    preprossessing: 86s
    solving: 124s
    tot: 214s

So loading multiple efficiencies from file doesn't seem to have any greater effect than loading a single efficiency from file, provided that Pyomo Params are used for storing and accessing the data. In the original master this was done for both r and e_eff anyway. I moved away from it to introduce generality, but I've since been able to provide sufficient generality to work with the Params for any possible time-dependent constraint.

I've attached an excel of the python profiler results for these runs (including any functions with tottime >1s)
Issue 7 profiler results.xlsx

from calliope.

brynpickering avatar brynpickering commented on July 3, 2024

This is now set to be merged into master as part of pull request #28

from calliope.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.