Please refer wiki for theortical explanation of the project
The repository contains python scripts for the general purpose of processing a population.xml for a MATSim model of the Strasbourg region. The model is developed within a framework of both Smart Charging and SILO projects.
Please refer wiki for the theoretical information regarding the project.
The current activity based model is based on the mobility EMD survey and census data that encompasses the following steps:
- Explanatory Data Analyses
- Processing EMD survey and modelling location choice
Apart from the standard library, following python packages have to be installed:
- pandas~=1.1.5
- numpy~=1.19.3
- scipy~=1.5.4
- statsmodels~=0.12.0
- matplotlib~=3.3.2
- seaborn~=0.11.0
- biogeme~=3.2.6
- sklearn~=0.0
- scikit-learn~=0.23.2
- Shapely~=1.7.1
- geopandas~=0.8.1
- requests~=2.24.0
- gdal~=3.1.2
- overpass~=0.7
- Thesis
- [Documentation](X:\Groups\N41\GROUP-ORGA\MOBILITY_cluster\Activities\Smart_Charging\Method\20201002 FURTHER DEVELOPMENTS IN TRAFFIC MODEL.docx)
- GIS data: OSM
- matsim_pop_gen_strasbourg/input/size_with_iris: Each
activity
file with the suffix with_size describes the magnitude/size of individuals doing that particularactivity
with their IRIS. Format: DCOMIRIS and size.
A. "Ancillary" python scripts
- Script for geolocating the SIRENEdb based on adress using the API of geocode_data_gouv_fr
helper_scripts/geocode_data_gouv_fr.py
**B. _abm (activity-based model) python scripts **
Run the scripts in the order:
daily_activity_pattern.py
time_of_day.py
location_of_activity.py
data_analysis.py
or simply run main.py
About the scripts:
daily_activity_pattern.py
: generates different chain types (activity, tour, trip) dataframes in non-condensed format where each row is an individual and each column describes the individual.time_of_day.py
: The dataframes generated by this script not only contain data fromdaily_activity_pattern.py
but also contain information on starting and ending time of each activity, trip duration and activity duration in a chain/ sequence format. (refer to issue #10)location_of_activity.py
: The dataframes generated by this script not only contain data fromtime_of_day.py
but also contain information on the location of each activity based on their IRIS or EMD ZONE.data_analysis.py
: The script analyses the data generated by all the above scripts in different ways.distance_between_iris.py
: Finding distance between EMD ZONES or IRISES or random points in the EMD ZONE.prepare_emd.py
: Takes the EMD survey data and makes into one merged dataset to be utilized for further analysis.regression_analysis.py
: Runs the multinomial logistic regression based on different parameters.generate_graphs.py
: Generates all types of graphs.emd_to_iris.py
: Converts each EMD ZONE to their specific IRIS based on a distributive algorithm. (refer to issue #9)define_area_of_interest.py
: To understand area of interest for our model
C. Run the file run.py
with the arguments set in the CONFIG.txt
:
- Read age group of population per zone
- Read Activity bahavoiur from precalulated file (age group per activities)
- Multiply the population matrix (IRIS X Age_group) with the activity matrix (age-group X activity)
- Generate attraction vectors using JT's OSM-Approach - a france method
- WORK: calculate average number of employees of each NAF code based on SIRENE data
- SHOPPING: calculate average number of customeres based on number of employees of each NAF code based on SIRENE data multiplied by Bosserhof ratio
- LEISURE: leisure activities are derived from OSM data and processed with Bosserhof ratio Data input and output
- EDUCATION derived from census
- ACCOMPANY derived from all activities
- Calculate DISTANCE MATRIX
- NOTE:
get_distance_vector.py
which uses the OTP model to generate distance matrix between each zones does not work properly
- NOTE:
- USE gravity model to combine all the data processed in the above mentioned steps and calculate betas.pkl, omtl.pkl and trips.pkl
- Extras:
helper scripts/Check_Travel_Time_Distribution_pkl.py
read .pkl format files - OUTPUT--> TRIP Matrix
- Extras:
- Process population.xml for MATSim model by calling the script
generate_xml.py
with the arguments set in theCONFIG.txt