Giter VIP home page Giter VIP logo

dose's People

Contributors

appultaart avatar keerthana-d avatar khadijashahrukh avatar mauriceling avatar sharlene98 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

dose's Issues

Organism.identity and Organism.deme in genetic.py

  1. Organism.identity is specified in Organism class as "Each organism is identifiable by a randomly generated 32-character 'name' as Organism.identity" but not implemented.
    Suggestion: Organism.identity is generated in init method
  2. Organism.deme records the subpopulation of the organism. In some respect, deme is like a clan name or tribe name - cross-deme mating is alright as they are the same population (another way to see population is species). At simulation initiation, all demes will be population name.

Genetic distance between long migration / no cross-cell mating

  1. Within a 5x5x1 cell eco-system (flatland), 1250 genetically identical (chromosome length = 50) individuals are evenly distributed. 50 individuals per cell.
  2. Mating can only occur within cell (use dose.filter_location function).
  3. 10% of the population inside each eco_cell would be migrating individually to random eco_cells across the entire world (implement in organism_location function with the use of dose.filter_location function).
  4. eco_cell_capacity shall be ignored to enable random movement to other cells.
  5. Monitor the 25 cells for 1000 generations.
  6. Report genetic status every 10 generations. Location reporting is required.

Genetic distance between adjacent migration / no cross-cell mating

  1. Within a 5x5x1 cell eco-system (flatland), 1250 genetically identical (chromosome length = 50) individuals are evenly distributed. 50 individuals per cell.
  2. Mating can only occur within cell (use dose.filter_location function).
  3. A random portion of the population inside each eco_cell would be migrating individually to random eco_cells adjacent to its current position (implement in organism_movement function with the use of dose.filter_location function).
  4. eco_cell_capacity shall be ignored to enable random movement to other cells.
  5. Monitor the 25 cells for 1000 generations.
  6. Report genetic status every 10 generations. Location reporting is required.

add population name into Organism.status dictionary

  1. Currently, Organisms have a deme in the status dictionary, which defines sub-population.
  2. However, organisms are not aware of it's original population name.
  3. Biologically, an organism can mate between different demes (breeds, races, etc) but not between distinct populations (like between species).
  4. When reviving a simulation from database, it is often not possible to know which population it is unless we assume deme as population but that was not the original intention of deme as deme can change within a simulation.
  5. Needed to add a population name (from simulation parameters) into Organism.status.

database - database file

  1. A simulation can log all the data (simulation parameters, Organism.status, Organism.genome, World.ecosystem) in a database file.
  2. Organism.status, Organism.genome, and World.ecosystem logging by default is done for every generation. However, user can change this in simulation parameters as "database_logging_frequency".
  3. Database file path/name is given in simulation parameters as "database_file". If not exist, the file will be created. If exist, it will just use the existing database file.

Incorrect calling of custom functions inside dose.py

  1. Because the initialization of populations now occur outside Entities/simulation_functions, the calling of these custom functions designed for these populations should also be called with the Populations themselves as the parameters.
  2. Consider renaming all 'Entities' variables inside dose.py into 'dose_functions' strictly as it is now technically just a class of 'custom functions'.

Eg.

Change:
SIM_01.py

  • def mating(self): pass
  • def postpopulation_control(self): pass

dose.step()

  • Entities.mating()
  • Entities.postpopulation_control()

Into:
SIM_01.py

  • def mating(Populations, pop_name): pass
  • def postpopulation_control(Populations, pop_name): pass

dose.step()

  • dose_functions.mating(Populations, pop_name)
  • dose_functions.mating(Populations, pop_name)

mixed licensing in DOSE

  1. Currently, DOSE used 2 licenses
    -- Python Software Foundation License version 2
    -- GNU General Public License version 3
  2. Needs to standardize these licensing terms.
  3. Removed all license statements from all files except those in copads sub-directory.
  4. All files in copads sub-directory to continue as Python Software Foundation License version 2 unless otherwise specified, as per https://github.com/copads/copads
  5. All other files to be licensed under GNU General Public License version 3.

deployment schemes: [pre-defined] random deployment

  1. Deployment scheme is to allocate organisms to a particular eco-cell before the simulation starts (before generation 1).
  2. Deployment schemes will be given as part of simulation parameters.
  3. We should have an Random deployment scheme where the entire population gets randomly distribution across a set of eco-cells (defined by user). Due to randomness, the number of organisms per eco-cell may not be the same. However, it is possible for the number of organisms per eco-cell may to be the same.
  4. Random deployment scheme to have deployment_code=2.

world_builder class inside dose.py shows an unecessary nesting of classes as initialization of World can be done directly with dose_world.World

  1. world_builder's function is to call out dose_world.World without referring to it explicitly (using super()).
  2. Directly calling out dose_world.World would not make much difference with this as there are no longer classes inside it that require inheritance due to its separation with the simulation_functions. (main advantages of using super() would no longer be of use)
  3. Remove world_builder and make the initialization of World directly referenced to dose_world.World.

Means to continue simulation from database logs

  1. Currently, database logging is implemented.
  2. All key-value pairs in simulation parameters are logged in "parameters" table. All key-value pairs in Organism.status dictionary is logged in "organisms" table. All chromosomes are logged in "organisms" table. An ecological cell is implemented as a dictionary and all key-value pairs of this dictionary is logged in "world" table.
  3. Hence, it should be possible to extract all the needed data from the database using (start time, generation count) to build the needed objects to continue simulation.
  4. New "maximum_generations" in simulation parameters may be given.
  5. A record should be inserted into "miscellaneous" table or any other suitable table to signify continued simulation event.

deployment schemes: [pre-defined] all-in-one

  1. Deployment scheme is to allocate organisms to a particular eco-cell before the simulation starts (before generation 1).
  2. Deployment schemes will be given as part of simulation parameters.
  3. We should have an All-in-one deployment scheme where the entire population gets parachuted into just one eco-cell.
  4. All-in-one deployment scheme to have deployment_code=1.

Basic simulation examples

  1. "01_basic_functions_one_cell_deployment.py"

This simulation will have the most basic parameters and functions. Most parameters would be set to it's default value and most custom_functions would be passed (except for those that are required to be over-ridden).

  1. 02_basic_functions_even_deployment.py

Like "01_basic_functions_one_cell_deployment.py", this simulation shall also feature basic parameters and basic custom functions with the exception of having 2 populations and an even deployment scheme for the simulation's deployment scheme.

extract() support methods

To aid in future data analyses, there should be a function to help the user extract a specific set of information from the database.

Children produced in mating scheme share status dictionaries similar to Issue #14

New organisms created inside mating scheme shares the same status dictionary similar to issue #14.

  1. Location of all new organisms takes the last location in the iteration of locations which suggests that location statuses do share the same id address.
  2. Clone function inside genetic.py does not solve this problem like it answered issue #14 as these organisms are newly created and not cloned.

deployment schemes: user-defined

  1. Deployment scheme is to allocate organisms to a particular eco-cell before the simulation starts (before generation 1).
  2. Deployment schemes will be given as part of simulation parameters.
  3. We should have a means for user-defined deployment scheme - as a method "def deployment_scheme(self)"
  4. User-defined deployment scheme to have deployment_code=0.

Genetic distance between no migration / no cross-cell mating

  1. To demonstrate the workings of DOSE, we are going to use a scientific case study.
  2. Within a 5x5x1 cell eco-system (flatland), 1250 genetically identical (chromosome length = 50) individuals are evenly distributed. 50 individuals per cell.
  3. Mating can only occur within cell (use dose.filter_location function).
  4. No cross-cell travel/migration allowed.
  5. Monitor the 25 cells for 1000 generations.
  6. Report genetic status every 10 generations. Location reporting is required.

deployment schemes: [pre-defined] centralized deployment

  1. Deployment scheme is to allocate organisms to a particular eco-cell before the simulation starts (before generation 1).
  2. Deployment schemes will be given as part of simulation parameters.
  3. We should have an Centralized deployment scheme where the entire population is centered about an eco-cell. For example, if the capacity of the eco-cell [location = (3,3,3)] is 50 and 100 organisms are to be deployed, then eco-cell 3,3,3 will have 50 organism deployed. The rest of the 50 organisms will be deployed randomly across the 8-adjacent cells on the same z-axis. That is 50 organisms randomly deployed into (2,2,3), (2,3,3), (2,4,3), (3,2,3), (3,4,3), (4,2,3), (4,3,3), (4,4,3).
  4. Centralized deployment scheme to have deployment_code=4.

Reconstruct write_parameters() to handle all kinds of simulation

  1. write_parameters() for both kinds of simulations are different. Normal simulations call on write_parameters() whereas revive_simulation() calls on write_rev_parameters().
  2. write_rev_parameters() does not properly handle both database revival and file revivals. Conflicts rise as they do not have the same parameters.
  3. Merge write_parameters() and write_rev_parameters() such that it can handle any kind of simulation, properly printing out all the keys in the parameters (with special exceptions).

deployment schemes: [pre-defined] even deployment

  1. Deployment scheme is to allocate organisms to a particular eco-cell before the simulation starts (before generation 1).
  2. Deployment schemes will be given as part of simulation parameters.
  3. We should have an All-in-one deployment scheme where the entire population gets evenly distributed across a set of eco-cells (to be defined by user). The designated eco-cells should have the same number of organisms if the number of organisms is divisible by the number of eco-cells without remainder.
  4. Even deployment scheme to have deployment_code=3.

Needs simulation_functions class in dose.py for documentation

  1. Currently, all simulation functions are implemented as a class in examples. This is correct but class simulation_functions does not inherit any class from DOSE.
  2. This is strange as then the user will need to know what functions there are to implement.
  3. We should have a corresponding empty class called "simulation_functions" in dose.py to serve as prototype as well as documentation - let users know what functions there are to implement.

class simulation_functions():
def organism_movement(self, World, x, y, z): pass
def organism_location(self, World, x, y, z): pass
def ecoregulate(self, World): pass
def update_ecology(self, World, x, y, z): pass
def update_local(self, World, x, y, z): pass
def report(World): pass
def fitness(self, Populations, pop_name): pass
def mutation_scheme(self, organism): pass
def prepopulation_control(self, Populations, pop_name): pass
def mating(self, Populations, pop_name): pass
def postpopulation_control(self, Populations, pop_name): pass
def generation_events(self, Populations, pop_name): pass
def population_report(self, Populations, pop_name): pass
def deployment_scheme(Populations, pop_name, World): pass

  1. So in the examples, "class simulation_functions()" should be "class simulation_functions(dose.simulation_functions)"

08_revive_simulation_03.py returns a TypeError exception when reconstructing World

 C:\Users\13035006\SkyDrive\GitHub\dose\examples>python 08_revive_simulation_03.py
 Traceback (most recent call last):
   File "08_revive_simulation_03.py", line 111, in <module>
     dose.revive_simulation(rev_parameters, simulation_functions)
   File "C:\Users\13035006\SkyDrive\GitHub\dose\dose\dose.py", line 568, in revive_simulation
     simulation_core(sim_functions, rev_parameters, Populations, World)
   File "C:\Users\13035006\SkyDrive\GitHub\dose\dose\simulation_calls.py", line 68, in simulation_core
     pop_name, World)
   File "C:\Users\13035006\SkyDrive\GitHub\dose\dose\simulation_calls.py", line 204, in interpret_chromosome
     (x,y,z) = coordinates(location)
   File "C:\Users\13035006\SkyDrive\GitHub\dose\dose\simulation_calls.py", line 89, in coordinates
     x = location[0]
 TypeError: 'NoneType' object has no attribute '__getitem__'

Renaming "p", "Entities" and "population" in dose.py

  1. In dose.py, there are several functions with "Populations", "Entities" and "population" as parameters. For example - def deploy(p, Populations, Entities, population)
  2. "Populations", "Entities" and "population" are confusing. For example, what is the difference between "Populations" and "population"?
  3. It seems that "Entities" refer to "World" - Line 82: Entities.ecosystem[x][y][z]['organisms'] += 1
  4. "population" seems to refer to "population_name" - Line 72: for individual in Populations[population].agents
  5. "p" seems to refer to "simulation parameter"
  6. If so, they should be renamed
    -- p to sim_param
    -- population to pop_name
    -- Entities to World

refactoring simulation_calls.deploy() function

  1. Currently, all 4 modes of deployment is in the same function and it is possible to be adding more deployment modes over time.
  2. Hence, it is better to start refactoring out by deployment modes into separate functions such as deploy0(), deploy1() etc.

offsprings to store parents' identity in offsprings' status as list in status['parents']

  1. Currently, parental identities are not stored in offsprings which made ancestry tracing impossible.
  2. Commit (96092be) added Organism.status['parents'] as a place holder to store parent(s)' identities as a list, such as [parentA identity, parentB identity].
  3. Database logging (dose.database_report_populations function) logs parental identities as a string delimited by "|", for example |
  4. database_calls.db_reconstruct_organisms() function had catered to process "parents" key.
  5. The following examples (03, 04, 05) should have parental identity logging.

find_organisms_by_XXX support methods

  1. To aid in mating scheme development, there should be functions to enable an individual to select for organisms within the population of specific qualities.
  2. These functions should be support functions and not be part of Population class.
  3. The following should be available:
    -- find_organisms_by_location(x, y, z, agents)
    -- find_organisms_by_deme(deme_name, agents)
    -- find_organisms_by_age(minimum, maximum, agents)
    -- find_organisms_by_gender(gender, agents)
    -- find_organisms_by_vitality(minimum, maximum, agents)
    -- find_organisms_by_status(status, (selection), agents) where (selection) can be a range, such as (30, 50) to mean 30 to less than 50, or a specific value ('alive')
  4. Each of the functions in (2) must return a list of agents. This will enable chaining of functions, such as find_organisms_by_age(10, 20, find_organisms_by_location(2, 3, 4, agents)), to find all the organisms aged between 10 to 20 in eco-cell (2, 3, 4).

Refactor revive_simulation() and simulate() to call simulation_core()

  1. simulate() and revive_simulation() in dose.py contains the exact same block of code, with the exception that:
    • simulate() calls on prepare_simulation() whereas revive_simulation() calls prepare_revival()
    • simulate() deploys the organism whereas revive_simulation() does not
    • simulate() depends on "maximum_generations" for its generation count, whereas revive_simulation() depends on "extend_gen"
  2. It seems that the largest contrast among the two would be how they prepare the simulation (creation of simulation parameters, functions, World and Populations).
  3. Re-factor both revive_simulation() (rename it to 'rev_simulate()') and simulate() to call simulation_core() that contains the common codes they use.

Possible infinite loop in Ragaraja interpreter

  1. Ragaraja interpreter function (in register_machine.py) - interpret(source, functions, function_size=1, inputdata=[], array=None, size=30, max_instructions=1000) - hence, default maximum instructions to execute is 1000.
  2. The number of max instructions to execute can be changed by "max_codon" option in simulation parameters.
  3. simulation_calls.interpret_chromosome() function called the interpreter as
    register_machine.interpret(source, ragaraja.ragaraja, 3, inputdata, array,
    sim_parameters["max_cell_population"],
    sim_parameters["max_codon"])
  4. Hence, this should prevent infinite loop. However, it is possible for simulations to run very slowly, appears as entering infinite loop.

Moving functions called by dose.simulation into simulation_calls.py

  1. Currently, dose.py contains all the simulation functions but on reality, only dose.simulate() is really called by user simulations.
  2. Hence, dose.py is like the programming interface for DOSE and should be kept as simple as possible.
  3. Functions called by dose.simulate() such as dose.step() will never be called directly by users - they are "private" functions to a large extend.
  4. Is it possible to move all these "private" functions to simulation_calls.py and import it into dose?
    Technically, we can use "from simulation_calls import " so that we do not have to change dose.simulation() codes.
  5. Identified functions are
    -- spawn_populations
    -- eco_cell_iterator
    -- coordinates
    -- adjacent_cells
    -- deploy
    -- interpret_chromosome
    -- step
    -- report_generation
    -- bury_world
    -- write_parameters
  6. filter_xxx functions in dose.py are required for user-defined mating schemes; hence, should be retained in dose.py

Removed all unused modules in COPADS

  1. dose.COPADS contains a lot of modules; of which, a number of modules are not used, especially data structures.
  2. Remove the unused modules to make copads subpackage slimmer.
  3. Modules can be added as required.

Rename max_cell_population in simulation_parameters into tape_length

  1. max_cell_population is a parameter in simulation_parameters that serves as the size parameter when calling interpret function in ragaraja.
  2. It defines the length of the array to be used when interpreting the chromosome.
  3. Both "eco_cell_capacity" and "max_cell_population" names seems to collide with each other, even though they mean two entirely different things.
  4. To prevent confusion, rename "max_cell_population" into "tape_length".

database - log simulation parameters

  1. A table called parameters should hold all the simulation parameters.
  2. Fields:
    -- UID - unique ID, primary key; either auto-increment number or a random key
    -- key - key/field name of the parameters
    -- value - parameter value
  3. Function to log parameters: uid = db_log_simulation_parameters(parameters)
  4. Dependent of Issue #18

reporting organism.status['location'] for every organism in the population does not follow deployment schemes

Deployment schemes 2-4 does not properly change the 'location' status of every deployed organism.

  1. Something could be overwriting the changed 'location' status or,
  2. Deployment schemes are bugged, unable to change 'location' status of every individual properly or,
  3. Reporting [str(org.status['location']) for org in population.agents] is an incorrect reporting scheme to see the changes in 'location' status for every individual.

Means to continue from frozen population(s) and buried ecosystem

  1. Population.freeze() is a method to pickle (aka freeze) all or part of the population into a file. The reverse method is Population.revive(). This is similar to freezing down bacterial cultures and reviving frozen cultures in research settings.
  2. Similarly, World.eco_burial() is a method to pickle (aka bury) the entire world into a file. The reverse method is World.eco_excavate().
  3. But currently, there is no method/function to reading in a list of frozen population files and buried world file, and continue the simulation from there.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.