asteca / asteca Goto Github PK
View Code? Open in Web Editor NEWCode for the ASteCA package.
Home Page: http://asteca.github.io/
License: GNU General Public License v3.0
Code for the ASteCA package.
Home Page: http://asteca.github.io/
License: GNU General Public License v3.0
Change the positioning of the diagrams.
This function need to be optimized, it takes far too long to finish for medium sized clusters and above.
Sometimes this will happen:
.../functions/get_p_value.py", line 142, in get_pval
p_vals_cl.append(float(str(p_val_cl)[4:]))
ValueError: invalid literal for float(): 0,2902803
Need to add localization so commas are never used.
Add best fitted parameters (metallicity, age, extinction, distance modulus) to final output file.
There's a bug in synth_clust where sometimes (haven't reproduced it) the array mass_bin0
is a float, possibly because m1
is empty.
Once the data_output is stable, fix 'clusters_input.dat'.
1- It could to match the output file's format, or
2- have a different, easier to read format.
Check Cabrera-Caño & Alfaro 1990, there appears to be a missing 1/(n-1)
term in the likelihood function.
To be used on frames with complicated geometries or bad pixels that show blank portions.
Empty or bad pixel regions could be either filled with an average sample of stars from the frame or marked an ignored.
Give the option to supply a file with as many as xmin, xmax, ymin, ymax
values per line to leave out rectangular sections of the frame that are either empty or unusable for some reason.
The colorbar in the CMD with the membership probs keeps moving around when the GA and/or the p-value test are disabled/enabled.
If the cluster is located near a border (see NGC1863) the zoom will look stretched because the axis won't be of the same size and the plot is square allways.
Obtain not only the integrated magnitude curve but also the integrated color index.
Use those errors in the likelihood calculus.
Use exceptions to handle any possible crash in any function. Make it so that the code itself doesn't crash but instead it jumps to the next cluster file in the loop (if any)
Resources:
https://docs.python.org/2/tutorial/errors.html#handling-exceptions
http://openbookproject.net/thinkcs/python/english3e/exceptions.html
Check if I should subtract to the cluster region integrated magnitude curve the averaged field integrated magnitude curve to obtain a more accurate estimation for the true cluster integrated magnitude.
See Ref 27/SL351, the cluster region defined cuts a portion of the cluster. Perhaps define it as a square centered on the center of the cluster of length 2_1.5_r_cl.
If the cluster is too large or the frame too small, it could happen the the region selected for obtaining the background falls inside the cluster.
In this case the radius, background, density, field regions, bayesian decont algor, p-value test should be skipped.
Add a flag so that the user can indicate when this happens or.
Write a function that calculates Saha's W parameter between the cluster region and all the field regions. It is another version of what the p-values distribution functions does.
On second thought, not sure it is the same thing.
This short article The W-function applied to the age of Globular Clusters, Rengel & Bruzual (2002), uses the W function to estimate ages for GCs.
The method used is similar to what ASteCA does to estimate the cluster probability of being a real physical entity through the KDE p-value: compares synthetic clusters of the same age with each other to generate a distribution of W values, then compares the observed cluster with synthetic clusters of the same age, and finally selects the "best" age estimate as that which produces the largest overlap between distributions.
More details can be found in the PhD Thesis (dead) on which the article is based. Here it is stated that the number of model points is fixed to the number of observed points (stars), see pag. 29.
Confirmed by Dr P Saha: the W function should be used when the number of model points is fixed.
Dr Saha suggested to fix this parameter to a large value (as large as possible) and assign per-star masses after the fitting is completed. But, as stated by Dr Saha: "If M is very large, W should go to the Poisson formula", which sort of defeats the purpose of using W.
This statistic is also discussed in Bayesian isochrone fitting and stellar ages Valls-Gabaud (2014), who conclude that W is:
the statistic of choice to be used in the context of CMD modeling
Add a file containing all the values that are currently hard-coded.
There's an issue in the elitism/decode/fitness_eval
block where the best solution is apparently not being passed along to the fitness_eval
function.
I suspect this is related to the decode_ function
not transforming the solution correctly.
Right now before comparing the cluster region with a field region, a number of n_f
stars are removed from the cluster region where n_f
is the number of stars in that field region.
This results, for heavily contaminated regions, in a cluster region almost devoid of stars which forces high p-values for the cluster-field regions comparisons. For clusters not too contaminated the effect is diminished.
This was introduced via issue #12.
Generate an average of all the field regions defined and add it to the integrated magnitude plot.
Also add completeness limit line to the integ mag plot.
Finish isoch fit process and merge into main code.
Retrieving data from Vizier: http://www.aspylib.com/doc/astrometry_queries.html
Main package: http://www.astropy.org/
Generalize the code to process a CMD from any arbitrary photometric system defined in the Girardi set.
Old attempts: Old 0.2.0 branch with 70 commits, 34 older commits
inp\input_params
params_input
data as pd
dict, get rid of global variables.cld
dictionary with keys: id, coordinates, magnitudes (and errors), colors (and errors)clp
dictionary as they are found.func_caller
.params_input.dat
Make the total mass and the binary fraction the synthetic cluster is created with two more variables parameters.
Correct the description of flag_area_stronger (currently tied to the decont algor in the comments) and other things that need it.
Currently the steps defined in the input file for these two parameters has little use.
Make it so that these values are used when reading the isochrone files so as to skip values in between.
Add an option to read the membership probabilities from file (from a previous run or user-provided) to speed up calculations.
See what happens with the plotted KDE when this happens.
Restrict the selection to a given CMD so that it defines the magnitude, color and photometric system used.
There's no point in leaving these things be picked until the code learns how to deal with them separately (currently it does not)
When the cluster has a high CI (cont index) the radius should be restricted to a value lower that the one found by the get_radius function. This way less field stars will be present in the r<r_cl CMD and the isochrone fitting process will be more accurate.
It's useless.
The cluster-region
array is constantly being used and the stars in it are always being filtered to only use stars inside the cluster's radius. Re-write this so this "cleaning" is not need anymore.
It needs a re-name and a re-write. The name does not accurately describe what it does anymore and neither do the descriptions inside the function.
Should I subtract the averaged integrated magnitude from the field regions from the cluster region integrated magnitude curve?
That.
Curve goes below the plot, minimum is set too high. See BSDL654.
Create a file that can store all the values necessary for the p-value and qq-plot functions to be processed without running them, just reading data from said file.
1- Move the integ magnitude plot o the first column fifth row
2- Move the memb probability distribution to the second column fifth row
3- Discard the m_p>0.75 diagram
4- Discard the N_c CMD diagram.
5- Displace the full CMD two column to the right
6- Replace the m_p>0.5 for m_p>mu and locate it after the full CMD.
Plot the cluster and field regions LFs in a single graph, flip the x axis and use high-steps.
Use it along with Saha's W parameter to estimate the field-cluster region fit.
Same issue as with Saha's W parameter, not sure it does the same thing as the p-value.
Finish.
Read on possible CCC replacements and/or useful additions here:
http://stats.stackexchange.com/questions/78640/assessing-fit-with-identity-line-in-q-q-plot
Check if R_t is better obtained leaving the KP to find the background automatically.
The method is introduced in Cartwright & Whitworth (2004) and applied in Sánchez & Alfaro (2009), Gieles et al. (2008), Sánchez & Alfaro (2010), etc.
Have all the text the code shows in terminal saved to a final output file.
Resources:
https://docs.python.org/2/howto/logging.html#logging-from-multiple-modules
http://www.shutupandship.com/2012/02/how-python-logging-module-works.html
http://inventwithpython.com/blog/2012/04/06/stop-using-print-for-debugging-a-5-minute-quickstart-guide-to-pythons-logging-module/
This is important
The 'cluster_region' should be randomly cleaned by removing a given number of stars so that it will have the same number of stars as the field region being compared to.
The radius is assigned perhaps too fast by using the first density point that falls within the back+delta threshold (point A).
Instead take the first four density points starting from point A and select the one which lays closer to the background value as the radius.
Currently the best fitting algorithm makes use of all the stars in the cluster region to compare with the synthetic clusters and obtain a best fit.
Wouldn't it be more reasonable to just use the N_c stars (most probable members) from the cluster region in this comparison?
Make it so that if a data input file exists then it acts, otherwise skip it.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.