Giter VIP home page Giter VIP logo

Comments (45)

yymao avatar yymao commented on August 14, 2024

Thanks to @evevkovacs. #34 closes this.

from descqa.

yymao avatar yymao commented on August 14, 2024

reopen this issue as we haven't finalized validation criteria.

from descqa.

yymao avatar yymao commented on August 14, 2024

@evevkovacs has implemented this test but we still need to impose a validation criterion.

The plot below is Nz_i_Coil2004_maglim test, taken from this DESCQA run
image

The plot below is Nz_r_DEEP2_JAN test, taken from this DESCQA run
image

from descqa.

rmandelb avatar rmandelb commented on August 14, 2024

Just to smooth out some of the structure, could we imagine fitting the data to the same parametric form for the N(z) from DEEP2? The advantage of this in my mind is that (a) it means we don't have to worry about structure in the mocks that is just due to small areas involved, and (b) it means that our test is still at least somewhat valid only with z<1.

from descqa.

rmandelb avatar rmandelb commented on August 14, 2024

(And then our validation criterion can be on the values of the fit parameters.)

from descqa.

evevkovacs avatar evevkovacs commented on August 14, 2024

@rmandelb @janewman-pitt-edu Yes, that is a possibility. The plots above are both for the small-area catalogs, so are you imagining that this will be an interim criterion, until cosmoDC2 is available? For comparison, here is a similar plot for Buzzard (10 k sq. deg). [](image
So, if you think there is a better metric for this plot than comparing fits, that would be the preferred one to implement, so that we don't have to redo it when cosmoDC2 arrives. (Or we could have one method for big catalogs and another for small ones, but I'm a bit relucatant to invest a lot of effort in tailoring something special for small catalogs when the larger one is coming)
The other issue with comparing the fits (which are z**2*exp(-z/z0)), is that the current values I have for the fit parameter (z0) do not include errors (with one exception), so we have no way to evaluate whether the deviation is acceptable. That is, suppose the criterion is +/- 3 sigma for z0... we don't have a value for sigma (except for i-band for a limited range of magnitudes). So we would have to look for other validation data sets that include errors. Jeff may be able to help here.

from descqa.

janewman-pitt-edu avatar janewman-pitt-edu commented on August 14, 2024

The errors in z0 coming from the fit of z0 vs. magnitude s are quite small. Inaccuracy in the model would be a bigger worry (i.e.: a fit like z^2 exp(-(z/z0)^1.25 or ^1.5 would also give an acceptable solution for some value of z0, given the sample/cosmic variance in DEEP2). I actually think these plots look pretty acceptable, especially because they look better at the fainter end which is the LSST regime anyways.

from descqa.

rmandelb avatar rmandelb commented on August 14, 2024

@janewman-pitt-edu - speaking of the statement that these plots looking acceptable, I had some thoughts about this:

  • Correct me if I'm wrong, but I think anything with i>24 from Coil et al is an extrapolation of a fitting formula, right? I think that any validation tests using those extrapolations should be designed to be less stringent.

  • I don't think we need a validation tests for all the magnitude ranges shown. We could pick ~2 mag ranges and stick with those.

  • I was wondering what to make of the bright ones looking somewhat odd. My first thought was "OK, we don't care too much if i<21 is wonky for much of our science". But can/should these plots be used as input into the tuning of the galaxy models that @aphearin is doing?

To answer the question from @evevkovacs - I wasn't proposing that test as an interim solution until cosmoDC2 is available, I was proposing as something we do now and keep for later. I like it because it works both for large-area and small-area catalogs, with somewhat limited or very broad redshift ranges, and probably gets at enough of the features in the N(z) that it can provide a good enough test for our purposes. Curious to hear if @janewman-pitt-edu agrees with this sentiment.

With that said, the bigger issue is (a) should we allow a bit more freedom in the fitting formula? (it doesn't make sense to change the power-law out front from z^2, but it could make sense to change the bit in the exponential as Jeff says) (b) what is our validation criterion? I have code to re-fit those distributions, and I'm sure Jeff does too, so most likely either of us could provide statistical errors. But I'm not sure that basing it on a few-sigma statistical error in the fit parameters is the way to go. For example, there is incompleteness in the spec-z sample especially at the faint end, and that incompleteness is likely a function of z, so there is systematic uncertainty in the distributions in the data. And for some fainter magnitude ranges these aren't even parameters output from a fit to real data, but rather extrapolations of fitting formulae based on fit parameters for brighter magnitude ranges! It's really the systematic error in that extrapolation that is likely to dominate. I am not sure what's a great way to decide whether the distribution is close enough the data to not affect our science too badly. Curious to hear if @slosar has some thoughts for LSS?

from descqa.

janewman-pitt-edu avatar janewman-pitt-edu commented on August 14, 2024

@rmandelb : I'm pretty sure the biggest uncertainty for the DEEP2 fits is sample/cosmic variance, not incompleteness. Without much more area it's just not possible to distinguish between exp(-(z/z0)^1 vs 1.5. (We can rule out 2 or 0.5).

Correct me if I'm wrong, but I think anything with i>24 from Coil et al is an extrapolation of a fitting formula, right? I think that any validation tests using those extrapolations should be designed to be less stringent.

Past i=23 (or r=24). The z0 vs magnitude limit curve is such a straight line (with very small residuals) that I have confidence in those extrapolations; what I would say is more uncertain is the functional form to use.

I don't think we need a validation tests for all the magnitude ranges shown. We could pick ~2 mag ranges and stick with those.

Absolutely.

I was wondering what to make of the bright ones looking somewhat odd. My first thought was "OK, we don't care too much if i<21 is wonky for much of our science". But can/should these plots be used as input into the tuning of the galaxy models that @aphearin is doing?

I have thought of this as a sanity check. I think if we are really matching a) the number counts, b) the galaxy color distributions for galaxies, and c) luminosity functions we should be getting this right; but a, b, and c are all easier to measure empirically than dN/dz. I thus feel like we should focus on things more like those for validation tests we want to strictly enforce. (I'll note that I prefer luminosity functions to mass functions as the latter has systematic uncertainties in the observations and at the high-z end again has sample/cosmic variance issues).

I believe it would be possible for me to do the z0 vs. r or i magnitude limit fits for ^1.5 and probably ^1.25, if I can remember how the code works :)

from descqa.

slosar avatar slosar commented on August 14, 2024

I don't have much to add, except that absolute number of object as a function of magnitude matters much more than how they are distributed across redshift. The former affects number of blends, etc which propagates into pretty much everything and will be crucial for some decision (do you just throw really bad ones away, or there are just too many so you need to fight them), the latter just changes relative SNR and perhaps PZ accuracy, but while these affect FoM, etc, they don't fundamentally change the way we want to do data reduction.

from descqa.

evevkovacs avatar evevkovacs commented on August 14, 2024

@rmandelb @yymao @janewman-pitt-edu @slosar So to summarize this:
i) DN/dmag is much more important than dN/dz
ii) dN/dz could be checked "by eye" if the fits to the data were more representative of the true uncertainties involved and the catalog data included errors for cosmic variance
iii) the fits to the data for dN/dz need to explore other functional forms.

Actually, since Coil at al do have other fits available, with exp(-(z/z0)*1.2), I propose that I should:

  1. include this variant in the plots and make some kind of shaded band for the data fits
  2. implement jack-knife errors for the catalogs to get more realistic error bars on the catalog data.
    The we can see how the plots look. What do you all think?

from descqa.

aphearin avatar aphearin commented on August 14, 2024

@evevkovacs @yymao - a lot of work needs to be done on the catalogs in order for them to meet basic specs. This test you describe is important, but for example implementing jack-knife errors on the catalogs seems to me a much lower priority than inspecting the result and trying to make adjustments to the catalog. Worth keeping in mind as we have very limited time remaining.

from descqa.

evevkovacs avatar evevkovacs commented on August 14, 2024

@aphearin @yymao We already have a jack-knife module from DESCQA1, so I hope this will be easy.

from descqa.

yymao avatar yymao commented on August 14, 2024

Not necessary for this specific case but generally, I agree with @aphearin that we are currently human power limited and we really need to prioritize our efforts.

from descqa.

janewman-pitt-edu avatar janewman-pitt-edu commented on August 14, 2024

from descqa.

janewman-pitt-edu avatar janewman-pitt-edu commented on August 14, 2024

@pefreeman has been testing out applying a photo-z algorithm and found some anomalies that caused us to investigate further. @sschmidt23 has done a couple of tests showing that the halo mass cutoffs in protoDC2 are having a big effect -- such that, although redshift distributions look OK for integrated bins in magnitude, they will look way off for differential bins (see https://github.com/sschmidt23/DC2stuff/blob/master/protoDC2noerrorscheck.ipynb for DC2, vs. https://github.com/sschmidt23/DC2stuff/blob/master/Buzzard_quicklook.ipynb for Buzzard).

The key issue is visible in this plot:

image

where the solid red line indicates the LSST gold sample weak lensing limit and the dashed line represents a deeper limit that could perhaps be used for LSS studies. The mass limit causes a deficiency in faint objects at low redshift (where they do exist in real samples). Amongst other things, this will cause photo-z performance to be poor if realistic magnitude priors are used with template techniques, whereas it will be too good with training-based techniques as magnitude will be more informative about redshift than it should be.

Other surprising things that Peter and Sam have found are the gaps in the color-color diagram (g-r vs. r-i) and the jumps at what our best guess is are the boundaries between where different-redshift cubes were used to construct the light cones (most visible in the r-i vs. redshift plot at higher redshifts).

from descqa.

janewman-pitt-edu avatar janewman-pitt-edu commented on August 14, 2024

Here's the color-color plot:

image

from descqa.

janewman-pitt-edu avatar janewman-pitt-edu commented on August 14, 2024

And here's the r-i vs. z plot:

image

from descqa.

rmandelb avatar rmandelb commented on August 14, 2024

@janewman-pitt-edu - thanks for these plots. A few people had been looking at color vs. redshifts, but I'm not sure anybody has considered mag vs. redshift - and indeed that's a pretty important gap. We should not just be testing 1D dN/dmag and dN/dz because that will cause us to miss important features in these distributions.

I just tried to find how the mass limit changes from proto-DC2 to cosmo-DC2, and failed. Must have been looking in the wrong place. Hopefully @evevkovacs or @yymao will comment. It could be that the move to cosmo-DC2 will help with this problem.

from descqa.

aphearin avatar aphearin commented on August 14, 2024

@janewman-pitt-edu - Thanks for posting the tests of protoDC2. I agree with your assessment that the chunky edges shown in your z_phot vs. z_spec are most likely due to finite time-stepping of the simulation snapshots used to construct the lightcone.

I've also noticed some unrealistic features in color-color and color-magnitude space for protoDC2. I've been working on rescaling a single snapshot of protoDC2, using a mishmash of Monte Carlo resampling methods and drawing upon the UniverseMachine to resample model galaxies as a function of {M*, sSFR}, so that I can get two-point clustering correct as a function of these variables.

Since I'm currently only focused on a single snapshot, I can't plot things as a function of redshift, but this four-panel plot shows a range of quantities that are scaling better than what you're showing here. All panels show results for a volume-limited sample at z=0, complete in stellar mass down to 10**9.5 Msun (~4e5 galaxies).

four_panel_color_magnitude

@janewman-pitt-edu - I looked at the notebooks you posted and I do not follow your argument about halo mass cutoffs. None of those plots show halo mass on any axis, and no plotted samples have done any halo mass masking. Apologies if I'm being dense or just missed something, but why are you saying that halo mass cutoffs are connected to the problems you are seeing?

CC @evevkovacs @dkorytov @katrinheitmann

from descqa.

janewman-pitt-edu avatar janewman-pitt-edu commented on August 14, 2024

@aphearin : My belief that this is due to a mass cutoff is based on past discussions with @evevkovacs et al. about how Galacticus was run (at least in the past). The shape looks entirely consistent with that to me; basically, the envelope of i magnitude vs. redshift is set by the mass cutoff combined with the highest mass-to-light ratio amongst the low-mass galaxies (or maybe I'm flipping that and it's lowest). I.e.: it looks like a line of ~constant luminosity. At high z that luminosity is brighter than our magnitude limit but at low z it is not (as distance is smaller).

You mentioned the Universe Machine mapping is happening as a box; just to confirm: galaxies will still end up with observed colors that properly kcorrect to their assigned z's? Otherwise photo-z's will be very messed up...

from descqa.

evevkovacs avatar evevkovacs commented on August 14, 2024

I rechecked the parameter file that I used to run Galacticus. In order to speed up the calculation of luminosities, there is a cut-off in absolute magnitude of -10; ie. if the galaxy is fainter than -10, it is not evaluated. However, at z ~ 0.2, I estimated that this would cut out galaxies fainter than apparent magnitude ~ 30, so unless I made a mistake, this cut would not produce the behavior seen above. We will investigate further.

from descqa.

janewman-pitt-edu avatar janewman-pitt-edu commented on August 14, 2024

@evevkovacs : I come up with ~30 too. That suggests again that it might be due to a mass limit on how Galacticus was run (or the simulations/merger trees used as input) rather than a luminosity limit directly.

from descqa.

evevkovacs avatar evevkovacs commented on August 14, 2024

@dkorytov OK good. We are checking now, but I believe Dan added an additional cut of M<-16 to protoDC2 because that is where the numbers of galaxies started to fall off. And M<-16 would correspond to m~24 at z=0.2, so that is exactly where you see the cutoff. We can easily remove this cut. The number density of the resulting galaxies would probably be too low, but at least there would be some to look at.

from descqa.

janewman-pitt-edu avatar janewman-pitt-edu commented on August 14, 2024

That would explain it!

Why would the number density be low?

from descqa.

dkorytov avatar dkorytov commented on August 14, 2024

from descqa.

aphearin avatar aphearin commented on August 14, 2024

@dkorytov is correct about simulation resolution - properly resolving the halos hosting galaxies fainter than Mr < -16 is not possible for present-day simulations of cosmological volumes

from descqa.

janewman-pitt-edu avatar janewman-pitt-edu commented on August 14, 2024

I know it's late for DC2, but for DC3 can we at least try to emulate the subhalos down to lower masses? Redshift distributions and photo-z tests will be off if we don't.

from descqa.

aphearin avatar aphearin commented on August 14, 2024

We may be able supplement the existing simulation on DC2 timescales, @janewman-pitt-edu - for example using something along the lines of this paper, which is similar to what is done in Buzzard. I thought for purposes of present discussion that it was worth making sure you were well aware that the specs you are asking for are beyond the current capabilities of any simulation that has ever been run.

from descqa.

rmandelb avatar rmandelb commented on August 14, 2024

@yymao and @evevkovacs - I wanted to ask you about an update to this test that could address the issue @janewman-pitt-edu raised. Since Andrew showed that it would be very challenging in DC2 to get a sample to the depth of the LSST gold sample at the lowest redshift range (see discussion in this issue), I think that for the dN/dz test we need to only use the redshift range where the sample is complete down to the magnitude we are using for the test. So basically, we'd need to make a 2D plot like the one Jeff showed in #11 (comment), and use it to find the redshift at which the limiting magnitude is the one we are using for the test, and only require the test to pass above that redshift.

In other words: if we're using i<X (X=25 or 24 or whatever), then we take narrow redshift bins, and in each one, we ask what its limiting magnitude is. Only when the limiting magnitude is X or larger do we consider the test valid. That becomes a lower limit in redshift, zmin, for this test, and we only test the dN/dz above that zmin.

Is that a simple change to the setup? Is there some obstacle? My impression is that this should still be an issue in cosmoDC2 or any current N-body-based simulation, so when devising the test to be generally applicable we should account for this.

from descqa.

aphearin avatar aphearin commented on August 14, 2024

@rmandelb - I think this generally a good idea. I will just briefly point out that we have made some progress on this faint-end problem since receiving input from @janewman-pitt-edu and @sschmidt23. The plot here shows that we're now pushing down past the previous hard cutoff at Mi=-14 (shown with the black curve to guide the eye). The hard cut in M* is currently what drives the cut in Mi, and it should be possible to extend this further still for cosmoDC2, if not the next release of protoDC2.

new_scatter

from descqa.

yymao avatar yymao commented on August 14, 2024

@rmandelb the procedure you proposed is certainly doable. It does require some work to modify the current code.

Given that, I am unchecking the box for "code to reduce mock data" as it still need to be worked out.

from descqa.

yymao avatar yymao commented on August 14, 2024

@janewman-pitt-edu re: your comment at #50 (comment) --- I think what @rmandelb suggested above, if I interpret that corrected, is N(m < X, z), but only for a range of z that the galaxy is complete down to X.

Is your suggestion at #50 (comment) something different?

from descqa.

janewman-pitt-edu avatar janewman-pitt-edu commented on August 14, 2024

It's slightly different as Anze was requesting doing everything in a general framework (but 100% inspired by Rachel's discussion here). Basically we'd be comparing a particular simulation in a restricted z and magnitude range, but we'd define the test quantity in broader ranges than that (just only use the relevant cells).

from descqa.

yymao avatar yymao commented on August 14, 2024

@janewman-pitt-edu hmm, they seem the same to me operationally (need to count the number of galaxies in bins of m and z). Maybe I missed something? Or you are suggesting a different to compare with the validation data?

from descqa.

slosar avatar slosar commented on August 14, 2024

Ok, this looks like we're turning a bit in circles. You count galaxies in bins of m and z and then compare only those bins that you are complete in your test datasets. Once we have more datasets, this will naturally extend to some other bins.

from descqa.

yymao avatar yymao commented on August 14, 2024

@slosar thanks --- I guess what I am trying to figure out is how to compare with data, or, more specifically, how to normalize the counts in bins of mag and z and how to define the validation criteria.

I think what you are saying is just do number of galaxies per sky area, in bins of mag and z. I thought Rachel's suggestion was to look at P(z) for z > z*, where z* is the redshift above which the galaxy catalog is complete to m < X. I understand that they are not that different but the choice affects how the validation criteria are defined.

from descqa.

janewman-pitt-edu avatar janewman-pitt-edu commented on August 14, 2024

I think we want to:

  1. do the analysis for integral bins in magnitude (i.e. m < X), not differential bins in magnitude (A < m < B). That is tied down better by the data than differential bins.

  2. have both minimum and maximum redshifts for any given comparison. Most observational samples will have an effective z limit.

I agree this is all delving in the details.

from descqa.

yymao avatar yymao commented on August 14, 2024

I think we have come to some agreement on how to implement the criteria given the above discussion.

@evevkovacs what's the current status of this test? Are you actively working on it or should we find help? If you are currently working on it, can you provide a status update? Thanks!

from descqa.

evevkovacs avatar evevkovacs commented on August 14, 2024

@yymao @janewman-pitt-edu I am working on the test. I need to do a little code refactoring. I need a source for validation data with galaxy number densities. Right now, I only have shape information to compare with (from Jeff and from Coil et al 2004) Once I have the data, I can redo the test to conform to it. Thanks

from descqa.

rmandelb avatar rmandelb commented on August 14, 2024

If we're already doing a number density validation of N(<mag) in a separate test, then can we just test N(z) based on shape, ignoring normalization?

from descqa.

janewman-pitt-edu avatar janewman-pitt-edu commented on August 14, 2024

from descqa.

evevkovacs avatar evevkovacs commented on August 14, 2024

@janewman-pitt-edu Is there anything more up-to-date than the Coil et al 2004 data that I have been using for the shape test for the N(z) distributions? Thanks

from descqa.

janewman-pitt-edu avatar janewman-pitt-edu commented on August 14, 2024

from descqa.

yymao avatar yymao commented on August 14, 2024

This test has been done for a while. There's a bug fix currently open in #119. But this issue should be closed.

from descqa.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.