Giter VIP home page Giter VIP logo

gaia_tools's Introduction

gaia_tools

Tools for working with the ESA/Gaia data and related data sets (APOGEE, GALAH, LAMOST DR2, and RAVE).

AUTHORS

  • Jo Bovy - bovy at astro dot utoronto dot ca

with contributions from

  • Miguel de Val-Borro
  • Simon Walker

ACKNOWLEDGING USE OF THIS CODE

Please refer back to this repository when using this code in your work. If you use the TGAS selection-function code in gaia_tools.select.tgasSelect, please cite Bovy (2017, in preparation).

INSTALLATION

Standard python setup.py build/install

Either

sudo python setup.py install

or

python setup.py install --prefix=/some/directory/

DEPENDENCIES AND PYTHON VERSIONS

This package requires NumPy, astropy, fitsio, tqdm. Some functions require Scipy and galpy. The selection-function code requires healpy; the effective-selection-function code requires mwdust for dealing with extinction. If the apogee package is installed, this package will use that to access the APOGEE data; otherwise they are downloaded separately into the GAIA_TOOLS_DATA directory (see below).

This package should work in both python 2 and 3. Please open an issue if you find a part of the code that does not support python 3.

DATA FILES AND ENVIRONMENT VARIABLES

This code will download and store various data files. The top-level location of where these are stored is set by the GAIA_TOOLS_DATA environment variable, which is the path of the top-level directory under which the data will be stored. To use the apogee functionality, you also need to set the environment variables appropriate for that package.

BASIC USE

Catalog reading and cross-matching

The basic use of the code is to read various data files and match them to each other. For example, to load the TGAS data, do:

import gaia_tools.load as gload
tgas_cat= gload.tgas()

The first time you use this function, it will download the TGAS data and return the catalog (the data is stored locally in a manner that mirrors the Gaia archive, so downloading only happens once).

Similarly, to get data for the GALAH survey's DR1, do:

galah_cat= gload.galah()

Through an interface to the more detailed apogee package, you can also load various APOGEE data files, for example:

apogee_cat= gload.apogee()
rc_cat= gload.apogeerc()

If you don't have the apogee package installed, the code will still download the data, but less options for slicing the catalog are available.

Similarly, you can load the RAVE and RAVE-on data as:

rave_cat= gload.rave()
raveon_cat= gload.raveon()

Last but not least, you can also load the LAMOST DR2 data as:

lamost_cat= gload.lamost()

or:

lamost_star_cat= gload.lamost(cat='star')

for just the stars.

To match catalogs to each other, use the tools in gaia_tools.xmatch. For example, to match the GALAH and APOGEE-RC catalogs loaded above and compare the effective temperatures for the stars in common, you can do:

from gaia_tools import xmatch
m1,m2,sep= xmatch.xmatch(rc_cat,galah_cat,colDec2='dec')
print(rc_cat[m1]['TEFF']-galah_cat[m2]['Teff'])
     Teff     
     K       
--------------
-12.3999023438
 0.39990234375

which matches objects using their celestial coordinates using the default maximum separation of 2 arcsec. To match catalogs with coordinates at epoch 2000.0 to the TGAS data, which is at epoch 2015., give the epoch1 and epoch2 keyword. For example, to cross-match the APOGEE-RC data and TGAS do:

tgas= gload.tgas()
aprc= gload.apogeerc()
m1,m2,sep= xmatch.xmatch(aprc,tgas,colRA2='ra',colDec2='dec',epoch2=2015.)
aprc= aprc[m1]
tgas= tgas[m2]

Further, it is possible to cross-match any catalog to the catalogs in the CDS database using the CDS cross-matching service. For example, to match the GALAH catalog to the Tycho-2 catalog, do the following:

tyc2_matches, matches_indx= xmatch.cds(galah_cat,colDec='dec',xcat='vizier:Tycho2')
print(galah_cat['RA'][matches_indx[0]],tyc2_matches['RA_1'][0],tyc2_matches['pmRA'][matches_indx[0]],tyc2_matches['pmDE'][matches_indx[0]])
('209.8838244', 209.88408100000001, -23.100000000000001, -10.699999999999999)

Let's see how these proper motions hold up in Gaia DR1! If you want to download a catalog from CDS, you can use gaia_tools.load.download.vizier.

The TGAS selection function

Bovy (2017, in preparation) determine the raw TGAS selection function over the 48% of the sky where the TGAS selection is well behaved. This selection function gives the fraction of true point-like objects observed as a function of (J,J-Ks) 2MASS photometry and as a function of position on the sky. Bovy (2017, in prep.) also discuss how to turn this raw selection function into an effective selection function that returns the fraction of true stars contained in the TGAS catalog as a function of distance and position on the sky, for a given stellar population and how to compute the fractional volume of a given spatial region that is effectively contained in TGAS (this is the denominator in N/V when computing bias-corrected densities based on TGAS star counts in a certain spatial region). Tools to work with the raw and effective selection functions are contained in the gaia_tools.select.tgasSelect sub-module.

The raw selection function is contained in an object and can be instantiated as follows:

>>> import gaia_tools.select
>>> tsf= gaia_tools.select.tgasSelect()

When you run this code for the first time, a ~200 MB file that contains 2MASS counts necessary for the selection function will be downloaded. When instantiating the tgasSelect object, it is possible to make different choices for some of the parameters described by Bovy (2017, in prep.), but it is best to leave all keywords at their default values. To then evaluate the fraction observed at J=10, J-Ks = 0.5, RA= 10 deg, Dec= 70.deg, do:

>>> tsf(10.,0.5,10.,70.)
array([ 0.7646336])

Another example:

>>> tsf(10.,0.5,10.,20.)
array([ 0.])

The latter is exactly zero because the (RA,Dec) combination falls outside of the part of the sky over which the selection function is well behaved. The method tsf.determine_statistical can return the part of your TGAS sub-sample that is part of the sky over which the selection function is well behaved. For example, to plot the data in TGAS for which the selection function is determined, do:

>>> import gaia_tools.load as gload
>>> tgas_cat= gload.tgas()
>>> twomass= gload.twomass()
>>> indx= tsf.determine_statistical(tgas_cat,twomass['j_mag'],twomass['k_mag'])
>>> import healpy
>>> healpy.mollview(title="")
>>> healpy.projplot(tgas_cat['l'][indx],tgas_cat['b'][indx],'k,',lonlat=True,alpha=0.03)

which gives

image

We can turn the raw TGAS selection function into an effective selection function that is a function of distance rather than magnitude for a given stellar population by specifying a sampling of true intrinsic absolute M_J and true J-Ks for this stellar population. We also require a three-dimensional extinction map, although by default the extinction is set to zero (for this, you need to install mwdust). A simple example of this is the following instance:

>>> import mwdust
>>> tesf= gaia_tools.select.tgasEffectiveSelect(tsf,dmap3d=mwdust.Zero(),MJ=-1.,JK=0.65)

which is close to a red-clump effective selection function. We can then evaluate tesf as a function of (distance,RA,Dec) to give the fraction of stars with absolute M_J = -1 and J-Ks = 0.65 contained in TGAS, for example at 1 kpc distance and (RA,Dec) = (10,70):

>>> tesf(1.,10.,70.)
array([ 0.89400531])

We could do the same taking extinction into account:

>>> tesf_ext= gaia_tools.select.tgasEffectiveSelect(tsf,dmap3d=mwdust.Combined15(filter='2MASS J'),MJ=-1.,JK=0.65)
>>> tesf_ext(1.,10.,70.)
array([ 0.27263462])

This is much lower, because the extinction toward (RA,Dec) = (70,10) =~ (l,b) = (122,7.1) is very high (A_J =~ 0.7). Note that the MJ and JK inputs can be arrays, in which case the result will be averaged over these, and they can also be changed on-the-fly when evaluating the effective selection function.

We can also compute the effective volume as defined by Bovy (2017, in prep.). For this, we need to define a function that defines the volume over which we want to compute the effective volume. For example, a cylindrical volume centered on the Sun is:

def cyl_vol_func(X,Y,Z,xymin=0.,xymax=0.15,zmin=0.05,zmax=0.15):
    """A function that bins in cylindrical annuli around the Sun"""
    xy= numpy.sqrt(X**2.+Y**2.)
    out= numpy.zeros_like(X)
    out[(xy >= xymin)*(xy < xymax)*(Z >= zmin)*(Z < zmax)]= 1.
    return out

We can then compute the effective volume for a cylinder of radius 0.15 kpc from z=0.1 kpc to 0.2 kpc as:

>>> dxy= 0.15
>>> zmin= 0.1
>>> zmax= 0.2
>>> tesf.volume(lambda x,y,z: cyl_vol_func(x,y,z,xymax=dxy,zmin=zmin,zmax=zmax),ndists=101,xyz=True,relative=False)
0.0023609512382473932

Setting relative=True would return the fractional effective volume, that is, the effective volume divided by the true spatial volume; computing the relative volume and multiplying it with the true volume is a more robust method for computing the effective volume (because pixelization effects in the computation of the effective volume cancel out). Compare:

>>> tesf.volume(lambda x,y,z: cyl_vol_func(x,y,z,xymax=dxy,zmin=zmin,zmax=zmax),ndists=101,xyz=True,relative=False)/(numpy.pi*dxy**2.*(zmax-zmin))
0.33400627552533657

with:

>>> tesf.volume(lambda x,y,z: cyl_vol_func(x,y,z,xymax=dxy,zmin=zmin,zmax=zmax),ndists=101,xyz=True,relative=True)
0.3332136527277989

As you are running these examples, you will notice that evaluating the effective volume is much faster the second time you do it (even for a different volume). This is because the evaluation of the selection function gets cached and re-used. Taking extinction into account (that is, running these examples using tesf_ext rather than tesf) takes much longer. Tools to use multiprocessing are available in this case.

For more examples of how to use this code, please see the tgas-completeness repository, which contains all of the code to reproduce the results of Bovy (2017, in prep.).

RECIPES

Match RAVE to TGAS taking into account the epoch difference

RAVE celestial positions (and more generally all of the positions in the spectoscopic catalogs) are given at epoch J2000, while TGAS reports positions at J2015. To match stars between RAVE and TGAS, we therefore have to take into account the proper motion to account for the 15 year difference. This can be done as follows:

tgas= gaia_tools.load.tgas()
rave_cat= gaia_tools.load.rave()
m1,m2,sep= gaia_tools.xmatch.xmatch(rave_cat,tgas,
                colRA1='RAdeg',colDec1='DEdeg',
                colRA2='ra',colDec2='dec',
                epoch1=2000.,epoch2=2015.,swap=True)
rave_cat= rave_cat[m1]
tgas= tgas[m2]
print(len(rave_cat))
216201

The xmatch function is setup such that the second catalog is the one that contains the proper motion if the epochs are different. This is why TGAS is the second catalog. Normally, xmatch finds matches for all entries in the first catalog. However, RAVE contains duplicates, so this would return duplicate matches and the resulting matched catalog would still contain duplicates. Because TGAS does not contain duplicates, we can do the match the other way around using swap=True and get a catalog without duplicates. There is currently no way to rank the duplicates by, e.g., their signal-to-noise ratio in RAVE.

Match LAMOST to TGAS

Similar to RAVE above, we do:

tgas= gaia_tools.load.tgas()
lamost_cat= gaia_tools.load.lamost()
m1,m2,sep= gaia_tools.xmatch.xmatch(lamost_cat,tgas,
                colRA1='ra',colDec1='dec',
                colRA2='ra',colDec2='dec',
                epoch1=2000.,epoch2=2015.,swap=True)
lamost_cat= lamost_cat[m1]
tgas= tgas[m2]
print(len(lamost_cat))
108910

Match APOGEE or APOGEE-RC to TGAS

Similar to RAVE above, we do:

tgas= gaia_tools.load.tgas()
apogee_cat= gaia_tools.load.apogee()
m1,m2,sep= gaia_tools.xmatch.xmatch(apogee_cat,tgas,
                colRA2='ra',colDec2='dec',
                epoch1=2000.,epoch2=2015.,swap=True)
apogee_cat= apogee_cat[m1]
tgas= tgas[m2]
print(len(apogee_cat))
20113

Make that second line apogee_cat= gaia_tools.load.apogeerc() for the APOGEE-RC catalog.

Match GALAH to TGAS

Similar to RAVE above, we do:

tgas= gaia_tools.load.tgas()
galah_cat= gaia_tools.load.galah()
m1,m2,sep= gaia_tools.xmatch.xmatch(galah_cat,tgas,
                colRA1='RA',colDec1='dec',
                colRA2='ra',colDec2='dec',
                epoch1=2000.,epoch2=2015.,swap=True)
galah_cat= galah_cat[m1]
tgas= tgas[m2]
print(len(galah_cat))
7919

API

(May or may not be fully up-to-date)

  • gaia_tools.load
    • gaia_tools.load.apogee
    • gaia_tools.load.apogeerc
    • gaia_tools.load.galah
    • gaia_tools.load.lamost
    • gaia_tools.load.rave
    • gaia_tools.load.raveon
    • gaia_tools.load.tgas
      • gaia_tools.load.download.vizier
  • gaia_tools.select
    • gaia_tools.select.tgasSelect
      • ``__call__
    • determine_statistical
    • plot_mean_quantity_tgas
    • plot_2mass
    • plot_tgas
    • plot_cmd
    • plot_magdist
    • gaia_tools.select.tgasEffectiveSelect
      • __call__
    • volume
  • gaia_tools.xmatch
    • gaia_tools.xmatch.xmatch
    • gaia_tools.xmatch.cds
    • gaia_tools.xmatch.cds_matchback

gaia_tools's People

Contributors

jobovy avatar simonrw avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.