Giter VIP home page Giter VIP logo

cata2data's Introduction

Cata2Data

Use FITS catalogues, FITS image data and astropy to manipulate your data into a dynamically read data set!

See the example.ipynb notebook to see how you can easily load in a number of data sets into a dynamically memory loaded iterable data set.

Installation

With this package at ROOT_PATH/Cata2Data on your deveice, install using pip:

pip install -e ROOT_PATH/Cata2Data

Quick Introduction

The quickest introcution:

from cata2data import CataData
from torch.utils.data import DataLoader

field_names = ["A", "B"]
catalogue_paths = ["CAT_A.fits", "CAT_A.fits"]
image_paths     = ["IMG_A.fits", "IMG_B.fits"]

data = CataData(
    catalogue_paths=catalogue_paths,
    image_paths=image_paths,
    field_names=field_names
)

dataloader = DataLoader(data, batch_size=64, shuffle=True)

Features

See the doc strings for detailed notes on all of the parameters which CataData accepts. The main features are:

  • Memmory mapping (mmap)
  • Image size for the loadable dataset can be set for the whole dataset, or
  • Cutout sizes can be controlled by the catalog on a source by source basis, including for non uniform cutout heights and widths.

โš ๏ธ Note that currently catalogues are indexed through their "ra"" and "dec" columns. Use the catalogue_preprocessing parameter to correctly name the columns.

Contributing

Open an issue and let us know what sort of issue you are experiencing.

Open a pull request if you have added functionality or fixed a bug.

Conceptual Workflow

CataData takes in fields of images and catalogues. Catalogues are merged into one dataframe and labelled with their respective field names. The length of CataData objects is the length of that dataframe. Entries are indexed through the dataframe and samples are cutout from the respective image using the units provided in the "RA" and "DEC" columns of the catalogue.

If catalogued features are needed to manipulate the iamges, we recommend using an image processing wrapper around CataData objects. I.e. a function like: image_postprocessing(catadata_instance, index) -> np.ndarray which calls the catadata_object[index] and manipulates the resulting image as required before returning it.

cata2data's People

Contributors

miguelcarcamov avatar mb010 avatar as595 avatar inigoval avatar

Stargazers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.