Giter VIP home page Giter VIP logo

elefits's People

Contributors

degauden avatar grizonnetm avatar kabasset avatar teobouvard avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

elefits's Issues

Warnings with GCC 11.2.0 (use reference type loop)

/home/grizonnetm/projets/euclid/src/EleFits/EleFitsData/tests/src/Raster_test.cpp: In member function ‘void Raster_test::sectionning_test::test_method()’:
/home/grizonnetm/projets/euclid/src/EleFits/EleFitsData/tests/src/Raster_test.cpp:175:19: warning: loop variable ‘p’ creates a copy from type ‘const Euclid::Fits::Position<3>’ [-Wrange-loop-construct]
  175 |   for (const auto p : section3D.domain()) {
      |                   ^
/home/grizonnetm/projets/euclid/src/EleFits/EleFitsData/tests/src/Raster_test.cpp:175:19: note: use reference type to prevent copying
  175 |   for (const auto p : section3D.domain()) {
      |                   ^
      |                   &
/home/grizonnetm/projets/euclid/src/EleFits/EleFitsData/tests/src/Raster_test.cpp:182:19: warning: loop variable ‘p’ creates a copy from type ‘const Euclid::Fits::Position<2>’ [-Wrange-loop-construct]
  182 |   for (const auto p : section2D.domain()) {
      |                   ^
/home/grizonnetm/projets/euclid/src/EleFits/EleFitsData/tests/src/Raster_test.cpp:182:19: note: use reference type to prevent copying
  182 |   for (const auto p : section2D.domain()) {
      |                   ^
      |                   &
/home/grizonnetm/projets/euclid/src/EleFits/EleFitsData/tests/src/Raster_test.cpp:189:19: warning: loop variable ‘p’ creates a copy from type ‘const Euclid::Fits::Position<1>’ [-Wrange-loop-construct]
  189 |   for (const auto p : section1D.domain()) {
      |                   ^
/home/grizonnetm/projets/euclid/src/EleFits/EleFitsData/tests/src/Raster_test.cpp:189:19: note: use reference type to prevent copying
  189 |   for (const auto p : section1D.domain()) {

Row-wise bintable IOs

Create BintableRows or something similar to read bintables row by row, possibly with selection mechanism.
Being able to join two tables would be wonderful!

  • Select features
  • Design API
  • Implement!

ReadOnlyError

Implement some ReadOnlyError to be thrown as early as possible, e.g. when trying to call ImageRaster::write() which outputs a cryptic message as of today.

This needs a bit of refactoring:

  • MefFile = Hdu container + HduInfo container (index, name, version, category...) => Hdu not mutable anymore; only modifies MefFile's HduInfos
  • Hdu::editThisHdu() -> MefFile::edit(Hdu)
  • Need to intercept "protected" keyword writing to avoid even more cryptic messages or UBs!

Document concurrent file access

Several instances of a SifFile/MefFile can be created for the same file. In this case, they should all be read-only.
This should be documented.

Simplify and robustify contextualCopy()

As of today, the function is a simplified version of imcopy. Wrapped CFITSIO functions should be used wherever possible. Moreover, bad_alloc should be worked around like in imcopy.

Remove Subraster

This is merely an implementation details which brings complexity and maintenance for little added value.

Clarify ext creation methods

There is some asymmetry and some ambiguity in the extension creation methods of MefFile.
For example, initImageExt creates a data unit full of zeros, while initBintableExt creates no data unit.
Extending the current methods for insertion would mean adding an index as parameter, but would not be reflected in the method name.

Potential options are:

  • append or insert
  • image or binary table
  • header only (minimal or with user records) or with data (zeros or provided values)

Which potentially makes a lot of methods (although some of them may be less useful).

Proposed naming:

  • append or insert;
  • Image or Bintable;
  • Header (header-only) or Ext (non-empty data);

To avoid the proliferation of methods, the first parameter would be the user records as a RecordSeq, which means the RecordVec constructors should cover all needs, including an empty sequence.

template<typename T = unsinged char>
appendImageHeader(const RecordSeq& records = {}); // No data

template<typename T, N>
appendImageExt(const RecordSeq& records, Position<N> shape); // Fill with zeros, useful?

template<typename TRaster>
appendImageExt(const RecordSeq& records, const TRaster& raster); // Write raster

template<typename... TInfos>
appendBintableHeader(const RecordSeq& records = {}, TInfos... infos); // No data, possibly no column

template<typename... TInfos>
appendBintableExt(const RecordSeq& records, long rowCount, TInfos... infos); // Fill with zeros, useful?

template<typename... TColumns>
appendBintableExt(const RecordSeq& records, TColumns... columns);

Bintable row buffer iterator

Would be nice to have an incrementable object to return the next Segment/FileMemSegment corresponding to the next buffer location, i.e. move by Bintable::readBufferRows() with bound checking.

Attach keys to the strategy actions

In order to allow removing or copying actions, they should be attached a key (e.g. an encapsulated std::list::iterator, which gets not invalidated by adding or removing actions). A special value could be assigned to compression.

Clean string dispatch

Avoid double specializations like string and const string, e.g. similarly to isTuple.

Parse units

Instead of merely reading Record units as strings, parsing them as a proper object, like boost::unit::quantity, would be nice, e.g. Record::quantity<Unit>(). Yet, seems complex as I didn't find any convincing runtime quantity/UoM library, and implementing one is not as straightforward as it might seem at first sight (which might be why there is none).

One thing which would be simpler in our case than in general purpose library, is that we'd only need to check that the user-required unit is (compatible with) that in the Fits (writing is easy).

Write Releasing Guide

At least a checklist for not forgetting issues, doc publishing, quality review...

Bintable compression

Bintable compression is standardized but not recognized by viewers. Therefore, the feature is tagged as experimental but seems to be working when copying an HDU (cannot create compressed table and then write in it like for images). Yet, the feature would be useful in EleFitsCompress, at least for testing.

Question to be asked to CFITSIO devs: is there any hope to get the same level of support as for images?

Explicitely accept any SizedData where possible in ImageRaster and BintableColumns

Methods like:

void ImageRaster::readTo(TRaster& raster)
void BintableColumns::readTo(ColumnKey key, TColumn& column)

(should) only use TRaster/TColumn::value_type, raster/column.size() and raster/column.data() (to be checked) so that the declaration and doc could be changed to:

void ImageRaster::readTo(TContainer& container)
void BintableColumns::readTo(ColumnKey key, TContainer& container)

without even changing the definition!

This would explicitly allow bypassing Raster and Column, e.g.:

// Given: ImageRaster du
std::vector<float> raster(du.readSize());
du.readTo(raster);

// Given: BintableColumns du
std::vector<float> column(du.readRowCount());
du.readTo("NAME", column);

Replace Hdu::m_type with m_category

Provide as much information as possible without touching the HDU (e.g., Primary, Efited) for optimization purpose, e.g.:

hdu.type() {
  return m_categoty.type();
}

hdu.matches() {
  if (not filter.accepts(m_category)) return false;
  return filter.accepts(readCategory());
}

// To be implemented, too
HduCategory::type(); // Image or Bintable
HduCategory::overwrite(HduCategory); // Forces constrained, even if incompatible

This way, Hdu::readCategory() doesn't even touches the HDU.

/!\ ImageHdu and BintableHdu must touchThisHdu() by their own when Hdu stops doing it.

Look for SPDX

SPDX looks like a good alternative to traditional license files and notices.
Would be nice to see how to apply it.
A first task would be to replace preamble notices, which can be automated.

Implement WriteTemplateHeader action

The action would add some specified keywords from a user-provided list, e.g. TELESCOP. Other keywords could be interpreted, e.g. DATE.

Here is a tentative list of keywords:

  • ORIGIN
  • AUTHOR
  • REFERENC
  • TELESCOP
  • INSTRUME
  • DATE-OBS
  • OBSERVER
  • DATE (computable)
  • DATAMINMAX (computable)
  • TDMIN/MAXn (computable)

Support for DATE keywords

Support reading and writing of keyword records which represent dates, e.g. as std::tm*.
This includes standard date keywords.

For reference, see CFitsIO's utils.

*Fractional time not supported by std::tm => use boost?.

Add Strategy setters, inc. in MefFile constructor

First, it should be possible to set the strategy as a whole, instead of appending actions, e.g.:

f.strategy() = makeEuclidStrategy();

Then, the constructor should accept actions and strategy, e.g.:

MefFile f(filename, FileMode::Create, HCompress());

MefFile f(filename, FileMode::Create, makeEuclidStrategy());

This may be how the default CiteEleFits should be exposed in the public API.

Richer KeywordCategory

Refine keyword categories using fits_get_keyclass as:

  • Structural (inc TDIM, TUNIT, TDISP)
  • Compression
  • Scaling
  • Null
  • Range
  • HduId
  • Checksum
  • Wcs
  • ReferenceSystem
  • Comment
  • Continue
  • User

Check if all cases are handled!

Enable streaming in appendCopy()

If binary copy is not adequate (e.g. due to compression) and the raster is too big to be read at once, then it should iteratively be split and processed in halves. In this case, the Initializer would have no data, which can have border effects, e.g. PLIO may be discarded.

Move any to variant

It seems std::variant (or boost::variant) is better suited to our needs. It is safer and more efficient, given that all possible types are known. Additional services like std::visit would also be welcome to clean the current macro-based implementation.

The probable future update to C++17 would be the best opportunity.

Use CfitsioWrapper in contextualCopy

For now, the HDU copy is slightly adapted from imcopy, but relying on wrapped functions instead of raw CFITSIO would bring more robustness, reduce complexity and ease maintenance.

rowCount-based Column constructor doesn't accept int

The rowCount-based constructor only accepts longs:

VecColumn<float> col ({ "NAME", "", 3}, 10L); // OK col.rowCount() = 10
VecColumn<float> col ({ "NAME", "", 3}, 10); // FIXME col.rowCount() = 3

In the second case, the preferred overload is the forwarding constructor, which calls std::vector<foat>(10).
By the way, rowCount should probably be 4 in this case, or an error (size mismatch?) should be raised.

This behavior is so confusing that it can be considered an actual bug.

Make MefFile accessors names more standard

Here is the current list of MefFile accessors:

const T& access(long index);
const Hdu& operator[](long index);
const T& accessFirst(const std::string& name, long version = 0);
const T& access(const std::string& name, long version = 0);
const ImageHdu& primary();

For homogeneity and standardization, they could be renamed as:

const T& as(long index);
const Hdu& operator[](long index);
const T& find(const std::string& name, long version = 0);
const T& as(const std::string& name, long version = 0);
const ImageHdu& primary();

where as<T>() denotes the requirement to access the only available such HDU (analogously to RecordVec where keywords must not be duplicated), while find<T>() stops when one HDU matches.
Find could even be open to user functions.

// Given two extensions:
// - 1: named "FOO" with version 1 and no data
// - 2: named "FOO" with version 2 and some data
const auto hdu = f.as<ImageRaster>(1); // 1
const auto hdu = f.as<ImageRaster>("FOO"); // Throws
const auto hdu = f.find<ImageRaster>("FOO"); // 1
const auto hdu = f.find<ImageRaster>("FOO", 2); // 2
const auto hdu = f.find<ImageRaster>("FOO", 0, [](const auto& hdu) { return hdu.readSize() > 0; }); // 2

Update to C++17

Simplify metaprogramming dispatching, test structured bindings, replace boost libs where possible.

Run coverity and clang-tidy on EleFits

Here is the procedure to perform static analysis of Elefits project using coverity and clang-tidy.

Coverity scanner

Follow EleFits installation procedure until the make command and then run the make command using the coverity-build tool that should be downloaded on coverity website (compilation takes more time of course)

  • Configure
cmake -DCMAKE_INSTALL_PREFIX=/usr/local -DCMAKE_PREFIX_PATH=/usr/local ..

Create directory to store coverity analyser:

mkdir cov-int

Run coverity build tool and make command (not make install):

cov-build --dir cov-int make -j10

In this case build is done with using 10 threads.

  • Compress coverity analysis:
tar czvf elefits.tgz cov-int
  • Upload results to coverity server (manually):

https://scan.coverity.com/projects/elefits

It can be automated locally using coverity-submit or by integrating it in github (github actions? Travis CI?)

clang-tidy

  • Configure Elefits and generate compilation database with cmake to let clang tools figure out specific build options:
cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DCMAKE_INSTALL_PREFIX=/usr/local -DCMAKE_PREFIX_PATH=/usr/local  ..
  • Use run-clang-tidy script to run clang-tidy over all files in a compilation database:
run-clang-tidy-13 -p . -j 10 -export-fixes clang-tidy-fixes.log ../Ele*

In this case clang-tidy runs without default checks, with 10 threads in parallel and export errors/fixes to clang-tidy-fixes.log file

attached in the issue:
clang-tidy-fixes.log

TODO: Compile Elements and Elefits with clang before running clang-tidy.

Improve doxygen footer

Copyright (CNES?)
License (CCby)
Funding
Acknowledgements (EC, ESA, HEASARC, Elements, Doxygen, Doxygen Awesome, GitHub corner, Caro...)

Review doc

Following the 5.0 refactoring, many doc pages should be updated.

Standardize method names

We need something more global than #23 and #21!

Here is a list of debatable present/proposed/standard method names:

  • access: precise but not standard
  • add: standard but sometimes imprecise (see append, insert)
  • alloc: OK but is it useful?
  • append: OK
  • apply: OK (see transform)
  • as: for conversion only? avoid f.as<ImageHdu>(1)? avoid records.as<float>("EXPTIME")?
  • assign: standard but imprecise (see append, insert, write)
  • at: OK, but dedicated to bound checking, and ugly with template parameter (e.g. f.at<ImageHdu>(1))
  • cast: prefer as (e.g. record.as<float>())
  • find: OK
  • findAll: OK but is it needed (see iterFind)?
  • get: fuzzy (e.g. ̀f.get(1)`)
  • init: non standard and imprecise (see alloc and assign)
  • insert: OK
  • iter: OK but is it needed (see begin, end)?
  • iterFind: ugly but standard-ish (e.g. f.iterFind<ImageHdu>(Created))
  • reinit: same as init (see reshape)
  • remove: OK
  • reshape: OK
  • select: not in line with SQL (see where, see iterFind)
  • transform: OK but for class methods (use apply instead, e.g. transform(container) or container.apply(...))
  • update: OK except where it means "write or update" (e.g. hdu.updateName() vs. hdu.rename() or hdu.name())
  • where: standard but ugly in some contexts (e.g. f.where<ImageHdu>(Created))

Provide access to the CFITSIO fitsfile pointer

There would be no way back because the state of the MefFile cannot be rebuilt at low cost (e.g. need to recount the HDUs, lose HDU states...)

Proposal:

fitsfile* MefFile::handoverToCfitsio() {
  auto fptr = m_fptr;
  m_fptr = nullptr;
  return fptr;
}

Memory issue when writing long (long) comment for HIERARCH keyword

Internally reported by @grizonnetm

I found some strange memory issue (invalid write) when I create a record with a hierarchical keyword (more than 8 characters) and a long (very long) comment.
I don't know exactly if this is a limitation from fits or CfitsIO and I had some trouble to find a way to reproduce the issue as I did not have the same output if I run all the tests, unit test per unit test, with valgrind...
I've created a branch in el_fitsIO which update unit tests which use the checkHierarchKeywordIsReadBack function and increase the size of the comment associated to the hierarchical keyword:
https://gitlab.euclid-sgs.uk/EuclidLibs/EL_FitsIO/-/commit/25e82146e8bc35cb3608105d766965a6078bd8e2
Branch:
https://gitlab.euclid-sgs.uk/EuclidLibs/EL_FitsIO/-/tree/bug-long-comment-with-hierarch-keyword
In this branch, I've let only 1 test related to hierarchical keyword to illustrate
And I've attached the valgrind log (valgrind ./build.x86_64-conda_cos6-gcc73-o2g/bin/EL_FitsFile_RecordHdu_test).
I'll update the issue if I get more information.

Refactor HduCategory

Instead of the trits-based selection, variable-size types could be used in the form of enums wrapped in some Selector decorator which would provide the Unconstrained case and the operators.

The following enums would be added:

enum HduType { Primary, Image, Bintable };
enum ImageType { Int, Float }; // Deprecated with Compress
enum CompressionType { None, Lossless, Lossy };
enum DataBlocks { None, Single, Multiple }; // Or a long?
enum HduStatus { Untouched, Read, Edited, Created };

They could be used standalone in subsequent versions, too, e.g. Hdu::type() -> HduType.

Implement fine categorization of HDUs

readCategory only sets the following trits:

  • ImageBintable
  • PrimaryExt
  • UntouchedTouched
  • ExisitedCreated
  • ReadEdited

The following flags should be set, too:

  • MetadataData -> read the image size or bintable row count; if > 0, set Data, else Metadata
  • IntFloatImage -> in case of image HDU, read (Z)BITPIX -> if > 0, set Int, else Float
  • RawCompressedImage -> read fits_is_compressed_image

Implement Strategy

A Strategy is a collection of Action = std::function<void(const Hdu&)>.
It is indexed by some ActionTrigger, e.g. PostFileOpen, PreFileClose, PreHduInit, PostHduInit, PostFirstHduAccess (naming too rude, to be refined) and the HduCategory.
Possibly, some ActionTag (e.g. int or string) could be used to add or remove actions.

Something like this must be feasible:

Strategy s;
s.add(Compress(Plio()), ActionTrigger::PreHduInit, HduCategory::IntImage);
s.add(Compress(HCompress()), ActionTrigger::PreHduInit, HduCategory::FloatImage);
s.add(ValidateChecksums(), ActionTrigger::PostFileOpen); // default category is Any
s.add(ValidateChecksums(), ActionTrigger::PostFirstHduAccess);
s.add(UpdateChecksums(), ActionTrigger::PreFileClose, HduCategory::Edited);
s.add([](const Hdu& hdu) {
        hdu.header().updateVersion(hdu.header().readVersion() + 1);
    }, ActionTrigger::PreFileClose, HduCategory::Edited);

MefFile f("file.fits", FileMode::Create, s); // default is empty strategy

At some point, shortcuts could be introduced, e.g. s.preHduInit(Compress(Plio()), HduCategory::IntImage);.

For implementation ease, tags could also be replaced with ids, e.g.:

auto plio = s.add(Compress(Plio()), ...);
...
f.disableAction(plio);
f.initImage(...);
f.enableAction(plio);

Given that such local behaviors should be used with care, such an extra complexity would probably be acceptable.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.