hsf / documents Goto Github PK
View Code? Open in Web Editor NEWRepository for HSF documents (e.g. technical notes)
Repository for HSF documents (e.g. technical notes)
Before making the Licensing TN final, a few improvements to the contents are needed:
Issue dedicated to a topic started by @davidlange6 in #105:
how are people cloning this documents repository? The entire thing just to get one of the N documents?
Eventually we will all care when there are enough papers there. Should think how to organize for the future.
Current repository remains small (75 MB) but potentially a problem if we continue to add PDFs and other binary files to the repo. Should we move to LFS (LFS is a standard component in Git v2 if I'm right) for these files?
@jouvin I see that with your last changes to a2tex.py
you added a \bigskip
to the output between each line of the institutions. This seems like very unnecessary space and it makes the institution list grow from 3.5 pages to 6.
Was there a good reason for this or can we revert?
P.S. I reviewed this, so I am also to blame for not noticing!
The published version of the technical note on software project best practices is evidently generated from a LaTeX version of the document. However, all that is in GitHub is the original Markdown.
Can either of @hegner or @drbenmorgan remember how we did this change or where the LaTeX is?
I would propose that after accepting the recent change in #148 we just convert the master document to LaTeX (with pandoc
) and consider the MD obsolete.
What do you think?
I found that the JHEP author output made a few mistakes - there was no separator between the author's names and the order was wrong (should be "Forename Surname"). Getting the commas and final "and Author X" correct is quite fiddly.
There is a fix for that in #72.
One other issue though is that no footnotes are printed in the JHEP styled output. Most papers like to at least identify the editors and other authors will want their grants acknowledged.
we asked for and got some feedback by checking the reco-trigger author list - these changes might be of interest for the full CWP list as well
EPFL: Institute of Physics, École Polytechnique F'ed'erale de Lausanne (EPFL), Lausanne, Switzerland
ETHZurich: ETH Z"urich - Institute for Particle Physics and Astrophysics (IPA), Z"urich, Switzerland
LMU: Fakult"at f"ur Physik, Ludwig-Maximilians-Universit"at M"unchen, M"unchen, Germany
MIT: Laboratory for Nuclear Science, Massachusetts Institute of Technology, Cambridge, MA, USA
(MIT is different from UMass)
at least some NIKHEF authors are affiliated with Vrije university not university of Amsterdam (ymmv here)
NIKHEF: Nikhef National Institute for Subatomic Physics and Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
UZurich: Physik-Institut, Universit"at Z"urich, Z"urich, Switzerland
In the first version of the TN, there is no explicit paragraph on license compatibility issues. @valassi provided a first draft for such a paragraph in #8. @sextonkennedy also suggested in #4 (#4 (comment)) that we add a reference to:
https://en.wikipedia.org/wiki/GPL_linking_exception
as this clause has some relevance for our case since many of us use a framework where dynamic loading of libraries is a core strategy in forming our large applications.
This TN revision should also address @valassi comment in #4 (depending of whether #8 is merged or not).
I read the note on licensing and found it helpful. However, one case is not quite clear to me and perhaps it can be clarified in the "Recommendations" section. I imagine a common situation when a software developer is employed by a non-profit organization such as a national lab or a university and funded by a government agency like DOE. In such case, should the organization and/or the agency be mentioned in the copyright notice as an owner along with the actually contributing authors? Is there a common rule for that or it should be decided on a case by case basis? I would be interested to know how it applies to the US.
This is the comments received by email on Sept. 5 turned into a task list for easier tracking. Be sure to update appropriately the Google Docs-based draft answer to CSBS reviewers before marking a task done.*
Introduction
Section 3.2:
line 684: The first action item for detector simulation seems out of place. While it is desirable to extend the validity of the physics modeling towards the FCC, this is not really a computing issue, nor does it impact the speed of simulations that will be necessary for the HL-LHC, except if by making it more accurate the result is slower code. If I go to the section on current practices, this is a bit better targeted towards improving accuracy AND efficiency. My suggestion is to at least mention software performance as a goal in this bullet.
lines 715-727: it is striking that this section (and section 3.1) discusses human resources, while the following sections do not. It might be more impactful to isolate all human resource discussions to section 4.
Section 3.3:
Section 3.4:
“Scope and Challenges”: I’m sure that HEP Data Management also adheres to the FAIR principles. This is a much used buzzword, but not using it here in this context might raise the question why. Hence, you should make a conscious decision whether or not to mention these FAIR principles.
line 1334: “quasi-real time”; perhaps better: “near real-time”
The first piece of the R&D "enable ... to be plugged in dynamically" could do with some more specificity. The bullet is very general. Is it in conflict with the 3rd to last about interacting and exchanging data?
line 1371: provide examples (incl. references) for "...emergence of new analysis tools coming from industry and open source projects..."
line 1386 ff.: This is also called “provenance”. You might want to use this term here.
lines 1476ff.: please provide references for each given example technology
line 1569: I would have expected more milestones, e.g. in the direction of uptake of the software (or at least an evaluation) mentioned above or integrating/interfacing them in existing environments
Section 3.5 (Machine Learning):
Section 3.7:
Section 3.8:
Section 3.10:
Section 3.11:
Section 3.12:
Section 3.13 (Security)
Section 4:
Section 4.1:
Section 4.2 and 4.3:
These detailed comments where preceded by the following general remarks:
This document provides to the High Energy Physics community a clear status of the current software and computing practices which have been set along the last decade in order to achieve the exploitation of the data produced by the four detectors installed on the Large Hadron Collider at CERN. It also, and this is its main objective, proposes a program of work (PoW) in various identified areas that must be done to be able to successfully process the data that will be produced by the upgraded LHC (HL-LHC) in less than 10 years from now.
Reviewing the document was made more difficult due to the poor formatting of the manuscript. Therefore, some of the line numbers mentioned below might be wrong and we encourage the authors to seek a solution for the formatting problems before resubmission.
In addition, it seems that many/most of the references (only analysing the reference names) are from within the HEP community. A broader view should be taken, i.e. references from outside the HEP community should be considered.
The proposed program of work appears to be exhaustive from a technology point of view; at least with the current HEP community knowledge of the evolution of hardware and software that may occur during this period.Some general points were mentioned in the referee reports. These might be addressed in the Introduction and/or Conclusion (or Section 2?):
Attention must be better paid to
⁃ Carefully take into account the consequences on the computing facilities (Tiers 0 to 1) of such or such solution that may be raised from the POW and that may lead to extra cost. For instance, HPC or GP/GPU facilities if they provide more efficient (in CPU time) ways to do specific processing are much more expensive traditional HTC farms.
⁃ extra cost that may be generated by Hybrid Computing especially if made of commercial resources .
⁃ not to produce solutions that may lead to the set up of dedicated hardware that then may not be mutualized with other experiments yet supported by the same funding agencies as HL-LHC .
- Tier 1s must be part of these works at the earliest possible stage.
Even though it is well understood that this document focuses on HEP computing challenges, the proposed program does not make any reference to the European or worldwide computing for research context. It is mandatory that all these works take into account the evolution of the e-infrastructures they rely on . This may require some closed interactions with bodies like EGI, OSG, EUDAT, PRACE, NRENs… This « collaborative » work must be more clearly identified within the relevant POW if not within a dedicated one.
Finally, the work package relative to Security and Authentication and Authorization Infrastructure purposes a sum-up of the most important actions expected to be achieved. As these issues are beyond the scope of the HEP community and depend on the various national security policy it is of first importance that this work package is handled at the very stage of the proposed roadmap
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.