Comments (6)
There are at least two issues at play here, one in the underlying code and one in the plotting code.
When the data is grouped, get_counts()
provides itself numbers in counts/channel
, i.e. if more channels are grouped together, it divides the sum of counts over the different channels and then assigns this number (which is smaller than original sum) to a "grouped channel" that has a value equal to the mid-point of the grouped channels.
According to @anetasie this is not the expected behavior.
On the other hand, the plot_data command does indeed specify counts/channel as the unit for the plot, but this issue is implying that, again, this is not the desired behavior and should be fixed.
In particular:
When the PHA is grouped and set_analysis is "channel", "counts", two things must happen:
a) get_counts().sum()
should return the same number as calc_data_sum()
, i.e. the same number that it would return without grouping. In other terms, the number of counts should not be divided by the group width.
b) the plot should display the number of counts, not counts/channel, i.e. the the sum of counts corresponding to the middle point of the channel group?
from sherpa.
@olaurino issue (a) is still not resolved and we need to track.
(b) is fixed.
from sherpa.
@anetasie not sure I am following. Until #86 is merged none of the issues in this ticket are resolved. #86 cannot be merged until I get approval, which I haven't had for the past three years.
from sherpa.
[I don't think this is surprising given @olaurino's comments, but I am just trying to be explicit in what we currently see with grouped and ungrouped data, to decide where we should go]
So, there's a number of potentially interlinking issues here, but we see a difference in behavior in plot("data")
- and in this case also plot_data
- for grouped and ungrouped PHA data sets. The following two plots show a four-by-four grid where the data is displayed with set_analysis(xaxis, type=yaxis, factor=0)
and the label in the plot gives the form xaxis-yaxis
. So top left is for energty,rate and bottom-right is channel,counts. The first plot is grouped data and the second is ungrouped data.
For the grouped data we always divide by the bin width here (so the y axis is always "/channel" or "/energy) whereas for the ungrouped case we have per-bin for type='rate'
but unbinned for type='counts'
. In some ways this makes sense, since dividing by a channel width of 1 gives you the same answer (xaxis=channel), but it's less obvious it makes sense for xaxis=energy.
QUESTION: Are we happy with the current behavior for these plots?
For reference, the code used to generate these plots is:
from matplotlib import pyplot as plt
from sherpa.astro import ui
def load_datasets(infile, factor=0):
ui.clean()
for xaxis in ["energy", "channel"]:
for yaxis in ["counts", "rate"]:
idval = "{}-{}".format(xaxis, yaxis)
ui.load_pha(idval, infile)
ui.set_analysis(idval, quantity=xaxis,
type=yaxis, factor=factor)
def make_plots(outfile):
fig = plt.figure(figsize=(11, 8))
labels = ["energy-rate", "channel-rate",
"energy-counts", "channel-counts"]
args = [val for pair in zip(["data", "data", "data", "data"],
labels)
for val in pair]
ui.plot(*args)
for axis, label in zip(fig.axes, labels):
axis.text(0.5, 0.8, label, transform=axis.transAxes)
if outfile is not None:
plt.savefig(outfile)
print("Created: {}".format(outfile))
plt.close(fig)
print("Ungrouped data:")
load_datasets('sherpa-master/sherpa-test-data/sherpatest/obs1.pi')
make_plots('ylabel-ungrouped.png')
print("Grouped data:")
load_datasets('sherpa-master/sherpa-test-data/sherpatest/3c273.pi')
make_plots('ylabel-grouped.png')
from sherpa.
@DougBurke are you using the same data, 3C273 for group and ungroup display? or there are two different sources?
from sherpa.
@DougBurke thanks for the set of current plots. The current behavior is ok. However, it also depends on the user expectation. There is no easy way to get the plot of channel-counts for grouped data which would show the group labels instead of the original channel labels.
This is missing from our standard list of plots. I can see that we may want to expand the list of default plots or just provide the data access for such plot.
One issue with the grouped data per channel plot is that we use the
from sherpa.
Related Issues (20)
- RSPModelNoPHA does not apply the ARF exposure time
- sherpa.astro.ui.save_all change in logic
- UI layer and multi-panel plots HOT 2
- Issues in reading RMFs which encode the data with the Q type (i.e. very-double type) fail HOT 11
- we do not know about KEYWORD_ONLY keywords in functions when creating the ui layer HOT 3
- add an `image` method to our plotting backends HOT 4
- are the CI:arch builds broken HOT 3
- Do we have to make all backends support all "rich visualization" option? HOT 2
- DataRMF.apply_rmf infelicities
- save_all improvement
- internal: move rmf decoding logic from the backends into sherpa.astro.io HOT 1
- Create a wrapper function to link several parameters over several models at once or add a "Metamodel"
- save_all - should it save the random state
- build failures HOT 7
- conda-build pinned to 3.25 to get around index failures
- dataset id's still listed with a bracket
- running tests in parallel: Gauss-Kronrod message
- montecarlo optimisation and multi-core HOT 2
- sherpa does not install on gh-actions HOT 7
- Move grouping methods up in in hirachy? HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sherpa.