Overall, I enjoyed the JOSE paper. It was well-written and offered great insights.

Thanks for the review <a class="user-mention notranslate" data-hovercard-type="user" d

JOSE Review - 221: Text Clarity about performance HOT 3 CLOSED

lebebr01 commented on July 18, 2024

JOSE Review - 221: Text Clarity

from performance.

Comments (3)

lebebr01 commented on July 18, 2024 1

Thanks! Looks great. The added piece to being explicit about the method's assumptions is helpful, I misread/misunderstood that.

from performance.

rempsyc commented on July 18, 2024

Thanks for the review @lebebr01!

pg 2, lines 43 - 46: There is a discussion of the mean and standard deviation not being robust, which is great. I was surprised to see the second part regarding these statistics assuming a Normal distribution. I agree and understand your point, but this is overstated for those learning statistics.

To be clear, in the sentence, "they assume normally distributed data", the "they" refers to the methods based on the means and SD, not the means and SD themselves. What if we rephrase this phrase to respecify that we refer to the methods, would you be OK with that?

pg 7, lines 244 - 247: I liked this example and appreciate the idea of thinking about research context for extreme values/outliers. Would it be worth adding/framing this idea into statistical terms and being very explicit about what you mean by context?

For context, the example is:

For example, if we are studying the effects of X on Y among teenagers and we have one observation from a 20-year-old, this observation might not be a statistical outlier, but it is an outlier in the context of our research, and should be discarded to allow for valid inferences.

Here, I think the deal with this example is that this is an undetected error outlier, in the sense that it is perhaps not detected by the statistical outlier detection methods, but it still does not belong to the theoretical or empirical distribution of interest (i.e., teenagers). So the take-away from this paragraph is that we should not blindly rely on statistical outlier detection methods and we should do our due diligence to investigate error outliers that are missed by the statistical methods. I will try to clarify this paragraph, but I am not sure I can reframe this in statistical terms since we are zooming out of the stats perspective here in a way, except I can mention the distribution of interest bit.

from performance.

rempsyc commented on July 18, 2024

Here is the revised paragraph for point 2 (updated on the JOSE branch):

We should also keep in mind that there might be error outliers that are not detected by statistical tools, but should nonetheless be found and removed. For example, if we are studying the effects of X on Y among teenagers and we have one observation from a 20-year-old, this observation might not be a statistical outlier, but it is an outlier in the context of our research, and should be discarded. We could call these observations undetected error outliers, in the sense that although they do not statistically stand out, they do not belong to the theoretical or empirical distribution of interest (e.g., teenagers). In this way, we should not blindly rely on statistical outlier detection methods; doing our due diligence to investigate undetected error outliers relative to our specific research question is also essential for valid inferences.

from performance.

JOSE Review - 221: Text Clarity about performance HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent