Open-source statistical package for Python based on the Pandas.
Python과 Pandas 사용자를 위한 오픈소스 통계 패키지
Especially for researchers, data scientists, psychologist, students, and anyone who interested in conducting hypothesis testing. The statmanager-kr aims to organize packages that are "convenient to use", "uncompliated to use", and "convenient to see results". The end goal of statmanager-kr is to be a simple and useful package that can be used by people who don't know much about Python and Pandas.
Pandas를 사용하며, 가설 검증에 대해 관심을 갖는 연구원, 데이터분석가, 심리학자, 학생 등을 위합니다. statmanager-kr은 사용하기 쉽고, 사용이 복잡하지 않으며, 결과를 확인하기에 편리한 패키지 구성을 목표로 개발됩니다. statmanager-kr 개발의 최종 목표는 Python과 Pandas를 잘 알지 못하는 사람도 이용할 수 있는 매우 간편하면서도 유용한 패키지를 만드는 것입니다.
Currently, KOREAN and ENGLISH are supported.
현재 지원하는 언어 세팅은 한글과 영어입니다.
For updates, please see the notice in the documentation or the Github release.
업데이트 내역은 정식 문서 내 공지사항 혹은 Github release에서 확인하시기 바랍니다.
Please use Github Discussion to let me know if you have any questions, bugs you encounter, suggestions, etc. Of course, you can also email the developer directly.
궁금하신 점, 발생하는 버그, 제안 사항 등 모든 것은 Github Discussion을 활용해서 알려주시면 감사하겠습니다. 물론, 개발자에게 직접 이메일을 보내셔도 됩니다.
Available functions to make figure or graph | 그래프 혹은 그림 제작에 활용되는 기능
P-P plot
Q-Q plot
Histogram
Histogram (cumulative)
Pointplot (within differences)
Boxplot (between group difference)
Dependency
pandas
statsmodels
scipy
numpy
matplotlib
seaborn
XlsxWriter
Recommendation
Using "Jupyter Notebook" is STRONGLY RECOMMENDED (Of course, statmanager-kr works just as well in a Python environment)
"주피터 노트북(Jupyter Notebook)" 사용을 강력하게 권고합니다. 물론, Python 환경에서도 statmanager-kr은 문제없이 작동합니다.
Hi!
nice software you have there. I will use this issue to keep track of problems I found on the way to use it.
the differences to pengouin are mentioned in the paper, but I would also add them to the readme under "related software" with a brief discussion of strengths of each
a compatible python version should be provided. I'm running my tests with python 3.12 current version
clarification: "with df that have a structure of WIDE-RANGE. " => is WIDE-RANGE the same as a tidy dataframe in R?
while the errormessage is generally good (it tells me what I need to do), I don't understand why I need to specify an index for the DF.
Search sm.howtouse("fgiure") for the function to draw pictures and graphs! => typo in figure
in the howtouse() output it would be nice to get an example how to actual use a method, e.g. something like sm.progress(method = 'ttest_ind', vars = 'age', group_vars = 'sex').figure() - (not for every method, just one general example to get an idea how to use it)
if I define an id="species", I cannot use it anymore as grouping variable
if I specify more than 2 groups in ttest_ind, I get a very cryptic error AxisError: axismust be an integer, a tuple of integers, orNone.
typo: Indenpendent in howtouse
if I run ttest_rel I get a KeyError: 's' if I use the wrong call syntax (which is impossible to find out from the REPL/jupyter notebook, you have to look into the actual manual).
automated testing: While in the docs in notion there are some printed comparisons against e.g. scipy, it is not clear whether these are run automatically (I didnt find the code for the documentation), or whether the author checked each for equivalence
in a spot-check I saw that the cohen's d calculation was implemented by the author, but I cannot confirm that there exists a test for these functions.
The JOSS Part " Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support" is not completed as of now (or I missed it)
I will continue later, but the unittesting is really important, especially in a stats-package
I've been exploring your package and appreciate its capabilities. However, I encountered a couple of issues that I would like to bring to your attention:
The use of index_col="name" as suggested in the Quickstart guide does not work with the provided testdata/testdf.csv. It functions correctly when I switch to another variable, such as "id".
The bar plot generated by the command sm.progress(method='ttest_ind', vars='age', group_vars='sex').figure() appears misaligned. Is there a specific version of the plotting library that I should be using to ensure proper alignment?
현재 statamanager-kr은 manager.py 내에서 정해진 방식에 따라 분석이 진행되도록 코딩되어 있습니다.
현재 의도한 대로 기능은 잘 작동하고 버그가 거의 다 잡힌 것으로 보이지만, 각 기능을 업데이트하거나 수정함에 있어 다소 불편한 점이 있습니다.
그래서 더 많은 통계분석 기능이 추가되기 전에 미리 손을 보려고 합니다.
따라서 업데이트가 다소 지연될 수 있습니다. 양해해주세요!
Currently, statamanager-kr is coded so that the analysis proceeds in a prescribed manner within manager.py.
While it currently functions as intended and seems to be mostly bug-free, there are some inconveniences with updating or fixing each feature.
So I want to get my hands dirty before adding more statistical analysis features.
As a result, updates may be delayed a bit. Thanks for your patience!