Comments (20)
One issue submitted, one to go.
from pyinaturalist.
Ok I think it's finally good to go!!!! Tests appears to be passed. Let me know if I'm wrong
from pyinaturalist.
That would be great, thanks! Let me know if you run into any issues with that.
After taking a quick look, one thing I noticed about those endpoints is that both have parameters identical to GET /observations
, except they don't list pagination parameters. It appears that the usual parameters (page
, per_page
, order_by
, etc.) work just fine with those, though, so that seems to be an error in the docs.
from pyinaturalist.
Happy holidays!
I was able to modify get_observations()
and get_all_observations()
to serve the GET /observations/observers
and GET /observations/identifiers
endpoints. Both endpoints do allow the pagination parameters, however, there appear to be bugs in each endpoint that might need to be worked out before we can implement these new functions here.
Here are the bugs that I've encountered so far:
/observers
returns pages of results that seem to overlap and that are mostly duplicates (bug documented in the/projects
endpoint here and here). If we could find the "end" of the records, we could just filter out duplicates, but I can't figure out how to get to the last page. It goes way past the expected number of pagesn_pages = total_records % per_page
.order_by
works for/observers
with optionsobservation_count
(default) andspecies_count
). The default seems to work as expected, but the latter returns{"error":"Error","status":500}
when you request a page after the ~500th record/identifiers
returns a maximum of 500 records for (apparently) any query, even iftotal_results
is much higher
I haven't pushed my updates yet because they don't really work due to these bugs. What do you think we should do here @JWCook ?
from pyinaturalist.
Thanks for the detailed report! I should have some time to take a look at this later this week.
Since these are the same endpoints used by inaturalist.org, there must be a combination of parameters that works (like per_page=30
in the similar issue with /projects
), at least as a temporary workaround.
It would be also worth creating an issue on INaturalistAPI for this so they're aware. Would you be willing to do that?
from pyinaturalist.
Yes, it would make sense that there's some parameter combo that works for the website backend...we just need to find it.
I will open an issue on github. Is it better to open 2 issues (one for each endpoint) or one bigger one, given that they aren't necessarily connected issues?
from pyinaturalist.
One issue per endpoint would be good. Thanks!
from pyinaturalist.
Well this is an interesting one. This is different from inaturalist/iNaturalistAPI#227; in that one, the offset was wrong, and each page would return a certain number of unique and non-unique results. With these endpoints, I see that each page contains only unique results until ~500 results, like you noted, and then doesn't return any at all. The issue with Elasticsearch sharding also doesn't seem to apply here.
The relevant code for the /observers
endpoint is here: https://github.com/inaturalist/iNaturalistAPI/blob/main/lib/controllers/v1/observations_controller.js#L955-L1089
It looks like this works different than the other endpoints, and contains 2 subqueries that behave differently depending on order_by
. This may be a bit trickier to debug.
I think the case you found that causes a HTTP 500
server error may be the most useful here, as that is likely to produce some error logs that the iNat devs would have access to.
from pyinaturalist.
Another thing I noticed is that the observer 'leaderboards' for a given species only shows 500 results, for example: Common Milkweed in North America.
So it's possible that the limit of 500 results is the intended behavior. Even so this behavior isn't documented, and the HTTP 500
error is unexpected, so it's still worth creating an issue.
Since your use case is "how many users have made 10+ observations in project X?," 500 results may actually be enough to do what you want. Even for some of the most commonly observed species (like the milkweed example above), the 500th observer has less than 10 observations of that species.
from pyinaturalist.
@willkuhn Until that issue is answered, would you like to go ahead and commit what you have? You can just add a note in the docstring that it will currently return no more than 500 results, and then set the page size to 500 if not specified:
def get_observers(..., **params):
params.setdefault('per_page', 500)
from pyinaturalist.
@JWCook I've got that ready. Do you want me to add it as a patch again like before?
from pyinaturalist.
Yes, go ahead and submit a pull request for it. Thanks!
from pyinaturalist.
Well, I think it's done. The coding I can do but the gitting confuses the hell out of me.
from pyinaturalist.
Yeah, git can have quite the learning curve. I'm happy to help if there are specific features or tasks in git that you'd like to learn. Or at least point you to some good resources; there are plenty of good ones out there, but also a lot of bad ones that can make it difficult to sort through them all. Are you using a git interface within an IDE (like PyCharm), a standalone UI (like GitKraken), or the git
command line (my personal favorite)?
Atlassian has some tutorials that are relatively clear and straightforward. The most useful ones are in the Collaborating and Advanced Tips sections.
from pyinaturalist.
@willkuhn If your changes are done, you can submit a pull request by going to Pull Requests (from your fork) -> New Pull Request. Then select niconoe/pyinaturalist (dev)
as the base repository, and the feature branch from your fork as the 'head repository':
from pyinaturalist.
If you would like some practice with the git
command line, this is a good opportunity. I've pushed some more changes to dev
since you started, so there is a minor merge conflict. This is a really common situation and will demonstrate several git concepts at once.
This can be fixed with either a merge or a rebase. There's a good explanation of the differences here. When dealing with your own feature branch, rebasing is usually the best option.
Rebase commands
First, add the upstream (base) repo as another 'remote' so you can get my recent changes. You can name this whatever you want, but here I'll just name it 'upstream':
git remote add upstream https://github.com/niconoe/pyinaturalist.git
Then, update your dev
branch with mine:
git checkout dev # Switch to your local dev branch
git pull upstream dev # Pull in the changes from my dev branch
git push origin dev # (optional) Push those changes back to your remote dev branch (in your fork on GitHub)
You can run those commands again anytime you want to pull the latest upstream changes.
Now you're ready to rebase:
git checkout patch-1
git rebase dev
Git will now start with the current dev
branch, and then stick your changes on top of it, one commit at a time. Here's a diagram from the Atlassian docs linked above (rebasing onto a master
branch instead of dev
):
Fixing a merge conflict
The rebase
process will pause when it gets to a merge conflict and show you this message:
You are currently rebasing branch 'patch-1' on '67916d8'.
(fix conflicts and run "git rebase --continue")
(use "git rebase --skip" to skip this patch)
(use "git rebase --abort" to check out the original branch)
Unmerged paths:
(use "git restore --staged <file>..." to unstage)
(use "git add <file>..." to mark resolution)
both modified: test/test_node_api.py
Translation: we both modified the same part of test/test_node_api.py
, and git doesn't know how to automatically merge it.
Run git diff
to see the relevant lines:
diff --cc test/test_node_api.py
index 2c04856,bdeda49..0000000
mode 100644,100644..100755
--- a/test/test_node_api.py
+++ b/test/test_node_api.py
@@@ -15,11 -17,8 +17,12 @@@ from pyinaturalist.node_api import
get_controlled_terms,
get_geojson_observations,
get_observation,
++<<<<<<< HEAD
+ get_observation_histogram,
++=======
+ get_observation_identifiers,
+ get_observation_observers,
++>>>>>>> appeasing black
get_observation_species_counts,
get_observations,
get_places_autocomplete,
Translation: I added one import (above the ===
line), and you added two imports (below the line).
This is the easiest kind of merge conflict to resolve, since we just want to keep all three lines without modifying any of them. So open up test/test_node_api.py
and remove the lines added by git (the lines with <<<
, ===
, >>>
):
get_observation,
get_observation_histogram,
get_observation_identifiers,
get_observation_observers,
get_observation_species_counts,
(Note that there is a way to do that part automatically, but it's actually more complicated, not less!).
Finish rebasing
Almost done! Now you can add your change and finish the rebase:
git add .
git rebase --continue
And finally push your changes back to GitHub:
git push --force
--force
is needed there because rebase
will rewrite your existing commits instead of adding new ones, so you need to explicitly tell git
that you want to overwrite your previous commits.
Let me know if any of that needs more explanation.
from pyinaturalist.
@JWCook thank you so much for taking the time to write up that helpful guide! I really appreciate that! I found the conflict that you pointed out and I believe it's resolved and PRed. Please let me know if that worked or not.
from pyinaturalist.
@willkuhn Great! I don't think the PR got submitted, though. Can you try again? It will show up here after being submitted: https://github.com/niconoe/pyinaturalist/pulls
from pyinaturalist.
PR done but I forgot to run unittests locally after rebasing (I think that's the right word) so there are some failed tests.
Palm to forehead
Working on that...
from pyinaturalist.
Great! It's almost ready to merge. Just added a couple comments on your PR.
from pyinaturalist.
Related Issues (20)
- Add option to get_taxa_by_id() to get current taxon in the event of a taxonomy change
- Add iNatClient method to populate full term/value info for Annotation objects
- Support passing an async event loop to run executor for async pagination
- Support Paginator async filtering? HOT 1
- Document example of rendering rich output to markdown instead of stdout HOT 1
- How do I read observation fields? HOT 6
- Complete documentation for iNatClient, controller methods, and models
- Labels not returned for Plant Phenology annotation HOT 6
- Add get_observations_by_id() to wrap GET /observations/{id}
- Add missing annotation details to observation results HOT 2
- Full taxon load overwrites matched term HOT 1
- Add option to iNatClient.taxa.autocomplete() to fetch full records for matched taxa
- Apply default_params to client.taxa(id) HOT 3
- ConservationStatus model complete code to name mapping HOT 4
- Reloading full record does not pass all_names parameter HOT 6
- Send only changed items when updating project observation rules HOT 6
- pprint cannot handle field with value None HOT 2
- Async loop gets included in request params HOT 2
- Controller methods that take one or more IDs should take a single list arg instead of variable args
- Add GET /users/me endpoint from v1 API
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pyinaturalist.