Giter VIP home page Giter VIP logo

Comments (22)

derat avatar derat commented on May 25, 2024 1

Almost all of the non-rewritten URLs that I see have paths of the forms /<username> or /search?q=#...:

% nitter-rss-proxy -format json -instances https://nitter.kylrth.com -user nasa 2>/dev/null | \
    grep -oP 'https?://nitter\.kylrth\.com/[^\\]*'
http://nitter.kylrth.com/chandraxray
http://nitter.kylrth.com/NASA
http://nitter.kylrth.com/BoeingSpace
http://nitter.kylrth.com/search?q=%23Starliner
http://nitter.kylrth.com/NASA_Astronauts
http://nitter.kylrth.com/Space_Station
http://nitter.kylrth.com/BoeingSpace
http://nitter.kylrth.com/search?q=%23Starliner
http://nitter.kylrth.com/Space_Station
http://nitter.kylrth.com/NASA
http://nitter.kylrth.com/search?q=%23Artemis
http://nitter.kylrth.com/NASASTEM
http://nitter.kylrth.com/search?q=%23YourPlaceInSpace
http://nitter.kylrth.com/BoeingSpace
http://nitter.kylrth.com/search?q=%23Starliner
http://nitter.kylrth.com/Space_Station
http://nitter.kylrth.com/Space_Station
http://nitter.kylrth.com/search?q=%23Artemis
http://nitter.kylrth.com/NASA_Orion
http://nitter.kylrth.com/DoNASAScience
http://nitter.kylrth.com/nasa_eyes
http://nitter.kylrth.com/pic/card_img%2F1639869985400193025%2FeVRqQkMJ%3Fformat%3Djpg%26name%3D420x420_2
http://nitter.kylrth.com/search?q=%23Artemis
http://nitter.kylrth.com/NASAArtemis
http://nitter.kylrth.com/POTUS
http://nitter.kylrth.com/csa_asc
http://nitter.kylrth.com/search?q=%23Artemis
http://nitter.kylrth.com/CNES
http://nitter.kylrth.com/JHUAPL
http://nitter.kylrth.com/search?q=%23Dragonfly
http://nitter.kylrth.com/search?q=%23AskAstrobio

It's tricky for the proxy to rewrite most of these URLs when using the https://twiiit.com redirector, since the URLs will have an arbitrary hostname belonging to the underlying Nitter instance. However, it looks like twiiit.com issues a redirect, so the proxy can probably use that to figure out which hostname it needs to look for.

from nitter-rss-proxy.

derat avatar derat commented on May 25, 2024 1

Thanks again for reporting this and testing it!

from nitter-rss-proxy.

derat avatar derat commented on May 25, 2024

Sorry, I don't understand -- Nitter links should already be rewritten by default. Can you provide an example?

from nitter-rss-proxy.

cxplay avatar cxplay commented on May 25, 2024

Oh yes, you can check this Twitter user AnimalsHolbox, then searcher keywords "nitter", in this tweet id https://twitter.com/AnimalsHolbox/status/1641392056848330754, some hashtags not be processed:

  <entry>
    <title>These cruel and useless experiments were done by #UWMadison to receive more tha…</title>
    <updated>2023-03-30T10:48:56Z</updated>
    <id>https://twitter.com/AnimalsHolbox/status/1641392056848330754</id>
    <content type="html">&lt;p&gt;These cruel and useless experiments were done by &lt;a href=&#34;https://nitter.privacydev.net/search?q=%23UWMadison&#34;&gt;#UWMadison&lt;/a&gt; to receive more than $3 million in tax money through the &lt;a href=&#34;https://nitter.privacydev.net/search?q=%23NIH&#34;&gt;#NIH&lt;/a&gt; : πŸ’° πŸ’° πŸ’° πŸ’· πŸ’³&lt;/p&gt;&lt;br&gt;&lt;img src=&#34;https://pbs.twimg.com/ext_tw_video_thumb/928220573418717184/pu/img/hi68jyf983uMDpwe.jpg&#34; style=&#34;max-width:250px;&#34; /&gt;</content>
    <link href="https://twitter.com/AnimalsHolbox/status/1641392056848330754" rel="alternate"></link>
    <summary type="html">&lt;p&gt;These cruel and useless experiments were done by &lt;a href=&#34;https://nitter.privacydev.net/search?q=%23UWMadison&#34;&gt;#UWMadison&lt;/a&gt; to receive more than $3 million in tax money through the &lt;a href=&#34;https://nitter.privacydev.net/search?q=%23NIH&#34;&gt;#NIH&lt;/a&gt; : πŸ’° πŸ’° πŸ’° πŸ’· πŸ’³&lt;/p&gt;&lt;br&gt;&lt;img src=&#34;https://pbs.twimg.com/ext_tw_video_thumb/928220573418717184/pu/img/hi68jyf983uMDpwe.jpg&#34; style=&#34;max-width:250px;&#34; /&gt;</summary>
    <author>
      <name>@AnimalsHolbox</name>
    </author>
  </entry>

from nitter-rss-proxy.

cxplay avatar cxplay commented on May 25, 2024

I also noticed that this seems to be a problem with the instance? Because not every time there is an unprocessed nitter link.
I'm not sure...
Sometimes it's a picture, sometimes it's a username or a link to another tweet referenced within a tweet, which is very strange.

from nitter-rss-proxy.

derat avatar derat commented on May 25, 2024

Thanks for the details.

In the account that you gave, it looks like search URLs like https://nitter.privacydev.net/search?q=%23UWMadison aren't being rewritten. The proxy doesn't currently rewrite /search URLs (you can see the list of rewrites in rewritePatterns). I'm worried that it might be hard to rewrite these in a generic way without also breaking some non-Nitter links, since /search?q=... seems like a common pattern. Maybe /search?q=#... (note the #) would be fairly safe, though.

Can you provide some examples of non-search URLs that also aren't being rewritten? Those might be safer to fix.

from nitter-rss-proxy.

cxplay avatar cxplay commented on May 25, 2024

Sure, but I think by default it uses a random instance, and I can only find it this way: by using an account with a lot of valid tweet inlinks, like nasa. As I said earlier, the problem doesn't always occur on a fixed instance, so it takes several refreshes to find the one that can't handle it. In this way, I found several instances: nitter.kylrth.com, nitter.poast.org, nitter.fdn.fr, where almost all of their links were not processed correctly.

from nitter-rss-proxy.

derat avatar derat commented on May 25, 2024

Note that you can pass e.g. -instances https://nitter.kylrth.com to use a specific instance. Maybe some of the instances are formatting links in such a way that they aren't matched by the proxy's regular expressions.

from nitter-rss-proxy.

cxplay avatar cxplay commented on May 25, 2024

Note that you can pass e.g. -instances https://nitter.kylrth.com to use a specific instance. Maybe some of the instances are formatting links in such a way that they aren't matched by the proxy's regular expressions.

This is indeed a solution, but it loses the support of twiiit.com, which is designed to avoid single points of failure. What about the "-instance" parameter, does it support specifying multiple instances for polling?

from nitter-rss-proxy.

derat avatar derat commented on May 25, 2024

Yes, you can supply multiple comma-separated instances via -instance, e.g. -instances https://n1.example.org,https://n2.example.org.

But just to be clear, I was just suggesting -instances so you could use it to provide more examples of URLs that aren't being rewritten correctly.

from nitter-rss-proxy.

cxplay avatar cxplay commented on May 25, 2024

Yes, you can supply multiple comma-separated instances via -instance, e.g. -instances https://n1.example.org,https://n2.example.org.

But just to be clear, I was just suggesting -instances so you could use it to provide more examples of URLs that aren't being rewritten correctly.

Okay, I get it.

from nitter-rss-proxy.

cxplay avatar cxplay commented on May 25, 2024

Well, I've recreated a test instance, which in addition to the timeout parameter being 20, the instance parameter inserts 83 valid instances filtered from the instance list, so you can check the link to the instances: https://nitter-rss-proxy-mod-v2.fly.dev/nasa. My own testing has been able to observe that almost all of the nitter links are not being handled correctly.

Instances in use:

https://nitter.lacontrevoie.fr,https://nitter.1d4.us,https://nitter.kavin.rocks,https://nitter.unixfox.eu,https://birdsite.xanny.family,https://nitter.moomoo.me,https://twitter.censors.us,https://nitter.grimneko.de,https://twitter.076.ne.jp,https://nitter.fly.dev,https://notabird.site,https://nitter.weiler.rocks,https://nitter.sethforprivacy.com,https://nitter.cutelab.space,https://nitter.nl,https://nitter.mint.lgbt,https://nitter.bus-hit.me,https://nitter.esmailelbob.xyz,https://tw.artemislena.eu,https://nitter.tiekoetter.com,https://nitter.spaceint.fr,https://nitter.privacy.com.de,https://nitter.poast.org,https://nitter.bird.froth.zone,https://nitter.dcs0.hu,https://twitter.dr460nf1r3.org,https://nitter.garudalinux.org,https://twitter.femboy.hu,https://nitter.privacydev.net,https://nitter.kylrth.com,https://nitter.foss.wtf,https://unofficialbird.com,https://nitter.projectsegfau.lt,https://nitter.eu.projectsegfau.lt,https://singapore.unofficialbird.com,https://canada.unofficialbird.com,https://india.unofficialbird.com,https://nederland.unofficialbird.com,https://uk.unofficialbird.com,https://nitter.qwik.space,https://read.whatever.social,https://nitter.rawbit.ninja,https://nitter.privacytools.io,https://nitter.sneed.network,https://n.sneed.network,https://nitter.smnz.de,https://nitter.twei.space,https://nitter.inpt.fr,https://nitter.d420.de,https://nitter.caioalonso.com,https://nitter.at,https://nitter.pw,https://nitter.nicfab.eu,https://bird.habedieeh.re,https://nitter.hostux.net,https://nitter.adminforge.de,https://nitter.platypush.tech,https://nitter.pufe.org,https://nitter.us.projectsegfau.lt,https://nitter.arcticfoxes.net,https://t.com.sb,https://nitter.kling.gg,https://nitter.ktachibana.party,https://nitter.riverside.rocks,https://ntr.odyssey346.dev,https://nitter.lunar.icu,https://twitter.moe.ngo,https://nitter.freedit.eu,https://ntr.frail.duckdns.org,https://nitter.librenode.org,https://n.opnxng.com,https://nitter.plus.st,https://nitter.in.projectsegfau.lt,https://nitter.tux.pizza,https://t.floss.media,https://twit.hell.rodeo,https://nitter.edist.ro,https://twt.funami.tech,https://nitter.nachtalb.io,https://n.quadtr.ee,https://nitter.altgr.xyz,https://jote.lile.cl,https://nitter.one

from nitter-rss-proxy.

derat avatar derat commented on May 25, 2024

Please let me know if you still see non-rewritten URLs in a binary that includes 6d66c12 and 4825e4d.

from nitter-rss-proxy.

cxplay avatar cxplay commented on May 25, 2024

Please let me know if you still see non-rewritten URLs in a binary that includes 6d66c12 and 4825e4d.

OK!

from nitter-rss-proxy.

cxplay avatar cxplay commented on May 25, 2024

Today I have filtered out some examples that still have problems after testing the new version of the binaries, These instances are still from this list of instances, all working fine until now!

Next I would like to describe the status of the instances that still have problems.
In addition to the instance server errors, most of the instances that still have errors are instances that reverse proxy or redirect to other nitter instances, for example:

notabird.site >> nitter.fly.dev
twitter.dr460nf1r3.org >> nitter.garudalinux.org
nitter.privacytools.io >> nitter.net
n.sneed.network >> nitter.sneed.network

There is also a list of instances that appear to be the same organization that cannot be rewritten directly:

nitter.in.projectsegfau.lt
nitter.projectsegfau.lt
nitter.eu.projectsegfau.lt

Therefore, I recommend replacing the default twiiit.com to the official nitter instance(nitter.net) in the binary, because after testing, not all instances from the list are suitable for handling by a proxy server, because some instances are broken in their RSS generation even though they work fine as Twitter front-ends (probably due to instance server issues). It is therefore recommended to use the official instances rather than a list with uncertainty (the official instances have been tested with rewrites).

from nitter-rss-proxy.

derat avatar derat commented on May 25, 2024

Thanks for the additional testing!

I've changed the default back to nitter.net as you recommended. That was the original default, but I moved away from it in aab0616, apparently due to it sometimes returning bogus "user not found" errors. Hopefully whatever the problem was has been fixed -- it worked the few times that I tried it just now.

One other option that I thought about was changing the proxy's code to rewrite all URLs with "nitter" appearing in the hostname. That seems like it would handle all of the instances that you listed, but I'm not sure that it's a good idea.

from nitter-rss-proxy.

cxplay avatar cxplay commented on May 25, 2024

If I understand correctly, some nitter instance servers do not always use custom domains with "nitter", and may have mismatches if proxy server rely on a single keyword match. My suggestion is to provide a list of keywords to check and replace, like the -instance parameter, and possibly define one or more keywords for each instance separately (since there may be multiple redirects) to handle.

from nitter-rss-proxy.

derat avatar derat commented on May 25, 2024

Hmm, I'd rather not add more configuration and code to handle weird instances. If a particular instance causes problems due to a strange setup, I think it's straightforward enough to just not pass it via the -instances flag.

from nitter-rss-proxy.

cxplay avatar cxplay commented on May 25, 2024

Hmm, I'd rather not add more configuration and code to handle weird instances. If a particular instance causes problems due to a strange setup, I think it's straightforward enough to just not pass it via the -instances flag.

Indeed. also the idea of "rewriting everything with nitter keywords" is worth considering, perhaps opening a new branch for testing feasibility? Because the current rewrite method is already perfect, but new matching rules are also worth trying.

from nitter-rss-proxy.

cxplay avatar cxplay commented on May 25, 2024

Also, for the "user not found" error, I've found it on other instances, so maybe the proxy server detects that this output can cycle directly to the next preset instance?

from nitter-rss-proxy.

derat avatar derat commented on May 25, 2024

Many of the instances listed at https://github.com/zedeus/nitter/wiki/Instances don't have nitter in their hostnames, so I don't think there's much value in adding a rewrite pattern that will only work sometimes.

Regarding "user not found" errors, it probably makes more sense to report this as a Nitter bug (if it's not already reported). I don't want the proxy to send a bunch of extra unnecessary requests whenever someone mistypes a username or a Twitter account is deleted. (If there's some way to distinguish between bogus errors and real ones caused by nonexistent users, I'm happy to add code to move on to the next instance, though.)

from nitter-rss-proxy.

cxplay avatar cxplay commented on May 25, 2024

Well, so far the problem has been solved, congratulations!

from nitter-rss-proxy.

Related Issues (9)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.