ailothaen / redditarchiver-standalone Goto Github PK

RedditArchiver-standalone is the standalone version of RedditArchiver. "Standalone" means that you do not need a web server: the function was reduced to a simple Python script.

License: MIT License

Python 100.00%

archive datahoarder reddit

redditarchiver-standalone's People

Contributors

Stargazers

Watchers

Forkers

lcranstonshadow jadedgnome aryamanhm cobertos cryowarrior anasmohd50 jdehorty mahrkeenerh

redditarchiver-standalone's Issues

The "script" type, and put localhost:8080 as redirect URI does not work.

Hello. The instruction #2 is not working.
Reddit won't let me write either localhost:8080 or 127.0.0.1:8080.
Is there a workaround for this?

Thanks.

[Query] Any way to forward web links from the text document instead of individually mentioning them

such that all the URL can be pasted in a document instead of giving them individually?

Error at installation

Hi,

i can't install

can you help me ?

Thank you

colored throwing exception on Windows 11

The colored library used for coloring the console feedback, thows an exception on Windows 11:

[x] Uncaught problem: function 'SetsConsoleMode' not found
Traceback (most recent call last):
  File "RedditArchiver.py", line 392, in <module>
    myprint(f'[i] {len(submission_id_list)} submissions to download', 14)
  File "RedditArchiver.py", line 301, in myprint
    print(f"{colored.fg(color)}{message}{colored.attr(0)}")
             ^^^^^^^^^^^^^^^^^
  File "AppData\Local\Programs\Python\Python311\Lib\site-packages\colored\colored.py", line 276, in fg
    return Colored(name).foreground()
           ^^^^^^^^^^^^^
  File "AppData\Local\Programs\Python\Python311\Lib\site-packages\colored\colored.py", line 48, in __init__
    self.enable_windows_terminal_mode()
  File "AppData\Local\Programs\Python\Python311\Lib\site-packages\colored\colored.py", line 145, in enable_windows_terminal_mode
    ok = windll.kernel32.SetsConsoleMode(wintypes.HANDLE(hStdout), mode)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "AppData\Local\Programs\Python\Python311\Lib\ctypes\__init__.py", line 389, in __getattr__
    func = self.__getitem__(name)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "AppData\Local\Programs\Python\Python311\Lib\ctypes\__init__.py", line 394, in __getitem__
    func = self._FuncPtr((name_or_ordinal, self))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: function 'SetsConsoleMode' not found. Did you mean: 'GetConsoleMode'?

I fixed it by switching to the colorama lib, but have not tested it on other OS, and also my linter prettified the hell of the file so I didn't create a pull request. If anyone is interested on the fix, it consists on installing colorama and then these changes on RedditArchiver.py:

from colorama import Fore, Style, init
init(autoreset=True)

Then substitute the myprint function as follows:

def myprint(message, color, stderr=False):
    """
    Easy wrapper for print
    """
    color = color % 8
    color_dict = {
        0: Fore.BLACK,
        1: Fore.RED,
        2: Fore.GREEN,
        3: Fore.YELLOW,
        4: Fore.BLUE,
        5: Fore.MAGENTA,
        6: Fore.CYAN,
        7: Fore.WHITE
    }
    if stderr:
        print(
            f"{color_dict[color]}{message}{Style.RESET_ALL}", file=sys.stderr)
    else:
        if args.quiet:
            return None
        else:
            print(f"{color_dict[color]}{message}{Style.RESET_ALL}")

Result:

Some feedback and thanks!

I used this to download a ton of my data from reddit. Figured I'd give some feedback on how I used it/what worked/what didn't

"[X] It looks like you are not authenticated well ..."

Sometimes I would get "[X] It looks like you are not authenticated well. [X] Please check your credentials and retry.". Upon further inspection it was the result of querying a link_id that returned a 403 response, and was not an issue with authentication. As an example, link_id 8vkhv8, as the subreddit is now private. Probably needs a better error message for that exception/another except case.

JSON format

I found the html format to be nice but not machine-digestable. I added a pretty botched json export along with my html export. After "Submission downloaded", I added

    @jsonpickle.handlers.register(praw.models.reddit.submission.Submission, base=True)
    class SubmissionHandler(jsonpickle.handlers.BaseHandler):
        def flatten(self, obj, data):
            return {}
    @jsonpickle.handlers.register(praw.reddit.Reddit, base=True)
    class RedditHandler(jsonpickle.handlers.BaseHandler):
        def flatten(self, obj, data):
            return {}
    write_json(jsonpickle.encode(submission.comments[:]), submission, submission_id, now, args.output)

I also added jsonpickle to make this work. Someone might find this useful, I liked having both exports

-i take an array?

It would be nice if -i could take an array. I used jq and xargs to pipe all link_ids from my rexport dump into RedditArchiver-standlone. The command was jq '[.submissions[].id, .saved[].id, .upvoted[].id, .comments[].link_id[3:]] | unique | .[]' ~/Seafile/archive/ExportedServiceData/reddit-apiexport/export-username-2023-06-11.json | xargs -i python ~/Seafile/projects/FORKED/RedditArchiver-standalone/RedditArchiver.py -c ./config-username.yml -i {} -o /home/username/Seafile/archive/ExportedServiceData/redditarchiver if you're interested. Gets all submission ids, saved ids, uploaded ids, and link_id from comments and pipes it to xargs against RedditArchiver.py . It would have been nice if I could have send the entire array to redditarchiver but this worked. Probably make it faster not having to spin up the python interpreter for every invocation

praw.ini / config.yml

I had to make a praw.ini to use refresh_token.py. Kind of annoying to archive multiple accounts because now I have 3 praw.inis and config.ymls. Would be nice if this repo just read praw.ini for all client-specific secrets, or ran refresh_token for you. Probably more work that its worth

Thanks

Thanks for the library! Really saved my weekend. And no pressure to implement this just thought I'd share my pain points

ailothaen / redditarchiver-standalone Goto Github PK

redditarchiver-standalone's People

Contributors

Stargazers

Watchers

Forkers

redditarchiver-standalone's Issues

The "script" type, and put localhost:8080 as redirect URI does not work.

[Query] Any way to forward web links from the text document instead of individually mentioning them

getting config error

Error at installation

colored throwing exception on Windows 11

Some feedback and thanks!

"[X] It looks like you are not authenticated well ..."

JSON format

-i take an array?

praw.ini / config.yml

Thanks

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent