Giter VIP home page Giter VIP logo

pixivutil2's Introduction

Requirements:

Capabilities:

  • Download by member_id
  • Download by image_id
  • Download by tags
  • Download from list (list.txt)
  • Download from bookmarked artists (/bookmark.php?type=user) including private/hidden bookmarks.
  • Download from bookmarked images (/bookmark.php) including private/hidden bookmarks.
  • Download from tags list (tags.txt)
  • Download new illustrations from bookmarked artist (/bookmark_new_illust.php)
  • Download by Title/Caption
  • Download by Tag and Member Id
  • Download Member Bookmark (/bookmark.php?id=)
  • Download by Group Id
  • Download from supported artists (FANBOX)
  • Download by artist/creator id (FANBOX)
  • Download by post id (FANBOX)
  • Download from followed artists (FANBOX)
  • Re-encoding of all ugoira present in folder
  • Batch Download from batch_job.json (experimental) See https://github.com/Nandaka/PixivUtil2/wiki/Using-Batch-Job-(Experimental)
  • Manage database:
    • Show all member
    • Show all downloaded images
    • Export list (member_id only)
    • Export list (detailed)
    • Export local database (image_id)
    • Show member by last downloaded date
    • Show image by image_id
    • Show member by member_id
    • Show image by member_id
    • Delete member by member_id
    • Delete image by image_id
    • Delete member and image (cascade deletion)
    • Blacklist image by image_id
    • Show all deleted member
    • Export FANBOX post list
    • Delete FANBOX download history by member_id
    • Delete FANBOX download history by post_id
    • Delete Sketch download history by member_id
    • Delete Sketch download history by post_id
    • Clean Up Database (remove db entry if downloaded file is missing)
  • Export user bookmark (member_id) to a text files.

Docker

$ docker build -t pixivutil2 .
$ docker run -it --rm \
  -v $(pwd):/workdir \
  -w /workdir \
  pixivutil2 \
  /bin/bash -c "python PixivUtil2.py"

WARNING

Overusage can lead to Pixiv blocking your IP for a few hours.

FAQs

A. Usage

Q1. How to paste Japanese tags to the console window?
    - Click the top-left icon -> select Edit -> Paste (Cannot use Ctrl-V), if
      it show up as question mark -> Change the Language for non-Unicode
      program to Japanese (google it).
    - or use online url encoder (http://meyerweb.com/eric/tools/dencoder/)
      and paste the encoded tag back to the console.
    - or paste it to tags.txt and select download by tags list. Separate each
      tags with space, and separate with new line for new query.

Q2. My password doesn't show up in the console!
    - This is normal. The program still reads it.
    - or you can put in the config.ini if not sure.

Q3. I cannot login to Pixiv!
    - Check your password.
    - Try to login to the Pixiv Website.
    - Try to use the config.ini on the [Authentication] section.
    - Check your date and time setting (e.g.: https://www.timeanddate.com/)
    - Disable Daylight Saving Time and try again.
    - Copy your session values from browser:
      1. Open Firefox.
      2. Go to Pixiv website and login, remember to enable [Remember Me]
          check box.
      3. Press F12 to open Developer Tools, and select the Storage tab.
      4. Click the Cookies and select for the pixiv.net.
      5. Look for Cookie named = PHPSESSID.
      6. Copy the content value. https://imgur.com/a/BppHOoQ
      7. Open config.ini, go to [Authentication] section, paste the value
         to cookie. https://imgur.com/VB2g3qn

Q4. PixivUtil working from local terminal on Linux box but not working when I
    used SSH with PuTTY!
    - export LANG=en_US.UTF-8. PuTTY does not set locales right, when they are
      not set, python does not know what to write (Thanks to nho!)
    - ... and export PYTHONIOENCODING=utf-8, so it can create DB and populate
      it properly (Thanks to Mailia!)

Q5. How to delete member id from Database?
    - Open the application and choose Manage Database (d) then select delete
      Member by Member Id.
    - Open the database (db.sqlite) directly using sqlite browser and use sql
      command to delete it.
    - If you are downloading using Download from List.txt (3), you can create
      ignore_list.txt to skip the member id.

Q6. The app doesn't download all the images! (I want to download SFW images too).
    - Pixiv only allow to search up to 1000 pages if you don't have Pixiv
      Premium.
    - Check your pixiv website settings (refer to https://goo.gl/gQi09v),
      then delete the cookie value in config.ini and retry.
    - Check the value of r18mode in config.ini. Setting it to True will only
      download R-18 images.

Q7. The apps show square/question mark texts in the console output!
    - This is because your Windows is not set to Japanese for the Regional Settings
      in control panel.
    - Since 20161114+ version, you need to set the console font properties to
      use font with unicode support (e.g. Arial Unicode, MS Gothic).

Q8. Where to get FFmpeg software? How to enable `createwebm`?
    - Download the stable version of FFmpeg from https://www.ffmpeg.org/download.html.
    - For Windows:
      - Extract the archive to a folder.
      - Open the extracted folder and open to the `/bin` folder.
      - Copy the application `ffmpeg.exe` to your PixivUtil2 folder.
    - For Linux:
      - Install the package using your favorite package manager.

Q9. The downloaded images are corrupted, how to redownload it again?
    - You can delete the download history in databases by manually delete the image id
      from databases (enter d, followed by 10).
    - Or, you can set alwaysCheckFileSize = True and verifyimage = True in config.ini
      and retry the download.
      
Q10. I got this error またはメールアドレス、パスワードが正しいかチェックしてください。
    - Use your email address for the username, or check your password in config.ini

Q11. Older windows support (e.g. Win7)?
    - You can try to run from source code with the latest supported python 3.x.
      See the instruction here: https://github.com/Nandaka/PixivUtil2/wiki/IDE-Enviroment-(Windows)

B.Bugs/Source Code/Supports

Q1. Where I can report bugs?
    - Please report any bug to https://github.com/Nandaka/PixivUtil2/issues.

Q2. Where I can support/donate to you?
    - You can send it to my PayPal account (nchek2000[at]gmail[dot]com).
    - or visit https://bit.ly/PixivUtilDonation.

Q3. I want to use/modify the source code!
    - Feel free to use/modify the source code as long you give credit to me
      and make the modificated source code open.
    - if you want to add feature/bug fix, you can do fork the repository in
      https://github.com/Nandaka/PixivUtil2 and issue Pull Requests.

Q4. I got ValueError: invalid literal for int() with base 10: '<something>'
    - Please modify _html.py from mechanize library, search for
      'def unescape_charref(data, encoding):' and replace with patch in
      https://pastebin.com/5bT5HFkb.

Q5. I got '<library_name> module no found error'
    - Download the library from the source (see links from the Requirements
      section) and copy the file into your Lib\site-packages directory.
    - Or use pip install (google on how to use).

C.Log Messages

Q1: HTTPError: HTTP Error 404: Not Found
    - This is because the file doesn't exist in the pixiv server, usually
       because there is no big images version for the manga mode (currently the
       apps will try to download the big version first then try the normal size
       if failed, this is only for the manga mode and it is normal).

Q2: Error at process_image(): (<type 'exceptions.WindowsError'>, WindowsError
    (32, 'Prosessi ei voi kayttaa tiedostoa, koska se on toisen prosessin
    kaytossa')
    - The file is being used by another process (google translate). Either you
      ran multiple instace of Pixiv downloader from the same folder, or there
      are other processes locking the file/db.sqllite (usually from antivirus
      or some sync/backup application).

Q3: Error at process_image(): (<type 'exceptions.AttributeError'>,
    AttributeError ("'NoneType' object has no attribute 'find'",)
    - Usually this is because of failed login (cookie not valid). Try to change
      your password to simple one for testing, or copy the cookie from browser:
      1. Open Firefox/Chrome.
      2. Login to your Pixiv.
      3. On Pixiv page, press F12 and choose the Storage tab (Firefox), or
         Right click on the leftmost address bar/the (i) icon (Chrome)
      5. Click the View Cookies button.
      6. Look for Cookie named = PHPSESSID.
      7. Copy the content value.
      8. Open config.ini, go to [Authentication] section, paste the value to
         cookie.
    - Or because Pixiv has changed the layout code, so the Pixiv
      downloader cannot parse the page correctly. Please tell me by posting a
      comment if this happens and include the details, such as the member/image
      id, dump html, and log file (check on the application folder).

Q4: URLError: <urlopen error [Errno 11004] getaddrinfo failed>
    - Update version to > pixivutil20221029.
    - This is because the Pixiv downloader cannot resolve the address to
      download the images, please try to restart the network connection or do
      ipconfig /flushdns to refresh the dns cache (windows).

Q5: Error at download_image(): (<class 'socket.timeout'>, timeout('timed out',)
    - This is because the Pixiv downloader didn't receive any reply for
      specified time in config.ini from Pixiv. Please retry the download again
      later.

Q6: httperror_seek_wrapper: HTTP Error 403: request disallowed by robots.txt
    - Set userobots = False in config.ini

Command Line Option

Please refer run with --help for latest information.

  -h, --help            show this help message and exit
  -s STARTACTION, --startaction=STARTACTION
                        Action you want to load your program with:
                        1 - Download by member_id
                            (required: list of member_ids separated by space
                             optional: --include_sketch to also download Pixiv Sketch)
                        2 - Download by image_id
                            (required: followed by image_ids separated by space)
                        3 - Download by tags
                            (required: tags
                             optional: --use_wildcard_tag, --sp=START_PAGE, and --ep=END_PAGE, --start_date, --end_date)
                        4 - Download from list
                            (required: -f LIST_FILE and followed with optional tag)
                        5 - Download from user bookmark
                            (optional: -p BOOKMARK_FLAG [y/n/o] for private bookmark, --sp=START_PAGE, and --ep=END_PAGE)
                        6 - Download from image bookmark
                            (required: -p BOOKMARK_FLAG [y/n/o] for private bookmark
                             optional: --sp=START_PAGE, and --ep=END_PAGE, and followed with tag)
                        7 - Download from tags list
                            (required: -f LIST_FILE,
                             optional: --sp=START_PAGE, and --ep=END_PAGE, --start_date, --end_date)
                        8 - Download new illust from bookmark
                            (optional: --sp=START_PAGE, and --ep=END_PAGE)
                        9 - Download by Title/Caption
                            (required: title/caption
                             optional: --sp=START_PAGE, and --ep=END_PAGE, --start_date, --end_date)
                        10 - Download by Tag and Member Id
                            (required: member_id, followed by tags
                             optional: --sp=START_PAGE, and --ep=END_PAGE)
                        11 - Download Member's Bookmarked Images
                            (required: followed by member_ids separated by space)
                        12 - Download by Group ID
                            (required: Group ID, limit, and process external[y/n])
                        13 - Download by Manga Series ID
                            (required: Manga Series ID separated by space
                            optional: --sp=START_PAGE, and --ep=END_PAGE)
                        f1 - Download from supported artists (FANBOX)
                            (optional: End Page)
                        f2 - Download by artist/creator id (FANBOX)
                            (required: artist(digits only)/creator ids separated by space,
                             optional: end page)
                        f3 - Download by post id (FANBOX)
                            (required: post ids, separated with space)
                        f4 - Download from followed artists (FANBOX)
                            (optional: End Page)
                        f5 - Download from custom artist list (FANBOX)
                            (optional: End page, path to list)
                        b - Batch Download from batch_job.json (experimental)
                            (optional: --bf=BATCH_FILE)
                        l - Export local database image_id/post_id
                            (required: --up=USE_PIXIV, and --uf=USE_FANBOX, and --us=USE_SKETCH)
                        e - Export online bookmark
                            (required: -p BOOKMARK_FLAG [y/n/o] for private bookmark,
                             optional: --ef=EXPORT_FILENAME)
                        m - Export online user bookmark
                            (required: member_id, optional: --ef=EXPORT_FILENAME)
                        d - Manage database
  -x, --exitwhendone    Exit programm when done.
                        (only useful when DB-Manager)
  -i, --irfanview       start IrfanView after downloading images using
                        downloaded_on_%date%.txt
  -n NUMBEROFPAGES, --numberofpages=NUMBEROFPAGES
                        temporarily overwrites numberOfPage set in config.ini
  -c [PATH], --config [PATH] provide different config.ini

Error Codes

  • 100 = Not Logged in.
  • 1001 = User ID not exist/deleted.
  • 1002 = User Account is Suspended.
  • 1003 = Unknown Member Error.
  • 1004 = No image found.
  • 1005 = Cannot login.
  • 2001 = Unknown Error in Image Page.
  • 2002 = Not in MyPick List, Need Permission.
  • 2003 = Public works can not be viewed by the appropriate level.
  • 2004 = Image not found/already deleted.
  • 2005 = Image is disabled for under 18, check your setting page (R-18/R-18G).
  • 2006 = Unknown Image Error.
  • 9000 = Download Failed.
  • 9001 = Download Failed: Harddisk related.
  • 9002 = Download Failed: Network related.
  • 9005 = Server Error.

config.ini

[Authentication]

  • username

    Your pixiv username. Needed for OAuth. Please make sure the combination of username and password is valid in case of OAuth error. If you get error 103, please try changing username from pixiv ID to email address or the other way around.

  • password

    Your pixiv password, in clear text! Needed for OAuth. Please make sure the combination of username and password is valid in case of OAuth error.

  • cookie

    Your cookies for pixiv login, will be automatically updated in the login. See #814 (comment) for details.

  • cookieFanbox

    Cookie for fanbox.cc, normally no need to fill in.

  • refresh_token

    Used for OAuth refresh token to avoid relogin too many time. Automatically generated upon succesful OAuth login.

[Pixiv]

  • numberofpage

    Number of page to be processed, put 0 to process all pages.

  • r18mode

    Only list images tagged R18, for member, member's bookmark, and search by tag. Set to True to enable.

  • r18Type

    Allow filtering for R-18 type (R-18 or R-18G) Set r18Type with value 0 = both R18 and R-18G, 1 = only R18, or 2 = only R18G

  • dateformat

    Pixiv DateTime format, leave blank to use default format (YYYY-MM-DD). Refer to http://strftime.org/ for syntax. Quick Reference:

    • %d = Day, %m = Month, %Y = Year (4 digit)
    • %H = Hour (24h), %M = Minute, %S = Seconds
  • autoAddMember

    Automatically save member id to db for all download.

  • aiDisplayFewer

    if true, filter out AI-generated images from downloading.

[FANBOX]

  • filenameFormatFanboxContent

    Similar to filename format, but for files inside FANBOX posts.

  • filenameFormatFanboxCover

    Similar to filename format, but for FANBOX post cover images

  • filenameFormatFanboxInfo

    Similar to filename format, but for info dumps.

  • writeHtml

    A switch to decide whether to write FANBOX posts into HTMLs or not.

    • If set to True, article type posts will for sure be written into HTMLs, while non-article type posts are controlled with minTextLengthForNonArticle and minImageCountForNonArticle.
    • If set to False, no post will be written into HTMLs.
    • filenameFormatFanboxInfo will be used for filename.
    • For HTML format, please refer to 'HTML Format' section
  • minTextLengthForNonArticle

    Works with minImageCountForNonArticle. When 'writeHtml' is True, a non-article post should contain text longer than this value to be written into HTML.

  • minImageCountForNonArticle

    Works with minTextLengthForNonArticle. When writeHtml is True, a non-article post should contain at least this many files/images to be written into HTML.

  • useAbsolutePathsInHtml

    Set to True to use absolute paths in HTMLs. Set to False to use relative paths.

  • downloadCoverWhenRestricted

    Set to True to download FANBOX post cover images even if they are restricted.

  • checkDBProcessHistory Each FANBOX post has a updated_date value, which will be recorded/updated in database after it is processed.

    • When this is True, the values in database would be checked when processing each post. If record is no earlier than the newly retrieved date, which means that the post has not been processed at all or changed since last time, the post would be skipped.
    • When this is False, posts will be processed anyways.
  • listPathFanbox

    The list file for fanbox creators. One creator per line. Doesn't support custom path.

[Network]

  • useproxy

    Set True to use proxy server, or False to disable it.

  • proxyaddress

    Proxy server address, use this format:

    • http://<username>:<password>@<proxy_server>:<port> or
    • socks5://<username>:<password>@<proxy_server>:<port> or
    • socks4://<username>:<password>@<proxy_server>:<port>
  • useragent

    Browser user agent to spoof. You can check it from https://www.whatismybrowser.com/detect/what-is-my-user-agent

  • userobots

    Download robots.txt for mechanize.

  • timeout

    Time to wait before giving up the connection, in seconds.

  • retry

    Number of retries.

  • retrywait

    Waiting time for each retry, in seconds.

  • downloadDelay

    Set random delay up to n seconds for each image post. Set to 0 to disable.

  • checkNewVersion

    Set to True to check new releases in github.

  • notifyBetaVersion

    Set to False to ignore beta releases.

  • openNewVersion

    Set to False to disable opening new releases in browser.

  • enableSSLVerification

    Enable SSL verication, only set to False if you always encounter SSL Error (this disable the security)

[Debug]

  • logLevel

    Set log level, valid values are CRITICAL, ERROR, WARNING, INFO, DEBUG, and NOTSET

  • enableDump

    Enable HTML Dump. Set to False to disable.

  • skipDumpFilter

    Skip HTML Dump based on error code (using regex format). E.g.: 1.|2. => skip all HTML dump for error code 1xxx/2xxx.

  • dumpMediumPage

    Dump all medium page for debugging. Set to True to enable.

  • dumpTagSearchPage

    Dump tags search page for debugging.

  • debughttp

    Print http header, useful for debuggin. Set 'False' to disable.

[IrfanView]

  • IrfanViewPath

    Set directory where IrfanView is installed (needed to start IrfanView)

  • startIrfanView

    Set to True to start IrfanView with downloaded images when exiting pixivUtil

    • This will create download-lists
    • Be sure to set IrfanView to load Unicode-Plugin on startup when there are unicode-named files!
  • startIrfanSlide

    Set to True to start IrfanView-Slideshow with downloaded images when exiting pixivUtil.

    • This will create download-lists
    • Be sure to set IrfanView to load Unicode-Plugin on startup when there are unicode-named files!
    • Slideshow-options will be same as you have set in IrfanView before!
  • createDownloadLists

    Set to True to automatically create download-lists.

[Settings]

  • downloadlistdirectory

    list.txt path, also used for download-lists needed for createDownloadLists and IrfanView-Handling If leaved blank it will create download-lists in pixivUtil-directory.

  • uselist

    Set to True to parse list.txt. This will update the DB content from the list.txt (member_id and custom folder).

  • processfromdb

    Set True to use the member_id from the DB.

  • rootdirectory

    Your root directory for saving the images.

  • downloadavatar

    Set to True to download the member avatar as 'folder.jpg'

  • usesuppresstags

    Remove the suppressed tags from %tags% meta for filename. The list is taken from suppress_tags.txt, each tags is separated by new line.

  • tagsLimit

    Number of tags to be used for %tags% meta in filename. Use -1 to use all tags.

  • writeImageJSON

    Set to True to export the compact image information to JSON file. The filename is following filename(Manga)Infoformat + .json. If you want the original info from source, use with writeRawJSON.

  • writeimageinfo

    Set to True to export the compact image information to text file. The filename is following filename(Manga)Infoformat + .txt. If you want the original info from source, use with writeRawJSON.

  • writeRawJSON

    Set to True to export the original JSON untouched of the image for writeImageJSON.

  • RawJSONFilter

    Enter the JSON keys which you want to filter out for writeRawJSON. Keys are seperated by a comma.

  • includeSeriesJSON

    Set to True to export the series information to JSON. Non-series artwork doesn't have this info. The filename is following filenameSeriesJSON + .json.

  • writeImageXMP

    Set to True to export the image information to a .XMP sidecar file, this does not add XMP metadata to the image header.

  • writeImageXMPPerImage

    Set to True to export the image information to a .XMP sidecar file, one per image in the album. The data contained within the file is the same but some software requires matching file names to detect the metadata. If set to True, then writeImageXMP is ignored. Additionally, enabling this option will create a .XMP sidecar for every ugoira encoding enabled, and allow you to customise the name of each file using %image_ext%. For example, if you enable createWebp and createGif, then set your filenameInfoFormat to something like %urlFilename%.%image_ext%, then you will end up with <image ID>.gif.xmp and <image ID>.webp.xmp files created.

  • verifyimage

    Check if downloaded files are valid image or zip. Set the value to True to enable.

  • writeUrlInDescription

    Write all url found in the image description to a text file at the root directory. Set to True to enable. The list will be saved to to the application folder as url_list_.txt

  • stripHTMLTagsFromCaption

    Remove all HTML tags and their contents from the image caption/description when writing metadata to files. The contents of any links will be lost, so consider enabling writeUrlInDescription to retain them.

  • urlBlacklistRegex

    Used to filter out the url in the description using regular expression.

  • dbPath

    Use different database.

  • setLastModified

    Set last modified timestamp based on pixiv upload timestamp to the file.

  • useLocalTimezone

    Use local timezone in the .txt file of writeimageinfo and .XMP file of writeImageXMP.

  • defaultSketchOption

    Skip the "Include Pixiv Sketch" prompt when downloading by member_id option by using a default option. Set the value to y to always include sketches or n to exclude sketches from the download.

[DownloadControl]

  • minFileSize

    Skip if file size is less than minFileSize, set 0 to disable.

  • maxFileSize

    Skip if file size is more than minFileSize, set 0 to disable.

  • checkLastModified

    If the last-modified timestamp of the local files is the same with the uploaded date of the artwork, it'll log "match" and skip to process the current image_id. Require setlastmodified = True in config.ini to work properly

  • alwaysCheckFileSize

    Actually, it'll always check the file size. But if this is false, if the overwrite is also false and this file is recorded in db, it'll skip to process the current image_id. This will override the image_id checking from db (always fetch the image page to check the remote size).

  • overwrite

    If is true, when found file size different, it'll just delete the file (unless the backupOldFile is true), then start to re-download the image.

  • backupOldFile

    Set to True to backup old file if the file size is different. Old filename will be renamed to filename.unix-time.extension.

  • daylastupdated

    Only process member_id which were processed at least x days since the last check.

  • checkUpdatedLimit

    Jump to the next member id if already see n-number of previously downloaded images. alwaysCheckFileSize must be set to False.

  • useblacklisttags

    Skip image if containing blacklisted tags. The list is taken from blacklist_tags.txt, each tags is separated by new line.

  • useblacklisttitles

    Skip image if the title contains a blacklisted character sequence. The list is taken from blacklist_titles.txt, each sequence is separated by new line.

  • useblacklisttitlesregex

    Make the title blacklist check interpret each sequence as a regular expression.

  • dateDiff

    Process only new images within the given date difference. Set 0 to disable. Skip to next member id if in 'Download by Member', stop processing if in 'Download New Illust' mode.

  • enableInfiniteLoop

    Enable infinite loop for download by tags. Only applicable for download in descending order (newest first).

  • useBlacklistMembers

    Skip image by member id based on blacklist_members.txt in the same folder of the application.

  • downloadResized

    Download the medium size, rather than the original size.

  • skipUnknownSize

    Skip downloading if the remote size is not known when alwaysCheckFileSize is set to True.

  • enablePostProcessing

    If true, it enabled post processing cmd for every downloaded files. Default: False.

  • postProcessingCmd

    command to execute. add %filename% to pass the downloaded filename. NO ERROR HANDLING AT ALL, use on your own risk.

  • extensionFilter

    Provide a | seperated list of acceptable file extensions to download. Eg. jpg|png|gif|ugoira

  • downloadBuffer

    Download buffer before it write to disk in kiloByte, default is 512kB. You can change it based on your download speed. Mainly useful for smoother progress bar. Usually no need to change this value.

[FFmpeg]

  • ffmpeg

    ffmpeg executable path.

  • ffmpegcodec

    Codec to be used for encoding, default is using libvpx-vp9.

  • ffmpegExt

    The file extension (container format) to use for encoding. default: webm.

  • ffmpegparam

    Parameter to be used to encode webm, default: -lossless 0 -crf 15 -b 0 -vsync 0.

  • mkvcodec

    Codec to be used for encoding mkv, default is using copy.

  • mkvparam

    Parameter to be used to encode mkv, default: .

  • webpcodec

    Codec to be used for encoding webm, default is using libwebp.

  • webpparam

    Parameter to be used to encode webm, default: -lossless 0 -compression_level 5 -quality 100 -loop 0 -vsync 0.

[Ugoira]

  • writeugoirainfo

    If set to True, it will write the info of ugoira frames to a filename(Manga)Infoformat+.zip.js file. writeImageJSON contains this info as well.

  • createugoira

    If set to True, it will create .ugoira file. This is Pixiv own format for animated images. You can use Honeyview to see the animation.

  • createmkv

    Set to True to create mkv file (video format). The default settings is lossless(no encoding), it will pack the images in the container. Very large file size. Required createUgoira = True and ffmpeg executeable.

  • createwebm

    Set to True to create webm file (video format). The default encoding settings is lossy encoding but high quality with smallest file size. Required createUgoira = True and ffmpeg executeable.

  • createwebp

    Set to True to create webp file (image format). The default encoding settings is lossy encoding but high quality with smaller file size. Required createUgoira = True and ffmpeg executeable.

  • creategif

    Set to True to convert ugoira file to gif. The default encoding settings is lossy encoding but moderate quality with smaller file size. Required createUgoira = True and ffmpeg executeable.

  • createapng

    Set to True to convert ugoira file to animated png. The default encoding settings is lossless encoding but very large file size. Required createUgoira = True and ffmpeg executeable.

  • deleteugoira

    Set to True to delete the created .ugoira after conversion.

  • deleteZipFile

    If set to True, it will delete the orignal .zip (i.e. the actual image) file. Only active if createUgoira = True.

[Filename]

  • filenameformat

    The format for the filename, reserved/illegal character will be replaced with underscore '_', repeated space will be trimmed to single space. The filename (+full path) will be trimmed to the first 250 character (Windows limitation). Refer to Filename Format Syntax for available format.

  • filenamemangaformat

    Similar to filename format, but for manga pages.

  • filenameinfoformat

    Similar to filename format, but for info dumps.

  • filenameSeriesJSON

    Similar to filename format, but for series JSON dumps.

  • avatarNameFormat

    Similar to filename format, but for the avatar image. Not all formats are available.

  • backgroundNameFormat

    Similar to filename format, but for the background image. Not all formats are available.

  • tagsseparator

    Separator for each tag in filename, put %space% for space and %ideo_space% for ideographic space (" ").

  • createmangadir

    Create a directory if the imageMode is manga. The directory is created by splitting the image_id by '_pxx' pattern. This setting is depends on %urlFilename% format.

  • usetagsasdir

    Append the query tags in tagslist.txt to the root directory as save folder.

  • urlDumpFilename

    Define the dump filename, use python strftime() format. Default value is 'url_list_%Y%m%d'

  • filenameFormatSketch

    Similar to filename format, but for Pixiv Sketch.

  • customBadChars

    For sanitizing filenames with custom rules. Supports regular expressions. For detailed syntax, please refer to 'Bad chars' section.

  • customCleanUpRe

    TODO.

Filename Format Syntax

Available for filenameFormat, filenameMangaFormat, avatarNameFormat, filenameInfoFormat, filenameFormatFanboxCover, filenameFormatFanboxContent and filenameFormatFanboxInfo:

-> %member_token%
   Member token, might change.
-> %member_id%
   Member id, in number.
-> %artist%
   Artist name, might change too.
-> %urlFilename%
   The actual filename stored in server without the file extensions.
-> %date%
   Current date in YYYYMMMDD format.
-> %date_fmt{format}%
   Current date using custom format.
   Use Python string format notation, refer: https://goo.gl/3UiMAb
   e.g. %date_fmt{%Y-%m-%d}%
-> %image_ext%
   The image's file extension (jpg, png, etc.), the "." is not included.
   The correct file extension is already appended to the end of all files.
   This is available if you want to add more, or want to add the image's file extension to info files etc.

Available for filenameFormat and filenameMangaFormat:

-> %image_id%
   Image id, in number. (Post id for FANBOX and sketches)
-> %title%
   Image title, usually in japanese character.
-> %tags%
   Image tags, usually in japanese character. (not implemented for FANBOX yet)
-> %works_date%
   Works date, complete with time.
-> %works_date_only%
   Only the works date.
-> %works_date_fmt{<format>}%
   works date using custom format.
   Use Python string format notation, refer: https://goo.gl/3UiMAb
   e.g. %works_date_fmt{%Y-%m-%d}%
-> %works_res%
   Image resolution, will be containing the page count if manga.
-> %works_tools%
   Tools used for the image.
-> %R-18%
   Append R-18/R-18 based on image tag, can be used for creating directory
   by appending directory separator, e.g.: %R-18%\%image_id%.
-> %page_big%
   for manga mode, add big in the filename.
-> %page_index%
   for manga mode, add page number with 0-index. It will auto-pad with 0 based on the total count.
-> %page_number%
   for manga mode, add page number with 1-index. It will auto-pad with 0 based on the total count.
-> %bookmark%
   for bookmark mode, add 'Bookmarks' string.
-> %original_member_id%
   for bookmark mode, put original member id.
-> %original_member_token%
   for bookmark mode, put original member token.
-> %original_artist%
   for bookmark mode, put original artist name.
-> %searchTags%
   for download by tags and bookmarked images, put searched tags.
-> %bookmark_count%
   Bookmark count, will have overhead except on download by tags.
-> %image_response_count%
   Image respose count, will have overhead except on download by tags.
-> %manga_series_order%
   the order in the manga series.
-> %manga_series_id%
   original manga series id.
-> %manga_series_title%
   original manga series title, different from work title.
-> %AI%
   Add 'AI' for AI-generated images (aiType==2).

Specific for PixivSketch (option 1 if PixivSketch included, s1, and s2 ):

-> %sketch_member_id%
   Pixiv Sketch artist id, might be different from Pixiv's artist id.

Specific for Fanbox:

-> %fanbox_name%
   Fanbox name, might be different from Pixiv's artist name.
   Useful if the artist is suspended from Pixiv and there is no record in the DB to avoid interuption.

list.txt Format

  • This file should be build in the following way, white space will be trimmed, see example:
member_id1 directory1
member_id2 directory2
  ...
#comment - lines starting with # will be ignored
  • member_id = in number only

  • directory = path to download-directory for member_id

    • %root%\directory will save directory in rootFolder specified in config.ini \directory will save the folder in the root of your PixivUtil-drive
    • C:\directory will save the folder in drive C: (change to any other drive as you wish)
    • .\directory will save the folder in same directory as PixivUtil2.exe
    • directory-path can end with \ or not
  • Examples for list:

### START EXAMPLE LIST####
# this is a comment line, lines starting with # will be ignored
# here is the first member:
123456
# you can see, the line has only the member id
# usually I use it the following way:
#
# username (so I can recognize it ;) )
123456
#
# next 2 lines contain a special folder for this member
123456 .\test
123456 ".\test"
# now all images from member no. 123456 will be safed in directory "test" in the
# same directory as PixivUtil2
# as you can see you can use it with "" or without ;)
#
# next will be stored at the same partition as PixivUtil, but the directory is
# located in root-part of it
123456 \test
123456 "\test"
# this will lead to "C:\test" when pixivUtil is located on "C:\"
#
# next line uses complete path to store the files
123456 F:\new Folder\test
123456 "F:\new Folder\test"
# this will set the folder everywhere on your partitions
#
123456 %root%\special folder
123456 "%root%\special folder"
# this will set the download location to "special folder" in your rootDirectory
# given in config
http://www.pixiv.net/member.php?id=123456
http://www.pixiv.net/member_illust.php?id=123456
# also support url format.
### END EXAMPLE LIST####

tags.txt Format

  • This file will be used as source for Download from tags list (7)
  • Separate tags with space, ensure to set Use Wildcard to 'y'.
  • Each line will be treated as one search.
  • Save the files with UTF-8 encoding.

suppress_tags.txt Format

  • This file is used for suppressing the tags from being used in %tags%.
  • If matches, the tags will be removed from filename.
  • Each line is one tag only.
  • Save the files with UTF-8 encoding

blacklist_tags.txt Format

  • This file is used for tag blacklist checking for downloading image.
  • If matches, the image will be skipped.
  • Each line is one tag only.
  • Save the files with UTF-8 encoding

blacklist_members.txt Format

  • similar to list.txt, but without custom folder.

HTML Format

  • A simple default format will be used when no 'template.html' is provided.
  • Urls originally in the post will be overwritten with local paths.
  • Currently available syntaxes are:
-> %coverImage%
   A 'div' tag with its 'class' set to 'cover', and a child 'img' tag with 
   the url to the cover image as its 'src' attribute.
-> %coverImageUrl%
   Simply the url to the cover image in clear text.
-> %artistName%
   Same as %artist% in 'Filename Format Syntax' in clear text.
-> %imageTitle%"
   Title of the post in clear text.
-> "%worksDate%"
   Published date of the post in clear text.
-> %body_text(article)%
   This works for article type posts only.
   A 'div' tag with its 'class' set to 'article', and the post's content,
   which is already formatted HTML if the post is article, as its inner text.
-> %images(non-article)%
   This works for none-article type posts only.
   A 'div' tag with its 'class' set to 'non-article images', and 'a' tags
   of all files in the post as its children tokens.
   For each 'a' tag, its 'href' would be url to the file, and the inner text
   would be an 'img' tag with its 'src' set to the url to the file if the
   file's extension is 'jpg', 'jpeg', 'png' or 'bmp'. Otherwise the inner text
   would simply be the url to the file.
-> %text(non-article)%
   This works for none-article type posts only.
   A 'div' tag with its 'class' set to 'non-article text' and all paragraphs
   of text put in 'p' tags as its children tokens.
  • If there is a 'div' tag with 'main' in its 'class' in the template, 'article' or 'non-article' would be appended to its 'class' depending on the type of the post.

Bad chars

  • Originally for removing single bad chars for use between different OSs.
  • Now also supports strings and regular expressions.
  • The value set in option customBadChars would be parsed from left to right.
  • Currently available syntaxes are:
-> %replace<default>(your_default_replace_with)%
   Use this syntax to define default value to replace with.
   If this syntax gets used multiple times in the option value, the first value would be used.
   If this value is not set, "_" would be used.
-> %pattern<you_group_name>(your_pattern)%
-> %replace<you_group_name>(your_replace_with)%
   Use these two syntaxes to set groups of rules. Supports regular expression.
   You should not use "default" as group names, otherwise the first replace would
   be parsed as default value to replace with, while the others would be ignored.
   Groups with no "pattern" would be ignored.
   Groups with no "replace" use default value.
   If multiple "pattern"s or "replace"s share the same group name, the last value set
   would be used.
  • Chars/string not wrapped with syntaxes above would be considered single chars to be replaced with global replacement char/string, "_" if unset.
  • When configuration file gets written to file, customBadChars would be replaced with parsed valid value. Single chars would be placed first, followed by %replace<default>(your_default_replace_with)%, and each group.
  • Examples:
# If you just want to replace some single chars with "_"
\@[]
# If you want to replace them with "@":
\@[]%replace<default>(@)%
# If you want to replace certain words:
# This example would first replace all "maze" with "labyrinth",
# then all "labyrinth" with "nevermind"
%pattern<1>(maze)%%replace<1>(labyrinth)%%pattern<2>(labyrinth)%%replace<2>(nevermind)%
# If you want to replace characters within certain unicode range,
# then remove all continuous "_"s with a single "_":
%pattern<unicode>([\U0001d400-\U0001ffff])%%pattern<1>(_+)%%replace<1>(_)%

Development

PixivUtil2 posesses robust test suite. To run it, one needs pytest suite:

pip install --user pytest

pytest -v ./test_*

Credits/Contributor

** If I forget someone, please send me a pull request with the commit/merge id.

License Agreement

See LICENSE.

Run on Repl.it

pixivutil2's People

Contributors

anton-evseev avatar awiebe avatar baa14453 avatar bluerthanever avatar byjtje avatar cokemine avatar denden047 avatar dragontamer8740 avatar fireattack avatar hamuko avatar hi117 avatar jwshields avatar markhuang3310 avatar nandaka avatar newuserha avatar nhorus avatar nixxquality avatar patrickl546 avatar pixtrix avatar prototype27 avatar qazmlpok avatar rachmadaniharyono avatar split-n avatar terryble2 avatar toyem avatar whinette avatar wmjdgla avatar woky avatar yukihoaa avatar yzaoui avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pixivutil2's Issues

selecting “5 – Download from user bookmark” and also exporting online bookmarks every entry is doubled

Firstly, thanks for this awesome time saving program. I really appreciate it.

I did notice something though, when selecting “5 – Download from user bookmark” and also exporting online bookmarks every entry is doubled. So instead of ending up with let’s say 100 bookmarks, it doubles to 200. Not quite sure if this is a new issue or not. Obviously it increases download times and also hits the Pixiv servers twice which isn’t a good thing.

For anyone else who is having the same problem…I solved it by exporting the online bookmarks, copy pasting the list into http://textop.us/Lines-tools/Delete-Duplicate-Lines to remove duplicates and then using the “uselist” config setting for Pixiv Downloader instead. Just thought i’d let you know and everyone else so we can all responsibly connect to Pixiv and minimize our bandwidth consumption.

http://nandaka.wordpress.com/2012/05/19/pixiv-downloader-20120519/#comment-1799

Have a problem in Download from list (list.txt)

in win8 64 cmd

PixivDownloader2 version 20140325
https://nandaka.wordpress.com/tag/pixiv-downloader/
Reading I:\pixivutil\config.ini ...
done.
Creating database... done.
Only process member where day last updated >= 7
Using Username: aixinggecheng
logging in with saved cookie
Trying to log with saved cookie
done.
PixivDownloader2 version 20140325
https://nandaka.wordpress.com/tag/pixiv-downloader/
1. Download by member_id
2. Download by image_id
3. Download by tags
4. Download from list
5. Download from online user bookmark
6. Download from online image bookmark
7. Download from tags list
8. Download new illust from bookmark
9. Download by Title/Caption
10. Download by Tag and Member Id
11. Download Member Bookmark
12. Download by Group Id
------------------------
d. Manage database
e. Export online bookmark
r. Reload config.ini
p. Print config.ini
x. Exit
Input: 4
Processing from database.
Select only last 7 days.
Found 0 items.
PixivDownloader2 version 20140325
https://nandaka.wordpress.com/tag/pixiv-downloader/
1. Download by member_id
2. Download by image_id
3. Download by tags
4. Download from list
5. Download from online user bookmark
6. Download from online image bookmark
7. Download from tags list
8. Download new illust from bookmark
9. Download by Title/Caption
10. Download by Tag and Member Id
11. Download Member Bookmark
12. Download by Group Id
------------------------
d. Manage database
e. Export online bookmark
r. Reload config.ini
p. Print config.ini
x. Exit
Input:4 I:\pixivutil\list.txt
PixivDownloader2 version 20140325
https://nandaka.wordpress.com/tag/pixiv-downloader/
1. Download by member_id
2. Download by image_id
3. Download by tags
4. Download from list
5. Download from online user bookmark
6. Download from online image bookmark
7. Download from tags list
8. Download new illust from bookmark
9. Download by Title/Caption
10. Download by Tag and Member Id
11. Download Member Bookmark
12. Download by Group Id
------------------------
d. Manage database
e. Export online bookmark
r. Reload config.ini
p. Print config.ini
x. Exit

I get it;

log

2014-04-06 02:18:54,934 - PixivUtil20140325 - INFO - Starting...
2014-04-06 02:18:54,937 - PixivUtil20140325 - INFO - Setting log level to: DEBUG
2014-04-06 02:18:54,938 - PixivUtil20140325 - INFO - No default cookie jar available, creating... 
2014-04-06 02:18:54,944 - PixivUtil20140325 - INFO - Only process member where day last updated >= 7
2014-04-06 02:18:54,944 - PixivUtil20140325 - INFO - Using Username: aixinggecheng
2014-04-06 02:18:54,944 - PixivUtil20140325 - INFO - logging in with saved cookie
2014-04-06 02:18:54,944 - PixivUtil20140325 - INFO - Trying to log with saved cookie
2014-04-06 02:18:58,305 - PixivUtil20140325 - INFO - Logged in using cookie
2014-04-06 02:19:04,210 - PixivUtil20140325 - INFO - Export Bookmark mode.

What should I do ?

I use the exe get list.txt
Iike this

Export date: 2014-04-06 02:19:18.592000

171007
806502
2704736
2789630
1043550
3030723
2539243
393713
5770091
68864

END-OF-FILE

but when i use it,it don`t work
and i test change list
like this
1721007
1721007 .\test
it don't work too;

Add ability to move files and detect moved files

At the moment, if you move files in the database (move your folder somewhere else) and change the storage path in your config, the script assumes everything is deleted.

It'd be useful to have the script detect a change in the storage path from the config file and scan the new directory for the old files. Or have a command under the database section where you can change the path and have the script move everything.

Download by member id - downloads only few images and says it sees only 1 page of 2

http://www.pixiv.net/member_illust.php?id=20352

It downloads 10 images and says there is only 1 page but on website it says "31件" images and two pages

Log:
2013-01-19 00:13:46,079 - PixivUtil20121215 - INFO - ###############################################################
2013-01-19 00:13:46,079 - PixivUtil20121215 - INFO - Starting...
2013-01-19 00:13:46,079 - PixivUtil20121215 - INFO - Using Username: *******
2013-01-19 00:13:46,079 - PixivUtil20121215 - INFO - logging in with saved cookie
2013-01-19 00:13:46,079 - PixivUtil20121215 - INFO - Trying to log with saved cookie
2013-01-19 00:13:49,502 - PixivUtil20121215 - INFO - Logged in using cookie
2013-01-19 00:13:51,611 - PixivUtil20121215 - INFO - Member id mode.
2013-01-19 00:13:58,266 - PixivUtil20121215 - INFO - Processing Member Id: 20352
2013-01-19 00:13:58,282 - PixivUtil20121215 - INFO - Member Url: http://www.pixiv.net/member_illust.php?id=20352&p=1
2013-01-19 00:14:00,766 - PixivUtil20121215 - INFO - Member_id: 20352 complete, last image_id: 3095802

window direct log:
PixivDownloader2 version 20121215
https://nandaka.wordpress.com/tag/pixiv-downloader/
Reading V:\Program Files\PixivD\config.ini ...
done.
Creating database... done.
Using Username: *******
logging in with saved cookie
Trying to log with saved cookie
done.
PixivDownloader2 version 20121215
https://nandaka.wordpress.com/tag/pixiv-downloader/

  1. Download by member_id
  2. Download by image_id
  3. Download by tags
  4. Download from list
  5. Download from online user bookmark
  6. Download from online image bookmark
  7. Download from tags list
  8. Download new illust from bookmark
  9. Download by Title/Caption
  10. Download by Tag and Member Id
  11. Download Member Bookmark
    d. Manage database
    e. Export online bookmark
    x. Exit
    Input: 1
    Member id: 20352
    Start Page (default=1):
    End Page (default=0, 0 for no limit):
    Processing Member Id: 20352
    Reading V:\Program Files\PixivD\config.ini ...
    done.
    Page 1
    Member Url: http://www.pixiv.net/member_illust.php?id=20352&p=1
    Member Name : ???
    Member Avatar: http://i1.pixiv.net/img03/profile/tateha/5646840.jpg
    Member Token : tateha
    #1
    Already downloaded: 29155508
    #1
    Already downloaded: 29022590
    #1
    Already downloaded: 26879813
    #1
    Already downloaded: 20830865
    #1
    Already downloaded: 19781678
    #1
    Already downloaded: 16862443
    #1
    Already downloaded: 8160208
    #1
    Already downloaded: 6427420
    #1
    Already downloaded: 3362027
    #1
    Already downloaded: 3095802
    Last Page
    Done.

cfg file:
[Settings]
proxyaddress =
useproxy = False
useragent = Mozilla/5.0 (X11; U; Unix i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1
debughttp = False
userobots = False
filenameformat = %member_id% (%member_token%)%urlFilename% - %title%
filenamemangaformat = %member_id% (%member_token%)%image_id%%urlFilename%
timeout = 60
uselist = False
processfromdb = False
overwrite = False
tagsseparator = ,
daylastupdated = 7
rootdirectory = C:\DL Image Packs
retry = 3
retrywait = 5
createdownloadlists = False
downloadlistdirectory = .
irfanviewpath = C:\Program Files\IrfanView
startirfanview = False
startirfanslide = False
alwayscheckfilesize = False
checkupdatedlimit = 0
downloadavatar = False
createmangadir = False
usetagsasdir = False
useblacklisttags = False
usesuppresstags = False
tagslimit = -1
writeimageinfo = False

[Pixiv]
numberofpage = 0
r18mode = False

*Moved issue to right repository, sry ^_^

it look like the image list is from page 1 and page 2 if you compare from pixiv... Likely the server return wrong number of images?

Yes images are both from page 1 and 2 but just not close to all of em, how do I know if it returns wrong number of images? And why would it do so for me and not for you? What can be done?

EDIT: I logged from my account I made specially for pixivdownloader manually and opened http://www.pixiv.net/member_illust.php?id=20352 and yes it shows 10 images total =_= - wtf is this? how? do they downgrade accounts in right to view all images after some 'inactivty" time or something? do you know? Cuz I wouldn't want to store my pixivs main account credentials in plain text file on my hard drive T_T

HTTP Error 403: request disallowed by robots.txt

Can't download anything T_T

PixivDownloader2 version 20120806
https://nandaka.wordpress.com/tag/pixiv-downloader/
Reading V:\Program Files\PixivD\config.ini ...
done.
Creating database... done.
Only process member where day last updated >= 7
Using Username: test56
logging in with saved cookie
Trying to log with saved cookie
Cookie already expired/invalid.
Log in using form.
done.
new cookie value: 22d476541f275bad092a260a60f9f6f8
Writing config file... done.
PixivDownloader2 version 20120806
https://nandaka.wordpress.com/tag/pixiv-downloader/

  1. Download by member_id
  2. Download by image_id
  3. Download by tags
  4. Download from list
  5. Download from online user bookmark
  6. Download from online image bookmark
  7. Download from tags list
  8. Download new illust from bookmark
  9. Download by Title/Caption

10. Download by Tag and Member Id

d. Manage database
e. Export online bookmark
x. Exit
Input: 1
Member id: 1471757
Start Page (default=1):
End Page (default=0, 0 for no limit):
Processing Member Id: 1471757
Reading V:\Program Files\PixivD\config.ini ...
done.
Page 1
Member Name : ??????????????
Member Avatar: http://i2.pixiv.net/img44/profile/believer_a/4859407.png
Member Token : believer_a
#1

Processing Image Id: 29126463
Title: ??????????
Tags : ?????? ??????????? ?????????????? ???? ????????? ???? ???????
Mode : big
Image URL : http://i2.pixiv.net/img44/img/believer_a/29126463.png
Filename : C:\DL Image Packs\1471757 (believer_a)\29126463.png
HTTP Error 403: request disallowed by robots.txt
403
1 2 3 4
HTTP Error 403: request disallowed by robots.txt
403
1 2 3 4
HTTP Error 403: request disallowed by robots.txt
403
1 2 3 4
HTTP Error 403: request disallowed by robots.txt
403
Traceback (most recent call last):
File "PixivUtil2.py", line 672, in processImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 122, in downloadImage
File "mechanize_mechanize.pyc", line 203, in open
File "mechanize_mechanize.pyc", line 255, in _mech_open
httperror_seek_wrapper: HTTP Error 403: request disallowed by robots.txt
Error at processImage(): (<class 'mechanize._response.httperror_seek_wrapper'>,
<httperror_seek_wrapper (mechanize._http.RobotExclusionError instance) at 0x1649
c38 whose wrapped object = <closeable_response at 0x16e2940 whose fp = <cStringI
O.StringI object at 0x016FE908>>>, <traceback object at 0x016FC378>)
Dumping html to: Error Medium Page for image 29126463.html
Cannot dump page for image_id: 29126463
Stuff happened, trying again after 2 second ( 1 )
local variable 'parseBigImage' referenced before assignment
Processing Image Id: 29126463
Title: ??????????
Tags : ?????? ??????????? ?????????????? ???? ????????? ???? ???????
Mode : big
Image URL : http://i2.pixiv.net/img44/img/believer_a/29126463.png
Filename : C:\DL Image Packs\1471757 (believer_a)\29126463.png
HTTP Error 403: request disallowed by robots.txt
403
1 2 3 4
HTTP Error 403: request disallowed by robots.txt
403
1 2 3 4
HTTP Error 403: request disallowed by robots.txt
403
1 2 3 4
HTTP Error 403: request disallowed by robots.txt
403
Traceback (most recent call last):
File "PixivUtil2.py", line 672, in processImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 122, in downloadImage
File "mechanize_mechanize.pyc", line 203, in open
File "mechanize_mechanize.pyc", line 255, in _mech_open
httperror_seek_wrapper: HTTP Error 403: request disallowed by robots.txt
Error at processImage(): (<class 'mechanize._response.httperror_seek_wrapper'>,
<httperror_seek_wrapper (mechanize._http.RobotExclusionError instance) at 0x1720
d88 whose wrapped object = <closeable_response at 0x17d2a30 whose fp = <cStringI
O.StringI object at 0x017E5F98>>>, <traceback object at 0x017D7AD0>)
Dumping html to: Error Medium Page for image 29126463.html
Cannot dump page for image_id: 29126463
Stuff happened, trying again after 2 second ( 2 )
local variable 'parseBigImage' referenced before assignment
Processing Image Id: 29126463
Title: ??????????
Tags : ?????? ??????????? ?????????????? ???? ????????? ???? ???????
Mode : big
Image URL : http://i2.pixiv.net/img44/img/believer_a/29126463.png
Filename : C:\DL Image Packs\1471757 (believer_a)\29126463.png
HTTP Error 403: request disallowed by robots.txt
403
1 2 3 4
HTTP Error 403: request disallowed by robots.txt
403
1 2 3 4
HTTP Error 403: request disallowed by robots.txt
403
1 2 3 4
HTTP Error 403: request disallowed by robots.txt
403
Traceback (most recent call last):
File "PixivUtil2.py", line 672, in processImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 122, in downloadImage
File "mechanize_mechanize.pyc", line 203, in open
File "mechanize_mechanize.pyc", line 255, in _mech_open
httperror_seek_wrapper: HTTP Error 403: request disallowed by robots.txt
Error at processImage(): (<class 'mechanize._response.httperror_seek_wrapper'>,
<httperror_seek_wrapper (mechanize._http.RobotExclusionError instance) at 0x1817
df8 whose wrapped object = <closeable_response at 0x184d878 whose fp = <cStringI
O.StringI object at 0x018C5CB0>>>, <traceback object at 0x01851AD0>)
Dumping html to: Error Medium Page for image 29126463.html
Cannot dump page for image_id: 29126463
Stuff happened, trying again after 2 second ( 3 )
local variable 'parseBigImage' referenced before assignment
Processing Image Id: 29126463
Title: ??????????
Tags : ?????? ??????????? ?????????????? ???? ????????? ???? ???????
Mode : big
Image URL : http://i2.pixiv.net/img44/img/believer_a/29126463.png
Filename : C:\DL Image Packs\1471757 (believer_a)\29126463.png
HTTP Error 403: request disallowed by robots.txt
403
1 2 3 4
HTTP Error 403: request disallowed by robots.txt
403
1 2 3 4
HTTP Error 403: request disallowed by robots.txt
403
1 2 3 4
HTTP Error 403: request disallowed by robots.txt
403
Traceback (most recent call last):
File "PixivUtil2.py", line 672, in processImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 122, in downloadImage
File "mechanize_mechanize.pyc", line 203, in open
File "mechanize_mechanize.pyc", line 255, in _mech_open
httperror_seek_wrapper: HTTP Error 403: request disallowed by robots.txt
Error at processImage(): (<class 'mechanize._response.httperror_seek_wrapper'>,
<httperror_seek_wrapper (mechanize._http.RobotExclusionError instance) at 0x18db
e68 whose wrapped object = <closeable_response at 0x1971738 whose fp = <cStringI
O.StringI object at 0x0197BDB8>>>, <traceback object at 0x01974878>)
Dumping html to: Error Medium Page for image 29126463.html
Cannot dump page for image_id: 29126463
Stuff happened, trying again after 2 second ( 4 )
local variable 'parseBigImage' referenced before assignment
Processing Image Id: 29126463
Title: ??????????
Tags : ?????? ??????????? ?????????????? ???? ????????? ???? ???????
Mode : big
Image URL : http://i2.pixiv.net/img44/img/believer_a/29126463.png
Filename : C:\DL Image Packs\1471757 (believer_a)\29126463.png
HTTP Error 403: request disallowed by robots.txt
403
1 2 3 4
HTTP Error 403: request disallowed by robots.txt
403
1 2 3 4
HTTP Error 403: request disallowed by robots.txt
403
1 2 3 4
HTTP Error 403: request disallowed by robots.txt
403
Traceback (most recent call last):
File "PixivUtil2.py", line 672, in processImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 122, in downloadImage
File "mechanize_mechanize.pyc", line 203, in open
File "mechanize_mechanize.pyc", line 255, in _mech_open
httperror_seek_wrapper: HTTP Error 403: request disallowed by robots.txt
Error at processImage(): (<class 'mechanize._response.httperror_seek_wrapper'>,
<httperror_seek_wrapper (mechanize._http.RobotExclusionError instance) at 0x19a1
ed8 whose wrapped object = <closeable_response at 0x1a52ad0 whose fp = <cStringI
O.StringI object at 0x01A6AE90>>>, <traceback object at 0x018E2DA0>)
Dumping html to: Error Medium Page for image 29126463.html
Cannot dump page for image_id: 29126463
Giving up image_id: 29126463
PixivDownloader2 version 20120806
https://nandaka.wordpress.com/tag/pixiv-downloader/

  1. Download by member_id
  2. Download by image_id
  3. Download by tags
  4. Download from list
  5. Download from online user bookmark
  6. Download from online image bookmark
  7. Download from tags list
  8. Download new illust from bookmark
  9. Download by Title/Caption

10. Download by Tag and Member Id

d. Manage database
e. Export online bookmark
x. Exit
Input:

and logilfe:

2012-08-09 17:17:47,533 - PixivUtil20120806 - INFO - ###############################################################
2012-08-09 17:17:47,549 - PixivUtil20120806 - INFO - Starting...
2012-08-09 17:17:47,690 - PixivUtil20120806 - INFO - Only process member where day last updated >= 7
2012-08-09 17:17:47,690 - PixivUtil20120806 - INFO - Using Username: test56
2012-08-09 17:17:47,690 - PixivUtil20120806 - INFO - logging in with saved cookie
2012-08-09 17:17:47,690 - PixivUtil20120806 - INFO - Trying to log with saved cookie
2012-08-09 17:17:51,611 - PixivUtil20120806 - INFO - Cookie already expired/invalid.
2012-08-09 17:17:51,611 - PixivUtil20120806 - INFO - Log in using form.
2012-08-09 17:17:57,503 - PixivUtil20120806 - INFO - Logged in
2012-08-09 17:18:38,331 - PixivUtil20120806 - INFO - Member id mode.
2012-08-09 17:18:43,924 - PixivUtil20120806 - INFO - Processing Member Id: 1471757
2012-08-09 17:18:51,299 - PixivUtil20120806 - ERROR - HTTPError: HTTP Error 403: request disallowed by robots.txt(http://i2.pixiv.net/img44/img/believer_a/29126463.png)
2012-08-09 17:18:55,908 - PixivUtil20120806 - ERROR - HTTPError: HTTP Error 403: request disallowed by robots.txt(http://i2.pixiv.net/img44/img/believer_a/29126463.png)
2012-08-09 17:19:01,753 - PixivUtil20120806 - ERROR - HTTPError: HTTP Error 403: request disallowed by robots.txt(http://i2.pixiv.net/img44/img/believer_a/29126463.png)
2012-08-09 17:19:06,378 - PixivUtil20120806 - ERROR - HTTPError: HTTP Error 403: request disallowed by robots.txt(http://i2.pixiv.net/img44/img/believer_a/29126463.png)
2012-08-09 17:19:06,378 - PixivUtil20120806 - ERROR - Error at processImage(): (<class 'mechanize._response.httperror_seek_wrapper'>, <httperror_seek_wrapper (mechanize._http.RobotExclusionError instance) at 0x1649c38 whose wrapped object = <closeable_response at 0x16e2940 whose fp = <cStringIO.StringI object at 0x016FE908>>>, <traceback object at 0x016FC378>)
2012-08-09 17:19:06,378 - PixivUtil20120806 - ERROR - Error at processImage(): 29126463
Traceback (most recent call last):
File "PixivUtil2.py", line 672, in processImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 122, in downloadImage
File "mechanize_mechanize.pyc", line 203, in open
File "mechanize_mechanize.pyc", line 255, in _mech_open
httperror_seek_wrapper: HTTP Error 403: request disallowed by robots.txt
2012-08-09 17:19:06,378 - PixivUtil20120806 - ERROR - Dumping html to: Error Medium Page for image 29126463.html
2012-08-09 17:19:06,378 - PixivUtil20120806 - ERROR - Cannot dump page for image_id: 29126463
2012-08-09 17:19:12,815 - PixivUtil20120806 - ERROR - HTTPError: HTTP Error 403: request disallowed by robots.txt(http://i2.pixiv.net/img44/img/believer_a/29126463.png)
2012-08-09 17:19:17,440 - PixivUtil20120806 - ERROR - HTTPError: HTTP Error 403: request disallowed by robots.txt(http://i2.pixiv.net/img44/img/believer_a/29126463.png)
2012-08-09 17:19:22,065 - PixivUtil20120806 - ERROR - HTTPError: HTTP Error 403: request disallowed by robots.txt(http://i2.pixiv.net/img44/img/believer_a/29126463.png)
2012-08-09 17:19:26,690 - PixivUtil20120806 - ERROR - HTTPError: HTTP Error 403: request disallowed by robots.txt(http://i2.pixiv.net/img44/img/believer_a/29126463.png)
2012-08-09 17:19:26,690 - PixivUtil20120806 - ERROR - Error at processImage(): (<class 'mechanize._response.httperror_seek_wrapper'>, <httperror_seek_wrapper (mechanize._http.RobotExclusionError instance) at 0x1720d88 whose wrapped object = <closeable_response at 0x17d2a30 whose fp = <cStringIO.StringI object at 0x017E5F98>>>, <traceback object at 0x017D7AD0>)
2012-08-09 17:19:26,690 - PixivUtil20120806 - ERROR - Error at processImage(): 29126463
Traceback (most recent call last):
File "PixivUtil2.py", line 672, in processImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 122, in downloadImage
File "mechanize_mechanize.pyc", line 203, in open
File "mechanize_mechanize.pyc", line 255, in _mech_open
httperror_seek_wrapper: HTTP Error 403: request disallowed by robots.txt
2012-08-09 17:19:26,690 - PixivUtil20120806 - ERROR - Dumping html to: Error Medium Page for image 29126463.html
2012-08-09 17:19:26,690 - PixivUtil20120806 - ERROR - Cannot dump page for image_id: 29126463
2012-08-09 17:19:33,283 - PixivUtil20120806 - ERROR - HTTPError: HTTP Error 403: request disallowed by robots.txt(http://i2.pixiv.net/img44/img/believer_a/29126463.png)
2012-08-09 17:19:37,908 - PixivUtil20120806 - ERROR - HTTPError: HTTP Error 403: request disallowed by robots.txt(http://i2.pixiv.net/img44/img/believer_a/29126463.png)
2012-08-09 17:19:47,096 - PixivUtil20120806 - ERROR - HTTPError: HTTP Error 403: request disallowed by robots.txt(http://i2.pixiv.net/img44/img/believer_a/29126463.png)
2012-08-09 17:19:51,721 - PixivUtil20120806 - ERROR - HTTPError: HTTP Error 403: request disallowed by robots.txt(http://i2.pixiv.net/img44/img/believer_a/29126463.png)
2012-08-09 17:19:51,721 - PixivUtil20120806 - ERROR - Error at processImage(): (<class 'mechanize._response.httperror_seek_wrapper'>, <httperror_seek_wrapper (mechanize._http.RobotExclusionError instance) at 0x1817df8 whose wrapped object = <closeable_response at 0x184d878 whose fp = <cStringIO.StringI object at 0x018C5CB0>>>, <traceback object at 0x01851AD0>)
2012-08-09 17:19:51,721 - PixivUtil20120806 - ERROR - Error at processImage(): 29126463
Traceback (most recent call last):
File "PixivUtil2.py", line 672, in processImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 122, in downloadImage
File "mechanize_mechanize.pyc", line 203, in open
File "mechanize_mechanize.pyc", line 255, in _mech_open
httperror_seek_wrapper: HTTP Error 403: request disallowed by robots.txt
2012-08-09 17:19:51,721 - PixivUtil20120806 - ERROR - Dumping html to: Error Medium Page for image 29126463.html
2012-08-09 17:19:51,721 - PixivUtil20120806 - ERROR - Cannot dump page for image_id: 29126463
2012-08-09 17:19:58,206 - PixivUtil20120806 - ERROR - HTTPError: HTTP Error 403: request disallowed by robots.txt(http://i2.pixiv.net/img44/img/believer_a/29126463.png)
2012-08-09 17:20:02,815 - PixivUtil20120806 - ERROR - HTTPError: HTTP Error 403: request disallowed by robots.txt(http://i2.pixiv.net/img44/img/believer_a/29126463.png)
2012-08-09 17:20:07,424 - PixivUtil20120806 - ERROR - HTTPError: HTTP Error 403: request disallowed by robots.txt(http://i2.pixiv.net/img44/img/believer_a/29126463.png)
2012-08-09 17:20:12,033 - PixivUtil20120806 - ERROR - HTTPError: HTTP Error 403: request disallowed by robots.txt(http://i2.pixiv.net/img44/img/believer_a/29126463.png)
2012-08-09 17:20:12,033 - PixivUtil20120806 - ERROR - Error at processImage(): (<class 'mechanize._response.httperror_seek_wrapper'>, <httperror_seek_wrapper (mechanize._http.RobotExclusionError instance) at 0x18dbe68 whose wrapped object = <closeable_response at 0x1971738 whose fp = <cStringIO.StringI object at 0x0197BDB8>>>, <traceback object at 0x01974878>)
2012-08-09 17:20:12,033 - PixivUtil20120806 - ERROR - Error at processImage(): 29126463
Traceback (most recent call last):
File "PixivUtil2.py", line 672, in processImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 122, in downloadImage
File "mechanize_mechanize.pyc", line 203, in open
File "mechanize_mechanize.pyc", line 255, in _mech_open
httperror_seek_wrapper: HTTP Error 403: request disallowed by robots.txt
2012-08-09 17:20:12,033 - PixivUtil20120806 - ERROR - Dumping html to: Error Medium Page for image 29126463.html
2012-08-09 17:20:12,033 - PixivUtil20120806 - ERROR - Cannot dump page for image_id: 29126463
2012-08-09 17:20:18,846 - PixivUtil20120806 - ERROR - HTTPError: HTTP Error 403: request disallowed by robots.txt(http://i2.pixiv.net/img44/img/believer_a/29126463.png)
2012-08-09 17:20:23,440 - PixivUtil20120806 - ERROR - HTTPError: HTTP Error 403: request disallowed by robots.txt(http://i2.pixiv.net/img44/img/believer_a/29126463.png)
2012-08-09 17:20:28,049 - PixivUtil20120806 - ERROR - HTTPError: HTTP Error 403: request disallowed by robots.txt(http://i2.pixiv.net/img44/img/believer_a/29126463.png)
2012-08-09 17:20:32,674 - PixivUtil20120806 - ERROR - HTTPError: HTTP Error 403: request disallowed by robots.txt(http://i2.pixiv.net/img44/img/believer_a/29126463.png)
2012-08-09 17:20:32,674 - PixivUtil20120806 - ERROR - Error at processImage(): (<class 'mechanize._response.httperror_seek_wrapper'>, <httperror_seek_wrapper (mechanize._http.RobotExclusionError instance) at 0x19a1ed8 whose wrapped object = <closeable_response at 0x1a52ad0 whose fp = <cStringIO.StringI object at 0x01A6AE90>>>, <traceback object at 0x018E2DA0>)
2012-08-09 17:20:32,674 - PixivUtil20120806 - ERROR - Error at processImage(): 29126463
Traceback (most recent call last):
File "PixivUtil2.py", line 672, in processImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 122, in downloadImage
File "mechanize_mechanize.pyc", line 203, in open
File "mechanize_mechanize.pyc", line 255, in _mech_open
httperror_seek_wrapper: HTTP Error 403: request disallowed by robots.txt
2012-08-09 17:20:32,674 - PixivUtil20120806 - ERROR - Dumping html to: Error Medium Page for image 29126463.html
2012-08-09 17:20:32,674 - PixivUtil20120806 - ERROR - Cannot dump page for image_id: 29126463
2012-08-09 17:20:32,674 - PixivUtil20120806 - ERROR - Giving up image_id: 29126463

and dumped page: http://rghost.ru/39673731

Refuses to work for unknown reasons

Sorry for the useless title, but I really can't express it any other way. After trying to download (this happens on all artist pages, by the way), the following stuff gets written (and nothing is downloaded):

2013-02-20 20:01:43,140 - PixivUtil20130128 - INFO - Processing Member Id: 941624
2013-02-20 20:01:43,140 - PixivUtil20130128 - INFO - Member Url: http://www.pixiv.net/member_illust.php?id=941624&p=1
2013-02-20 20:01:46,000 - PixivUtil20130128 - ERROR - Error at processing Artist Info: (<type 'exceptions.AttributeError'>, AttributeError("'NoneType' object has no attribute 'find'",), <traceback object at 0x02CCCE68>)
2013-02-20 20:01:46,000 - PixivUtil20130128 - ERROR - Error at processing Artist Info: 941624
Traceback (most recent call last):
File "PixivUtil2.py", line 414, in processMember
File "PixivModel.pyc", line 44, in init
File "PixivModel.pyc", line 60, in ParseInfo
AttributeError: 'NoneType' object has no attribute 'find'
2013-02-20 20:01:50,000 - PixivUtil20130128 - INFO - Member Url: http://www.pixiv.net/member_illust.php?id=941624&p=1
2013-02-20 20:01:52,953 - PixivUtil20130128 - ERROR - Error at processing Artist Info: (<type 'exceptions.AttributeError'>, AttributeError("'NoneType' object has no attribute 'find'",), <traceback object at 0x02E33FA8>)
2013-02-20 20:01:52,953 - PixivUtil20130128 - ERROR - Error at processing Artist Info: 941624
Traceback (most recent call last):
File "PixivUtil2.py", line 414, in processMember
File "PixivModel.pyc", line 44, in init
File "PixivModel.pyc", line 60, in ParseInfo
AttributeError: 'NoneType' object has no attribute 'find'

config.ini contents (nothing changed since it last worked):
[Settings]
proxyaddress =
useproxy = False
useragent = Mozilla/5.0 (X11; U; Unix i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1
debughttp = False
userobots = True
filenameformat = %member_token% (%member_id%)%image_id% - %title%
filenamemangaformat = %member_token% (%member_id%)%image_id%.%page_index% - %title%
timeout = 60
uselist = False
processfromdb = True
overwrite = False
tagsseparator = ,
daylastupdated = 7
rootdirectory = S:\Downloads\Pixiv Downloader
retry = 3
retrywait = 5
createdownloadlists = False
downloadlistdirectory = .
irfanviewpath = C:\Program Files\IrfanView
startirfanview = False
startirfanslide = False
alwayscheckfilesize = False
checkupdatedlimit = 0
downloadavatar = False
createmangadir = False
usetagsasdir = False
useblacklisttags = False
usesuppresstags = False
tagslimit = -1
writeimageinfo = False

[Pixiv]
numberofpage = 0
r18mode = False

[Authentication]
username = ---
password = ---
cookie = ---
usessl = False

Any help would be appreciated.

Replace root does not seems to work on huge sqlite db

I moved my collection to another path this week end.
When I try to replace the root path, a sqlite-journal is created and the cli exit it states (return to db menu), but the journal does not disapear and no change is done to the db (verified with sqlitebrowser).
I waited around one hour and nothing seems to happen.

The db is ~174MB.

Error: [Errno 28] No space left on device

Now if there is no space left on drive it just throws random errors and continue making those, it would be very nice that if program finds out that there is no space left it would pause itself, notify the user that it did so and make a prompt line saying "no space left on drive, clean up some space and press ENTER to continue" (and last failed due to space problem image would be tried to redownloaded.

Page 4
Member Name : yamasan
Member Avatar: http://i1.pixiv.net/img19/profile/yamamasa/2814862.png
Member Token : yamamasa
#61

Processing Image Id: 18558854
Title: ???????????
Tags : ???? ??????????????????? ??????????? ??????? ??? ?????? ????????????? ???
??? ??????? ???????????
Mode : big
Image URL : http://i1.pixiv.net/img19/img/yamamasa/18558854.png
Filename : C:\DL Image Packs\346855 (yamamasa)\18558854.png
Start downloading... 722763 of 722763 Bytes Complete.
done.
#62

Processing Image Id: 18361849
Title: ???????
Tags : ???? ???????????? ??????????????????? ????? ????????????? ????? ?????? ??
???? ???????????? ?????
Mode : big
Image URL : http://i1.pixiv.net/img19/img/yamamasa/18361849.png?1303610439
Filename : C:\DL Image Packs\346855 (yamamasa)\18361849.png
Start downloading... 151552 of 1194460 Bytes Downloaded file incomplete!
151552 of 1194460 Bytes
Filename = C:\DL Image Packs\346855 (yamamasa)\18361849.png
URL = http://i1.pixiv.net/img19/img/yamamasa/18361849.png?1303610439
Traceback (most recent call last):
File "PixivUtil2.py", line 164, in downloadImage
IOError: [Errno 28] No space left on device
1 2 3 4
Found file with different filesize, removing...
Start downloading... 184320 of 1194460 Bytes Downloaded file incomplete!
184320 of 1194460 Bytes
Filename = C:\DL Image Packs\346855 (yamamasa)\18361849.png
URL = http://i1.pixiv.net/img19/img/yamamasa/18361849.png?1303610439
Traceback (most recent call last):
File "PixivUtil2.py", line 164, in downloadImage
IOError: [Errno 28] No space left on device
1 2 3 4
Found file with different filesize, removing...
Start downloading... 53248 of 1194460 Bytes Downloaded file incomplete!
53248 of 1194460 Bytes
Filename = C:\DL Image Packs\346855 (yamamasa)\18361849.png
URL = http://i1.pixiv.net/img19/img/yamamasa/18361849.png?1303610439
Traceback (most recent call last):
File "PixivUtil2.py", line 164, in downloadImage
IOError: [Errno 28] No space left on device
1 2 3 4
Found file with different filesize, removing...
Start downloading... 270336 of 1194460 Bytes Downloaded file incomplete!
270336 of 1194460 Bytes
Filename = C:\DL Image Packs\346855 (yamamasa)\18361849.png
URL = http://i1.pixiv.net/img19/img/yamamasa/18361849.png?1303610439
Traceback (most recent call last):
File "PixivUtil2.py", line 164, in downloadImage
IOError: [Errno 28] No space left on device
Traceback (most recent call last):
File "PixivUtil2.py", line 672, in processImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 223, in downloadImage
File "PixivUtil2.py", line 164, in downloadImage
IOError: [Errno 28] No space left on device
Error at processImage(): (<type 'exceptions.IOError'>, IOError(28, 'No space lef
t on device'), <traceback object at 0x00FD0F58>)
Dumping html to: Error Medium Page for image 18361849.html
Cannot dump page for image_id: 18361849
Stuff happened, trying again after 2 second ( 1 )
local variable 'parseBigImage' referenced before assignment
Processing Image Id: 18361849
Title: ???????
Tags : ???? ???????????? ??????????????????? ????? ????????????? ????? ?????? ??
???? ???????????? ?????
Mode : big
Image URL : http://i1.pixiv.net/img19/img/yamamasa/18361849.png?1303610439
Filename : C:\DL Image Packs\346855 (yamamasa)\18361849.png
Found file with different filesize, removing...
Start downloading... 65536 of 1194460 Bytes

Tag list mode switched to next tag without reason

Hello,

While using tag list mode (option 7), setting starting page at 1 and not touching the default limit (by typing nothing and pressing "enter"), I remarked the program switched tags at page 615.

2014-10-12 16:29:41,052 - PixivUtil20141006 - INFO - Looping... for http://www.pixiv.net/search.php?s_mode=s_tag&p=615&word=%E3%83%95%E3%83%A9%E3%83%B3%E3%83%89%E3%83%BC%E3%83%AB%E3%83%BB%E3%82%B9%E3%82%AB%E3%83%BC%E3%83%AC%E3%83%83%E3%83%88
2014-10-12 16:29:52,842 - PixivUtil20141006 - INFO - Searching for: (フランドール) %E3%83%95%E3%83%A9%E3%83%B3%E3%83%89%E3%83%BC%E3%83%AB
2014-10-12 16:29:52,842 - PixivUtil20141006 - INFO - Looping... for http://www.pixiv.net/search.php?s_mode=s_tag&p=1&word=%E3%83%95%E3%83%A9%E3%83%B3%E3%83%89%E3%83%BC%E3%83%AB

If you use the first address ( http://www.pixiv.net/search.php?s_mode=s_tag&p=615&word=%E3%83%95%E3%83%A9%E3%83%B3%E3%83%89%E3%83%BC%E3%83%AB%E3%83%BB%E3%82%B9%E3%82%AB%E3%83%BC%E3%83%AC%E3%83%83%E3%83%88 ) you will find out that it's full, and that there is up to 1000 pages (and probably more but pixiv doesn't allow to go higher than 1000).

I don't know if it can be tested easily (I mean 615 pages of download is a pretty high number !) but I just thought I should report it.

In the meantime I just restarted the tag list action, setting the starting page to 614 to make sure I got everything right. It doesn't seem to be complaining and is at page 617 going like it should right now.

Thank you for making this and I hope it's a silly bug that you can resolve :3

Crash caused by Pixiv's new animated images feature

As of June 25th 2014, Pixiv has introduced a new function that allows animations, as announced here: http://www.pixiv.net/info.php?id=2476.

The software cannot obviously process the new format, since these images cannot be even saved normally: they are played by a Java applet. So every time the software reaches an image_id that uses it, it shuts down immediately. Within hours, several artists have already started to use the new feature, so option 8 has become unusable, and it’ll extend to other functions once use of the new feature spreads.

A solution, or at least a temporary one, may be have the software skip the image_id once it finds there is no “traditional” image.

Also, by taking a look into the page source of some new images, they seem to lead to zip files that contain their individual animation frames.

For example, for
http://www.pixiv.net/member_illust.php?mode=medium&illust_id=44301046

I find
{“src”:”http://i1.pixiv.net/img-zip-ugoira/img/2014/06/25/17/58/59/44301046_ugoira600x600.zip”
“src”:”http://i1.pixiv.net/img-zip-ugoira/img/2014/06/25/17/58/59/44301046_ugoira1920x1080.zip”
(I took only these and not the commands that follow, regulating the delay of each individual frame)

Baside skipping, managing to get these may be the “key”.

AttributeError: 'NoneType' object has no attribute 'string'

Not sure exactly what happened here, but these two files consistently throw errors at me upon trying to download them. (There are 2 HTML files that I can upload if you need them.)

Log:

PixivDownloader2 version 20120724
https://nandaka.wordpress.com/tag/pixiv-downloader/

  1. Download by member_id
  2. Download by image_id
  3. Download by tags
  4. Download from list
  5. Download from online user bookmark
  6. Download from online image bookmark
  7. Download from tags list
  8. Download new illust from bookmark
  9. Download by Title/Caption

10. Download by Tag and Member Id

d. Manage database
e. Export online bookmark
x. Exit
Input: 4
Processing from database.
Found 32 items.

(snip)

done.
Processing Member Id: 491042
Reading config file... done.
Page 1
Member Name : ??
Member Avatar: http://source.pixiv.net/source/images/no_profile.png
Member Token : akkn-akikan
#1

Processing Image Id: 29074091
Traceback (most recent call last):
File "PixivUtil2.py", line 554, in processImage
File "PixivModel.pyc", line 169, in init
File "PixivModel.pyc", line 217, in ParseInfo
AttributeError: 'NoneType' object has no attribute 'string'
Error at processImage(): (<type 'exceptions.AttributeError'>, AttributeError("'N
oneType' object has no attribute 'string'",), <traceback object at 0x0262EF30>)

Dumping html to: Error Medium Page for image 29074091.html
Stuff happened, trying again after 2 second ( 1 )
'NoneType' object has no attribute 'string'
Processing Image Id: 29074091
Traceback (most recent call last):
File "PixivUtil2.py", line 554, in processImage
File "PixivModel.pyc", line 169, in init
File "PixivModel.pyc", line 217, in ParseInfo
AttributeError: 'NoneType' object has no attribute 'string'
Error at processImage(): (<type 'exceptions.AttributeError'>, AttributeError("'N
oneType' object has no attribute 'string'",), <traceback object at 0x033C3E18>)

Dumping html to: Error Medium Page for image 29074091.html
Stuff happened, trying again after 2 second ( 2 )
'NoneType' object has no attribute 'string'
Processing Image Id: 29074091
Traceback (most recent call last):
File "PixivUtil2.py", line 554, in processImage
File "PixivModel.pyc", line 169, in init
File "PixivModel.pyc", line 217, in ParseInfo
AttributeError: 'NoneType' object has no attribute 'string'
Error at processImage(): (<type 'exceptions.AttributeError'>, AttributeError("'N
oneType' object has no attribute 'string'",), <traceback object at 0x034F8D28>)

Dumping html to: Error Medium Page for image 29074091.html
Stuff happened, trying again after 2 second ( 3 )
'NoneType' object has no attribute 'string'
Processing Image Id: 29074091
Traceback (most recent call last):
File "PixivUtil2.py", line 554, in processImage
File "PixivModel.pyc", line 169, in init
File "PixivModel.pyc", line 217, in ParseInfo
AttributeError: 'NoneType' object has no attribute 'string'
Error at processImage(): (<type 'exceptions.AttributeError'>, AttributeError("'N
oneType' object has no attribute 'string'",), <traceback object at 0x0363D670>)

Dumping html to: Error Medium Page for image 29074091.html
Stuff happened, trying again after 2 second ( 4 )
'NoneType' object has no attribute 'string'
Processing Image Id: 29074091
Traceback (most recent call last):
File "PixivUtil2.py", line 554, in processImage
File "PixivModel.pyc", line 169, in init
File "PixivModel.pyc", line 217, in ParseInfo
AttributeError: 'NoneType' object has no attribute 'string'
Error at processImage(): (<type 'exceptions.AttributeError'>, AttributeError("'N
oneType' object has no attribute 'string'",), <traceback object at 0x03B31C60>)

Dumping html to: Error Medium Page for image 29074091.html
Giving up image_id: 29074091
done.

(snip)

done.
Processing Member Id: 1014524
Reading config file... done.
Page 1
Member Name : ??
Member Avatar: http://i1.pixiv.net/img35/profile/kemoisumi/4107803.jpg
Member Token : kemoisumi
File exist! (Identical Size)
#1

Processing Image Id: 29082131
Traceback (most recent call last):
File "PixivUtil2.py", line 554, in processImage
File "PixivModel.pyc", line 169, in init
File "PixivModel.pyc", line 217, in ParseInfo
AttributeError: 'NoneType' object has no attribute 'string'
Error at processImage(): (<type 'exceptions.AttributeError'>, AttributeError("'N
oneType' object has no attribute 'string'",), <traceback object at 0x03105468>)

Dumping html to: Error Medium Page for image 29082131.html
Stuff happened, trying again after 2 second ( 1 )
'NoneType' object has no attribute 'string'
Processing Image Id: 29082131
Traceback (most recent call last):
File "PixivUtil2.py", line 554, in processImage
File "PixivModel.pyc", line 169, in init
File "PixivModel.pyc", line 217, in ParseInfo
AttributeError: 'NoneType' object has no attribute 'string'
Error at processImage(): (<type 'exceptions.AttributeError'>, AttributeError("'N
oneType' object has no attribute 'string'",), <traceback object at 0x03631F80>)

Dumping html to: Error Medium Page for image 29082131.html
Stuff happened, trying again after 2 second ( 2 )
'NoneType' object has no attribute 'string'
Processing Image Id: 29082131
Traceback (most recent call last):
File "PixivUtil2.py", line 554, in processImage
File "PixivModel.pyc", line 169, in init
File "PixivModel.pyc", line 217, in ParseInfo
AttributeError: 'NoneType' object has no attribute 'string'
Error at processImage(): (<type 'exceptions.AttributeError'>, AttributeError("'N
oneType' object has no attribute 'string'",), <traceback object at 0x025E5260>)

Dumping html to: Error Medium Page for image 29082131.html
Stuff happened, trying again after 2 second ( 3 )
'NoneType' object has no attribute 'string'
Processing Image Id: 29082131
Traceback (most recent call last):
File "PixivUtil2.py", line 554, in processImage
File "PixivModel.pyc", line 169, in init
File "PixivModel.pyc", line 217, in ParseInfo
AttributeError: 'NoneType' object has no attribute 'string'
Error at processImage(): (<type 'exceptions.AttributeError'>, AttributeError("'N
oneType' object has no attribute 'string'",), <traceback object at 0x03BD1918>)

Dumping html to: Error Medium Page for image 29082131.html
Stuff happened, trying again after 2 second ( 4 )
'NoneType' object has no attribute 'string'
Processing Image Id: 29082131
Traceback (most recent call last):
File "PixivUtil2.py", line 554, in processImage
File "PixivModel.pyc", line 169, in init
File "PixivModel.pyc", line 217, in ParseInfo
AttributeError: 'NoneType' object has no attribute 'string'
Error at processImage(): (<type 'exceptions.AttributeError'>, AttributeError("'N
oneType' object has no attribute 'string'",), <traceback object at 0x03D30580>)

Dumping html to: Error Medium Page for image 29082131.html
Giving up image_id: 29082131

(snip)

Last Page
Done.

done.
PixivDownloader2 version 20120724
https://nandaka.wordpress.com/tag/pixiv-downloader/

  1. Download by member_id
  2. Download by image_id
  3. Download by tags
  4. Download from list
  5. Download from online user bookmark
  6. Download from online image bookmark
  7. Download from tags list
  8. Download new illust from bookmark
  9. Download by Title/Caption

10. Download by Tag and Member Id

d. Manage database
e. Export online bookmark
x. Exit
Input:

error when download from tag list when encountered an image with a blacklisted tag

got an error when download from tag list when encountered an image with a blacklisted tag

Image Id: xxxxxxxx
Bookmark Count: 11
Processing Image Id: xxxxxxxx
Skipping image_id: xxxxxxxx because contains blacklisted tags: 3D
Traceback (most recent call last):
File “PixivUtil2.py”, line 656, in processImage
UnboundLocalError: local variable ‘viewPage’ referenced before assignment
Error at processImage(): (, UnboundLocalErr
or(“local variable ‘viewPage’ referenced before assignment”,), )
Cannot dump page for image_id: 27377905
Error at processTags(): (, UnboundLocalErro
r(“local variable ‘mediumPage’ referenced before assignment”,), )
Error at processTagsList(): (, UnboundLocal
Error(“local variable ‘mediumPage’ referenced before assignment”,), )
Traceback (most recent call last):
File “PixivUtil2.py”, line 1372, in main
File “PixivUtil2.py”, line 1166, in menuDownloadFromTagsList
File “PixivUtil2.py”, line 779, in processTagsList
File “PixivUtil2.py”, line 743, in processTags
File “PixivUtil2.py”, line 668, in processImage
UnboundLocalError: local variable ‘mediumPage’ referenced before assignment
press enter to exit.

http://nandaka.wordpress.com/2012/05/19/pixiv-downloader-20120519/#comment-1796

Issue when downloading from tags (option 3)

When downloading from tags, there is an issue where the program will suddenly close halfway through downloading or show an error. Below is an example of the error message that comes up, which results in having to close the program. This issue mainly appears when downloading from a tag with a large number of results, around >10 000, although it also has happened when downloading from tags with less results.

Image #76
Image Id: 36367640
Bookmark Count: 53
Skipping imageId=36367640 because less than bookmark count limit (500 > 53)
Looping... for http://www.pixiv.net/search.php?s_mode=s_tag_full&word=%E3%83%93%
E3%82%AD%E3%83%8B&p=170
Error at process_tags(): (<class 'urllib2.URLError'>, URLError(error(10054, 'An
existing connection was forcibly closed by the remote host'),), <traceback objec
t at 0x02FD66C0>)
Cannot dump page for search tags:ビキニ
Traceback (most recent call last):
File "PixivUtil2.py", line 1856, in main
File "PixivUtil2.py", line 1655, in main_loop
File "PixivUtil2.py", line 1423, in menu_download_by_tags
File "PixivUtil2.py", line 985, in process_tags
UnboundLocalError: local variable 'search_page' referenced before assignment
press enter to exit.

Update: I also got this message while downloading from another tag.

urlopen error [Errno 10054] An existing connection was forcibly closed by the remote host
1 2 3 4
Traceback (most recent call last):
File "PixivUtil2.py", line 718, in process_image
File "mechanize_mechanize.pyc", line 569, in follow_link
File "mechanize_mechanize.pyc", line 550, in click_link
File "mechanize_mechanize.pyc", line 443, in viewing_html
BrowserStateError: not viewing any document
Error at process_image(): (<class 'mechanize._mechanize.BrowserStateError'>, Bro
wserStateError('not viewing any document',), <traceback object at 0x03022828>)
Dumping html to: Error Medium Page for image 32445747.html
Error at process_tags(): (<class 'mechanize._mechanize.BrowserStateError'>, Brow
serStateError('not viewing any document',), <traceback object at 0x03022760>)
Dumping html to: Error page for search tags DOA.html
Traceback (most recent call last):
File "PixivUtil2.py", line 1856, in main
File "PixivUtil2.py", line 1655, in main_loop
File "PixivUtil2.py", line 1423, in menu_download_by_tags
File "PixivUtil2.py", line 944, in process_tags
File "PixivUtil2.py", line 718, in process_image
File "mechanize_mechanize.pyc", line 569, in follow_link
File "mechanize_mechanize.pyc", line 550, in click_link
File "mechanize_mechanize.pyc", line 443, in viewing_html
BrowserStateError: not viewing any document
press enter to exit.

New cannot download by member_id error

PixivDownloader2 version 20140926
https://nandaka.wordpress.com/tag/pixiv-downloader/
Reading C:\Users\D5114360\Downloads\images\pixivutil20140926\config.ini ...
done.
Creating database... done.
Only process member where day last updated >= 7
Username ?
Password ?
logging in with saved cookie
Trying to log with saved cookie
done.
PixivDownloader2 version 20140926
https://nandaka.wordpress.com/tag/pixiv-downloader/

  1. Download by member_id
  2. Download by image_id
  3. Download by tags
  4. Download from list
  5. Download from online user bookmark
  6. Download from online image bookmark
  7. Download from tags list
  8. Download new illust from bookmark
  9. Download by Title/Caption
  10. Download by Tag and Member Id
  11. Download Member Bookmark

12. Download by Group Id

d. Manage database
e. Export online bookmark
r. Reload config.ini
p. Print config.ini
x. Exit
Input: 1
Member ids: 39290
Start Page (default=1):
End Page (default=0, 0 for no limit):
Processing Member Id: 39290
Reading C:\Users\D5114360\Downloads\images\pixivutil20140926\config.ini ...
done.
Page 1
Member Url: http://www.pixiv.net/member_illust.php?id=39290&p=1
Traceback (most recent call last):
File "PixivUtil2.py", line 427, in process_member
File "PixivModel.pyc", line 46, in init
File "PixivModel.pyc", line 93, in ParseImages
AttributeError: 'NoneType' object has no attribute 'ul'
Error at process_member(): (<type 'exceptions.AttributeError'>, AttributeError("
'NoneType' object has no attribute 'ul'",), <traceback object at 0x0321EFD0>)
Dumping html to: Error page for member 39290.html
Traceback (most recent call last):
File "PixivUtil2.py", line 1813, in main
File "PixivUtil2.py", line 1609, in main_loop
File "PixivUtil2.py", line 1315, in menu_download_by_member_id
File "PixivUtil2.py", line 427, in process_member
File "PixivModel.pyc", line 46, in init
File "PixivModel.pyc", line 93, in ParseImages
AttributeError: 'NoneType' object has no attribute 'ul'
press enter to exit.

Suggested feature: do not download pictures older than X days

As requested, I post here this suggestion I originally posted on the blog.

The idea would be to have a setting in the .ini so that the downloader gets only pictures that were posted X days before the system date. So, value 0 would be no limit, 7 would be only pics that are a week old or newer, 60 only from the last two months, and so on.

Since the program also reads the works' dates (in fact, you can even set it to include them in the filenames), it would be technically possible.

[Feature] Error dump suppression

Requesting ability to arbitrary suppress dumps for errors.
Line in configuration file "supresserrordumps = " with positions True ( everything suppressed), False (nothing suppressed) or a string of error codes - 1001, 1002, 1004, 2004 and so on to fine-tune what is suppressed. Separator is either space ' ' or comma ','

Not logging in.

Hi, when I run pixivutil, it will not log in. It just stays there after saying "Log in using...".
I turned SSL on/off and also tried without cookies to see if it would generate a new one.

http://pastebin.com/njzz9jjK

Error on memberid 5944544

When I downloading artist id 5944544 i get:
Member Url: http://www.pixiv.net/member_illust.php?id=5944544&p=1
Traceback (most recent call last):
File “PixivUtil2.py”, line 420, in process_member
File “PixivModel.pyc”, line 49, in init
File “PixivModel.pyc”, line 69, in ParseInfo
IndexError: list index out of range
Error at processing Artist Info: (, IndexError(‘list index o
ut of range’,), )

Set useRobots = False but still 403?

I changed useRobots = False but am still getting the 403 error. Python 2.7.5 on OS X using the current modules after installing them. I changed time zones to avoid daylight savings time, copied the cookie, and set keepsignedin = 1. Am I doing something wrong or is something messed up somewhere else?

Log in using form.
Error at pixiv_login(): (<class 'mechanize._response.httperror_seek_wrapper'>, <httperror_seek_wrapper (mechanize._http.RobotExclusionError instance) at 0x10745c120 whose wrapped object = <closeable_response at 0x107453170 whose fp = <cStringIO.StringI object at 0x107440360>>>, <traceback object at 0x107454cf8>)
failed
Traceback (most recent call last):
  File "pixivutil2.py", line 1805, in main
    result = pixiv_login(username, password)
  File "pixivutil2.py", line 260, in pixiv_login
    __br__.open(req)
  File "build/bdist.macosx-10.9-intel/egg/mechanize/_mechanize.py", line 203, in open
    return self._mech_open(url, data, timeout=timeout)
  File "build/bdist.macosx-10.9-intel/egg/mechanize/_mechanize.py", line 255, in _mech_open
    raise response
httperror_seek_wrapper: HTTP Error 403: request disallowed by robots.txt
press enter to exit.

can't download by id: ValueError: time data '6 3 2014, 00:01' does not match format '%Y-%m-%d %H:%M'

My old pixivdownloader gave this error so I tried updating and new one has the very same problem - can't download when I tell it to by id, gives only errors:

PixivDownloader2 version 20140606
https://nandaka.wordpress.com/tag/pixiv-downloader/
Reading C:\Private\PixivDownloader\config.ini ...
done.
Creating database... done.
Only process member where day last updated >= 7
Using Username: test56
logging in with saved cookie
Trying to log with saved cookie
done.
PixivDownloader2 version 20140606
https://nandaka.wordpress.com/tag/pixiv-downloader/

  1. Download by member_id
  2. Download by image_id
  3. Download by tags
  4. Download from list
  5. Download from online user bookmark
  6. Download from online image bookmark
  7. Download from tags list
  8. Download new illust from bookmark
  9. Download by Title/Caption
  10. Download by Tag and Member Id
  11. Download Member Bookmark

12. Download by Group Id

d. Manage database
e. Export online bookmark
r. Reload config.ini
p. Print config.ini
x. Exit
Input: 1
Member id: 1422579
Start Page (default=1):
End Page (default=0, 0 for no limit):
Processing Member Id: 1422579
Reading C:\Private\PixivDownloader\config.ini ...
done.
Page 1
Member Url: http://www.pixiv.net/member_illust.php?id=1422579&p=1
Member Name : Saru
Member Avatar: http://i1.pixiv.net/img43/profile/longbb/6295157.jpg
Member Token : longbb

1

Processing Image Id: 43863737
Traceback (most recent call last):
File "PixivUtil2.py", line 617, in process_image
File "PixivModel.pyc", line 256, in init
File "PixivModel.pyc", line 349, in ParseWorksData
File "_strptime.pyc", line 325, in _strptime
ValueError: time data '6 3 2014, 00:01' does not match format '%Y-%m-%d %H:%M'
Error at process_image(): (<type 'exceptions.ValueError'>, ValueError("time data
'6 3 2014, 00:01' does not match format '%Y-%m-%d %H:%M'",), <traceback object
at 0x02ABFBE8>)
Dumping html to: Error Medium Page for image 43863737.html
Stuff happened, trying again after 2 second ( 1 )
Traceback (most recent call last):
File "PixivUtil2.py", line 513, in process_member
File "PixivUtil2.py", line 617, in process_image
File "PixivModel.pyc", line 256, in init
File "PixivModel.pyc", line 349, in ParseWorksData
File "_strptime.pyc", line 325, in _strptime
ValueError: time data '6 3 2014, 00:01' does not match format '%Y-%m-%d %H:%M'
Processing Image Id: 43863737
Traceback (most recent call last):
File "PixivUtil2.py", line 617, in process_image
File "PixivModel.pyc", line 256, in init
File "PixivModel.pyc", line 349, in ParseWorksData
File "_strptime.pyc", line 325, in _strptime
ValueError: time data '6 3 2014, 00:01' does not match format '%Y-%m-%d %H:%M'
Error at process_image(): (<type 'exceptions.ValueError'>, ValueError("time data
'6 3 2014, 00:01' does not match format '%Y-%m-%d %H:%M'",), <traceback object
at 0x02DD53F0>)
Dumping html to: Error Medium Page for image 43863737.html
Stuff happened, trying again after 2 second ( 2 )
Traceback (most recent call last):
File "PixivUtil2.py", line 513, in process_member
File "PixivUtil2.py", line 617, in process_image
File "PixivModel.pyc", line 256, in init
File "PixivModel.pyc", line 349, in ParseWorksData
File "_strptime.pyc", line 325, in _strptime
ValueError: time data '6 3 2014, 00:01' does not match format '%Y-%m-%d %H:%M'
Processing Image Id: 43863737
Traceback (most recent call last):
File "PixivUtil2.py", line 617, in process_image
File "PixivModel.pyc", line 256, in init
File "PixivModel.pyc", line 349, in ParseWorksData
File "_strptime.pyc", line 325, in _strptime
ValueError: time data '6 3 2014, 00:01' does not match format '%Y-%m-%d %H:%M'
Error at process_image(): (<type 'exceptions.ValueError'>, ValueError("time data
'6 3 2014, 00:01' does not match format '%Y-%m-%d %H:%M'",), <traceback object
at 0x030E6418>)
Dumping html to: Error Medium Page for image 43863737.html
Stuff happened, trying again after 2 second ( 3 )
Traceback (most recent call last):
File "PixivUtil2.py", line 513, in process_member
File "PixivUtil2.py", line 617, in process_image
File "PixivModel.pyc", line 256, in init
File "PixivModel.pyc", line 349, in ParseWorksData
File "_strptime.pyc", line 325, in _strptime
ValueError: time data '6 3 2014, 00:01' does not match format '%Y-%m-%d %H:%M'
Processing Image Id: 43863737
Traceback (most recent call last):
File "PixivUtil2.py", line 617, in process_image
File "PixivModel.pyc", line 256, in init
File "PixivModel.pyc", line 349, in ParseWorksData
File "_strptime.pyc", line 325, in _strptime
ValueError: time data '6 3 2014, 00:01' does not match format '%Y-%m-%d %H:%M'
Error at process_image(): (<type 'exceptions.ValueError'>, ValueError("time data
'6 3 2014, 00:01' does not match format '%Y-%m-%d %H:%M'",), <traceback object
at 0x02A9B7D8>)
Dumping html to: Error Medium Page for image 43863737.html
Stuff happened, trying again after 2 second ( 4 )
Traceback (most recent call last):
File "PixivUtil2.py", line 513, in process_member
File "PixivUtil2.py", line 617, in process_image
File "PixivModel.pyc", line 256, in init
File "PixivModel.pyc", line 349, in ParseWorksData
File "_strptime.pyc", line 325, in _strptime
ValueError: time data '6 3 2014, 00:01' does not match format '%Y-%m-%d %H:%M'
Processing Image Id: 43863737
Traceback (most recent call last):
File "PixivUtil2.py", line 617, in process_image
File "PixivModel.pyc", line 256, in init
File "PixivModel.pyc", line 349, in ParseWorksData
File "_strptime.pyc", line 325, in _strptime
ValueError: time data '6 3 2014, 00:01' does not match format '%Y-%m-%d %H:%M'
Error at process_image(): (<type 'exceptions.ValueError'>, ValueError("time data
'6 3 2014, 00:01' does not match format '%Y-%m-%d %H:%M'",), <traceback object
at 0x02A57738>)
Dumping html to: Error Medium Page for image 43863737.html
Stuff happened, trying again after 2 second ( 5 )
Traceback (most recent call last):
File "PixivUtil2.py", line 513, in process_member
File "PixivUtil2.py", line 617, in process_image
File "PixivModel.pyc", line 256, in init
File "PixivModel.pyc", line 349, in ParseWorksData
File "_strptime.pyc", line 325, in _strptime
ValueError: time data '6 3 2014, 00:01' does not match format '%Y-%m-%d %H:%M'
Processing Image Id: 43863737
Traceback (most recent call last):
File "PixivUtil2.py", line 617, in process_image
File "PixivModel.pyc", line 256, in init
File "PixivModel.pyc", line 349, in ParseWorksData
File "_strptime.pyc", line 325, in _strptime
ValueError: time data '6 3 2014, 00:01' does not match format '%Y-%m-%d %H:%M'
Error at process_image(): (<type 'exceptions.ValueError'>, ValueError("time data
'6 3 2014, 00:01' does not match format '%Y-%m-%d %H:%M'",), <traceback object
at 0x02992878>)
Dumping html to: Error Medium Page for image 43863737.html
Stuff happened, trying again after 2 second ( 6 )
Traceback (most recent call last):
File "PixivUtil2.py", line 513, in process_member
File "PixivUtil2.py", line 617, in process_image
File "PixivModel.pyc", line 256, in init
File "PixivModel.pyc", line 349, in ParseWorksData
File "_strptime.pyc", line 325, in _strptime
ValueError: time data '6 3 2014, 00:01' does not match format '%Y-%m-%d %H:%M'
Processing Image Id: 43863737
Traceback (most recent call last):
File "PixivUtil2.py", line 617, in process_image
File "PixivModel.pyc", line 256, in init
File "PixivModel.pyc", line 349, in ParseWorksData
File "_strptime.pyc", line 325, in _strptime
ValueError: time data '6 3 2014, 00:01' does not match format '%Y-%m-%d %H:%M'
Error at process_image(): (<type 'exceptions.ValueError'>, ValueError("time data
'6 3 2014, 00:01' does not match format '%Y-%m-%d %H:%M'",), <traceback object
at 0x035946C0>)
Dumping html to: Error Medium Page for image 43863737.html
Giving up image_id: 43863737
PixivDownloader2 version 20140606
https://nandaka.wordpress.com/tag/pixiv-downloader/

  1. Download by member_id
  2. Download by image_id
  3. Download by tags
  4. Download from list
  5. Download from online user bookmark
  6. Download from online image bookmark
  7. Download from tags list
  8. Download new illust from bookmark
  9. Download by Title/Caption
  10. Download by Tag and Member Id
  11. Download Member Bookmark

12. Download by Group Id

d. Manage database
e. Export online bookmark
r. Reload config.ini
p. Print config.ini
x. Exit
Input:

unable to download images by member_id after latest pixiv update (probably)

PixivDownloader2 version 20140926
https://nandaka.wordpress.com/tag/pixiv-downloader/
Reading V:\folder\config.ini ...
Error at loadConfig(): (<class 'ConfigParser.NoOptionError'>, No option 'writeUg
oiraInfo' in section: 'Settings', <traceback object at 0x020F0FA8>)
Some configuration have invalid value, replacing with the default value.
Writing config file... Backing up old config (error exist!) to config.ini.error-
1412089005
done.
done.
Creating database... done.
Only process member where day last updated >= 7
Using Username: test56
logging in with saved cookie
Trying to log with saved cookie
done.
PixivDownloader2 version 20140926
https://nandaka.wordpress.com/tag/pixiv-downloader/

  1. Download by member_id
  2. Download by image_id
  3. Download by tags
  4. Download from list
  5. Download from online user bookmark
  6. Download from online image bookmark
  7. Download from tags list
  8. Download new illust from bookmark
  9. Download by Title/Caption
  10. Download by Tag and Member Id
  11. Download Member Bookmark

12. Download by Group Id

d. Manage database
e. Export online bookmark
r. Reload config.ini
p. Print config.ini
x. Exit
Input: 1
Member ids: 1573847
Start Page (default=1):
End Page (default=0, 0 for no limit):
Processing Member Id: 1573847
Reading V:\folder\config.ini ...
done.
Page 1
Member Url: http://www.pixiv.net/member_illust.php?id=1573847&p=1
Traceback (most recent call last):
File "PixivUtil2.py", line 427, in process_member
File "PixivModel.pyc", line 46, in init
File "PixivModel.pyc", line 93, in ParseImages
AttributeError: 'NoneType' object has no attribute 'ul'
Error at process_member(): (<type 'exceptions.AttributeError'>, AttributeError("
'NoneType' object has no attribute 'ul'",), <traceback object at 0x0285D6C0>)
Dumping html to: Error page for member 1573847.html
Traceback (most recent call last):
File "PixivUtil2.py", line 1813, in main
File "PixivUtil2.py", line 1609, in main_loop
File "PixivUtil2.py", line 1315, in menu_download_by_member_id
File "PixivUtil2.py", line 427, in process_member
File "PixivModel.pyc", line 46, in init
File "PixivModel.pyc", line 93, in ParseImages
AttributeError: 'NoneType' object has no attribute 'ul'
press enter to exit.

filenames for manga (_big_)

Tried downloading by user id, manga pages were named like 27842092_big_p0.jpg but they were same as 27842092_p0.jpg , so maybe if big and usual image are the same, it could name files without big ?

Unable to download manga since 09/25/14

Due to maybe some changes in Pixiv's code mangas, and only mangas (single pictures and ugoiras still work fine), cannot be downloaded anymore.

Here’s an extract from the log of a session where I tried to download a 3-page manga, after some retries it ends skipping the pages one by one.

2014-09-25 15:41:30,926 – PixivUtil20140712 – INFO – ###############################################################
2014-09-25 15:41:30,926 – PixivUtil20140712 – INFO – Starting…
2014-09-25 15:41:30,931 – PixivUtil20140712 – INFO – Setting log level to: DEBUG
2014-09-25 15:41:30,933 – PixivUtil20140712 – INFO – No default cookie jar available, creating…
2014-09-25 15:41:30,937 – PixivUtil20140712 – INFO – Only process member where day last updated >= 7
2014-09-25 15:41:30,937 – PixivUtil20140712 – INFO – Using Username: cpgendo
2014-09-25 15:41:30,938 – PixivUtil20140712 – INFO – logging in with saved cookie
2014-09-25 15:41:30,940 – PixivUtil20140712 – INFO – Trying to log with saved cookie
2014-09-25 15:41:34,384 – PixivUtil20140712 – INFO – Logged in using cookie
2014-09-25 15:41:35,707 – PixivUtil20140712 – INFO – Image id mode.
2014-09-25 15:41:42,953 – PixivUtil20140712 – DEBUG – Sanitized Filename: D:\Davide\pixivup\757127\757127 – 46132631_big_p0 – 9-23-2014.png
2014-09-25 15:41:42,960 – PixivUtil20140712 – ERROR – download_image()
2014-09-25 15:41:46,967 – PixivUtil20140712 – ERROR – download_image()
2014-09-25 15:41:50,974 – PixivUtil20140712 – ERROR – download_image()
2014-09-25 15:41:54,986 – PixivUtil20140712 – ERROR – download_image()
2014-09-25 15:41:54,989 – PixivUtil20140712 – ERROR – Giving up url: http://i2.pixiv.net/img30/img/yukinokeisuke/46132631_big_p0.png
2014-09-25 15:41:54,990 – PixivUtil20140712 – ERROR – Error when download_image(): http://i2.pixiv.net/img30/img/yukinokeisuke/46132631_big_p0.png
Traceback (most recent call last):
File “PixivUtil2.py”, line 724, in process_image
File “PixivUtil2.py”, line 215, in download_image
File “PixivUtil2.py”, line 215, in download_image
File “PixivUtil2.py”, line 215, in download_image
File “PixivUtil2.py”, line 86, in download_image
File “mechanize_mechanize.pyc”, line 199, in open_novisit
File “mechanize_mechanize.pyc”, line 230, in mech_open
File “mechanize_opener.pyc”, line 188, in open
File “mechanize_urllib2_fork.pyc”, line 1043, in do_request

URLError:
2014-09-25 15:41:54,990 – PixivUtil20140712 – DEBUG – Sanitized Filename: D:\Davide\pixivup\757127\757127 – 46132631_p0 – 9-23-2014.png
2014-09-25 15:41:54,999 – PixivUtil20140712 – ERROR – download_image()
2014-09-25 15:41:59,019 – PixivUtil20140712 – ERROR – download_image()
2014-09-25 15:42:03,025 – PixivUtil20140712 – ERROR – download_image()
2014-09-25 15:42:07,032 – PixivUtil20140712 – ERROR – download_image()
2014-09-25 15:42:07,032 – PixivUtil20140712 – ERROR – Giving up url: http://i2.pixiv.net/img30/img/yukinokeisuke/46132631_p0.png
2014-09-25 15:42:07,033 – PixivUtil20140712 – ERROR – Error when download_image(): http://i2.pixiv.net/img30/img/yukinokeisuke/46132631_p0.png

AttributeError("'NoneType' object has no attribute 'ul'",)

Running on Mac OS X 10.10 (Yosemite), encountered this error when trying both the user bookmarks and the image bookmarks.

After searching around in the html and the programs code, I found the issue:

Seems that Pixiv (at least for me) changed the class from "'display_works linkStyleWorks', to 'display_works linkStyleWorks ' (With a space at the end).

So I added the space into the required lines in the code, and everything works fine. I'm not sure if this is a Mac only error or if is now a universal change for Pixiv.

Regardless, I thought I'd report it!

-Matt

[Feature] Counter in titlebar

When downloading from list the titlebar of the program is updated with the MemberID being downloaded and the Page nr. Can’t you include a counter as well, so we can quickly know how far we are in the list? Something in the lines of MemberId: 1234 Page: 5 Order: 9 of 17.

Allow formatting of manga folder & filename

I added a comment to your blog about this feature (http://nandaka.wordpress.com/2012/07/04/pixiv-downloader-20120704/#comment-1983)

It would be good to have some control over the folder and file names used for the manga entries. Presumably, it could use the same tags and formatting method used for file names.

Can I suggest two new tags for handling the page number part of the file name? How about:

  • %page_number% or %page_num% for page numbering using human-readable numbering (e.g. 1 of 7) starting at 1
  • %page_index% or %page_idx% for page numbering using pixiv's internal numbering with the index starting at 0

It would also be good to be able to specify whether to use leading zeros, with the number of zeros based on the total number of images. For 100 images using %page_number% the first image would be 001. Using %page_index% the first image of 100 would be 00 and the last, 99.

bookmark download dose not recover from errors.

2014-08-31 22:14:02,578 - PixivUtil20140712 - INFO - Image ID (2456####): 2002 'Not in MyPick List, Need Permission!'
2014-08-31 22:14:02,584 - PixivUtil20140712 - ERROR - Error at process_image(): (<type 'exceptions.AttributeError'>, AttributeError("'NoneType' object has no attribute 'imageTitle'",), <traceback object at 0x037C7698>)
2014-08-31 22:14:02,584 - PixivUtil20140712 - ERROR - Error at process_image(): 2456####
Traceback (most recent call last):
File "PixivUtil2.py", line 644, in process_image
AttributeError: 'NoneType' object has no attribute 'imageTitle'
2014-08-31 22:14:02,585 - PixivUtil20140712 - ERROR - Error at process_image_bookmark(): (<type 'exceptions.AttributeError'>, AttributeError("'NoneType' object has no attribute 'imageTitle'",), <traceback object at 0x033A50A8>)
Traceback (most recent call last):
File "PixivUtil2.py", line 977, in process_image_bookmark
File "PixivUtil2.py", line 644, in process_image
AttributeError: 'NoneType' object has no attribute 'imageTitle'
2014-08-31 22:14:02,588 - PixivUtil20140712 - ERROR - Unknown Error: 'NoneType' object has no attribute 'imageTitle'
Traceback (most recent call last):
File "PixivUtil2.py", line 1801, in main
File "PixivUtil2.py", line 1607, in main_loop
File "PixivUtil2.py", line 1470, in menu_download_from_online_image_bookmark
File "PixivUtil2.py", line 977, in process_image_bookmark
File "PixivUtil2.py", line 644, in process_image
AttributeError: 'NoneType' object has no attribute 'imageTitle'
2014-08-31 22:18:06,513 - PixivUtil20140712 - INFO - EXIT

'NoneType' object has no attribute 'ul'

This happens sometimes, after restart and entering same input works fine usually. (Proabably because program was open for too long and cookie expired?)

Log:

PixivDownloader2 version 20120704
https://nandaka.wordpress.com/tag/pixiv-downloader/

  1. Download by member_id
  2. Download by image_id
  3. Download by tags
  4. Download from list
  5. Download from online user bookmark
  6. Download from online image bookmark
  7. Download from tags list
  8. Download new illust from bookmark
  9. Download by Title/Caption

10. Download by Tag and Member Id

d. Manage database
e. Export online bookmark
x. Exit
Input: 1
Member id: 76266
Start Page (default=1):
End Page (default=0, 0 for no limit):
Processing Member Id: 76266
Reading config file... done.
Page 1
'NoneType' object has no attribute 'ul'
1 2 3 4
'NoneType' object has no attribute 'ul'
1 2 3 4
'NoneType' object has no attribute 'ul'
1 2 3 4
'NoneType' object has no attribute 'ul'
1 2 3 4
'NoneType' object has no attribute 'ul'
1 2 3 4
'NoneType' object has no attribute 'ul'
1 2 3 4
'NoneType' object has no attribute 'ul'
1 2 3 4
'NoneType' object has no attribute 'ul'
1 2 3 4
'NoneType' object has no attribute 'ul'
1 2 3 4
'NoneType' object has no attribute 'ul'
1 2 3 4
'NoneType' object has no attribute 'ul'
1 2 3 4
'NoneType' object has no attribute 'ul'
1 2 3 4
'NoneType' object has no attribute 'ul'
1 2 3 4
'NoneType' object has no attribute 'ul'
1 2 3 4
'NoneType' object has no attribute 'ul'
1 2 3 4
'NoneType' object has no attribute 'ul'
1 2 3 4
'NoneType' object has no attribute 'ul'
1 2 3 4
'NoneType' object has no attribute 'ul'
1

Error at processImage() for non english folders

PixivDownloader2 version 20120704
https://nandaka.wordpress.com/tag/pixiv-downloader/
Reading config file... done.
Creating database... done.
Only process member where day last updated >= 7
Using Username: test56
logging in with saved cookie
Trying to log with saved cookie
Cookie already expired/invalid.
Log in using form.
done.
new cookie value: ad9248ee374adac1b63e855ce01f679f
Writing config file... done.
PixivDownloader2 version 20120704
https://nandaka.wordpress.com/tag/pixiv-downloader/

  1. Download by member_id
  2. Download by image_id
  3. Download by tags
  4. Download from list
  5. Download from online user bookmark
  6. Download from online image bookmark
  7. Download from tags list
  8. Download new illust from bookmark
  9. Download by Title/Caption

10. Download by Tag and Member Id

d. Manage database
e. Export online bookmark
x. Exit
Input: 1
Member id: 1598540
Start Page (default=1):
End Page (default=0, 0 for no limit):
Processing Member Id: 1598540
Reading config file... done.
Page 1
Member Name : DanteWontDie
Member Avatar: http://i2.pixiv.net/img46/profile/dantewontdie/4706570.jpg
Member Token : dantewontdie
#1

Processing Image Id: 28628322
Title: Want some fun?
Tags : ???? ???? ??? ?????? ??
Mode : big
Image URL : http://i2.pixiv.net/img46/img/dantewontdie/28628322.jpg
Traceback (most recent call last):
File "PixivUtil2.py", line 649, in processImage
File "PixivHelper.pyc", line 49, in sanitizeFilename
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 32: ordinal
not in range(128)
Error at processImage(): (<type 'exceptions.UnicodeDecodeError'>, UnicodeDecodeE
rror('ascii', 'V:\Documents and Settings\Admin\xd0\xe0\xe1\xee\xf7\xe8\xe9
xf1\xf2\xee\xeb\DL Image Packs', 32, 33, 'ordinal not in range(128)'), <trace
back object at 0x016A67D8>)
Dumping html to: Error Medium Page for image 28628322.html
Cannot dump page for image_id: 28628322
Stuff happened, trying again after 2 second ( 1 )
local variable 'parseBigImage' referenced before assignment
Processing Image Id: 28628322
Title: Want some fun?
Tags : ???? ???? ??? ?????? ??
Mode : big
Image URL : http://i2.pixiv.net/img46/img/dantewontdie/28628322.jpg
Traceback (most recent call last):
File "PixivUtil2.py", line 649, in processImage
File "PixivHelper.pyc", line 49, in sanitizeFilename
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 32: ordinal
not in range(128)
Error at processImage(): (<type 'exceptions.UnicodeDecodeError'>, UnicodeDecodeE
rror('ascii', 'V:\Documents and Settings\Admin\xd0\xe0\xe1\xee\xf7\xe8\xe9
xf1\xf2\xee\xeb\DL Image Packs', 32, 33, 'ordinal not in range(128)'), <trace
back object at 0x01787198>)
Dumping html to: Error Medium Page for image 28628322.html
Cannot dump page for image_id: 28628322
Stuff happened, trying again after 2 second ( 2 )
local variable 'parseBigImage' referenced before assignment
Processing Image Id: 28628322
Title: Want some fun?
Tags : ???? ???? ??? ?????? ??
Mode : big
Image URL : http://i2.pixiv.net/img46/img/dantewontdie/28628322.jpg
Traceback (most recent call last):
File "PixivUtil2.py", line 649, in processImage
File "PixivHelper.pyc", line 49, in sanitizeFilename
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 32: ordinal
not in range(128)
Error at processImage(): (<type 'exceptions.UnicodeDecodeError'>, UnicodeDecodeE
rror('ascii', 'V:\Documents and Settings\Admin\xd0\xe0\xe1\xee\xf7\xe8\xe9
xf1\xf2\xee\xeb\DL Image Packs', 32, 33, 'ordinal not in range(128)'), <trace
back object at 0x016BDA58>)
Dumping html to: Error Medium Page for image 28628322.html
Cannot dump page for image_id: 28628322
Stuff happened, trying again after 2 second ( 3 )
local variable 'parseBigImage' referenced before assignment
Processing Image Id: 28628322
Title: Want some fun?
Tags : ???? ???? ??? ?????? ??
Mode : big
Image URL : http://i2.pixiv.net/img46/img/dantewontdie/28628322.jpg
Traceback (most recent call last):
File "PixivUtil2.py", line 649, in processImage
File "PixivHelper.pyc", line 49, in sanitizeFilename
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 32: ordinal
not in range(128)
Error at processImage(): (<type 'exceptions.UnicodeDecodeError'>, UnicodeDecodeE
rror('ascii', 'V:\Documents and Settings\Admin\xd0\xe0\xe1\xee\xf7\xe8\xe9
xf1\xf2\xee\xeb\DL Image Packs', 32, 33, 'ordinal not in range(128)'), <trace
back object at 0x01904788>)
Dumping html to: Error Medium Page for image 28628322.html
Cannot dump page for image_id: 28628322
Stuff happened, trying again after 2 second ( 4 )
local variable 'parseBigImage' referenced before assignment
Processing Image Id: 28628322
Title: Want some fun?
Tags : ???? ???? ??? ?????? ??
Mode : big
Image URL : http://i2.pixiv.net/img46/img/dantewontdie/28628322.jpg
Traceback (most recent call last):
File "PixivUtil2.py", line 649, in processImage
File "PixivHelper.pyc", line 49, in sanitizeFilename
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 32: ordinal
not in range(128)
Error at processImage(): (<type 'exceptions.UnicodeDecodeError'>, UnicodeDecodeE
rror('ascii', 'V:\Documents and Settings\Admin\xd0\xe0\xe1\xee\xf7\xe8\xe9
xf1\xf2\xee\xeb\DL Image Packs', 32, 33, 'ordinal not in range(128)'), <trace
back object at 0x019081C0>)
Dumping html to: Error Medium Page for image 28628322.html
Cannot dump page for image_id: 28628322
Giving up image_id: 28628322
PixivDownloader2 version 20120704
https://nandaka.wordpress.com/tag/pixiv-downloader/

  1. Download by member_id
  2. Download by image_id
  3. Download by tags
  4. Download from list
  5. Download from online user bookmark
  6. Download from online image bookmark
  7. Download from tags list
  8. Download new illust from bookmark
  9. Download by Title/Caption

10. Download by Tag and Member Id

d. Manage database
e. Export online bookmark
x. Exit
Input:

Cannot use "createdownloadlists = True"

Whenever I have createdownloadlists set to True (with downloadlistdirectory = . (but it occurs no matter what I have the directory set to)) errors occur when trying to add to/create the download lists. The images do actually download.

Start downloading... 620567 of 620567 Bytes Complete.
Traceback (most recent call last):
File "PixivUtil2.py", line 178, in downloadImage
File "codecs.pyc", line 881, in open
IOError: [Errno 2] No such file or directory: u'C:\Users\nokobon\Desktop\Pixiv Util 8-6-2012b\library.zip.\Downloaded_on_2012_08-26.txt
1 2 3 4

images from showcase also downloaded when do download by tags

need to remove images in

ul are identical

    Showcase in html body.abtest-u6 div#wrapper div#page-search section.column-main section#search-result.image-list div.user-ad-container section.showcase ul.images xpath=/html/body/div/div/section/section/div/section/ul

    Actual search result in html body.abtest-u6 div#wrapper div#page-search section.column-main section#search-result.image-list ul.images xpath=/html/body/div/div/section/section/ul

Suggested feature: command to interrupt operations only on the image or manga currently in download

Normally, if you want to interrupt the downloader without a forced closure, you have to press CTRL+C, so that any operation on the current member_id is terminated (when you are downloading from a list, it skips to the next member_id)

The idea would be for another keyboard command applied to image_id instead of member_id. By pressing it, operation (be it download or checking) on the current image or manga would be terminated and the program would skip to the next image_id.

SOCKS proxy support

Hey there! I've used your script and I was wondering if you could add SOCKS proxy support?

All it takes is this:

  • pip install https://socksipy-branch.googlecode.com/files/SocksiPy-branch-1.02.tar.gz to install the SOCKS library.

  • Add something like this to your code:

    if __config__.useProxy and __config__.proxy.startswith('socks'):
        import socks, urlparse
        parseResult = urlparse.parse(__config__.proxy)
    
        assert parseResult.scheme and parseResult.hostname and parseResult.port
    
        socksType = socks.PROXY_TYPE_SOCKS5 if parseResult.scheme == 'socks5' else socks.PROXY_TYPE_SOCKS4
    
        socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, parseResult.hostname, parseResult.port)
    
        socks.wrapmodule(urllib)
        socks.wrapmodule(urllib2)
        socks.wrapmodule(httplib)

    underneath the urllib and httplib imports.

I've done something similar locally but I don't understand your code well enough to integrate it properly ^.^

UseTagAsDir generate wrong directory

I downloaded files with %tag1%, then later switched to %tag2%. I’m using ‘usetagsasdir’ so I expected something like /root/%searchtag%/. It worked fine with the first batch, but the second batch resulted in /root/%tag1%/%tag2%/

need to check processTags

After last update, only download pair images in manga mode

Here is a full log:
screenshot: https://www.dropbox.com/s/0e83l5i3wtba2p3/Capture%20d%27%C3%A9cran%202014-10-03%2019.13.22.png?dl=0

Image ids: 30507937
Processing Image Id: 30507937
Title: 紅楼夢8新刊「共食い禁止令!」
Tags : 東方, 東方紅楼夢, 地霊殿, 霊烏路空, 古明地さとり
Date : 2012-10-01 21:27:00
Mode : manga
Fetching big image page: http://www.pixiv.net/member_illust.php?mode=manga_big&illust_id=3
0507937&page=0
Fetching big image page: http://www.pixiv.net/member_illust.php?mode=manga_big&illust_id=3
0507937&page=1
Fetching big image page: http://www.pixiv.net/member_illust.php?mode=manga_big&illust_id=3
0507937&page=2
Fetching big image page: http://www.pixiv.net/member_illust.php?mode=manga_big&illust_id=3
0507937&page=3
Fetching big image page: http://www.pixiv.net/member_illust.php?mode=manga_big&illust_id=3
0507937&page=4
Page Count : 5
Image URL : http://i2.pixiv.net/img14/img/kagamino/30507937_big_p0.jpg
Filename : I:\Pixiv\193027 kagamino\烏丸あみる _\30507937_big_p0 - 紅楼夢8新刊「共食い禁レ
Using Referer: http://www.pixiv.net/member_illust.php?mode=medium&illust_id=30507937
Start downloading... 357364 of 357364 Bytes Completed in 2.793s (124.95 KiB/s)
done.

Image URL : http://i2.pixiv.net/img14/img/kagamino/30507937_big_p2.jpg
Filename : I:\Pixiv\193027 kagamino\烏丸あみる _\30507937_big_p2 - something.jpg
Using Referer: http://www.pixiv.net/member_illust.php?mode=medium&illust_id=30507937
Start downloading... 622938 of 622938 Bytes Completed in 2.935s (207.27 KiB/s)
done.

Filename : I:\Pixiv\193027 kagamino\烏丸あみる _\30507937_big_p4 - something.jpg
Using Referer: http://www.pixiv.net/member_illust.php?mode=medium&illust_id=30507937
Start downloading... 930539 of 930539 Bytes Completed in 2.761s (329.13 KiB/s)
done.

Requests

You pixiv downloader is grateful

Of a way to add the features you need and I will

Please add features that the description of the picture in the text of the same name and picture

Many writers of the picture settings stored separately to many it's hard to

I'm begging you

あなたのpixiv downloaderは感謝しても使用しています

ところが、必要な機能が生じて追加を提案しています

画像の説明を図のような名前のテキストにしてくれる機能を追加してください

画像の設定が多くの作家が多くて別々に保存することが大変ですね

お願いします

Manga Mode downloading mobile site images following site redesign.

Following a recent redesign of Pixiv's manga sub-pages, the utility has been attempting to download images from the mobile site as well as the normal 'big' image. It returns a 404 when it looks for the 'big' image on the mobile site, and downloads the 128x128 preview image instead.

It appears that the new design is hosting the mobile images in a way that causes the utility to register them as normal images. When downloading a two-image submission, it registers four images on the page.
#4

Processing Image Id: 35718646
Title: ??????????????????
Tags : ?????????? ???????????? ?????? ??
Mode : manga
Page Count : 4
Image URL : http://i1.pixiv.net/img05/img/hitman/35718646_big_p0.jpg
Filename : K:\Download Pile\Pixiv\Test\35718646_big_p0.jpg
Using Referer: http://www.pixiv.net/member_illust.php?mode=manga&illust_id=35718
646
Start downloading... 1332421 of 1332421 Bytes Complete.
done.

Image URL : http://i1.pixiv.net/img05/img/hitman/mobile/35718646_128x128_big_p0.
jpg
Filename : K:\Download Pile\Pixiv\Test\35718646_128x128_big_p0.jpg
Using Referer: http://www.pixiv.net/member_illust.php?mode=manga&illust_id=35718
646
[downloadImage()] HTTP Error 404: Not Found (http://i1.pixiv.net/img05/img/hitma
n/mobile/35718646_128x128_big_p0.jpg)
No big manga image available, try the small one

Image URL : http://i1.pixiv.net/img05/img/hitman/mobile/35718646_128x128_p0.jpg
Filename : K:\Download Pile\Pixiv\Test\35718646_128x128_p0.jpg
Using Referer: http://www.pixiv.net/member_illust.php?mode=manga&illust_id=35718
646
Start downloading... 14602 of 14602 Bytes Complete.
done.

Image URL : http://i1.pixiv.net/img05/img/hitman/35718646_big_p1.jpg
Filename : K:\Download Pile\Pixiv\Test\35718646_big_p1.jpg
Using Referer: http://www.pixiv.net/member_illust.php?mode=manga&illust_id=35718
646
Start downloading... 161704 of 161704 Bytes Complete.
done.

Image URL : http://i1.pixiv.net/img05/img/hitman/mobile/35718646_128x128_big_p1.
jpg
Filename : K:\Download Pile\Pixiv\Test\35718646_128x128_big_p1.jpg
Using Referer: http://www.pixiv.net/member_illust.php?mode=manga&illust_id=35718
646
[downloadImage()] HTTP Error 404: Not Found (http://i1.pixiv.net/img05/img/hitma
n/mobile/35718646_128x128_big_p1.jpg)
No big manga image available, try the small one

Image URL : http://i1.pixiv.net/img05/img/hitman/mobile/35718646_128x128_p1.jpg
Filename : K:\Download Pile\Pixiv\Test\35718646_128x128_p1.jpg
Using Referer: http://www.pixiv.net/member_illust.php?mode=manga&illust_id=35718
646
Start downloading... 7727 of 7727 Bytes Complete.
done.

Question: Is it possible to run om Mac OS X?

Building from source means that you can run it on Mac OS X? I tried to replace the default setup.py file with one created by 'py2applet' which will use 'py2app' to make a Mac OS X distribution. But the generated application crashes after startup. Does someone know, how this must be done?

I used the following command to create a setup file:

py2applet --make-setup PixivUtil2.py --iconfile icon-mac.icns 

I made small changes in the setup file to avoid some errors. This is the resulting file:

"""
This is a setup.py script generated by py2applet

Usage:
    python setup.py py2app
"""

from setuptools import setup

APP = ['PixivUtil2.py']
DATA_FILES = []
OPTIONS = {'argv_emulation': True,
 'iconfile': '/Users/connect/Desktop/PixivUtil2-master/icon-mac.icns'}

setup(
    app=APP,
    data_files=DATA_FILES,
    options={'py2app': OPTIONS},
    setup_requires=['py2app','BeautifulSoup']
)

Then I try to build a executable by entering:

python setup.py py2app

After starting up the application, some window shows up with the message "PixivUtil2 Error". In the console there are these errors:

05-09-14 14:14:20,755 PixivUtil2[1409]: �]2;PixivDownloader 20140712 �PixivDownloader2 version 20140712
05-09-14 14:14:20,756 PixivUtil2[1409]: https://nandaka.wordpress.com/tag/pixiv-downloader/
05-09-14 14:14:20,756 PixivUtil2[1409]: Reading /Users/connect/Desktop/PixivUtil2-master/dist/PixivUtil2.app/Contents/MacOS/config.ini ...
05-09-14 14:14:20,756 PixivUtil2[1409]: Error at loadConfig(): (<type 'exceptions.IOError'>, IOError(2, 'No such file or directory'), <traceback object at 0x10481ff80>)
05-09-14 14:14:20,756 PixivUtil2[1409]: Some configuration have invalid value, replacing with the default value.
05-09-14 14:14:20,757 PixivUtil2[1409]: Writing config file... Backing up old config (error exist!) to config.ini.error-1409919260
05-09-14 14:14:20,757 PixivUtil2[1409]: done.
05-09-14 14:14:20,757 PixivUtil2[1409]: done.
05-09-14 14:14:20,762 PixivUtil2[1409]: Traceback (most recent call last):
05-09-14 14:14:20,762 PixivUtil2[1409]:   File "/Users/connect/Desktop/PixivUtil2-master/dist/PixivUtil2.app/Contents/Resources/__boot__.py", line 373, in <module>
05-09-14 14:14:20,762 PixivUtil2[1409]:     _run()
05-09-14 14:14:20,762 PixivUtil2[1409]:   File "/Users/connect/Desktop/PixivUtil2-master/dist/PixivUtil2.app/Contents/Resources/__boot__.py", line 358, in _run
05-09-14 14:14:20,762 PixivUtil2[1409]:     exec(compile(source, path, 'exec'), globals(), globals())
05-09-14 14:14:20,762 PixivUtil2[1409]:   File "/Users/connect/Desktop/PixivUtil2-master/dist/PixivUtil2.app/Contents/Resources/PixivUtil2.py", line 1828, in <module>
05-09-14 14:14:20,762 PixivUtil2[1409]:     main()
05-09-14 14:14:20,762 PixivUtil2[1409]:   File "/Users/connect/Desktop/PixivUtil2-master/dist/PixivUtil2.app/Contents/Resources/PixivUtil2.py", line 1718, in main
05-09-14 14:14:20,763 PixivUtil2[1409]:     dfilename = PixivHelper.toUnicode(sys.path[0], encoding=sys.stdin.encoding) + os.sep + dfilename
05-09-14 14:14:20,763 PixivUtil2[1409]:   File "PixivHelper.pyc", line 279, in toUnicode
05-09-14 14:14:20,763 PixivUtil2[1409]: TypeError: unicode() argument 2 must be string, not None
05-09-14 14:14:20,781 PixivUtil2[1409]: PixivUtil2 Error
05-09-14 14:14:20,781 PixivUtil2[1409]: 2014-09-05 14:14:20.780 PixivUtil2[1409:707] PixivUtil2 Error

Request: Animated images "Ugoira" metadata support

http://www.pixiv.net/info.php?id=2476&lang=en

Currently PixivUtil2 downloads these as .zip files. However in doesn't save any extra data - namely, delays between frames. Thus, if some image submission gets removed, there is no way to recover that information anymore.

There can be several options for converting Ugoira to a viewable format:

  • gif. Rather bad option due to quality loss but the most compatible.
  • apng. 100% quality but supported only supported on some browsers and image viewers.
  • webm. Quality loss is expectable, but the resulting file would be rather small, supported by most browsers.

Of course, all these options would require some serious codework or relying on more prerequisites like ImageMagick or apngasm etc.

At the same time, there can be another (easier imo) option that would at least allow keeping extra info, to save it for later, when more options might emerge:

  • save the whole html page containing the image, so it can still be viewed directly in any browser, just like original.
    (I think it should be same as "ugoira_view" page)

There is also a userscript to convert Ugoira into svg, but that requires extracting images from zip and relying on using svg-complaint browser. Even though it not many people would like that, saving original .html would give them an option to do that.

Script in question, if that helps:

// ==UserScript==
// @name        Ugoira backup
// @include     http://www.pixiv.net/member_illust.php*
// @grant       none
// @run-at      document-start
// ==/UserScript==

document.addEventListener('DOMContentLoaded', function(){

var frames = pixiv.context.ugokuIllustFullscreenData.frames,
    numFrames = frames.length,
    width = pixiv.context.illustSize[0],
    height = pixiv.context.illustSize[1];

var result = '';

for(var i=0; i<=numFrames-1; i++) {
    result += ''+
        ''+
        ''+
        '';
}
result +='';

$("._ugoku-illust-player-container").after(
    $("", {
        href : 'data:image/svg+xml,'+result,
        text: 'Get ugoira .svg ' ,
        target:'_blank'
    })
);

$("._ugoku-illust-player-container").after(' | ');

$("._ugoku-illust-player-container").after(
    $("", {
        href : pixiv.context.ugokuIllustFullscreenData.src,
        text: 'Get ugoira .zip '
    })
);

}, false);

Option 8 (download from bookmark) doesn't get anything

As of July 9th, option 8 doesn't work (it did last time I used it, about 24 hours earlier), but it's not the type of error that crashes the software. It just goes:

Processing new Illust from bookmark
Page #1
No Images!
Done.

The other functions I use regularly, 1 and 2, work as usual, so my only guess is that Pixiv may have changed something in the coding of the bookmark_new_illust.php page.

Attribute Error: 'NoneType' object has no attribute 'ul'

http://i.imgur.com/oD5ugXk.jpg
20140325

HTML error dump:
http://a.pomf.se/trlfvz.html

Firewall was cleared, etc. I redid the cookie config.ini, same result.

2014-10-24 12:19:43,779 - PixivUtil20140325 - INFO - ###############################################################
2014-10-24 12:19:43,779 - PixivUtil20140325 - INFO - Starting...
2014-10-24 12:19:43,785 - PixivUtil20140325 - INFO - Setting log level to: DEBUG
2014-10-24 12:19:43,786 - PixivUtil20140325 - INFO - No default cookie jar available, creating...
2014-10-24 12:19:43,792 - PixivUtil20140325 - INFO - Only process member where day last updated >= 7
2014-10-24 12:19:43,793 - PixivUtil20140325 - INFO - Using Username: conquistadork
2014-10-24 12:19:43,795 - PixivUtil20140325 - INFO - logging in with saved cookie
2014-10-24 12:19:43,796 - PixivUtil20140325 - INFO - Trying to log with saved cookie
2014-10-24 12:19:45,996 - PixivUtil20140325 - INFO - Logged in using cookie
2014-10-24 12:19:48,657 - PixivUtil20140325 - INFO - Member id mode.
2014-10-24 12:19:57,851 - PixivUtil20140325 - INFO - Processing Member Id: 17040
2014-10-24 12:19:57,884 - PixivUtil20140325 - INFO - Member Url: http://www.pixiv.net/member_illust.php?id=17040&p=1
2014-10-24 12:20:00,312 - PixivUtil20140325 - ERROR - Error at process_member(): (<type 'exceptions.AttributeError'>, AttributeError("'NoneType' object has no attribute 'ul'",), <traceback object at 0x034CAAD0>)
2014-10-24 12:20:00,312 - PixivUtil20140325 - ERROR - Error at process_member(): 17040
Traceback (most recent call last):
File "PixivUtil2.py", line 420, in process_member
File "PixivModel.pyc", line 46, in init
File "PixivModel.pyc", line 122, in ParseImages
AttributeError: 'NoneType' object has no attribute 'ul'
2014-10-24 12:20:00,667 - PixivUtil20140325 - ERROR - Dumping html to: Error page for member 17040.html
2014-10-24 12:20:00,671 - PixivUtil20140325 - ERROR - Unknown Error: 'NoneType' object has no attribute 'ul'
Traceback (most recent call last):
File "PixivUtil2.py", line 1833, in main
File "PixivUtil2.py", line 1630, in main_loop
File "PixivUtil2.py", line 1341, in menu_download_by_member_id
File "PixivUtil2.py", line 420, in process_member
File "PixivModel.pyc", line 46, in init
File "PixivModel.pyc", line 122, in ParseImages
AttributeError: 'NoneType' object has no attribute 'ul'

Unable to log in.

PixivUtil2 throws an error right when it tries to log in.
Throws an error whether I log in manually or through config.ini credentials.

Heres the debug log.

2014-01-29 22:18:45,232 - PixivUtil20140126 - INFO - ###############################################################
2014-01-29 22:18:45,233 - PixivUtil20140126 - INFO - Starting...
2014-01-29 22:18:45,237 - PixivUtil20140126 - INFO - Setting log level to: DEBUG
2014-01-29 22:18:45,239 - PixivUtil20140126 - INFO - No default cookie jar available, creating...
2014-01-29 22:18:45,246 - PixivUtil20140126 - INFO - Only process member where day last updated >= 7
2014-01-29 22:18:45,246 - PixivUtil20140126 - INFO - Using Username: ultimatenoob
2014-01-29 22:18:45,246 - PixivUtil20140126 - INFO - logging in with saved cookie
2014-01-29 22:18:45,249 - PixivUtil20140126 - INFO - Trying to log with saved cookie
2014-01-29 22:18:46,767 - PixivUtil20140126 - INFO - Failed to login using cookie, returned page: http://www.pixiv.net/index.php?return_to=%2Fmypage.php
2014-01-29 22:18:46,769 - PixivUtil20140126 - INFO - Cookie already expired/invalid.
2014-01-29 22:18:46,769 - PixivUtil20140126 - INFO - Log in using form.
2014-01-29 22:18:48,417 - PixivUtil20140126 - ERROR - Error at pixiv_login(): (<class 'mechanize._response.httperror_seek_wrapper'>, <httperror_seek_wrapper (mechanize._http.RobotExclusionError instance) at 0x2539ea0 whose wrapped object = <closeable_response at 0x25a9440 whose fp = <cStringIO.StringI object at 0x02552560>>>, <traceback object at 0x025A75A8>)
Traceback (most recent call last):
File "PixivUtil2.py", line 258, in pixiv_login
File "mechanize_mechanize.pyc", line 203, in open
File "mechanize_mechanize.pyc", line 255, in _mech_open
httperror_seek_wrapper: HTTP Error 403: request disallowed by robots.txt
2014-01-29 22:18:48,420 - PixivUtil20140126 - ERROR - Unknown Error: HTTP Error 403: request disallowed by robots.txt
Traceback (most recent call last):
File "PixivUtil2.py", line 1746, in main
File "PixivUtil2.py", line 258, in pixiv_login
File "mechanize_mechanize.pyc", line 203, in open
File "mechanize_mechanize.pyc", line 255, in _mech_open
httperror_seek_wrapper: HTTP Error 403: request disallowed by robots.txt
2014-01-29 22:18:56,233 - PixivUtil20140126 - INFO - EXIT
2014-01-29 22:18:56,233 - PixivUtil20140126 - INFO - ###############################################################

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.