Giter VIP home page Giter VIP logo

webrecorder-desktop's Introduction

Deprecated The Webrecorder Desktop App is now replaced by the ArchiveWeb.page system, available as an Electron App and Chrome Extension.

The new Electron app is close to supporting all features of Webrecorder Desktop, and should work much better on latest Windows and OS X.

The Chrome extension version is compatible with all Chromium-based browsers, and allowing recording and replay directly in the browser.

Development for ArchiveWeb.page continues at: https://github.com/webrecorder/archiveweb.page.


Webrecorder Desktop App

The Webrecorder Desktop App is a complete packaging of Webrecorder hosted service as an Electron application, with an integrated Chromium browser.

It includes the same functionality available on Webrecorder.io running as a local app, including the new Autopilot feature.

All data captured is stored in a local directory on your machine, in Webrecorder-Data in your Documents directory.

Webrecorder Desktop can be downloaded below or from Releases

OS X Windows (64-bit) Windows (32-bit) Linux
.dmg .exe (64-bit) .exe (32-bit) .AppImage

Note: Running on Linux requires installation of Redis, available as a package on most distros. OS X and Windows versions come with a bundled version of Redis.

Current Features

In addition to the core Webrecorder functionality, the desktop app includes additional features specific to the desktop environmentment. A few of these brand new features are still experimental or in beta, as listed below, so please let us know if anything is not working as expected!

Latest Chromium Browser with Flash Support

Like Webrecorder Player, the Webrecorder Desktop app is built with Electron, and includes the latest release of Chromium, ensuring capture and replay is done with a modern browser. The app also includes a recent Flash plugin to allow for capture and replay of any Flash content. (The App Settings screen includes versions of all components).

Local Storage of All Data

All Webrecorder Data is stored in the <Documents>/Webrecorder-Data directory, with actual WARC files under the storage subdirectory. The Autopilot behaviors are placed in the ‘behaviors’ subdirectory. The directory layout may be updated in the future as we work towards a more standardized directory format for web archives.

Capture, Replay & Curation

The app includes capture, replay, patching as well as curation and collection management features, same as those found on https://webrecorder.io. Existing collections can also be imported (as WARC files) and exported collections can be uploaded to https://webrecorder.io if desired.

Autopilot

The desktop app includes the full Autopilot capabilities for capture of certain dynamic websites, introduced with our last release. Unlike a regular browser, Webrecorder Desktop can run Autopilot in the background and be minimized without affecting the quality of Autopilot capture. For example, users can start Autopilot and have it run in the background while doing other work. (There is an option to mute audio in the Options menu for this use case). There is no limit to how long Autopilot can run locally, and only limits are available network bandwidth and disk storage!

Preview Mode (Beta)

The desktop app includes a new Preview mode that allows browsing content without capture. In particular, this can be used to preview a page before capturing it but also to log in to any sites that require login without capturing the login itself.

After logging to a site in Preview mode, users can then switch to capture mode via the dropdown menu, beginning capture from after the login has completed.

This workflow is recommended for capturing any sites that require a login. To reset all logins, there is also a “Clear Cookies” option in the Options menu. (This feature is currently in beta and we welcome any feedback on this!)

Mobile Device Emulation Mode (Experimental)

The desktop app also includes an experimental mobile device emulation mode, toggleable from the Options menu. With this mode, Webrecorder Desktop will act as a mobile browser and allow for capturing of mobile only content. The window can be resized as needed to support any mobile device. (This feature is currently in beta and we welcome any feedback on this!)

DAT Protocol Support (Experimental)

The app includes our previously-announced approach to sharing web archive collections via the Dat peer-to-peer protocol. To enable sharing of a collection, select Share via Dat from the collection menu. The collection will then have a unique dat:// url, which will allow the full collection (and future updates) to be synched using various tools that use the Dat protocol, to allow for automated backup of local collections, if desired. There is not (yet) a way to import existing collections via Dat, but import is planned for a future update.

Capture Cache (Experimental)

When browsing sites that share resources, Webrecorder Desktop enables the browser cache to avoid capturing the same resources multiple times and writing them to WARC. The cache is reset per recording session, but can also be cleared manually via the Options menu Clear Cache option. The cache should reduce duplicates resources loaded over the network and speed up the browsing and thereby the capture process. This feature is still experimental.

TOR Capture Support (Experimental)

Webrecorder Desktop can capture web content over Tor, including Tor hidden services. However, this requires a bit of manual setup. A local Tor Relay must be installed locally.

Then, via a command-line, set export SOCKS_HOST=localhost before starting Webrecorder Desktop to have it use the Tor SOCKS proxy. Future versions may simplify this process.

Building Webrecorder Desktop

To build Webrecorder Desktop locally, please follow the instructions:

  1. Clone with submodules (the submodule is the main webrecorder/webrecorder, which contains most of the code)
git clone --recurse-submodules https://github.com/webrecorder/webrecorder-desktop.git

This will install the Webrecorder submodule as well

  1. Build Webrecorder Python Binaries and install into python-binaries

This will build the Webrecorder project and install PyInstaller 3.3. Python 3.5 is recommended for now and a separate virtualenv just in case.

./build-wr.sh
  1. Build the Webrecorder frontend
node build-desktop.js
  1. Run in Dev Mode
yarn run start-dev
  1. Build Electron Binary
yarn run dist
  1. If all goes well, the binary image should be placed in ./dist/{mac,linux,win} directory, depending on your platform.

webrecorder-desktop's People

Contributors

chid avatar ikreymer avatar m4rk3r avatar machawk1 avatar n0tan3rd avatar phette23 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

webrecorder-desktop's Issues

[usecase] Daum FanCafe - [object Object] problem; little captured

Version 2.0.1 of the Webrecorder-Desktop (and possibly Webrecorder) does not correctly capture content that involves Daum's fancafe BBS. From what I can understand, Daum appears to be modifying the URL of the page through Javascript after it has been loaded. What Daum is doing seems to work in Preview mode, but does not work in Capture mode. When the WARC is reviewed after the capture, very little appears to be correctly captured.

For example, the following hyperlink is for a fancafe on Daum: http://cafe.daum.net/ok1221
http://cafe.daum.net/ok1221

Preview Mode
When a user first accesses that URL in Preview mode, the URL will be the following:
http://cafe.daum.net/ok1221

When the user clicks one of the boards in that bulletin board system, the URL will correctly reflect that:

Capturing Mode
If we attempt to access the original URL again when in Capture mode, the end of the URL eventually changes from the original desired URL to something ending with "[object Object]".
http://cafe.daum.net/[object%20Object]

When a user clicks one of the boards or one of the posts, the "object Object" problem keeps appearing:

I've seen somewhat similar behavior when accessing that kind of content with Firefox, but I'm not sufficiently experienced with web design and diagnosis to understand precisely why this problem is occurring.

How to change the Data Directory

Hi,
Thanks for sharing your app. I have a questions is it possible to change/configure the Data Directory: /home/USERNAME/Documents/Webrecorder-Data

Thanks a lot

Preview / recording sites with login

I've tried recording a site that has a login, but the app just hangs. Should I be seeing a login prompt? Cookies have been cleared at each attempt.

Redis refuses to start on Windows 10

Every time I try to install WR Desktop, it just hangs there and the processes stay up while nothing happens. I did a bit of digging and it turns out it's because Redis thinks the pagefile is too small.

From redis.log:

[13948] 07 Nov 11:32:34.854 # 
The Windows version of Redis allocates a memory mapped heap for sharing with
the forked process used for persistence operations. In order to share this
memory, Windows allocates from the system paging file a portion equal to the
size of the Redis heap. At this time there is insufficient contiguous free
space available in the system paging file for this operation (Windows error 
0x5AF). To work around this you may either increase the size of the system
paging file, or decrease the size of the Redis heap with the --maxheap flag.
Sometimes a reboot will defragment the system paging file sufficiently for 
this operation to complete successfully.

Please see the documentation included with the binary distributions for more 
details on the --maxheap flag.

Redis can not continue. Exiting.

Is there any way to fix this?

Ensure extract + patching is working

Extract + patching not yet working in desktop app, need to:

  • port over react components to same logic as record/replay
  • ensure public web archives is built

No recorder/browser window drawn on MacOS BigSur 11.1 Intel

Versions affected

Webrecorder 2.0.3.131
Webrecorder Player 1.8.0

On BigSur 11.1 (Intel) both Webrecorder and Webrecorder Player (1.8.0) will open but not display a capture/playback window. Also, when opening an existing warc no window is created. The app itself doesn't freeze, menues are accessible. Deleting the stored configuration in //Application Support/Webrecorder doesn't lead to the window being redrawn.
Unfortunately I couldn't find info on how to pass a debug argument to the app - if anyone would kindly direct me in the right direction I'll gladly post the output. (I have tried > open -F -a Webrecorder , but I'm not observing any different behaviour).

Minimum UI changes page for desktop app beta

Related to #4, what are the minimum UI changes needed for desktop app to be beta tested, besides user settings, to make it more clear its a desktop app?
Mostly everything works, but perhaps more distinction is needed (eg. no need to set collections to public?)

[Linux] Unable to launch on Debian; Electron sandbox issues?

I've encountered issues with launching it on Debian. Yes, I did install the redis-server. package.

From what I can tell, there's some kind of issue with the Google Chrome sandbox as used by Electron. I can get it to launch when i disable the sandbox with the --no-sandbox argument, however, as I see that as a security feature, I don't want to use it without that.

(EDIT: Nothing appears when I launch it from the file manager; it only outputs the following errors to the command line)

Attempts to launch on Debian 10 x86_64 (stable branch):

kevin@HomeServer:/mnt/workspaces/kevin/Webrecorder$ ./webrecorder-2.0.2.AppImage
[15998:0914/080744.740890:FATAL:setuid_sandbox_host.cc(158)] The SUID sandbox helper binary was found, but is not configured correctly. Rather than run without sandboxing I'm aborting now. You need to make sure that /tmp/.mount_webrecTCkBjG/chrome-sandbox is owned by root and has mode 4755.
Trace/breakpoint trap
kevin@HomeServer:/mnt/workspaces/kevin/Webrecorder$ ./webrecorder-2.0.2.AppImage
[17438:0914/083447.446031:FATAL:setuid_sandbox_host.cc(158)] The SUID sandbox helper binary was found, but is not configured correctly. Rather than run without sandboxing I'm aborting now. You need to make sure that /tmp/.mount_webrecKuaVix/chrome-sandbox is owned by root and has mode 4755.
Trace/breakpoint trap
kevin@HomeServer:/mnt/workspaces/kevin/Webrecorder$ /tmp/.mount_webrecKuaVix/webrecorder: error while loading shared libraries: libffmpeg.so: cannot open shared object file: No such file or directory

kevin@HomeServer:/mnt/workspaces/kevin/Webrecorder$ ./webrecorder-2.0.2.AppImage
[17449:0914/083455.029303:FATAL:setuid_sandbox_host.cc(158)] The SUID sandbox helper binary was found, but is not configured correctly. Rather than run without sandboxing I'm aborting now. You need to make sure that /tmp/.mount_webrechKi6Rj/chrome-sandbox is owned by root and has mode 4755.
Trace/breakpoint trap
kevin@HomeServer:/mnt/workspaces/kevin/Webrecorder$ uname -a
Linux HomeServer 4.19.0-10-amd64 #1 SMP Debian 4.19.132-1 (2020-07-24) x86_64 GNU/Linux
kevin@HomeServer:/mnt/workspaces/kevin/Webrecorder$ 

I have another system running Debian 11 beta on the same architecture (testing branch):

kevin@wander:~/Documents$ ./webrecorder-2.0.2.AppImage.elf 
[33047:0914/084811.647580:FATAL:setuid_sandbox_host.cc(158)] The SUID sandbox helper binary was found, but is not configured correctly. Rather than run without sandboxing I'm aborting now. You need to make sure that /tmp/.mount_webrec57muY2/chrome-sandbox is owned by root and has mode 4755.
Trace/breakpoint trap
kevin@wander:~/Documents$ ./webrecorder-2.0.2.AppImage.elf 
[33059:0914/084814.884237:FATAL:setuid_sandbox_host.cc(158)] The SUID sandbox helper binary was found, but is not configured correctly. Rather than run without sandboxing I'm aborting now. You need to make sure that /tmp/.mount_webrecJS1JlZ/chrome-sandbox is owned by root and has mode 4755.
Trace/breakpoint trap
kevin@wander:~/Documents$ ./webrecorder-2.0.2.AppImage.elf 
[33067:0914/084816.897788:FATAL:setuid_sandbox_host.cc(158)] The SUID sandbox helper binary was found, but is not configured correctly. Rather than run without sandboxing I'm aborting now. You need to make sure that /tmp/.mount_webrecy5jWXA/chrome-sandbox is owned by root and has mode 4755.
Trace/breakpoint trap
kevin@wander:~/Documents$ /tmp/.mount_webrecy5jWXA/webrecorder: error while loading shared libraries: libffmpeg.so: cannot open shared object file: No such file or directory

kevin@wander:~/Documents$ ./webrecorder-2.0.2.AppImage.elf 
[33077:0914/084825.195227:FATAL:setuid_sandbox_host.cc(158)] The SUID sandbox helper binary was found, but is not configured correctly. Rather than run without sandboxing I'm aborting now. You need to make sure that /tmp/.mount_webrecZgALcl/chrome-sandbox is owned by root and has mode 4755.
Trace/breakpoint trap
kevin@wander:~/Documents$ uname -a
Linux wander 5.7.0-3-amd64 #1 SMP Debian 5.7.17-1 (2020-08-23) x86_64 GNU/Linux
kevin@wander:~/Documents$ 

I have the old file for 2.0.1 downloaded on the first system shown here, and when I try to run that, the same issue is evident:

kevin@HomeServer:/mnt/workspaces/kevin/Webrecorder$ ./webrecorder-2.0.1.AppImage.elf 
[18532:0914/085326.051324:FATAL:setuid_sandbox_host.cc(157)] The SUID sandbox helper binary was found, but is not configured correctly. Rather than run without sandboxing I'm aborting now. You need to make sure that /tmp/.mount_webrecmKbMxM/chrome-sandbox is owned by root and has mode 4755.
Trace/breakpoint trap
kevin@HomeServer:/mnt/workspaces/kevin/Webrecorder$ 

Importing a list of links to record, then autopilot each one

Hi all!

I love the wonderful web archiving tool you have developed. Please continue the great work!

I have one request. Would it be possible--in the future--for a user to import a list of links from a text file (e.g., each link corresponds to one link) and the web recorder-desktop would visit each one, ensure the page is fully loaded, then run Autopilot on each page?

I could do it in a bit of a painful way using some macro scripts (e.g., Autohotkey) to do the same ask, but I think having even more enhanced automation via link importing may save time for web recording archivists!

Thank you for your consideration!

Sharing between WR Desktop and webrecorder.io only without DAT?

While #7 is more general, perhaps a transfer only between webrecorder.io and desktop makes sense? Syncing an entire user account? This could get more tricky, however, and may not really make sense, esp. if a lot of data is added on webrecorder.io.
Just adding this to think about

Backend DAT Support

  • Add dat-share server support! Currently only runs in dev mode, related to build issue in #1
  • Support for cloning a DAT outside of dat-share into temp dir, then calling webrecorder python api to import directory. The JS side mostly already there from player implementation (python api still needed)

Support live/preview/paused mode for logging in to sites without capturing login info

We've had this in webrecorder.io originally but removed it as it didn't work for hosted service, but works quite well locally. The idea, similar to Browsertrix profiles, is that a user can 'prepare' the browser by logging in to whatever sites they'd like without recording anything. When they're all logged in, they switch to recording. It should probably be another mode, only available in the app, not sure what best term for it is.
Workflow might be like this:

  • From landing page user has option to 'Start' or 'Prepare Browser'
  • If 'Prepare Browser', they're taken to the live browsing mode (mode selector says Preparing/Not Recording)
  • User can log in to as many sites as they'd like
  • User can press stop, which would return to landing page, or switch modes to Capture (which would reload the page fully, but with user already logged in)
  • The account settings page can have a 'Clear Cache' button that will clear the cache on the browser/reset all logins. (In the future, could even support multiple profiles but probably not needed for now)

Custom user settings page for desktop

Customize user settings page for desktop app:

  • support for moving data directory? (Default: <downloads>/Webrecorder-Data)
  • support for changing max size, or just set to available hd space?

AppImage file fails with various errors on Linux Mint 19.2

I just downloaded the v. 2.0.1 AppImage and made it executable. Since nothing happened after double-clicking on the file, I tried to launch it from a terminal window using:

./webrecorder-2.0.1.AppImage

This gave me the following output:

    Dat Share api server listening on
    http://localhost:44927
    └── /
        ├── s
        │   ├── wagger (GET)
        │   │   └── / (GET)
        │   │       ├── json (GET)
        │   │       ├── yaml (GET)
        │   │       ├── static/
        │   │       │   └── * (GET)
        │   │       └── * (GET)
        │   ├── hare (POST)
        │   └── ync (POST)
        ├── init (POST)
        ├── unshare (POST)
        └── num
            ├── Sharing (GET)
            └── Dats (GET)

    --no-browser --loglevel info -d /home/johan/Documents/Webrecorder-Data -u johan --port 0 --behaviors-tarfile /tmp/.mount_webrecQHAGW7/resources/python-binaries/behaviors.tar.gz --dat-share-port 44927
    [5405] Failed to execute script webrecorder_full

    Traceback (most recent call last):
      File "redis/connection.py", line 484, in connect
      File "redis/connection.py", line 541, in _connect
      File "redis/connection.py", line 529, in _connect
      File "gevent/_socket3.py", line 335, in connect
    ConnectionRefusedError: [Errno 111] Connection refused

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
      File "redis/client.py", line 667, in execute_command
      File "redis/connection.py", line 610, in send_command
      File "redis/connection.py", line 585, in send_packed_command
      File "redis/connection.py", line 489, in connect
    redis.exceptions.ConnectionError: Error 111 connecting to 127.0.0.1:7679. Connection refused.

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
      File "redis/connection.py", line 484, in connect
      File "redis/connection.py", line 541, in _connect
      File "redis/connection.py", line 529, in _connect
      File "gevent/_socket3.py", line 335, in connect
    ConnectionRefusedError: [Errno 111] Connection refused

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
      File "webrecorder/standalone/localredisserver.py", line 109, in start
      File "webrecorder/standalone/localredisserver.py", line 104, in _init_redis
      File "redis/client.py", line 711, in client_setname
      File "redis/client.py", line 673, in execute_command
      File "redis/connection.py", line 610, in send_command
      File "redis/connection.py", line 585, in send_packed_command
      File "redis/connection.py", line 489, in connect
    redis.exceptions.ConnectionError: Error 111 connecting to 127.0.0.1:7679. Connection refused.

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
      File "webrecorder/standalone/webrecorder_full.py", line 132, in <module>
      File "webrecorder/standalone/standalone.py", line 119, in main
      File "webrecorder/standalone/webrecorder_full.py", line 36, in __init__
      File "webrecorder/standalone/standalone.py", line 49, in __init__
      File "webrecorder/standalone/webrecorder_full.py", line 69, in _runner_init
      File "webrecorder/standalone/localredisserver.py", line 112, in start
      File "webrecorder/standalone/localredisserver.py", line 170, in create_redis_server
      File "site-packages/psutil-5.6.3-py3.6-linux-x86_64.egg/psutil/__init__.py", line 1431, in __init__
      File "gevent/subprocess.py", line 627, in __init__
      File "gevent/subprocess.py", line 1505, in _execute_child
    FileNotFoundError: [Errno 2] No such file or directory: 'redis-server'

    no process

System info

I'm getting this result on a machine with Linux Mint 19.2 (Tina), equivalent to Ubuntu Bionic. Earlier today I got the same problem while trying to run it on another machine with an older Mint version (18.3 if I recalll correctly).

[Linux] Embed the Redis server internally, not OS-level

Currently, advice is to install the desktop app on Linux by installing redis-server - this makes the OS a requirement to package the Redis version, which can cause version dependency issues, amongst others.

The request would be to make Redis part of the AppImage.

Options available:

  • Compiled, specific version of Redis
  • Include a Redis AppImage, inside of the existing AppImage

Other options are welcome! These are just the ones that I can list off of the top of my head. 👍

[Suggestion] Give the option to crawl an entire website + other features

I'd like to capture an entire website using webrecorder desktop instead of having to manually click each individual page. It would be nice to:

  • have the option to split the WARC in x GB pieces if it were to grow that big
  • import a regex set of ignored urls (to prevent something like &rs=5&rs=5&rs=5&rs=5)
  • choose how fast or slow the pages are downloaded (to prevent being banned on some sites)
  • set the number of retries on webpage/resource errors
  • choose what user agent to use from a list or give the user the option to manually specify one

maybe in an "advanced" dropdown menu for advanced users

Good programme so far !!

Storage integration with Google Drive, Dropbox, etc...

To allow users to store their data not just on their hd, possibly provide integration with individual cloud storage, like Google Drive, Dropbox, etc..

Might not even be needed if users can mount Google Drive, Dropbox, etc... as a directory on their machine

This behaviour is currently not supported

Hi,

I get the message 'this behavior is currently not supported' for all behaviors, except autoscrolling. Is this a bug or are all behaviors gone?

Schermafbeelding 2020-10-14 om 14 42 53

I noticed that the hosted version of Conifer also has this issue.

I use the latest version (2.0.2) on macOS High Sierra and in the Webrecorder-Data folder I have a behaviors folder with all behaviors in the dist folder.

Puppetting webrecorder-desktop

Is it possible to externally control webrecorder-desktop to navigate to a page?

I'm using webrecorder-desktop because the in-browser version of webrecorder (either via webrecorder.io or APP_HOST) doesn't work with the SSO system used by the site I'm scraping.

I'm ideally hoping for an APP_HOST endpoint I can fire a POST request at with a URL as a payload, to trigger webrecorder-desktop to navigate to that URL.

Workflow for importing collections via DAT

Already have support/UI for sharing via DAT, but need interface for importing.
Full Proposal:
Import via DAT button on user page, along with New Collection and Upload, which would show a dialog for entering a dat, and would start sync process. If successful, external collection added to the user's list of collections. The collection is then automatically synced via Dat when it is updated. However, the user can not edit this collection as it is synced remotely.
Maybe the collection is listed under a 'Watched' section for collections that are not owned by the user. The user could 'unlink' from DAT and start editing it, or copy collection to make a duplicate. Until DAT has multiwriter support, probably best we can do with collaborative editing. Note that this not needed for initial implementation, can simply be read-only at first.

This would allow sharing between individual Webrecorder Desktop instances as well as Webrecorder Desktop and webrecorder.io

Better Capturing through Tor Support

Tor support is available if setting export SOCKS_HOST=localhost.
Add a way to toggle capture through Tor if the tor proxy is running.
Detect if Tor proxy (or Tor browser?) is already running.

Also, evaluate using the Tor User-Agent (ie. Firefox User-Agent) when using Tor for better anonymity?

Zoom text only

Please add the ability to zoom text or increase font-size (without zooming the entire page). I use the "Zoom Text Only" chrome extensions but can't do that here.

Captured Pages Requires internet to work

First time user - webrecorder-win-x86_64-2.0.1 (windows 10)

Stored pages don't work when the app is opened offline. Once I connect to the internet, it'll keep working even if I go offline. But close the app, go offline, open the app and it'll no longer work.

My 4 collections are affected.

Using the separate webplayer didn't solve the issue. Exporting the collection to a separate WARC file didn't either.

Capture

Uploading WARC files - Issue with replay?

I am looking to upload a locally created WARC file into webrecorder (either to web or desktop app) but not having any success. I have tried both WARCIT and the Chrome extension WARCreate in order to generate WARC files locally before uploading into webrecorder. It appears as though the file is uploading and indexing successfully in webrecorder but there is nothing displayed and no ability to replay the file. The other point that might be worth mentioning is that from the main collections page the size of the upload is represented even though nothing is displayed within the collection.

Offline usage not possible under Windows

Hi, I am trying to view a WARC file/collection in offline mode (without internet access).
It seems that this doesn't work neither in WebRecorder Desktop 2.0.1 or Player 1.8.0 for Windows 10.

Is there anything that I am missing?
On the home page of webrecorder.io it literally says:

Webrecorder Player App
Use this desktop web archive viewer to browse exported collections, even when you are offline.

I am not doing anything specific:

  1. Just downloading the windows release.
  2. Opening and capturing a session.
  3. Restart the app without with internet disabled.
  4. The result is that the the sessions/pages doesn't load in the Browsing mode.

Is this option only available in webrecorder.io?

PS: I even created account on webrecorder.io and exported WARC collection from there in order to double check.

General build cleanup/Fix build process :)

The build process needs some cleanup, possibly to:

  • simplify babel config, with new electron?
  • remove the double package.json setup?
  • fix standalone build on linux and when adding dat-share (getting errors on spread op, then heap out of memory!)

webrecorder desktop not really open after upgrade to catalina

upgraded laptop to catalina. uninstalled webrecorder. re-downloaded dmg and re-installed. rec'd error from apple security but said open anyway, and open. WR icon appears in dock, but never opens desktop app. double click doesn't work. ctrl/click/open doesn't either. ?

File:// support

I want to convert warc to mhtml but it just display mhtml as plaintext, if it use chromium it should render mhtml without problem, probably it because chrome only render mhtml if using file:// protocol, chrome will treat/detect http/https mhtml as plaintext instead of mhtml, so chrome only allow offline file:// for mhtml viewing

Please allow file:// if allowed mhtml to warc converter will be possible

There's been an error: No such page or content is not accessible.

I downloaded the available .dng file (v2.0.0) for Mac (macOS 10.14.6) and I get it to work. I get this error message:

Screenshot 2019-09-26 at 18 18 53

Both links inside the "user" icon (top right) don't do anything. All that works inside this window is the "help" link that sends me to this repo.

Any suggestions to fix this? Thanks in advance.

It could be ideal for forensic activities

Hi:)
this is a very great project! and it can be very usefull also for forensics activities.

for this case two features should be added:

  • export SSL/TLS key for decrypt traffic dumped with parallel wireshark aquisition
    chrome and chromium had this option: --ssl-key-log-file for this goal
    (https://softwaretester.info/https-and-wireshark/)

  • show or export certificates when view WARC for evidences of authenticity

I hope it can be a helpful idea :)

Best Regards

Capturing a tweet while signed in via preview mode results in a session with 0 pages

Steps to reproduce:

  1. Sign in to Twitter (in preview mode)
  2. Go to a tweet
  3. Switch to capture mode
  4. Stop the capture

The application returns to Collection Manager, but the captured page is not in the Pages list. When you go to the Manage Sessions view, you can see that the session is there, but there are no pages in it.

Session without pages

Sometimes the page does get saved, but when you try to browse it, Webrecorder shows a Resource not Found error. Trying to patch the page does not seem to help, because browsing the patched session will again show the Resource not Found error.

These issues happen only when signed in to Twitter. Because of this, private tweets cannot currently be captured.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.