Topic: web-archiving Goto Github
Some thing interesting about web-archiving
Some thing interesting about web-archiving
web-archiving,Wayback Machine API interface & a command-line tool
User: akamhy
Home Page: https://pypi.org/project/waybackpy/
web-archiving,🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
Organization: archivebox
Home Page: https://archivebox.io
web-archiving,Official ArchiveBox browser extension: automatically/manually preserve your browsing history using ArchiveBox.
Organization: archivebox
Home Page: https://chromewebstore.google.com/detail/archivebox-exporter/habonpimjphpdnmcfkaockjnffodikoj
web-archiving,Home of the official apt/deb package for Ubuntu/Debian-based systems.
Organization: archivebox
Home Page: https://launchpad.net/~archivebox/+archive/ubuntu/archivebox
web-archiving,DigestBox takes any webpage URL (news article, video link, comment thread, etc.) and gives you just the raw content. It's powered by ArchiveBox.io under the hood.
Organization: archivebox
Home Page: https://DigestBox.io
web-archiving,Source for the Github Wiki / ReadTheDocs documentation for AchiveBox, the self-hosted internet archiving solution.
Organization: archivebox
Home Page: https://docs.archivebox.io
web-archiving,Desktop Electron app for ArchiveBox internet archiver. (ALPHA: not ready for general use)
Organization: archivebox
Home Page: https://archivebox.io
web-archiving,Homebrew formula for the ArchiveBox self-hosted internet archiving solution.
Organization: archivebox
Home Page: https://archivebox.io
web-archiving,Official Python package for ArchiveBox, the self-hosted internet archiving solution.
Organization: archivebox
Home Page: https://pypi.org/project/archivebox/
web-archiving,Automatically archive links to videos, images, and social media content from Google Sheets (and more).
Organization: bellingcat
Home Page: https://pypi.org/project/auto-archiver/
web-archiving,A toolkit for CDX indices such as Common Crawl and the Internet Archive's Wayback Machine
Organization: cocrawler
web-archiving,ArchiveBoxMatic: configure ArchiveBox with the simplicity of a yaml file.
User: dbeley
web-archiving,WarcDB: Web crawl data as SQLite databases.
User: florents-tselai
Home Page: https://WarcDB.tselai.com
web-archiving,CLI tool for saving a faithful copy of a complete web page in a single HTML file (based on SingleFile)
User: gildas-lormeau
web-archiving,Social Feed Manager user interface application.
Organization: gwu-libraries
Home Page: http://gwu-libraries.github.io/sfm-ui
web-archiving,An Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed at Internet Archive.
User: helgeho
web-archiving,Perpetual Access To The Scholarly Record
Organization: internetarchive
Home Page: https://guide.fatcat.wiki
web-archiving,Backend, IA-specific tools for crawling and processing the scholarly web. Content ends up in https://fatcat.wiki
Organization: internetarchive
web-archiving,Support for writing WARC files with Scrapy
Organization: internetarchive
web-archiving,:whale2: Web Archiving Integration Layer: One-Click User Instigated Preservation
User: machawk1
Home Page: https://matkelly.com/wail
web-archiving,Chrome extension to "Create WARC files from any webpage"
User: machawk1
Home Page: https://warcreate.com
web-archiving,🗄️ A simple CLI for converting WARC to Parquet.
User: maxcountryman
web-archiving,Parse And Create Web ARChive (WARC) files with node.js
User: n0tan3rd
web-archiving,Experimental continouous web crawler for web archiving
Organization: nla
web-archiving,Chrome debugging protocol client for Java
Organization: nla
web-archiving,Converts HTTrack crawls to WARC files
Organization: nla
web-archiving,Web archive index server based on RocksDB
Organization: nla
web-archiving,A Tool To Push Web Resources Into Web Archives
Organization: oduwsdl
web-archiving,InterPlanetary Wayback: A distributed and persistent archive replay system using IPFS
Organization: oduwsdl
web-archiving,A Memento Aggregator CLI and Server in Go
Organization: oduwsdl
Home Page: https://memgator.cs.odu.edu/api.html
web-archiving,Recover lost websites from the Web Infrastructure
Organization: oduwsdl
Home Page: https://code.google.com/p/warrick
web-archiving,A suite of tools for mirroring and hoarding web pages you visit for later offline viewing. I.e. your own personal Wayback Machine that can also archive HTTP POST requests and responses, as well as most other HTTP-level data, which also follows "archive everything now, figure out what to do with it later" philosophy.
Organization: own-data-privateer
Home Page: https://oxij.org/software/pwebarc/
web-archiving,🎭 An introduction to the Internet Archiving ecosystem, tooling, and some of the ethical dilemmas that the community faces.
User: pirate
Home Page: https://pirate.github.io/internet-archiving-talk/
web-archiving,The repository and website hosting the peer review process for new Programming Historian lessons
Organization: programminghistorian
Home Page: http://programminghistorian.github.io/ph-submissions
web-archiving,Archiveror will help you preserve the webpages you love. 💾
User: rahiel
Home Page: https://www.rahielkasim.com/archiveror/
web-archiving,Collect and revisit web pages.
Organization: rhizome-conifer
Home Page: https://conifer.rhizome.org
web-archiving,Conifer setup and deployment via Ansible
Organization: rhizome-conifer
web-archiving,Shepherding our web archives from crawl to access.
Organization: ukwa
web-archiving,A High-Fidelity Web Archiving Extension for Chrome and Chromium based browsers!
Organization: webrecorder
Home Page: https://chrome.google.com/webstore/detail/webrecorder/fpeoodllldobpkbkabpblcfaogecpndd
web-archiving,Browsertrix is the hosted, high-fidelity, browser-based crawling service from Webrecorder designed to make web archiving easier and more accessible for all!
Organization: webrecorder
Home Page: https://browsertrix.com
web-archiving,Run a high-fidelity browser-based crawler in a single Docker container
Organization: webrecorder
Home Page: https://crawler.docs.browsertrix.com
web-archiving,CDXJ Indexing of WARC/ARCs
Organization: webrecorder
web-archiving,A prototype server to swarm multiple DATs for Webrecorder
Organization: webrecorder
web-archiving,Core Python Web Archiving Toolkit for replay and recording of web archives
Organization: webrecorder
Home Page: https://pypi.python.org/pypi/pywb
web-archiving,Serverless replay of web archives directly in the browser
Organization: webrecorder
Home Page: https://replayweb.page
web-archiving,Streaming WARC/ARC library for fast web archive IO
Organization: webrecorder
Home Page: https://pypi.python.org/pypi/warcio
web-archiving,Webrecorder Player for Desktop (OSX/Windows/Linux). (Built with Electron + Webrecorder)
Organization: webrecorder
web-archiving,A server to collect & archive websites that also supports video downloads
User: xarantolus
Home Page: https://010.one/Collect/
web-archiving,Create "perfect" snapshots of web pages
Organization: zytedata
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.