Comments (7)
Thoughts on adding a name
field to easily label and differentiate configs?
from browsertrix.
Thoughts on adding a
name
field to easily label and differentiate configs?
Yes, definitely a good idea. Just free-form text that can be searched by, right?
from browsertrix.
@ikreymer initial pass at mockup: https://app.mockplus.com/run/rp/TgN5xi_FPJvV/2DkJibt0vw?cps=expand&rps=expand&nav=1&ha=1&la=1&fc=0&out=0&rt=1
q's:
- what are sane defaults for all these fields? I'll update the mockup to reflect default values.
- can you go into more detail as to how scope type and page limit should apply to the URLs? Should the term
seed
be user facing? or should the UI display seeds as something like "URL group"/"Page groups"? - should the JSON configuration represent the whole config/template, or just the nested
.config
object? IMO the former is more user-friendly if we expect users to download JSON configs and then upload or copy-paste them in the future
from browsertrix.
Simplified config for now, just send URL list for seeds
, everything else moved to outer scope, eg.
{
"schedule": null,
"runNow": true,
"config": {
"seeds": [
"https://webrecorder.net/"
],
"scopeType": "prefix"
},
"name": "Example Name",
"colls": [],
"crawlTimeout": "0",
"parallel": 1
}
from browsertrix.
per discussion on call:
- change default to "run instantly" and schedule to "weekly", @ikreymer to document other defaults
seed URL
is user-facing term (TODO link glossary), group seed configs in UI- render whole JSON object
from browsertrix.
@ikreymer minor issue with the POST crawlconfigs endpoint: leaving out the trailing /
throws a Method Not Allowed
error on the server
from browsertrix.
All done!
Crawl scaling and tags to be separate issues.
from browsertrix.
Related Issues (20)
- Add sitemap crawler option to workflow crawls
- UI for crawls, uploads and all objects
- API for listing, deleting uploads and all crawls/uploads HOT 5
- Update crawl data/artifact URLs HOT 2
- UI for adding uploads to collections
- Digital Ocean Playbook Improvements HOT 4
- Feature: Add link to last crawl's replay tab from "Watch crawl" when workflow finishes running HOT 1
- Ensure /replay.json endpoint works for all base crawls, uploads and collections with mixed types HOT 1
- API: Default to parent scopeType for seeds unless scopeType is included in Seed object HOT 1
- [Bug]: Time threshold reached but configuration was unlimited HOT 5
- [Bug]: Healthport still OK after Error HOT 2
- Reorder limits section
- Collections list and detail published state enhancements HOT 1
- Move publication to a background task and poll from frontend until ready
- Get microk8s ansible to pass `ansible-lint`
- [Bug]: Some Crawls aren't showing up properly under All Archived Data → All HOT 3
- Crawl Uploading via Front-End UI
- Add nightly tests for scheduled crawls.
- Collection Publishing V0
- Enable crawl uploading via UI
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from browsertrix.