alex-moreno / glitcherbot Goto Github PK
View Code? Open in Web Editor NEWVisual regression testing made easy. Automating the boring stuff
License: GNU Affero General Public License v3.0
Visual regression testing made easy. Automating the boring stuff
License: GNU Affero General Public License v3.0
At the moment Vagrant or installing the dependencies locally is the only possibility to run the crawler.
Ideally we'd have a way to run the crawler inside a Docker container, so dependencies are reduced as much as possible if you choose to use Docker instead of Vagrant.
Next level: now that we have some stored historical data, I wonder whether it would be worth looking at something like https://www.chartjs.org/ to put together some simple visualisations?
Or, we could get the data into grafana :)
On a new brand install, or when the db is empty as there has been no crawl yet most of the pages will display an error.
Instead would be great to show what the user needs to do to start to see things happening.
When a status code or size changes between crawls, this should be highlighted in the table of results.
For the size, a threshold should be used, in terms of acceptable tolerances.
We can use bootstrap classes for the highlighting
table-danger
if the size changes outside of tolerances or if the status code changestable-warning
if the size changes but is inside tolerancesPush any error from the process into a log file, somewhere in /etc/* maybe
Describe the bug
There are a few notices and warnings being produced by the tool when in use. For example:
Notice: Undefined index: date1 in src/ScraperBot/Routing/Controllers/SitesController.php on line 84
Notice: Undefined index: date2 in src/ScraperBot/Routing/Controllers/SitesController.php on line 84
Notice: Undefined variable: persistNaughty in src/ScraperBot/Routing/Controllers/SitesController.php on line 84
Notice: Undefined variable: persistLatest in src/ScraperBot/Routing/Controllers/SitesController.php on line 84
To Reproduce
Errors are produced on most pages of the site, but also when using the command line interface.
Error also produced by a missing favicon.ico file.
Expected behavior
No errors should be produced during the normal use of the tool.
At the moment the comparison between crawls is done using the whole url.
It would be useful to select specific crawls on which we want to compare just the final part of the url, so we can potentially have different environments, stage and prod, and compare results between the two.
Whilst I think the inclusion of a full vagrant docker setup is a good idea, I think that providing a simpler way to get up and running might be a good thing and create a lower barrier for entry. I had a few troubles getting the vagrant commands to run correctly on my setup so I resorted to an alternative method that worked well.
Can I therefore suggest alternate instructions that allow the site to be served directly using the built in PHP server? All that's needed is the following command.
php -S 0.0.0.0:8000 -t html html/index.php
I was able to get the site up and running using this and view my statistics. This command can be wrapped in the composer scrips section of the composer file in order to allow the command to be easily run without having to remember the command. Adding this:
"scripts": {
"start" : "php -S 0.0.0.0:8000 -t public public/index.php",
}
Allows the following to be run to .
composer start
This does assume that the user has all of the local dependencies installed, which are nicely detailed in the composer.json file :)
Idea, execute directly:
php bin/visual_regression_bot.php bot:crawl-xml-sitemap sitemap.xml
Thank you for working on a non-js based regression tool. ๐
I am having trouble diffing two different environments, I am assuming the workflow is the following:
php bin/visual_regression_bot.php bot:crawl-sites production-site.csv
php bin/visual_regression_bot.php bot:crawl-sites development-site.csv
php bin/visual_regression_bot.php bot:compare-crawls production-site.csv development-site.csv
If we use some plugin or api kind of mechanism weโd be able to trigger snapshots when for example we detect we have passed a threshold
at the moment the sites are only kept in the csv or the json. When doing the first crawl we need to store those sites and indexes in a new table
Potentially use of ncurses for the terminal, and clean the output using a log file maybe
I think a nice feature would be to provide an UI similar to this https://validator.w3.org/ where the users can either paste a list of urls or upload a CSV or Json file to be processed.
Tasks:
1 - Provide an upload field to let users upload CSV or Json file.
2 - A textarea field to allow people to paste a list of urls
3 - Add validation of the data in the backend
An abstract source interface, used by the crawler to fetch sites
When any errors happen would be good to log them on the database, so when drilling down on the site we can find what went wrong at the moment we were trying to fetch the database
This is a TODO.
Ideally we should be able to provide as well a sitemap, so the user can simply give a main url and the bot can crawl everything inside that url
Describe the bug
All the routes not working after taking the latest changes from the repository. I am using Vagrant based setup.
To Reproduce
Steps to reproduce the behavior:
0. I setup VirtualHost in apache with domain name dashboard.glitcherbot.local
sudo /etc/apache2/sites-available/dashboard.glitcherbot.local.conf
<VirtualHost *:80>
ServerAdmin [email protected]
ServerName dashboard.glitcherbot.local
ServerAlias www.dashboard.glitcherbot.local
DocumentRoot /var/www/html
DirectoryIndex index.php index.html
ErrorLog ${APACHE_LOG_DIR}/dashboard.glitcherbot.com-error.log
CustomLog ${APACHE_LOG_DIR}/dashboard.glitcherbot-access.log combined
</VirtualHost>
sudo apache2ctl -M | grep rewrite && sudo a2enmod rewrite
sudo systemctl restart apache2.service
Expected behavior
It should accept index.php automatically to navigate all the menu items.
Screenshots
File: src/templates/menu.twig
Not working
<a class="nav-link" href="/sites">Diffs</a>
Working
<a class="nav-link" href="/index.php/sites">Diffs</a>
Desktop (please complete the following information):
When on diff page it would be good to get results for specific status codes. Say, I only want to see pages which return a 500 status
Unless I am misunderstanding on how to use the the sitemap crawl using the bot:crawl-xml-sitemap command is failing as follows:
execute:
php bin/visual_regression_bot.php bot:crawl-xml-sitemap sitemap.xml
fails with:
Sitemaps crawling>>>> PHP Fatal error: Uncaught Error: Call to undefined method ScraperBot\Source\XmlSitemapSource::getCurrentIndex() in /Projects/glitcherbot/src/ScraperBot/Command/CrawlSitesCommand.php:83
Stack trace:
#0 /Projects/glitcherbot/vendor/symfony/console/Command/Command.php(256): ScraperBot\Command\CrawlSitesCommand->execute(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#1 /Projects/glitcherbot/vendor/symfony/console/Application.php(971): Symfony\Component\Console\Command\Command->run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#2 /Projects/glitcherbot/vendor/symfony/console/Application.php(290): Symfony\Component\Console\Application->doRunCommand(Object(ScraperBot\Command\CrawlXmlSitemapCommand), Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Compo in /Projects/glitcherbot/src/ScraperBot/Command/CrawlSitesCommand.php on line 83
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.