Skip to content

devopsgroup-io/siteshooter

Siteshooter

Siteshooter

NPM version Build Status dependencies

Automate full website screen shots and PDF generation with multiple view port support

Features

  • Crawls specified host and generates a sitemap.xml on the fly
  • Generates entire website screen shots based on sitemap.xml
  • Define multiple view ports
  • Automated PDF generation
  • Includes crawled meta data in generated PDF
  • Reports on broken website links (404 http response)
  • Supports HTTP basic authentication
  • Supports Microsoft Online 3 step authentication
  • Supports Salesforce Visualforce 3 step authentication
  • Supports site maps with HTTP, HTTPS, and FTP protocol URLs
  • Follows HTTP 301 redirects
  • Custom JavaScript inject file - injects into page prior to screen shooting
  • Trigger page events by passing querystring values to custom inject.js file

Do you need a website and workflow management platform?

Catapult website and workflow management platform Give Catapult a shot


In This Documentation

  1. Getting Started
  2. Siteshooter Configuration File
  3. CLI Options
  4. Tests
  5. Troubleshooting & FAQ

Getting Started

Dependencies

Install the following prerequisite on your development machine:

Notable npm Modules

Quick Start

$ npm install siteshooter --global 

If siteshooter is installed, make sure you have the latest version by running:

$ npm update siteshooter --global 
  • You may need to run these commands with elevated privileges, e.g. sudo, you will be prompted to do so if needed.
  • Installing with the --global flag affords you the siteshooter command on your machine's command line at any path.
  • Read more about the --global flag here.

Create a Siteshooter Configuration File

$ siteshooter --init 

Update Siteshooter Configuration File

View the full siteshooter.yml example

Inside siteshooter.yml, add additional options.

  • All Simple Web Crawler options can be added to sitecrawler_options and will pass through to the crawler process
  • Generated screenshot image files are optimized using imagemin and imagemin-pngquant modules, which reduce the overall size of generated PDFs. To adjust the image quality, update the image_quality option in your siteshooter.yml file.
domain: name: https://www.devopsgroup.io auth: user: pwd: pdf_options: excludeMeta: true screenshot_options: delay: 2000 image_quality: '60-80' transparent_background: false sitecrawler_options: exclude: - "pdf" stripQuerystring: false ignoreInvalidSSL: true viewports: - viewport: desktop-large width: 1600 height: 1200 - viewport: tablet-landscape width: 1024 height: 768 - viewport: iPhone5 width: 320 height: 568 - viewport: iPhone6 width: 375 height: 667 

CLI Options

$ siteshooter --help Usage: siteshooter [options] OPTIONS _______________________________________________________________________________________ -c --config Show configuration -C --cwd Set working directory, which will load a siteshooter.yml file in the specified path -e --debug Output exceptions -h --help Print this help -i --init Create siteshooter.yml template file in working directory -p --pdf Generate PDFs, by defined view ports, based on screen shots created via Siteshooter -q --quiet Only return final output -s --screenshots Generate screen shots, by view ports, based on sitemap.xml file -S --sitemap Crawl domain name specified in siteshooter.yml file and generate a local sitemap.xml file -v --version Print version number -V --verbose Verbose output -w --website Report on website information based on Siteshooter crawled results

When running a siteshooter command without any options, the following options will run in order by default:

  • --sitemap
  • --screenshots
  • --pdf

Custom JavaScript Inject File

To manipulate the DOM, prior to the screen shot process, add a inject.js file in the same working directory as the siteshooter.yml.

Example: inject.js file

/**  * @file: inject.js  * @description: used to inject custom JavaScript into a web page prior to a screen shot.   */ console.log('JavaScript injected into page.'); if ( typeof(jQuery) !== "undefined" ) { jQuery(document).ready(function() { console.log('jQuery loaded.'); }); }

Trigger JavaScript Events

When using the optional inject.js file, events can be triggered based on the following querystring parameter - pevent

 // Add URL with pevent querystring parameter in the generated sitemap.xml <url> <loc>https://www.devopsgroup.io?pevent=open-privacy-overlay</loc> <changefreq>weekly</changefreq> </url>

Example: Event detection & triggering

/**  * @file: inject.js  * @description: used to inject custom JavaScript into a web page prior to a screen shot.   */ function getQueryVariable(variable) { var query = window.location.search.substring(1); var vars = query.split('&'); for (var i = 0; i < vars.length; i++) { var pair = vars[i].split('='); if (decodeURIComponent(pair[0]) == variable) { return decodeURIComponent(pair[1]); } } } if ( typeof(jQuery) !== "undefined" ) { jQuery(document).ready(function() { var pageName = window.location.pathname.replace('/', ''), pageEvent = getQueryVariable('pevent'); console.log('document ready.'); console.log('userAgent', navigator.userAgent); console.log('Page: ', pageName); console.log('Event: ', pageEvent); switch (pageName) { // home case '': switch (pageEvent) { case 'open-privacy-overlay': jQuery('a[data-target~="#modal-privacy"]').trigger('click'); break; } break; } }); }

Tests

Tests are written with Mocha and can be run with npm test.

Troubleshooting

If you're having issues with Siteshooter, submit a GitHub Issue.

  • Make sure you have a siteshooter.yml file in your working directory and the yaml file is well formatted
  • Experiencing font-loading issues? Try increasing the delay setting in your siteshooter.yml file
screenshot_options: delay: 2000
  • Trying to take a screenshot of a page with a video? Unfortunately, PhantomJS does not support videos. As such, here's one approach to showing a video's poster image.
/**  * @file: inject.js  * @description: used to display a video's poster image  */ if( jQuery('video').length >0 ){ jQuery('video').parent().prepend('<img src="'+jQuery('video').attr('poster')+'"/>'); jQuery('video').remove(); }
  • SimpleCrawler TypeError: The header content contains invalid characters
    • Try setting the acceptCookies option to false
sitecrawler_options: acceptCookies: false

Code of Conduct

Take a moment to read or Code of Conduct

Contributing to the project

We are always looking for quality contributions! Please check the CONTRIBUTING.md for contribution guidelines.

About

📷 Automate full website screenshots and PDF generation with multiple viewport support.

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •