The SEO Spider Tool Crawls & Reports On The Following

The SEO Spider Tool Crawls & Reports On The Following

A quick summary of some of the data collected in a crawl include –
  • Errors – Client errors such as broken links & server errors (No responses, 4XX, 5XX).
  • Redirects – Permanent or temporary redirects (3XX responses).
  • Blocked URLs – View & audit URLs disallowed by the robots.txt protocol.
  • External Links – All external links and their status codes.
  • Protocol – Whether the URLs are secure (HTTPS) or insecure (HTTP).
  • URI Issues – Non ASCII characters, underscores, uppercase characters, parameters, or long URLs.
  • Duplicate Pages – Hash value / MD5checksums algorithmic check for exact duplicate pages.
  • Page Titles – Missing, duplicate, over 65 characters, short, pixel width truncation, same as h1, or multiple.
  • Meta Description – Missing, duplicate, over 156 characters, short, pixel width truncation or multiple.
  • Meta Keywords – Mainly for reference, as they are not used by Google, Bing or Yahoo.
  • File Size – Size of URLs & images.
  • Response Time.
  • Last-Modified Header.
  • Page Depth Level.
  • Word Count.
  • H1 – Missing, duplicate, over 70 characters, multiple.
  • H2 – Missing, duplicate, over 70 characters, multiple.
  • Meta Robots – Index, noindex, follow, nofollow, noarchive, nosnippet, noodp, noydir etc.
  • Meta Refresh – Including target page and time delay.
  • Canonical link element & canonical HTTP headers.
  • X-Robots-Tag.
  • rel=“next” and rel=“prev”.
  • AJAX – The SEO Spider obeys Google’s AJAX Crawling Scheme.
  • Inlinks – All pages linking to a URI.
  • Outlinks – All pages a URI links out to.
  • Anchor Text – All link text. Alt text from images with links.
  • Follow & Nofollow – At page and link level (true/false).
  • Images – All URIs with the image link & all images from a given page. Images over 100kb, missing alt text, alt text over 100 characters.
  • User-Agent Switcher – Crawl as Googlebot, Bingbot, Yahoo! Slurp, mobile user-agents or your own custom UA.
  • Configurable Accept-Language Header – Supply an Accept-Language HTTP header to crawl locale-adaptive content.
  • Redirect Chains – Discover redirect chains and loops.
  • Custom Source Code Search – The SEO Spider allows you to find anything you want in the source code of a website! Whether that’s Google Analytics code, specific text, or code etc.
  • Custom Extraction – You can collect any data from the HTML of a URL using XPath, CSS Path selectors or regex.
  • Google Analytics Integration – You can connect to the Google Analytics API and pull in user and conversion data directly during a crawl.
  • Google Search Console Integration – You can connect to the Google Search Analytics API and collect impression, click and average position data against URLs.
  • XML Sitemap Generator – You can create an XML sitemap and an image sitemap using the SEO spider.
Previous Post Next Post