When it comes to SEO and websites in general, there are a million and one tools that everyone says should be a “must-use”, especially those companies that build said tools. Some tools cost ridiculous amounts of money and just generate “pretty looking” reports and dashboards while others just regurgitate data from another source into a different layout that is “executive friendly” or more “presentation ready”. That is one thing I really like about working here at Vizion Interactive. We, as an independent agency, are constantly testing, exploring and re-testing various tools across our organization. Given we all have robust experience (minimum of 7+ years), often with slightly different concentrations, we also get a diverse view of any tools we use from our internal reviews.
Today, I am going to talk about Deepcrawl, a tool I have now been evaluating for more than 5 months. I have a lot to say about this tool, but will try to keep to just a few highlights that stand out for me, our team and our clients.
First, as an agency, the key is making sure the reported information, especially in exported format is valuable and legible for clients of all sizes and complexities. This is an area that Deepcrawl seems to excel at, especially when it comes to the identification of error reporting across a website. The one field that gets the most compliments from our clients is specifically the “found_at” data columns. They provide 2 in their exports, “found_at_sitemap” and “found_at_url” which provides the exact locations that a URL was found. This works great in general clean-up of old 404s still being actively linked to, 302s and 301s being linked to instead of the new destination URLs, and so on. While many tools can identify these errors, they are not always great at providing the actual page where the offending link was found. This comes up more frequently than you would think, given that many clients (and websites in general) have been publishing content for long periods of time and when site changes have been made (redesigns with new URLs, taxonomy optimizations, out of stock/obsolete products, etc.), we can more easily identify where these old URLs are linked from and go straight there and edit/update the sources.
Secondly, this tool provides great details if/when a website is going through a redesign/relaunch project phase. It allows the ability to continuously review development and staging environments to identify potential issues before they actually truly see the light of day. This means that clean up can occur ongoing throughout the project prior to launch versus the typical race we have all experienced once a site is launched and we (or our agencies) then start to find these various low-hanging-fruit issues. Typically, there is enough going on post launch to knowingly (or unknowingly) add a whole clean-up phase right after a launch. Additionally, this can easily identify a case where the sitemap files have become aged and are potentially causing the search engine spiders to spend more time than they should, going through a site because of old or bad URLs showing up in them. This becomes more and more crucial when it comes to larger websites with higher page counts, given we want search engines to efficiently find what we want them to and properly index and rate them against their algorithms. We also want to make sure that any rules for inclusion of pages in the sitemap files are properly being implemented and that, for the same efficiency reason stated above that we only have URLs that we want crawled in them and not including URLs that should be excluded.
Thirdly, the reports themselves via the interface are very clean, well grouped and allow the ability to dig deeper, even into the individual pages. I’ll start with the high-level data groupings. They breakdown as such: Summary, Indexation, Configuration, Content, Validation, Links, Mobile, Traffic, Source Gap and Extraction. It also includes a Change tracking aspect where once a site has been crawled and reviewed more than once, it begins to highlight the differences. This is an immense help in identifying the progress of work, potential issues being exacerbated by something and, in general, providing visibility to the ever-changing landscape that is a website. Each of these areas is then broken down into even more granular details as shown below:
While this is, simply put, a HUGE amount of data, the folks at Deepcrawl have managed to make it fairly easily digestible through the organization of things shown above. Each of the Blue options is also individually expandable making it easy to dive into a certain area or facet of the website or to view things at a higher level to make the decisions around prioritization. An additional piece of the ease of use is that, when reviewing data via the interface on Deepcrawl, almost every element is clickable and will take you to a more granular report specific to the element that caught your attention all the way down to an individual page/URL record.
Speaking of the individual page reports, there is a ton of data for each and every URL reviewed by Deepcrawl. Not only does it include a high-level summary of the page, but also includes a robust menu covering various aspects per page/URL:
Highest Level Page Summary:
More Detailed Summary Menu:
Even with this ease of access/visibility data, they even take it further beyond the ‘Overview’ with a variety of additional page-level options and data:
…and believe it or not, they still have even more data on the same page based on the overall macro view menus I showed earlier on:
Now, you can see that I am not kidding when I say there is “a HUGE amount of data” available through Deepcrawl.
One of the last features that we personally like, is that the crawls are not run directly from our own computers (looking at you ScreamingFrog), but are run directly from their servers. This means that we can crawl multiple sites at once, not be bogged down with RAM issues and, best of all, we can set things to run from various IPs, dynamic IPs, country specific IPs and it even has a ‘Stealth mode crawl’ should a site have things on lockdown (note: using this mode precludes the ability to link to a Google Analytics account and the platform most commonly an issue is Hubspot).
In the example above, we used a UK IP specifically because the client is UK based and we wanted to get a more native feeling for the site versus crawling from a U.S. based IP. This tool let’s us actually do this and, let’s say we have a client that does business across multiple nations, we can crawl the site from each location and see what differences possibly occur.
Overall, this tool has become an additional component of our toolbox when it comes to our work for clients. There are many features that we still have yet to tap fully into, but the information we are getting has been well received by the team, but most importantly, also well received by our clients, specifically making it easier for them to understand the issue and use the data along with our detailed recommendation write-ups to implement them more quickly. Often this piece is the crucial component when it comes to building success with a client because through the timely implementation of these recommendations, assisted by the clean information provided from Deepcrawl and translated appropriately for the client and the level of internal knowledge, it means that we get changes made more quickly with less time spent on explanation and more time spent on measuring results.
So, is your agency using Deepcrawl? If you don’t have an agency and are managing your own SEO, are you using Deepcrawl internally? Deepcrawl even provides a limited free trial of the tool, so what are you waiting for?