Spiders Crawling the Web

Understanding the basics of how search engines work will help you have more of your website pages indexed by search engines. Spiders are tools or programs search engines use to scour the web and collect information as pages are developed each day.

Spiders!  Yes, that’s right – Spiders!

Think about it for a moment – the internet is an inter-connected “web” of pages strung together by web links upon web links.  Google releases millions of spiders each day to crawl billions and billions of web pages.  Picture a spider’s web now and study how each web strand is connected to each other.  The spider will crawl on each web strand (or web page) to get from A to B.

What is a search engine spider?

An internet spider is a program that automatically fetches pages from your website.  Alternatively, it is also known as a bot, robot or crawler.  We like to refer this activity as “spidering” or “crawling” a web page.  As a result the spider will collect information about each of your web pages.

How do I get spiders to visit my web pages?

In the past, getting a spider to visit your web page was as simple as going to Google and submitting your website.  Therefore, in order to get a spider to crawl your web page today – you need to have links from other websites. Think about the logic behind this for a moment.  While the web grows everyday, spiders need to be able to make the leap from one page to the next efficiently.  So, you need to get links on other websites pointing to your own – bottom line.

What does a spider do when it arrives on your web page?

The first thing a spider will do is make a couple quick checks to see if in fact they are allowed to even search the page in question. Hence, there are two ways a spider will know if they are allowed to index or crawl a web page.

  1. by checking the robots.txt file or
  2. any http headers.

Once permission is granted the spider will collect any html meta data provided by the page. Meta data is information that helps categorize and organize the web page content for search engines.  I’ll write another article about meta data very soon!

What do web spiders check?

After the spider has gone through the preliminary checks for permission and meta data – it will then begin to collect content.

  • Page titles
  • Sub-headings
  • Images
  • Paragraphs are all considered part of your content.

Above all, the most important element you can provide in your content will be links to other pages within your site or other people’s websites. Consequently, linking to other websites that are useful in relation to your own content will increase the user experience factor.  But why would you want to link to someone else’s website and move people from your site to theirs?  As a result, Google puts even more emphasis on creating a great user experience hence stating it in their  mission statement. “To organize the world’s information and make it universally accessible and useful”.

Real people employed by Google to view your website

Real people at Google view websites for increased user experience

Google takes “user experience” to the next level by employing people to visit indexed websites and go through a checklist of items to help gauge the level of user experience.  This is very important to understand because Google realizes that spiders can only do so much in regards to evaluating and collecting information.  The best possible advice I can give to your underlying Search Optimization strategy is to create the best possible user experience you know how.

In Conclusion…

Competing for key words your competition is already ranking for is not easy and most times is an uphill climb.  Even more today, Google is putting more emphasis on users’ mobile experience when viewing your website.  There are so many factors when considering Search Engine Optimization and we’ve only begun to scratch the surface!  Do you need Search Engine Optimization services?  If so, contact us and fill out our website quote form.  You can thank us later!