This is a very common question which worries a lot of bloggers and webmasters. Some webmasters are clueless on how to treat junk links from spam sites and whether spam links to 404 pages would have any effect on the rankings of existing pages.
The short answer is a No, Spam links to 404 pages or existing pages won’t have any effect on the rankings of your content. Google engineer John MU clarified this doubt in a Google webmaster help thread, saying:
I have never seen a case where bad links pointing to URLs that return 404 have ever caused a website any noticeable problem in web-search. 404s are a part of the internet, they’re expected to be seen when a non-existent URL is crawled, there’s no reason that I can think of where it would make sense to count 404s against a site.
The problem is more prevalent when link farms copy each and every post from your site, publish an excerpt and link back to the original story. A couple of years later, you delete a few hundred pages and suddenly, there is a storm of crawl errors in your Google webmaster tools account. What’s more daunting is that some of these pages never existed in your domain but Google webmaster tools continues to show a whole bunch of crawl errors and 404 not found errors.
Here is a brief FAQ about how to properly deal with 404 pages:
Should You Redirect 404 Pages To The Homepage?
Q: Is it a proper practice to 301 redirect 404 pages to the homepage of the same website? You don’t want users to see a dead end and hence it makes sense to redirect them to another useful resource section on the same domain or the index page.
A: Never redirect 404 pages to the homepage or any other page of your site, if their content is not similar.
A 301 redirect tells users and search engines that this page has moved to a new location and the older address is now invalid or outdated. When you 301 redirect a 404 page to the homepage of your site, you are passing a false message to users and search engines, as if the earlier page has something to do with the homepage of your domain. The worst case is when you redirect a bulk of pages to a particular page on your domain, whose content does not match with the content of the original page (now 404).
Instead, it is a far better idea to include a link of your site’s most important sections on the 404 page itself and make navigation a breeze.
Will Spam Links To 404 Pages Affect Rankings Of Existing Pages?
Q: A lot of spam sites have re-blogged or plagiarized content from your site, linking back to an old story which is no longer available. How will Google and other search engines treat links from spam sites? Will they have any affect on the rankings of your existing pages?
A: If an external site is linking to a non existent 404 page on your domain, there is nothing much you can do. In general, I wouldn’t worry about it at all because of the golden rule – external links that point to your domain can either help your site to get better ranks or they won’t have any effect whatsoever. In any case, no one can intentionally or unintentionally hurt your site’s reputation from spam link bombing.
In short, spam links to 404 pages do not have any negative impact on the rankings of your existing pages.Trying to get rid of junk links fro external sites or contacting the owner of the spam site is just a waste of time.
What Is The Correct Way to Ensure That Google Removes 404 Pages From Its Index?
Q: You have used Google webmaster tools to fix crawl errors that occurred when you deleted a good amount of pages or blog posts from your domain. But these pages seem to be come back often in Google webmaster tool reports. What is the easiest way to ensure that Google will drop all the 404 pages from its index?
A: Google and other search engines crawl new and existing pages via hyperlinks. When a hyperlink to a page is not available, Google may sometimes refer to the XML sitemap.
Hence, in order to ensure that Google does not tries to recrawl your non existent pages, here are a few things you should do:
It is important to clean up your site and remove broken links periodically. This is because Google and other search bots will fetch those links whenever they see it and this would trigger a new crawl error in your Google Webmaster tools account. A large number of broken links is not a good user experience and might cause a drop in rankings for existing pages of your site.
2. Remove the reference of the older page from your XML sitemap. It is better to re-create or re-build a fresh sitemap of your entire website every six months.
3. Check the HTTP header of a 404 page and see whether the header returned is really 404 or something else.
And that will be all. Keep the non existent 404 pages as it is, Google will eventually drop them from its index after a given period of time.
Google Webmasters Tools Has A URL Removal Tool. Should You Use It To Fix Crawl Errors Occurring from 404 Pages?
Google’s URL removal tool is used to remove pages from Google search results and it has been designed to address urgent requests only. This tool should not be used to remove URL’s from your Google Webmaster tools account. Here is a useful FAQ by Google webmaster tools team which answers common questions about Google’s URL removal tool.
In general, you won’t need the URL removal tool at all. Honestly, I have wasted a good amount of time trying to fix crawl errors by removing 404 pages with the URL removal tool, which is nothing but a misconception.
When Can 404 Pages On My Website Have a Negative Impact?
The only possible scenario when 404 pages may have a negative impact is when a good number of internal links on the same site points to those 404 pages. These links would dilute the Google juice from your existing pages and might annoy the bots, who will face a dead end over and over again.
Otherwise, 404 pages is quite normal for any website and there is no need to panic over a thousand crawl errors in your Google webmaster tools account. Just take care of those broken links and you should be in pretty good shape.