Iframes, SEO and the trouble with talking to the parents

May 11, 2011   //   by Jezza   //   Code, HTML/XHTML, Javascript, PHP  //  No Comments

In the world of HTML and front end development, the use of frames on web pages went out of fashion somewhere near the start of the last decade. Frames, while serving a function in the era of the pre-semantic web, fell from grace for a number of reasons, the first and second being, in no particular order, limited accessibility and zero search engine index-ability. There is in this day and age no excuse whatsoever for their use, as there are numerous alternatives to just about every purpose they were ever put to. Iframes, while having all of the same accessibility and SEO shortcomings of frames, are generally deemed to be more acceptable as they can sit on a perfectly index-able and accessible parent page. So long as the iframe is not holding the bulk of the main content, then the damage caused is diminished. As such, iframes are commonly employed in advertising, tracking and other marketing activities.

Ok, so… recently I was called upon to create a module that would sit on a CMS page. This was to be part of a new micro-site that would initially consist of twelve pages, eventually growing to about thirty pages. The module was to appear on five of the pages and would be fed by five XML files update-able by the client. We did not have the access to write any server side code on the CMS. We had to simply supply a snippet which would be entered into a free HTML area of the CMS to achieve the desired result. The module was essentially a carousel, that had a number of thumbnails underneath as shortcuts to various slides. Each slide could hold a number of different content types. It was a requirement that the page scroll to bring the carousel to the top of the viewport, if it was not already there, whenever a new slide in the carousel was selected. Also, the navigation between the slides on the carousel was fire an event on the CMS page as a metric that could be tracked with the analytics package they were using. The problems faced were therefore as follows:

  1. Content is dynamic – fed by XML files
  2. Server-side scripts can only be run from a different domain
  3. Module content would be the main content on the page
  4. Scroll page on interaction with the module
  5. Track carousel usage as a metric of the CMS page

A solution we decided to follow was one where the module was created and hosted on one of our servers and then embeded onto the CMS page using an iframe. The XML files were also hosted on the same server. We were therefore proposing that we would serve up the main content of a number of CMS pages via iframes, a complete no-no, as set out in the previous paragraphs. This was not ideal by any means but we were dealing with a number of restrictions that were themselves far from ideal.

So the page content was created on our server from the XML data, and placed onto each of the five CMS pages via iframes. There was still the requirement to scroll the parent page to the relevant position, and to track the interactions occuring within the iframe as page metrics of the parent page. This would involve some sort communication between the child frame content and the parent page. But the two are on different servers, on different domains. In newer browsers, we can use window.postMessage(), but what about the legacy browsers? Tricky! Or so I thought. There is a hack known as ‘Fragment ID messaging’ whereby the child frame writes a fragment onto the end of the parents url and the parent, by means of a setInterval() or setTimeout() watches for this every second or so. This little trick has been nicely made into a Dojo plugin and a jQuery plugin. Here was not a means of passing events messaging from the child to the parent. Perfect!

Not so fast. about a week after going live, the client noticed that we were listed on the second page of Google for the main keywords that were required. The was deemed unacceptable and the blame immediately fell onto the use of iframes. Though the proportion of pages within the micro-site that used the module was about 40%, this would fall as the site grew with other content. The simple solution was to put more index-able content on those parent pages containing the iframes.

Going forward, a solution was required that would allow the use of the module, whilst allowing its content to index-able. I read in various places that the Googlebot had the ability to read content that was written to the page using the archaic document.write() method. With this in mind, I rewrote my PHP script to compose the module from the XML and store output it as JSONP. Though would allow me to read the data and place it straight onto the CMS page using document.write(). Though this appeared to solve the problem, I could not get definitive information on the veracity of the information that document.write() was readable by Google. It also had to be remembered that Google was not the only search engine out there. Because of this, we could not have sufficient confidence in this solution to carry on with it.

Ultimately the best way to get content indexed is to have it as part of the HTML from the start, and in this instance this requires the module to appear directly on the page. I modified the PHP script again so that the client could update the XML, the script would show a preview of the module as well as the embed code that the client could copy and paste directly into the CMS. This embed code would place the module directly onto the page. This is where I am at the moment, I will return to document any further progress i make.

Leave a comment