Share this content

Ecommerce information architecture: Key components and guidelines

2nd Jun 2014
Share this content

Information Architecture (IA) may sound dull but it’s a critical component of ecommerce and helps put the right data structures and standards in place to enable, amongst other things:

  • Site & catalogue structure
  • Core processes & functions e.g. site search
  • Business reporting & web analytics
  • SEO

This article collates my recent 'The devil is in the detail' series on Econsultancy, to take a look at some of the key components and guidelines for what ecommerce teams need to think about. 

1. Site structure

This is a sensible starting point for your IA: mapping out user journey flows through the website to identify key page types and the relationships between pages.

The site structure should provide a visual page hierarchy segmented into levels and inheritances (where a page or group of pages belong to a page at a higher level in the hierarchy).

In my experience there are three common oversights with site structure planning.

Insufficient depth of core page groups:

It’s easy to disconnect how the product experts in the business categorise products and where users would expect to find them on the website. 

A shallow catalogue structure can actually make it harder for users to find relevant products. One caveat is that with intelligent faceted navigation, it’s possible to reduce the click path but this creates additional challenges for SEO e.g. how do you ensure maximum coverage in the search index for URL variations created by parameters added when using faceted navigation?

Lack of attention to detail for non-product content:

Usually this is a lack of long-term thinking, focusing only on what the business has now and how it can be presented to users. However, you need to think about where your content strategy will be in the future and build an architecture that supports that.

Of course plans change and evolve but it pays to think ahead so you don’t have to do lots of fiddly development to accommodate new ideas.

For example, how easy is it to integrate non-product content like blogs, buying guides and videos into your site search results page? Do you have the relevant data tags set against each content asset to enable the search index to return only the most relevant matches to avoid diluting quality?

Loose structure for common assets:

A good example is product pages. By default on some platforms, a product page URL usually contains the string for the category the product is allocated to. However, products can sit in multiple categories, which then creates duplicate URLs.

Trying to iron out duplication issues with something like the canonical tag isn’t always easy in this situation. One solution is to use a dedicated folder for each content asset type, so use /products for all products. You can then allocate a product to as many categories as you want and maintain one clean URL.

Of course, there are other implications to consider. If you have a single product allocated to multiple categories, you still need to decide how to report on sales/revenue. Are sales allocated to a primary category, or do you report on sales based on which category on the site it was added to the basket from? Or do you have an aggregated roll-up view that can be split down into category level detail?

It’s important to ask this question and agree business-reporting needs with the finance team before setting data structures in stone.

2. Catalogue structure

A key challenge with an ecommerce site is the number of levels within the product catalogue. There is a fine balance between reducing the click path and ensuring there is sufficient depth of category classification to make it obvious to users where to find products.

For retailers with a small product range, the decision is much easier to make than for a large catalogue retailer with hundreds of thousands of SKUs. 

Start by putting yourself in the shoes of your customer and ask the following questions:

  • What category structure would make it easy for users to find your products?
  • How many levels do you need within the catalogue?
  • Does each level need its own page template, or can you use faceted navigation to minimise the click path?
  • Are categories/sub-categories clearly named to make it obvious what’s in them?
  • Do you have enough categories to have a meaningful number of products within each category?
  • Is the language you’re using to name categories in keeping with what your users call them?

You also need to assess the SEO benefit of deep category structures; do you want unique URLs for long tail product searches so that each category level has its own page template, or do you prefer keeping the site hierarchy shallow and using on-site navigation tools to get users to products quickly?

Let’s use an example of a men’s fashion retailer selling Coats & Jackets as a sub-category within Clothing.

There are two key SEO ‘friendly’ options:

  • Use a sub-sub-category page for each range within Coats & Jackets so that this unique URL can be indexed and used as a landing page e.g.
  • Don’t have a unique sub-sub-category page but use URL parameters to differentiate between sub-category pages and submit each version to the search index e.g.”fieldjacket”.

The challenge with the second option is that usually faceted navigation has multiple options (product type, brand, colour, size, price etc.), so you need to define which facets generate URLs that need to be indexed and which ones don’t.

For example, if in the URL above the user also refined the page display by colour and size, additional URL parameters would be generated (e.g.”fieldjacket”&colour=”green”&size=”medium”), creating a new URL. Should this URL also be indexed or is it best to focus on the original URL as a primary for all searches relating to ‘mens field jackets’? 

There isn’t a ‘right decision’ so you need to weigh up the practicalities of elaborate rules for URL indexation.

If you submit a unique URL for ‘mens green medium field jackets’, do you have enough products to warrant this? Are you likely at some point to have zero items, therefore the user experience will be compromised and you’re likely to see conversion rates drop?

3. URL structure

A common problem for ecommerce sites is system-generated URLs that contain strings that mean nothing to users.

This is often because the platform generates internal references for site components, for example a numerical code for a product category, which will be included in the URL by default unless over-written using an optimised URL structure.

Your platform may well support an SEO friendly URL in addition to the system generated one. You need to make sure that the SEO friendly URL is the one being served to users, sitemaps and search engines.

The table below is an example of how you can map out the URL structure for your website. It’s important to map out all different page types and create a consistent URL structure.

This also has an SEO benefit as it ensures you are using contextually relevant URLs.

Another common issue with URLs is the indexation of URLs that you don’t want in the search index.

For example, a product page may have session IDs generated, which creates an exponential number of versions of the URL.

If these URL parameters aren’t identified and managed effectively, the end result is content duplication for search and cluttered analytics reports where multiple URLs exist instead of a single version.

This makes data analysis and reporting complicated and unnecessarily time consuming.

One Client had more than 100,000 such URLs showing in their Webmaster Tools that had been sat there for more than 12 months.

It really pays to keep a close eye on your indexation status and be proactive in addressing issues as they arise, otherwise you risk diluting your SEO efforts by clogging the index with irrelevant URLs.

4. Data formats

There are many different data types that help a website work.

You need to understand what format the data needs to be captured in to enable the back-end processing, such as order management and financial reporting and reconciliation.

Let’s use the example of defining data tables for the checkout, the most important processes on a transactional website.

There are four steps to think about:

  • Map out each data field that is required. Streamline this because the less data entry, the less chance for drop-out.
  • Define the type of data that is being captured e.g. is is text only, or is it a drop down field so the user can’t enter any data, instead they must select from the list.
  • Define data formatting requirements e.g. what is the maximum # characters that the data field can support?
  • Set whether the field is required or not i.e. does the user have to enter a value into this field to progress? 

With regards to required data fields, you then need to define the process for handling exceptions.

First of all, is it clear to users what they need to enter e.g. do you make it clear that the password field must be a minimum of eight characters and contain at least one number?

Then you need to define the error scenarios and how the site responds to manage the user experience. For example, if an invalid password is entered, is the error message shown in-line to be as contextually relevant as possible.

It pays to think about the user first. What’s the minimum amount of data you can capture to effectively support the process or feature they’re using? Start with this and you won’t bloat data capture requirements.

You can then look for neat UI design techniques to subtly ask for more data without interrupting the user journey.

5. SEO

Why is SEO vital for your IA? Because if you don’t think through the impact of your development decisions on your ability to get webpages indexed for relevant keyword searches, you’ve compromised your online marketing from the start.

Technical SEO is often an afterthought in the web development process, usually considered after the website has gone live. The truth is that most of the issues that get flagged in a technical SEO audit can be avoided if they were considered in the early stages of web development. 

Key tips:


  • The rules for the generation of page titles, meta descriptions, URLs and H tags.
  • Redirect requirements for domain versions e.g. is primary version and the www. version redirects to this? Or vice-versa?
  • How you will use the canonical tag to reduce the risk of content duplication.
  • Which pages you want to be indexed and which you don't (using the noindex tag).
  • Which internal links you want to be followed by search engines and which you don't (using the nofollow tag).
  • How you will use pagination for product pages and the use of ‘rel=previous’ and ‘rel=next’ tags and/or using the canonical tag on a “View all” page.
  • What URL parameters the website will generate and how you need to manage these to control indexation issues.
  • How you will use 301/302 redirects and set-up the systems to enable you to create these without additional development work.
  • How you will use the robots.txt file and ensure this is easy to access and edit on the fly.
  • Which sitemaps you need, how they will be created and automated to ensure they are always updated.
  • Your 404 error page (soft and hard) and ensure it provides relevant links back to the website.
  • Search friendly file names for content assets like images and ensure there are alt tags.
  • Rules for any blog content on the website e.g. will links in comments automatically have the ‘nofollow’ tag to avoid link spam?
  • Opportunities for using relevant mark-up such as authorship and reviews.

It's not an exhaustive list but it gives you a good rallying point.

6. Integration of non-product content

It pays to sit down and think carefully about where on the site your users would expect/need to find relevant content to help them make decisions. You can then map out all page templates that need to be designed with this information requirement in mind, for example including blog content on search results pages.

Key tips:

  • Define what types of content you want to surface next to products e.g. buying guides, articles, blogs etc.
  • Identify which page templates you require this to happen on.
  • Define what data you need to store against the content assets to help automate the presentation on product pages.
  • Make sure your data systems are set-up to capture this data.
  • Define business rules for the presentation of the content e.g. only show blogs from the last 30 days. 

For example, if each content asset sits on its own unique webpage, you may decide to use the meta keywords property to insert keyword relevant tags e.g. jeans / denim for a buying guide on men’s jeans.

A script can be added to the PDP that matches the product title with the meta keywords property, so that a product called ‘Men’s Skinny Jeans’ will return a match for the buying guide but ‘Men’s Cashmere Scarf’ will not.

Some websites also make use of the meta keywords property for generating site search results pages, so you can kill two birds with one stone.


I appreciate that this is a whistle stop tour and in reality there is far more detail involved and a lot more elements to consider but hopefully this gives you food for thought.

Creating an IA for ecommerce is a critical element of your project and it needs to be a constant work in progress, updating based on the evolving needs of the business and your website users.

My recommendation is to give someone ownership of this and make sure it’s clearly documented, then use version control to update the master document over time.

James Gurd is owner of Digital Juggler

This article is an abridged version of his 'Ecommerce information architecture: The devil is in the detail' series from Econsultancy:


Replies (0)

Please login or register to join the discussion.

There are currently no replies, be the first to post a reply.