How to create a document schema for product variants and SKUs for your ecommerce search experience

ecommerce-sku-variant-thumbnail.png

In this article, we'll explore the concepts of variants and SKUs in ecommerce, and how to best handle these when modeling data for your ecommerce search experiences. We're optimizing our models using Elastic Enterprise Search.

Understanding variants and SKUs

In ecommerce, products are not just "products." There can be a hierarchy to products, which generally follows this pattern:

Product → Variant → SKU

Let's explore these terms and how they factor into ecommerce stores and search experiences.

Variants

The term variant in ecommerce refers to a single product that has a number of variations with different attributes. 

For example, a store may sell a Breezy Classic Button Down Shirt, in three different colors: green, red, and blue. Each color variation of the shirt is considered a variant of the product. For instance, a Red Breezy Classic Button Down Shirt is a variant of the Breezy Classic Button Down Shirt.

A shirt with three separate color variations

SKUs

The term SKU refers to a stock keeping unit. This is a code assigned to products to track inventory. SKUs are important to understand in order to power search features like "in stock" search filters.

A SKU can be assigned at various levels, including the product, variant, or even sub-variant level.

SKU at the product level

As an example, say you have an electronics store and you sell a video game called Mega Mayhem Machines. You'd likely assign a SKU directly to that product. There are no variations of that product — no colors, no sizes, and no variants — just a single product. So SKUs and products are one and the same in this case, and the data model is extremely simple.

ex. Mega Mayhem Machines has a one-to-one product to SKU relationship:

{
  product_id: 1,
  name: "Mega Mayhem Machines",
  sku: "12345",
  in_stock: true,
}

SKU at the variant level

If you have a retail shop selling a one-size-fits-all hat available in three different colors, then you'd likely have three different variants for the hat. You’d assign a SKU to each one of the color variants.

ex. Hat SKU assigned at variant level:

{
  product_id: 2,
  name: "Hat",
  variants: [
    {
      id: "2-1",
      color: "red",
      sku: "123423",
      in_stock: true,
    },
    {
      id: "2-2",
      color: "blue",
      sku: "91930",
      in_stock: true,
    },
    {
      id: "2-3",
      color: "green",
      sku: "039181",
      in_stock: true,
    }
  ]
}

Multiple SKUs within a single variant

If you have a retail store selling the Breezy Classic Button Down Shirt, available in three different colors, and also in three different sizes, then you may end up assigning a SKU based on size within each variant.

{
  product_id: 2,
  name: "Breezy Classic Button Down Shirt",
  variants: [
    {
      id: "2-1",
      color: "red",
      skus: [
        { id: "2-1-1", size: "s", sku: "847128", in_stock: true },
        { id: "2-1-2", size: "m", sku: "933372", in_stock: true  },
        { id: "2-1-3", size: "l", sku: "9010201", in_stock: true  },
      ]
    },
    {
      id: "2-2",
      color: "blue",
      skus: [
        { id: "2-2-1", size: "s", sku: "83312", in_stock: true  },
        { id: "2-2-2", size: "m", sku: "11937", in_stock: true  },
        { id: "2-2-3", size: "l", sku: "9993871", in_stock: true  },
      ]
    },
    {
      id: "2-3",
      color: "green",
      skus: [
        { id: "2-3-1", size: "s", sku: "991999", in_stock: true  },
        { id: "2-3-2", size: "m", sku: "981828", in_stock: true  },
        { id: "2-3-3", size: "l", sku: "003012", in_stock: true  },
      ]
    }
  ]
}

As you can see, ecommerce datasets can turn into heavily normalized, nested data structures. Now let’s explore how to optimize product, variant, and SKU data for search!

Deciding how to display product hierarchy in search results

When creating a search experience for ecommerce, you must first make a key decision: At what level do you want to show search results? Will each product be a result? Each variant? Or each SKU?

When a customer searches for "button down shirt," and the Breezy Classic Button Down Shirt appears in the results, you can show it three different ways:

Option A: Breezy Classic Button Down Shirt as a single search result: In this scenario, there is only one search result for the product.

One search result for the product

Option B: Each color variant of Breezy Classic Button Down Shirt as a single search result: In this case, there are three search results for the product: one result for each color.

Three search results, one for each color

Option C: Each SKU (or size + color variation combination) as a single result:  In this instance, there are nine results for the Breezy Classic Button Down Shirt: one for each size and color.

Nine results, one for each size and color

There's no right or wrong answer for how you choose to display search results; you can imagine scenarios where each method makes sense.

If you have only a few types of shirts available on your website and each product only has a few variants, then you might choose to show each variant as a single search result (option B). A search that generates three results for the term "button down shirt" allows customers to easily parse the results and find what they're looking for. 

However, as your data set grows, search results can quickly become overwhelming. Imagine seeing 30 shirts that match your query, each with 30 color variations. In this situation, a customer would no longer see 9 results, but 900! Here, you may opt to show each product as a result (option A) rather than each variant.

Indexing products for search

In order to index product data for search in Elastic Enterprise Search, you take your normalized data model (shown above) and de-normalize it. In other words, take objects that are nested in your data model and pull them up as top-level fields or separate objects. You shouldn’t have nested objects in your data model.

Using our Breezy Classic Button Down shirt example from above:

{
  product_id: 2,
  name: "Breezy Classic Button Down Shirt”,
  variants: [
    {
      id: "2-1",
      color: "red",
      skus: [
        { id: "2-1-1", size: "s", sku: "847128" },
        { id: "2-1-2", size: "m", sku: "933372" },
        { id: "2-1-3", size: "l", sku: "9010201" },
      ]
    },
    {
      id: "2-2",
      color: "blue",
      skus: [
        { id: "2-2-1", size: "s", sku: "83312" },
        { id: "2-2-2", size: "m", sku: "11937" },
        { id: "2-2-3", size: "l", sku: "9993871" },
      ]
    },
    {
      id: "2-3",
      color: "green",
      skus: [
        { id: "2-3-1", size: "s", sku: "991999" },
        { id: "2-3-2", size: "m", sku: "981828" },
        { id: "2-3-3", size: "l", sku: "003012" },
      ]
    }
  ]
}

This becomes:

{
  id: "2-1",
  name: "Red Breezy Classic Button Down Shirt",
  product_id: 2,
  product_name: "Breezy Classic Button Down Shirt"
  color: "red",
  sizes: ["s", "m", "l"]
},
{
  id: "2-2",
  name: "Green Breezy Classic Button Down Shirt",
  product_id: 2,
  product_name: "Breezy Classic Button Down Shirt"
  color: "green",
  sizes: ["s", "m", "l"]
},
{
  id: "2-3",
  name: "Blue Breezy Classic Button Down Shirt",
  product_id: 2,
  product_name: "Breezy Classic Button Down Shirt"
  color: "blue",
  sizes: ["s", "m", "l"]
}

Let’s break down the steps.

First, you take each variant and make it a unique object which gets indexed into your search engine. You must include both the top-level product information (product_id and name), as well as individual variant attributes, like id and name.

In your model, you also want to make sure that you capture each attribute you want to create user filters for. In this case, you intend to allow a customer to filter by size and color, so you pull them into top-level values.

Grouping variants

Assuming you've indexed your data in Elastic Enterprise Search using the data model above, think about how customers will search that data and how you want to render the results.

Going back to the earlier question about how to display search results, the data model above will support either. If you want to show individual results at the variant level, you'll just show each variant, as seen in Option B. Nothing additional is required.

However, if you want to show results at the product level, you can take advantage of the group option in the Search API in Elastic Enterprise Search. It allows you to group your variants together in order to show them as a single result.

In order to use the group option, you must find a common field within your schema that you can use to tie all related product variants together. In this case, you can keep the product_id in your schema available specifically for this purpose. So when executing your search request, you can specify the the group option, which would give you the following result:

[
  {
    "id": { "raw": "2-1"},
    "name": { "raw": "Red Breezy Classic Button Down Shirt"},
    "product_id": { "raw": 2},
    "product_name": { "raw": "Breezy Classic Button Down Shirt"},
    "color": { "raw": "red"},
    "sizes": { "raw" : ["s", "m", "l"] },
    "_group": [
      {
        "id": { "raw": "2-2"},
        "name": { "raw": "Green Breezy Classic Button Down Shirt"},
        "product_id": { "raw": 2},
        "product_name": { "raw": "Breezy Classic Button Down Shirt"},
        "color": { "raw": "green"},
        "sizes": { "raw" : ["s", "m", "l"] }
      },
      {
        "id": { "raw": "2-3"},
        "name": { "raw": "Blue Breezy Classic Button Down Shirt"},
        "product_id": { "raw": 2},
        "product_name": { "raw": "Breezy Classic Button Down Shirt"},
        "color": { "raw": "blue"},
        "sizes": { "raw" : ["s", "m", "l"] }
      }
    ]
  }
]

Notice that you have a single result in your result set, but with an additional _group option which lets you cleanly keep all Breezy Classic Button Down Shirts as a single result. Additionally, having all of the variants available together lets you do things like iterate over the available color variants to show color swatches inline in each result.

Variations shown as a single result, with color swatches

You'll also notice in the graphic above that you can use the top-level size and color fields to create filters using the facet feature. (Note that this is simple if you use the Search UI library).

Adding an in-stock filter

Perhaps you want to show an in-stock filter so that users can choose to only see results that are currently in stock. In the original data model, you tracked whether a product was in stock or not alongside the SKUs within the size field.

Since the data model you're creating for search needs to be flat, you can no longer keep the sku and in_stock fields clearly beside each other in a nested object. However, the approach you can take is to create a new field called in_stock_sizes, which will allow customers to filter on both "in stock" and "size" at the same time.

{
  id: "2-1",
  name: "Red Breezy Classic Button Down Shirt",
  product_id: 2,
  product_name: "Breezy Classic Button Down Shirt"
  color: "red",
  sizes: ["s", "m", "l"]
  in_stock_sizes: ["s", "m", "l"]
}

Now you're able to let customers toggle the in-stock filter and only see sizes and colors that are actually in stock in your store.

Colors and sizes shown for in-stock items only

Conclusion

In this article, you learned about the concept of variants and SKUs in ecommerce and how to model those to build a best-in-class search experience for online storefronts using Elastic Enterprise Search

Now it’s time to get started creating your own ecommerce search implementation. For some help implementing ecommerce search, check out Elastic’s ecommerce guide for Search UI.