Google Cloud Platform

Semantic Product Search as an extension to standard search in online shops

Summary

Many online shops still rely on simple keyword-based search, which often leads to irrelevant or no results at all.
We asked ourselves how these solutions could benefit from the latest trends in AI development without requiring a migration to an enterprise commerce search solution like Coveo oder Algolia. The answer: Semantic product search. Especially for older B2B shops, semantic search offers a fast and cost-effective way to create added value and enhance existing systems
We begin by explaining what semantic search is. Using the Google Embeddings API, we illustrate how product information can be transformed into a semantic representation and apply this concept to a concrete shop system: SAP Commerce / hybris with a Solr-based standard search. Finally, we take a look at the costs and provide an outlook on future development opportunities.

Why semantic search is important

Semantic search understands your customers’ search intent – not just the keywords. It delivers more relevant results, even when the exact search terms don’t appear in the product description.
For example, users can enter search queries based on use cases, such as “industrial adhesive for heat-resistant metal bonding” or “Mounting a TV on the wall”, without requiring an exact word match. In the latter case, a keyword-based search would likely return all TVs among the top results, even if they aren’t specifically suited for wall mounting.

Advantages

Embeddings

First, the product information must be transformed into a vector representation. This is where GenAI models come into play. In this post, we focus solely on text embeddings. Suitable standard models in the Google Cloud Platform can be found here. Other providers, such as OpenAI, also offer relevant models.

A high-quality semantic product search depends on the quality of your product data. Incomplete or poorly maintained information leads to poor search results. In such cases, you should consider using a PIM system like Akeneo or Atamya  to optimize your product data.

This step can easily be tested in an initial proof of concept using a Jupyter Notebook—on Google Cloud Platform, for example, via Colab.

A simple approach to search this data is to generate an embedding for the search text (as described above) and compute the dot product with the embeddings of existing products. The higher the value (1.00 = identical), the closer the result is.
This leads directly to the following use case: By sorting the list of dot products in descending order for a given product embedding, you can retrieve similar products.

The above steps can be easily followed in the following notebook:

  1. Colab – Requires a Google Cloud environment:Google Colab Notebook
  2. Cloud Skill Boost Lab: Similar to the first option—the environment is pre-configured for the exercise. However, there is a time limit and restrictions on resource creation unrelated to the lab: Cloud Skill Boost Lab

At this point, one could stop and integrate this approach as is. While this is generally possible and cost-effective, it is not practical for large product catalogs, frequent product updates, or a high volume of parallel search queries.
In such cases, a vector database and vector search should be used.

Vector Database and Vector Search

Without going too deep into the details: The vectors/embeddings are stored in a suitable database.
Google’s Vector Search provides this functionality and indexes the embeddings. The search is based on an approximation algorithm that significantly reduces computation time while allowing for minor deviations.
For production use, Vector Search is the go-to solution. It also supports hybrid Search, combining semantic and keyword-based search for even better results.

General Integration

Integration works by generating an embedding for the search query, executing the search, and returning unique IDs as results. These IDs can then either be directly converted back into product data or passed to the existing search provider for a follow-up query.

Basic semantic search shop integration with Google Vector Search

Integration Example: SAP Commerce and Solr

In SAP Commerce (Cloud), the search provides filter values (facets) that allow users to refine the result set further. These can be easily retrieved through a follow-up query to the Solr index.
From a technical perspective, this can be implemented quickly and efficiently via a custom extension, ensuring seamless integration of semantic search with existing Solr-based filtering.

SAP Commerce Cloud Semantic Search shop integration with Google Vector Search

Costs 

For the Google Cloud Platform and the described approach, we recommend using the Price Calculator. A sample setup with 500,000 vectors and an average configuration results in costs starting at approximately €150 per month.
You can view the detailed pricing setup and adjust it as needed here.

Google Price Calculator Vector Search

Ountlook

Integrating AI-powered search doesn’t have to be a massive undertaking. With the Google Cloud Platform, you can quickly and cost-effectively build a proof of concept and, if successful, transition to a production setup for further development.

A key area for expansion is hybrid search (semantic + keyword-based search), which is easy to implement and highly beneficial. This is especially relevant for B2B shops, where searches often rely on direct ID lookups.

Another exciting development is multimodal search, incorporating images and audio to enhance product discovery.

As a Google Cloud-certified service partner, we’re here to support you in designing and implementing your AI-driven search solution. Contact us for a personalized consultation!

Schedule a Meeting