Category Archives: Indexing

Untangling the Sitecore Search LINQ to SolR queries

Problem

It can be very difficult to identify why you do not get the search results you expected from Sitecore Search, but there is a simple way to help untangle what is going on.

Solution

It is possible to see the query that Sitecore generates and sends to SolR and then use the query on the SolR instance to see what data is returned to Sitecore.

This is such a huge help when trying to understand why your queries do not work!

Step 1 – Find the Query that was sent to SolR from Sitecore

Sitecore logs all the queries it sends to SolR in the standard sitecore log folder, look for files named Search.log.xxx.yyy.txt .

Step 2 – Execute the query in your SolR instance

Go to your Solr instance, and use the core selector drop down to select the index your Sitecore Search query is being executed against.

Select Query, from the menu

Then paste the query from the sitecore log, and you can see the result that is returned to Sitecore.

This has helped me a lot, so I hope this helps others untangling their search results using Sitecore Search 🙂

 

 

 

SolR

Introduce a (SolR) Sitecore Search Abstraction

After my previous post on Supporting Integrations, I received a few comments asking why was SolR was in the integration’s module group, as it is part of the sitecore API.

In this blog i will explain why and in more detail how to isolate a SolR integration.

Yes Sitecore Search is part of the Sitecore API, but it relies on an 3rd party system! Please read my previous post about why you need to identify, separate and isolate modules with external dependencies, as Sitecore Search API faces exactly the same challenges.

With the bonus that there are 3 supported implementations (Lucene, SolR and Azure Search) which are almost the same, but not quite!

Sitecore Search Issues

In most of the helix-based solution I have seen indexing is implemented in the framework layer which provides some helper extensions. Then each feature uses indexing module with Sitecore Search API to implement their requirements. This typically leads to the following issues:

  • Duplicated code across features
  • No clear definition of the indexing/constraints/sorting requirements for the solution.
  • Non-consistent implementation across the solution i.e. Predicate builder vs LINQ.
  • Optimization is difficult.

With each feature implement their indexing requirements, it leads to duplicated code as it feature needs to build the query to add sitecore root item, base templates, language etc. for each request, before adding the feature specific part of the query.

Therefore when fixing a bug or performance issues you must track down all the places where Search is used and then determine if they require the same fix and or the optimization.

How to abstract away the SolR Search Implementation

  • Identify the indexing requirements
    • Introduce an abstraction in the foundation layer (Indexing).
  • Create the implementation (Solr Indexing) that implements the abstraction define by Indexing in the foundation layer.
    • Address the sorting issues (i.e. different items templates have different date fields)
  • Let the features use the indexing abstractions (i.e. Course, News, Calendar, etc.)

Identify the indexing requirements

There are 3 main components to define the indexing requirements constraints, pagination & sorting.

Constraints

Constraints define what the filters can be applied to reduce the number of items that are returned. In this example it will be possible to apply the following constraints:

  • Location in tree sitecore (i.e. site specific news folder, all content, etc.)
  • Language (i.e. return items with an English language version)
  • Template, i.e. does the item inherit from a specific template (i.e. news, calendar, etc.)
  • Taxonomy – return items based on their categorization (i.e. football, skiing, etc.)

Pagination

Defines the number of search result per page and which page you require.

Sorting

Is responsible for defining what is used to sort the result items and the direction (ascending or descending), for example using date to get the 10 latest news.

If you want to sort by date, one challenge is to determine how to sort he results, as different pages will have different fields. Some pages have no date apart from created/updated, news normally has a specific news date and calendar events have start/end dates.

The SolR implementation must NOT KNOW ABOUT PAGE TYPES, see my blog post with a solution.

The following code defines the indexing requirements.

public interface IConstraint
{
    Item RootItem { get; }
    Language Language { get; }
    ID BaseTemplate { get; }
    IEnumerable<Category> Categories { get; }
}
public interface IPagination
{
    int Number { get; }
    int Size { get; }
}
public enum SortDirection
{
    Ascending,
    Descending
}

Then we need to define the result of making a search and a repository to make the search

public interface IPagedSearchResult
{
    IEnumerable<Item> Results { get; }
    IPagination Pagination { get; }
    int TotalHits { get; }
    bool HasMoreResults { get; }
}
public interface IPagedSearchResultRepository
{
    IPagedSearchResult Get([NotNull] IConstraint searchConstraint, [NotNull] IPagination pagination, SortDirection sortDirection);
}

The definition of the search result could of been type safe, i.e. return a model of type T instead of the Item, but I wanted to keep the example simple and not use a specific binding framework.

Anyway I hope this post will help, Alan

Sitecore 7 – Disable indexing – It is not enough to set Indexing.UpdateInterval to 00:00:00.

I was working on a setup for a customer where they have a dedicated publish server and therefore it was not necessary to update the indexes. There are many post stating that to disable sitecore indexing you have to set the Indexing.UpdateInterval setting to 00:00:00 i.e.

<setting name="Indexing.UpdateInterval" value="00:00:00"/>

The website has been recently upgraded from 6.6 to 7.5 and I noticed that the indexes were still being built?

So I used Brian’s jobs page (see his blog) which listed all the running jobs to see what was running and to my shock there were a lot of index update jobs? Which meant that instead of using all the machines power to publish it was busy indexing ALL the time?

running jobs

But since Sitecore 7.0 there is an extra setting which has to be changed in order to disable indexing.

Solution

To disable indexing the BucketConfiguration.ItemBucketsEnabled setting also must be set to false in the Sitecore.Buckets.config i.e.

<setting name="BucketConfiguration.ItemBucketsEnabled" value="false"/>

Hope this helps 🙂