Sitecore Analytics: Custom Dimensions and Segments

In Sitecore Analytics, a Dimension represents a set of key/values pair which is updated daily.

Each entry contains, for each key, a class called SegmentMetricsValue that contains the following fields: Visits, a custom Value, Bounces, Conversions, Time on Site, Page Views, and Count. Each of those can be arbitrarily assigned, and the method that calculates them has a default implementation in the abstract class DimensionBase in Sitecore.ExperienceAnalytics.Aggregation.Dimensions.

Further, each Analytics Dimension has one or more Segment, that by default contains all the data of their parent Dimension, but can apply Filters that will only display a subset of these based on an assortment of conditions.

DimenisonsFilters

All Dimensions of each Sitecore instance are updated every day from the data collected in the xDB Analytics database; writing a custom Dimension involves writing a mapping from xDB’s VisitData to our Dimensions, and requires understanding the format of both.

Visit Data

The details of each Visit of a user in our Sitecore application is recorded in the xDB Analytics database; those can be visualized with tools such as Robomongo:

VisitAggregationContext

This image shows the details of a single visit spanning three different pages, two of which containing a custom PageEvent (we will not cover their generation in this article).

The class VisitData is also present in Sitecore.Analytics.Model.dll

public class VisitData : InteractionData {
    public VisitData();
    public VisitData(Guid contactId);
    public VisitData(Guid interactionId, Guid contactId);

    [Obsolete("Deprecated")]
    public virtual string AspNetSessionId { get; set; }
    public virtual BrowserData Browser { get; set; }
    public virtual int ContactVisitIndex { get; set; }
    public virtual Guid DeviceId { get; set; }
    public virtual WhoIsInformation GeoData { get; set; }
    public virtual byte[] Ip { get; set; }
    public virtual string Keywords { get; set; }
    public virtual string Language { get; set; }
    public virtual Guid? LocationId { get; set; }
    [Obsolete]
    public virtual MvTestData MvTest { get; set; }
    public virtual OperatingSystemData OperatingSystem { get; set; }
    public virtual List<PageData> Pages { get; set; }
    public virtual Dictionary<string, ProfileData> Profiles { get; set; }
    public virtual string Referrer { get; set; }
    public virtual string ReferringSite { get; set; }
    public virtual ScreenData Screen { get; set; }
    [Obsolete("Deprecated.")]
    public virtual string SitecoreDeviceName { get; set; }
    public virtual string SiteName { get; set; }
    public virtual int TrafficType { get; set; }
    public virtual string UserAgent { get; set; }
    public virtual int Value { get; set; }
    public virtual int VisitPageCount { get; set; }

    public virtual VisitData Clone();
}

Creating a custom Analytics Dimension, a default Segment and registering it in Sitecore.config

Let’s say that we want to create a Dimension to record how much time each user spent on our site. We begin by creating an aptly named Dimension and its associated, un-filtered Segment. We choose an appropriate name and that’s it.

CustomSegment

We need to do two things for our new Segment to become fully integrated with Sitecore Analytics:

  • It needs to be added to the Sitecore configuration;
  • The Segment itself must be deployed.

1) Under sitecore/experienceAnalytics/aggregation/dimensions a new Dimension entry must be added:

    <dimension id="{[Sitecore ID]}" type="[Class Type]">

With the ID of the Dimension we created, and the Type of the class – inheriting from Sitecore.ExperienceAnalytics.Aggregation.Dimensions.DimensionBase­ – that we are going to create.

2) We must select our Segment in Sitecore, pick the Review tab and click on Deploy:

SegmentDeployment

The DimensionBase implementation

The “main course” of the implementation work. We must now think of an algorithm to map a set of VisitData to our SegmentMetrics.

The basic premise is that the GetData() method will be passed the VisitAggregationContext for every visit of every user/visitor of our site, and it will return an IEnumerable of DimensionData.

In order to build an Analytics Dimension that tracks the time spent for each user, we need to do the following:

  • Find the relevant key for the current visit;
  • Calculate how much time has been spent during the current visit;
  • Store this value in the SegmentMetricsValue for the current visit.

Several implementations that inherit from DimensionBase already implement this process, and their code is not too dissimilar from our:

public override IEnumerable<DimensionData> GetData(IVisitAggregationContext context) {
    // Check if we have a visit
    Assert.ArgumentNotNull((object)context, "context");
    Assert.IsNotNull((object)context.Visit, "visit");

    if (context.Visit.Pages.Count > 1) {
        // Generate standard metrics for this context
        SegmentMetricsValue metrics = this.CalculateCommonMetrics(context, 0);

        // Generate a dictionary of how many times each page has been visited
        ConcurrentDictionary<string, int> keyCount = GetDimensionKeys(context);

        if (metrics != null && keyCount != null) {
            // Iterates on each such dictionary and returns a DimensionData for each page
            foreach (string key in (IEnumerable<string>)keyCount.Keys) {
                int count = keyCount[key];
                SegmentMetricsValue metricsValue = metrics.Clone();
                metricsValue.Count = count;

                yield return new DimensionData() {
                    DimensionKey = key,
                    MetricsValue = metricsValue
                };
            }
        }
    }
}

This is only half the job however: we need the method GetDimensionKeys() to recover a dictionary of key/value pairs (in our case, only one pair) for the currernt Visit:

public ConcurrentDictionary<string, int> GetDimensionKeys(IVisitAggregationContext context) {
    ConcurrentDictionary<string, int> concurrentDictionary = new ConcurrentDictionary<string, int>();
 
    var time = 0;
    string key = null;
    if (context.Contact != null && context.Contact.Identifiers != null) {
        key = (context.Contact.Identifiers).Identifier;
    }
    if (key == null) { return null; }
 
    foreach (PageData page in context.Visit.Pages) {
        time += page.Duration;
    }
 
    time = time / 10000;
    concurrentDictionary.AddOrUpdate(key, time, (Func<string, int, int>)((key1, oldValue) => oldValue + time));
 
    return concurrentDictionary;
}

Here we retrieve the user’s ID from the context.Contact object, and we cycle through all the Pages in context.Visit to calculate the total time spent for the Visit.

We also divide the resulting value by 10k, because the Duration field is actually measured in milliseconds and, in the past, calculating the total amount of milliseconds a given user had spent over the course of several years ended up overflowing the variable’s positive limit.

And that’s about it. It is possible to debug our code in action when Rebuilding the Reporting database, or less aggressively by browsing the ReportDataView view in the Reporting Database after a few days.

Leave Comment