Data design for nested groups of ordered items

Introduction

Traveling the path from a problem to its solution as the specifics continue to unravel, makes reaching the end even more rewarding than having each step of the way determined before taking it. I want to share with you the path I found us treading when faced with the challenge of designing a data structure for a common problem amongst our users.

Our users are owners and staff within small-medium reservation-based businesses. Think hair salons, massage parlors, repair shops, vets, psychologists, chiropractors, etc. (let's reference them as outlets for convenience).

The problem

Within each reservation there exist multiple connections between various entities such as; a company, one of its employees, a customer, the services provided. The progression of the reservation might bring a connection to monetary transactions, followed by an invoice that might even contain additional services or physical products that the customer bought.

Products and services existing within a single outlet can easily reach tens and many outlets even keep catalogs containing hundreds of products and services. For the outlets' staff to be able to properly navigate these catalogs, and perhaps even present them to their customers in one way or another, the catalogs need to be organized.

Making it work

Opinions on how to organize such a catalog vary greatly. Therefore we realized that it most likely wouldn't be a good idea to try to implement an opinionated organization scheme for these catalogs. We decided to rather supply each outlet with the means to construct their organization scheme for both these catalogs.

Divisibility

Even though opinions vary, there are a few common concepts that most or all outlets would want to be able to apply in the organization process. The most obvious one is grouping. Having to start somewhere, we came up with the following simple data structure for items that can be shepherded into groups:

type Item = {
  _id: string;
  ...
}

type Group = {
  _id: string;
  itemIds: string[];
  ...
}

Ordering

The next common-ground organization concept that needed to be implemented was ordering. Some items are more important than others and I'm sure everyone can agree that without order, there exists little organization. Items need to be orderable and the groups should as well since some groups truly are more important than others.

At this point, data design decisions get a bit more challenging. Wait, what 🤔... why wouldn't we just inject both the Item and Group types with an order: number entity and get on with our lives?

Duplication

As you might have noticed, there is nothing in the data structure for Item and Group that refrains a group from containing an item already contained by another group. Although not entirely intentional from the get-go, it is a fortunate side effect since an outlet's staff might for example want the be able to access a service called "Haircut for kids" within both a group called "Services for kids" and "Haircuts". It might even make sense for more intricate services to be displayed within more than two groups. It has become apparent that we shouldn't restrict items to a single group.

Now let's reconsider our initial thought of injecting the Item type with an order: number entity. By doing so, the item would have the same order within all of its containing groups. As an alternative, we might consider altering our types as such:

type Item = {
  _id: string;
  parentGroups: {
    _id: string;
    order: number;
  }[]
  ...
}

type Group = {
  _id: string;
  ...
}

By doing so we'd render our Group type oblivious to its content, which feels grossly contradictory to its purpose. Another alternative and the one we decided to commit to is somewhere along these lines:

type Item = {
  _id: string;
  ...
}

type Group = {
  _id: string;
  items?: {
    itemId: string;
    order: number;
  }[];
  ...
}

This felt more comfortable since both types seemed more true to their domain purpose.

As with gut feelings, you shouldn't always make your decisions based on them and that is truly the case regarding the Item type in this scenario. However! Sometimes your insight is your best sight and that might just be the case regarding the Group type since there is no domain-related reason to be found that would suggest that a group should be duplicable. For orderable groups of orderable items our structure now looks like this:

type Group = {
  _id: string;
  items?: {
    itemId: string;
    order: number;
  }[];
  order: number;
  ...
}

Nesting

Not to be confused with making one's home as comfortable as possible, nesting groups within other groups is vital for our outlets to achieve good organization of their catalogs. If a single level of nesting would be sufficient, we'd likely have committed to the following structure.

type Group = {
  _id: string;
  items?: {
    itemId: string;
    order: number;
  }[];
  subGroups?: {
    groupId: string;
  }[];
  order: number;
  ...
}

Groups can now live securely and comfortably within other groups 😌.

At the risk of going into too much detail, it should be quite obvious that a group of services called "Haircuts" should optimally be divisible further into groups of "Men's haircuts", "Women's haircuts" and "Kids haircuts", for example. The "Men's haircuts" group might then even be divided into "Trims", "Fades" and so on. This calls for a **theoretical infinite nestability of groups since we have no way of knowing exactly how many levels of nesting would be sufficient for all future outlets. At first glance, it seems that the structure we already have supports this requirement. However, it contains some sneaky flaws.

The only way for a front-end application to determine which groups are top-level groups (and as such should be rendered first) is to dig through all the other groups' subGroups arrays to make sure the groupId in question doesn't exist in any of them. We could address that rather easily by injecting the structure with an isTopLevelGroup: boolean entity.
As with items referenced within an items array, groups referenced within a subGroups array can be referenced within a subGroups array of multiple groups, making them globally nonunique or duplicable. We'd probably be able to restrict that at the application level, but for multiple reasons, we want our data structure to handle the restriction if possible.

Making it good

Just because a data structure can be used to solve all the problems it was set out to, doesn't mean it can't be improved. Consider the following structure:

type Item = {
  _id: string;
  ...
}

type Group = {
  _id: string;
  items?: {
    itemId: string;
    order: number;
  }[];
  parentGroupId?: string;
  order: number;
}

By removing the subGroups array and introducing a parentGroupId we've reversed the responsibility of reference maintenance. The parent group is no longer aware of which subgroup it contains and the subgroups are now aware of which group they belong to. It might not be obvious why that would be a good thing, the reason being it forces each group to be a subgroup of a single group at most - in other words globally unique and non-duplicable. On top of that, making the parentGroupId entity optional opens up the possibility of omitting it and therefore implying the group has no parent. In other words, is a top-level group.

Now if this isn't a stellar example of less being more, I don't think I'll ever see one.

Everything works