Permission Computing Efficiency at Scale – Our Engineering Approach

Permission Computing Efficiency at Scale - Our Engineering Approach

Permission Computing Efficiency at Scale - Our Engineering Approach

Eugene Nilo

Eugène is a staff engineer at Gitguardian.

He loves building tools that benefit everyone

and work to improve the quality of GitGuardian products.

A few weeks ago, we I entered New Role-Based Access Management (RBAC) feature in GitGuardian internal monitoring platform. This release resulted from several months of hard work where we had to thoroughly revise our data model and implement a very resource efficient permissions calculation mechanism. I thought this was the perfect opportunity to provide a deep dive into the research and the problems and dead ends we encountered on this journey.

Disclaimer: I will be using Django in my code examples, but the ideas can be generalized; However, the relational database is a stronger requirement.

First define the problem

In short, the RBAC feature creates the idea of ​​”Teams”, an environment in which each member can see and act upon a limited number of incidents. In our field, an incident is a logical unit corresponding to a unique leaked secret. Since the secret can leak out on multiple repo’s, we’re calling iterations Various locations of this secret in one or more repositories. A group of repositories defines a team, so a user belonging to the team can act on any secret that is discovered one or more times in one of these repositories.

Since an incident can have two events owned by two different teams, our first conceptual problem was: how to distribute incidents across teams?

πŸ’‘

Note: Attaching repositories directly to a team is by no means the only possibility. We can decide, for example, to assign the entire GitHub organization to a team so that repositories created later in this organization are automatically added to the team periphery. But this implementation is beyond the scope of this article, so we’ll assume we have a direct link between teams and repositories.

but that is not all. We also needed to allow access to a specific incident to be granted to a user or team. The user has their own perimeter, which is the union of their teams perimeter, and incidents to which they have been individually granted access.

Here’s a visualization to help you understand the relationships between these concepts:

Permission Computing Efficiency at Scale - Our Engineering Approach
Class diagram for our models

Finally, knowing which incidents are part of a user’s periphery was only half the story; What we ultimately wanted was to see what the user could do with them. Here are the three permission levels:

  • READ: User can see the crash
  • WRITE: The user can handle the incident – ignore, waive, resolve, etc.
  • ADMIN: User can share the incident with other users and teams, and add it to their surroundings.

These permissions can, again, be inherited from the team the user is a member of, or they can be assigned directly. For a specific incident, the user permission is the permission Maximum The level of permission granted by these two methods. This permission must be calculated dynamically (quickly).

Why didn’t we go to the direct solution

One straightforward solution is to have a standing table per user Permissions. But this will be very difficult to maintain. why? Let’s imagine a user is removed from a team. Events for which some permissions are inherited by belonging to the team are no longer in the user’s vicinity. Therefore, all permissions on team incidents must be recalculated, to check if the user has lost access or their permission on incidents has been reduced.

Following a per-user permission table necessarily means an order of magnitude higher in the number of operations needed to update all user permissions.

Since we wanted to keep table operations as synchronous as possible, we added permission fields in three relationships to submit the workload:

  • The user’s relationship to an incident
  • Team Incident Relationship
  • The relationship between the user and the team

After doing some research, we decided to calculate these permissions in SQL. Not relying on each user’s permissions also means we can’t rely on popular Django permission libraries (including django.contrib.auth), which are all object-dependent.

In the table below, we set the number of rows affected by a new event (new incident, new rebuy added to the team perimeter, etc..). We can see that the solution for each object is linearly proportional to the number of users in the team. But we don’t want our team sizes to be limited:

condition # Incident affected a user # Affected by the team accident
new incident The number of teams x the number of team users # of teams
New repository in the team # warehouse incident x number of team users The number of warehouse accidents
New user in the team # Team accidents 0
New Team Incident (Direct Access) # of team users 1

Although early on we ignored the user-incident relationship as the ultimate source of truth, we had to use per-object permissions for the Team-Incident relationship. This choice was driven by performance reasons: the read-through process Store And the incident The schedules were very slow, and we assumed there would be fewer teams than users.

secondly. How does our model work?

Simple trick: use double masks

Once we defined the permissions specifications, we needed to decide how to store them in our database. You mentioned three levels of permission, but it was clear that in the future we would need to add more to allow more granularity in business domain roles. To avoid having many boolean fields and to facilitate the logic of validating authorizations, we preferred to store authorizations in their binary representation. Thanks to the use of binary masks, we can store all permissions in one integer field.

πŸ’‘ How to check permissions stored as a binary mask
Let’s say we have two resources A and B, and the permissions are read and write
We’ll store that in 2 bytes. Let’s assume for simplicity that WRITE means reading,
cases for:

  • 0b0011 Do you WRITE: A permission
  • 0b0001 Do you READ: A permission

B cases:

  • 0b1100 Do you WRITE: B permission
  • 0b0100 Do you READ: B permission

It is clear:

with a little ANDwe code eg 0b0111 for being WRITE: A And the READ: B permission. On the contrary, to check the permission, all we have to do is bit AND On the permission mask and the binary value of the field.

So to check if the user has permission WRITE: Awe will 0b0011 & user permission. The result will only be equal to the mask if the user has the permissions:

  • 0b1111 & 0b0011 = 0b0011 β†’ OK
  • 0b0111 & 0b0011 = 0b0011 β†’ OK
  • 0b1101 & 0b0011 = 0b0001 β†’ Quite a bit
  • 0b0000 & 0b0011 = 0b0000 β†’ Quite a bit

To implement this in Django we used the IntegerChoices classes, as well as a simple helper to help check permissions in our Python code.

from django.db import models

class Permission(models.IntegerChoices):
    READ = 0b001
    WRITE = 0b011
    ADMIN = 0b111

    @classmethod
    def is_authorized(
        cls, mask: "Permission", scope: "int | Permission"
    ) -> bool:
        """
        GIVEN a mask and a scope
        Return true if the scope matches the mask
        ex: 0b100 & 0b110 = 0b100 != 0b110
        """
        return bool((scope & mask) == mask)

Django models

Now that we know the relationships between our objects and where to store the permissions we need, we can implement them using Django forms.

Let’s say we’re using Django’s default user model, here are ours:

class TeamUser(models.Model):
    team = ForeignKey("Team", ...)
    user = ForeignKey("User", ...)
    permission = PositiveSmallIntegerField(default=Permission.READ)

class Team(models.Model):
    name = TextField(...)
    users = ManyToManyField("User", through="TeamUser", ...)

class TeamIncident(models.Model):
    team = ForeignKey("Team", ...)
    incident = ForeignKey("Incident", ...)
    permission = PositiveSmallIntegerField(default=Permission.READ)

class UserIncident(models.Model):
    user = ForeignKey("User", ...)
    incident = ForeignKey("Incident", ...)
    permission = PositiveSmallIntegerField(default=Permission.READ)

class Incident(models.Model):
    name = TextField(...)
    teams = ManyToManyField(Team, through="TeamIncident", ...)
    users = ManyToManyField(User, through="UserIncident", ...)

Quite obvious, let’s move on to the use cases.

Filter incidents for the user

First, getting all user crashes, or all users hitting an accident is simple, because the existence of forms themselves implies read permission, so we don’t have to check permissions. We can do the following:

# list incidents of user
Incident.objects.filter(Q(users=user) | Q(teams__users=user)).distinct()

# list user having access to an incident
User.objects.filter(Q(incidents=incident) | Q(teams__incidents=incident)).distinct()

πŸ’‘

The distinction is required because a user can be allowed to access an incident across multiple lines.

The query can be done through subqueries instead. In practice, we take advantage of the fact that we already have access to user teams to simplify this.

After checking which incidents will be shown to the user, we want to know what permissions they have on these incidents to see what actions they are allowed to perform.

Let’s stay with three levels then:

  • 0b001 he is READ allowing to see the accident
  • 0b011 he is WRITE (implied READ) that allows the disposition of the accident
  • 0b111 he is ADMIN (implied READ + WRITE) which allows granting access to the incident to other users and teams.

And of course, 0b000 No permissions at all.

Let’s write a Django query for this, by building a user_permission The annotation that will contain the aggregate permission of the user on each incident.

The permission of the user within the team is the least permission (computed with the binary AND operation) between the permission of the team in the incident and the permission of the user in the team:

F("team__team_incident__permission").bitand(F("team__team_user__permission"))

# and filter the relation by
queryset.filter(team__team_user=user)

User permission within multiple teams is the highest permission (computed with the binary OR operation) across all teams:

BitOr(
    F("team__team_incident__permission").bitand(F("team__team_user__permission")),
    output_field=PositiveSmallIntegerField(),
)

But the user can also access the incidents individually, so we’ll use Coalesce(..., 0) which will replace null values ​​with 0, our null permission, when the user does not have access through teams, or individually. Otherwise we wouldn’t be able to apply our binary operation (NULL not a binary value).

user_permission_expression = Coalesce(
    BitOr(
        F("team_incident__permission").bitand(F("team_incident__team__team_user__permission")),
        output_field=PositiveSmallIntegerField(),
    ),
    0,
).bitor(Coalesce(F("user_incident__permission"), 0))

Finally, we filter the queryset for our user:

queryset = Incident.objects.filter(
    Q(user_incident__user=user) | Q(team_incident__team__team_user__user=user)
).annotate(user_permission=user_permission_expression).distinct()

Filter the query group by permission

We have everything we need, but it is not yet practical to fetch all user objects for which they have a certain permission level using our binary logic.

We could craft a custom queryset filter, but let’s make something more reusable: let’s define a custom search to implement the extension Permission.is_authorized The method directly in SQL:

class IsAuthorized(Lookup):
    """
    GIVEN a mask and a scope
    Return true if the scope matches the mask
    ex: 0b100 & 0b110 = 0b100 != 0b110
    """

    lookup_name = "isauthorized"

    def as_sql(self, compiler, connection):
        lhs, lhs_params = self.process_lhs(compiler, connection)
        rhs, rhs_params = self.process_rhs(compiler, connection)
        params = lhs_params + rhs_params + rhs_params

        # The binary operation happens here
        return "%s & %s = %s" % (lhs, rhs, rhs), params

Field.register_lookup(IsAuthorized)

# usage, assuming the of_user queryset method annotates the user_permission
Model.objects.of_user().filter(user_permission__isauthorized=Permission.WRITE)

It is important to note that although calculating incident permissions works in all cases, we must not forget about shortcuts.

For example, file Manager A role that gives access to all incidents, so it doesn’t make sense to calculate permissions for it. Likewise, the All Incidents Team provides access to all the organization’s incidents, allowing us to cancel the Perimeter account.

Also, on the numbered endpoints, we just have to calculate the permissions on the page we want to return!

We are done!

Implementing the Teams feature has been far from easy, and I know we’re not the first engineering team to face this kind of challenge. It took a thoughtful reflection on the data models we use, and how to implement the feature with the lowest possible impact on performance and on the rest of the application. In the end, I think this was a really good exercise and we learned several things that we will be able to apply to other parts of our code.

Time for our next challenge!

*** This is a security blog shared by the Bloggers Network from GitGuardian Blog – Automated disclosure of secrets composing Guardians. Read the original post at: https://blog.gitguardian.com/efficiently-computing-permissions-at-scale-our-engineering-approach/

Leave a Comment