Skip to main content

Usage

The Permissions API uses the same base URL and authorization details as the Sync API.

About Permission Syncs

Permission Syncs use available APIs to query all ways that users can access or inherit access to a file. For example, for the Google Drive integration, Paragon syncs permissions for:
  • Direct access assigned to files
  • Inherited access from parent folders
  • Google Group member access (direct or inherited)
  • Google Workspace organization-wide access
    • If not designated as “searchable by organization”: access granted from an opened link
Because evaluating permissions can be difficult and error-prone in practice, Permission Syncs store permission data in a fine-grained authorization server that dramatically simplifies the query to check for access. Instead of searching across role relationships and executing specific rules for each integration, you can query the Permissions API in a standard format:
Query to check if a user has access to File ID
POST /api/permissions/{syncId}/check

{
    "user": "[email protected]",
    "role": "can_read",
    "object": "[File UUID]"
}

// Response:
{
    "hasAccess": true
}
This query format works across any File Storage sync that supports Permissions. Permission Syncs are currently supported on a select number of File Storage integrations and run automatically when files are processed.

Implementing Permissions API

To implement Permissions API in a production context, your application will need to query the Permissions API to check for access when documents are being requested. Below is a diagram illustrating how Sync API and Permissions API can be used together in a RAG application (“Your App”) where an organization admin connects their Google Workspace to your app to ingest workspace files on behalf of everyone in that organization. Permissions API can be used to filter documents that match the user’s query from your vector/search database to only the files that the user has access to in Google Workspace.
Permissions API diagram

Post-search Filtering approach illustrated

Depending on the architecture of your app and your indexing requirements, Permissions API can be implemented as a pre-search filter, post-search filter, or in a hybrid approach of both to ensure that users can only view documents that they have access to in upstream applications (Google Drive, SharePoint, etc.).

Pre-search Filtering

This approach will only scale to a small corpus of documents (≤ 1000 total).If your users will sync more documents, we recommend using a post-filtering or hybrid approach as described below.
In pre-filtering, you can use Permissions API to collect a list of document IDs before you query your vector database or search index. Use List Objects to enumerate all searchable documents, and provide this list as a metadata filter on the query to your search database:
POST /api/permissions/{sync_id}/list-objects

{
  "objectType": "file",
  "user": "[email protected]",
  "role": "can_read"
}
The API will respond with an array of document IDs (file.id from the File schema) that you can use to limit your search index parameters.
index.query(
    vector=[...], 
    top_k=10,
    filter={
        "file_id": {"$in": permitted_doc_ids} 
    }
)
{
  "query": {
    "bool": {
      "must": {
        "match": {
          "content": "search text"
        }
      },
      "filter": {
        "terms": {
          "file_id": permitted_doc_ids
        }
      }
    }
  }
}

Post-search Filtering

In post-search filtering, you can use Permissions API to filter the results of your search query to only include documents that the user has access to. After you receive ranked search results from your vector database, send a request to the Batch Check Access endpoint to check if the current user has access to the result documents by passing a list of IDs:
POST /api/permissions/{sync_id}/batch-check

{
  "checks": [
    {
      "object": "doc-1",
      "user": "user:[email protected]",
      "role": "can_read"
    },
     {
      "object": "doc-2",
      "user": "user:[email protected]",
      "role": "can_read"
    }
  ]
}
The API will provide a list of objects with a Boolean allowed property that will tell you if the current user has access to the file (directly or via a transitive relationship). You can use this list to filter down the search results from your database and include the accessible documents in your LLM context. We recommend over-fetching by a factor of at least 2x from your search / vector database to account for scenarios where some relevant documents will not be available to the user.

Hybrid Approach

You can combine pre-search and post-search filtering techniques to limit the search space upfront for your database queries while ensuring that permissions enforcement is accurate for document sets of >1000 files. Use coarse-grained pre-search filters to limit the initial search space (removing documents from e.g. different teams or spaces that the user is not assigned to), and use the Permissions API to filter the final results, verifying that the user has access to the documents that are ultimately used in a response. Some examples of coarse pre-search filters you can use:
  • Folder IDs that the user has access to
  • Space/Drive IDs that the user has access to
  • Group/Team IDs that the user is a member of
This metadata will need to be indexed in your search database at ingestion time so that you can filter the results based on it. For example, if a user is a part of the Engineering team inside of your app, an organization admin can choose which top-level folders or drives that belong to the Engineering team and should be searchable by other members, during sync setup. When ingesting files from this sync, you will need to include this metadata in the indexed documents so that you can use this as a pre-search filter.
index.query(
    vector=[...], 
    top_k=10,
    filter={
      "team_id": {"$in": ["engineering-team-id"]} 
    }
)
{
  "query": {
    "bool": {
      "must": {
        "match": {
          "content": "search text"
        }
      },
      "filter": {
        "terms": {
          "team_id": ["engineering-team-id"]
        }
      }
    }
  }
}
After you receive the search results, you can use the Batch Check Access endpoint to filter the result set to the documents that the user has access to. As in Post-search Filtering, we recommend over-fetching by a factor of at least 2x from your search / vector database to account for scenarios where some relevant documents will not be available to the user.