Notion AI security and privacy practices

In this Article

At Notion, we want to be transparent with our customers about our products and how we use AI to enhance our users’ experience. Below is an overview of Notion’s AI functionality and related security and privacy practices 🔒

What is Notion AI?

Notion AI is a collection of AI products and features, which currently includes*:

AI assistant: Boost your productivity with instant answers to your questions as well as help with generating written content and brainstorming, all by using information from across your Notion workspace and the web.
Autofill: Generate text across many pages in a database simultaneously by writing a custom prompt or selecting a pre-configured prompt.

Notion AI features appear seamlessly in your workspace but leverage technology from several AI Subprocessors to provide you with the service. Check out our Subprocessor page for a complete list of our current Subprocessors, and learn more about how Notion uses AI in this article.

*Notion AI will continue expanding to include more features over time.

How does Notion AI work?

Who are Notion’s Large Language Model Providers?

Notion currently utilizes various large language models (LLMs) hosted by Notion as well as by organizations such as Anthropic and OpenAI. We continuously evaluate LLM providers and their models to provide the highest quality experience to our Notion AI users. Any third parties that process Customer Data will be published in our Subprocessor page.

How do I subscribe to new Subprocessor notifications?

Customers may sign up to receive notification of new Subprocessors by e-mailing team@makenotion.com with the subject “Subscribe to New Subprocessors.” Once a customer has signed up to receive new Subprocessor notifications, Notion will provide that customer with notice of any new Subprocessors before authorizing the new Subprocessor to process Customer Data. For additional information, please see our Data Processing Addendum.

How does Autofill work?

When you interact with the Notion AI to set up an autofill property, several steps occur in the background:

Notion receives a prompt from a user.
Data relevant to the prompt is sent to an AI LLM Subprocessor, which produces an output to send back to Notion.
Notion then processes the LLM’s output so that it adheres to the right format and language and displays the output to the user.

How is my data protected?

When sending data to our AI LLM Subprocessors, the data is encrypted in transit using TLS 1.2 or greater, and no Customer Data is used to train the model.
All our AI LLM Subprocessors retain data for only 30 days or fewer before deletion.
Only data the user has access to on the specific page where Autofill is used will be sent to AI LLM Subprocessors to generate the output, meaning that the generated outputs provided to the user will not incorporate any data to which the user did not already have access.

How does AI assistant work?

Notion AI is a personalized assistant powered by GPT-4. It can search across your workspace, create and edit content, and chat with you about any topic.

The AI assistant works in two key phases:

Creating embeddings.
Generating responses.

What are embeddings?

Embeddings are numerical representations of text or documents. These representations capture the meaning and context of the text in a multidimensional space, where similar topics have similar numerical representations. By using embeddings, vector search algorithms can efficiently compare and find similarities between different pieces of text or documents. In the case of Notion’s AI assistant feature, embeddings are created from workspace content to enable the system to provide accurate and relevant responses to user questions.

Here is an example of an embedding from OpenAI:

[ -0.02541878, -0.0104167685, -0.0015037002, ..., -0.004155378, -0.00043069973, -0.01679479 ]

How are embeddings created?

For each page in your workspace, we generate an embedding by using an OpenAI zero-retention embeddings API.
Notion receives an embedding for each Notion page and stores it in a vector database (e.g., Pinecone), which ultimately is used to find that original block when it is relevant to providing a response to the user.

How are embeddings used to generate responses?

Notion receives a question from a user.
The request is passed to an LLM Subprocessor. If the request does not require searching the Notion workspace, the LLM will generate a response at this point and skip the rest of the process. If the user’s request requires searching their workspace, the LLM generates a search query most relevant for the user request.
The query is passed to a vector database, where a list pages is found based on relevance to the query.
Notion sends the query — and the pages identified by the vector database — to a Notion-hosted LLM where the pages are refined and ranked by relevance to the query.
The query, refined list of pages, and ranking of pages are processed by our LLM Subprocessors to generate a response that fulfills the user’s request.
Notion processes the output to adhere to the right format and language and displays the output to the user.

How are embeddings protected?

Despite embeddings only being a numerical representation of Customer Data, Notion still treats embeddings with the same level of security and privacy considerations as Customer Data. All our Customer Data commitments outlined in our Master Service Agreement (MSA) and Data Processing Agreements (DPA) apply to embeddings. View our Terms and Privacy Page for more information.

We store embeddings with vector databases like Pinecone. These vector databases have been vetted by our security team as well as an external auditor to obtain their SOC2 Type II certification.

How is my data protected?

Does Notion AI respect existing permissions?

Yes, Notion AI honors existing permissions. The LLM used to generate AI responses for a user cannot see or use any information to which that user does not already have access.

How is Customer Data protected when sent to AI Subprocessors?

Notion AI is designed to protect your Customer Data and prevent information leaks to other users of the service.

Prior to engaging any third-party Subprocessor or vendor, Notion evaluates their privacy, security, and confidentiality practices, and executes an agreement implementing its applicable security, privacy, and legal obligations. All Subprocessors are monitored and reviewed at least annually to ensure continued compliance with Notion’s security and privacy expectations. This includes reviewing documents such as attestation reports, penetration tests, and other artifacts based on the Subprocessor’s criticality and other risk factors. As part of the onboarding and ongoing reviews, technology security questionnaires are distributed to vendors and are required to be completed. Significant public security events are also assessed to protect the supply chain attack surface.

When we send your Customer Data to third parties, it is encrypted in-transit using TLS 1.2 or greater.

For more information about how Notion processes your data, please refer to our Data Processing Addendum.

Will my data be used to train any models?

By default, Notion and its AI Subprocessors do not use Customer Data to train any models. We specifically have contractual agreements in place with our AI Subprocessors that prohibit the use of Customer Data to train their models.

Your use of Notion AI does not grant Notion any right or license to your Customer Data to train our machine learning models.

How is Customer Data segregated?

Individual customer accounts are kept separate in our production environment. We do not mix or process data from different customers together during AI processing. This means we do not expose your data to other Notion customers.

What are the data retention obligations of third-party AI providers?

Notion AI Subprocessors have data retention policies that allow Notion to meet our obligations to customers for the processing of data.

When using Notion AI assistant and Autofill, our LLMs only retain Customer Data for 30 days or fewer before deletion. Notion's AI assistant is additionally powered by OpenAI's embeddings. OpenAI does not retain any Customer Data through their embeddings service.

Embeddings stored in vector databases are deleted within 60 days from when the page or workspace is deleted.

If a user deletes a Notion page or Notion workspace, we can restore the content within 30 days. After 30 days, the data is deleted and unrecoverable. This includes any AI-generated data and embeddings. For more information about deleting or restoring your data, please refer to this article.

What compliance standards does Notion AI meet?

Notion AI is included in the scope of Notion’s SOC 2 Type 2 report and ISO 27001 certification, demonstrating our commitment to various regulatory and industry standards.

We are actively working to enable Notion AI to meet HIPAA requirements by utilizing LLM provider’s zero-retention APIs and allow for the processing of protected health information (PHI).

What controls can be added?

Can data loss prevention (DLP) be configured to alert for data being used by Notion AI?

Customers can trigger data loss prevention (DLP) alerts for sensitive content in their Notion workspace using third-party integration partners on our Enterprise plan. That will include content in an AI prompt and the content generated by AI. Learn more about our DLP integrations here.

What are the legal considerations?

Are there rules against what I can do with Notion AI?

The Notion AI Supplementary Terms apply to your usage of Notion AI. In addition, Notion’s Content & Use Policy applies to any content on Notion, including content generated by Notion AI. Violating these terms can result in removal of your content or suspension of access to your workspace.

Who owns the rights to content generated by Notion AI?

Notion does not claim ownership of your input or the generated output. This is addressed in the Notion AI Supplementary Terms in the "Input and Output" section:

You may provide input to be processed by Notion AI (“Input”), and receive output generated and returned by Notion AI based on the Input (“Output”). When you use Notion AI, Input and Output are your Customer Data.

You can also reference our standard data protection practices.