I remember the first time I stumbled upon Notion in 2020. It was an enticing space where you could organize your thoughts and projects, but little did I know that behind this user-friendly interface lay a complex web of database management designed to handle an explosion of users. Fast forward just a few years, and this platform went from 1,000 users to a staggering 100 million! How did Notion manage to keep its service robust while catering to that many users? Let’s dive into the intricacies of their technical architecture and innovative strategies that allowed them to scale tremendously.
The Initial Tech Stack: Understanding Notion’s Formative Years
Notion has become a household name in productivity software. But have you ever wondered how it got there? Let’s dive into its architecture and the challenges it faced along the way.
Notion's Architecture in 2021
In 2021, Notion operated on a single PostgreSQL database. This setup was simple yet effective at first. Each Notion page consists of blocks. Think of these blocks as individual pieces of a puzzle. Each block has a unique ID and can reference other blocks, creating a seamless experience for users navigating their notes.
However, as Notion grew, this architecture began to show its limitations. By 2022, Notion had amassed a staggering 20 billion blocks. That’s a lot of data! With only one database handling all that information, it was only a matter of time before things slowed down. As Lewis pointed out, “Notion is one of the fastest growing softwares ever created.” But even the fastest can hit a wall.
Challenges with User Growth
Let’s talk numbers. Notion started with just 1,000 users in 2020. Fast forward to 2024, and they expect to reach 100 million users. That’s nearly doubling every few years! But with this rapid growth came significant challenges.
- Bottlenecks: As users created more content, the system began to struggle.
- Performance Issues: The single database couldn’t keep up with the demand.
Imagine trying to pour a gallon of water through a straw. It just doesn’t work! As the number of blocks surged, Notion faced the risk of crashing. “If they broke down, everybody loses access to their notes,” Lewis emphasized. This was a serious concern for the team.
The Decision-Making Process for Scalability
Faced with these challenges, Notion had to make some tough decisions. They opted for a strategy called “sharding.” This means splitting the database into smaller, more manageable pieces. Instead of one massive database, they created 32 separate instances, each with 15 shards. This decision was crucial for scaling their systems.
But the transition wasn’t easy. Notion employed a double write strategy. This means they wrote data to both the old and new databases simultaneously. It’s like having two notebooks open at once to ensure you don’t miss anything important. They even used a powerful machine with 96 CPUs to backfill old data over three days.
As they implemented these changes, they also introduced a data lake concept. This allowed them to store raw data separately from user-facing services. It’s a bit like having a storage room for all your supplies while keeping your workspace tidy. By using tools like Fivetran and Snowflake, Notion improved its data analytics capabilities.
However, challenges didn’t stop there. Shards began hitting 90% utilization rates. To combat this, the team decided to triple their database machines from 32 to 96. This reduced the number of shards per machine, optimizing performance.
Testing was crucial during this transition. They conducted “dark reads,” sending requests to both the old and new databases to ensure everything was consistent. The meticulous process involved stopping connections, verifying data integrity, and updating connection management tools. All of this was done without any downtime for users, showcasing the engineering prowess of the Notion team.
In summary, Notion’s journey from a single database to a scalable architecture is a testament to its commitment to user experience. As they continue to grow, their focus on innovation and adaptability remains key to their success.
The Smart Sharding Solution
Understanding Sharding
Sharding is a powerful strategy for scaling databases. But what exactly is it? In simple terms, sharding involves breaking a large database into smaller, more manageable pieces called shards. Each shard holds a portion of the data, allowing for better performance and easier management. Think of it like dividing a big pizza into slices. Each slice is easier to handle and can be served faster.
Notion, a popular productivity tool, faced significant challenges as its user base exploded. From just 1,000 users in 2020 to a staggering 100 million in 2024, their single PostgreSQL database struggled to keep up. With over 20 billion blocks of data, the system began to slow down. This is where sharding came into play.
How Notion Implemented Sharding
Notion's engineers took a strategic approach to sharding. They decided to base their sharding on workspace IDs. This meant that each workspace would have its own set of database shards. By doing this, they could efficiently manage the growing amount of data while ensuring that users experienced minimal latency.
The decision led to the creation of 32 separate database instances, each containing 15 shards. In total, this resulted in 480 logical shards. This transformation was not just about splitting the database; it was about creating a more efficient and manageable block table. As one engineer put it,
“The engineers at Notion looked at their block model and decided to shard everything related to this block table.”
Challenges Faced
However, the transition to sharding was not without its challenges. One major hurdle was implementing a double write strategy. This meant that data had to be written to both the old and new databases simultaneously. It was a complex process, but it was necessary to ensure data consistency during the transition.
Notion also had to deal with the limitations of their existing infrastructure. As shards began to hit 90% utilization rates, they realized they needed to triple their database machines from 32 to 96. This would help reduce the number of shards per machine, making it easier to manage the data.
Positive Outcomes
Despite the challenges, the outcomes of Notion's sharding strategy were overwhelmingly positive. The new system allowed for better performance and scalability. Users experienced faster load times and improved responsiveness. Additionally, the engineering team was able to implement new tools to organize and route incoming data more effectively.
Notion's journey into sharding serves as a valuable lesson for other companies facing similar scaling issues. By focusing on splitting their database rather than simply upgrading their host machine, they created a more efficient system. This strategic shift not only solved their immediate problems but also positioned them for future growth.
In conclusion, sharding is a vital strategy for any growing database. It allows for better management of data and improved performance. Notion's experience shows that with the right approach, even the most significant challenges can be overcome. As we continue to explore the world of database scaling, it's clear that innovative solutions like sharding will play a crucial role in the future of technology.
Building a Data Lake for Enhanced Data Management
What is a Data Lake and Why Does it Matter for Notion?
A data lake is a centralized repository that allows organizations to store vast amounts of raw data in its native format. This is crucial for companies like Notion, which generate massive amounts of data daily. Think of it as a large, flexible container that can hold everything from structured data to unstructured data, like documents and images.
Why does this matter? Well, as Notion grew from 1,000 users to a staggering 100 million in just four years, the need for efficient data management became critical. A data lake enables Notion to handle this influx of data without compromising performance. It allows them to store data without the constraints of traditional databases, making it easier to analyze and derive insights.
Traditional vs. Data Lake Storage Methods
Let’s break down the differences between traditional storage methods and data lakes:
- Traditional Storage: Typically involves structured databases like SQL. Data must be cleaned and organized before storage, which can be time-consuming.
- Data Lakes: Store data in its raw form. This means you can throw in everything—structured, semi-structured, and unstructured data—without worrying about formatting.
- Scalability: Traditional databases can struggle with scalability as data grows. Data lakes, on the other hand, are designed to scale effortlessly.
- Cost-Effectiveness: Maintaining a traditional database can be expensive. Data lakes often utilize open-source technologies, which can significantly reduce costs.
In essence, data lakes provide flexibility and efficiency that traditional methods simply can’t match. This is vital for a rapidly growing company like Notion.
How Notion Created an In-House Data Lake
Faced with the challenges of managing unstructured data, Notion decided to take matters into their own hands.
“They decided to build their own data lake designed specifically for Notion.”This was a strategic move to tailor their data management system to their unique needs.
By utilizing technologies like Apache Spark
for data processing and Snowflake
as their data lake technology, Notion could efficiently handle the massive amounts of data generated by their users. The decision to build an in-house solution saved Notion millions of dollars. It also allowed them to keep pace with their rapid growth, which saw block generation rates doubling every six to twelve months.
Notion's approach was not just about saving money; it was about creating a system that could evolve with their needs. They recognized the importance of cost-effectiveness and efficiency in data management. By leveraging open-source technologies, they could innovate without the heavy financial burden that often accompanies proprietary solutions.
In conclusion, Notion's journey to build a data lake is a testament to their commitment to efficient data management. As they continue to grow, their in-house data lake will play a crucial role in ensuring they can handle the ever-increasing volume of data while maintaining performance and user satisfaction.
Rethinking Infrastructure: The Path Forward
In today's fast-paced digital world, infrastructure is everything. As companies grow, their backend systems must evolve. This is especially true for data-driven architectures. When we look at the recent changes in Notion's infrastructure, we see a clear example of this evolution in action. The need for another round of sharding became evident as utilization rates spiked. But what does that really mean?
The Need for Sharding
Sharding is a method of splitting a database into smaller, more manageable pieces. Think of it like dividing a large pizza into slices. Each slice can be served independently, making it easier to handle. For Notion, the original PostgreSQL database was becoming overwhelmed. With over 20 billion blocks of data, it was clear that something had to change.
- 96 database machines were implemented to handle the increased load.
- Connection limits were reached with previously established systems.
By moving to a sharded architecture, Notion transformed their database into 32 separate instances, each with 15 shards. This strategic move allowed them to manage traffic more efficiently and reduce latency for users. But the transition wasn't without its challenges.
Innovative Uses of pgBouncer
One of the key tools in this transition was pgBouncer. This tool manages database connections, helping to optimize traffic flow. Imagine pgBouncer as a traffic cop at a busy intersection, directing cars (or in this case, database requests) to ensure smooth passage. Notion utilized pgBouncer to handle the influx of connections effectively.
As utilization rates reached critical levels, pgBouncer played a crucial role in managing the database connections. The team had to be innovative, finding ways to further optimize connection management. They implemented additional divisions of pgBouncer, allowing for better organization and routing of incoming data.
Seamless Transition Without Downtime
One of the most impressive aspects of Notion's overhaul was their ability to transition without user downtime. They executed a meticulous plan. "They went one database at a time doing these steps: stop accepting connections, verify data integrity, and resume traffic to the new shards." This careful approach ensured that users experienced no interruptions.
Planning and methodical execution were key. The team employed a double write strategy, where data was written to both the old and new databases simultaneously. This strategy helped maintain data integrity during the transition. A significant machine with 96 CPUs was used to backfill old data over three days. After verification, they switched over seamlessly.
Conclusion
Engaging in a major infrastructure overhaul was not just necessary; it was vital for continued growth and user satisfaction. Managing a data-driven architecture requires constant evolution. As we reflect on Notion's journey, it's clear that their commitment to innovation and user experience has set a standard in the industry. The challenges they faced are a reminder that with careful planning and execution, even the most complex transitions can be successful. As we move forward, we must embrace change and remain adaptable in our approach to infrastructure.
TL;DR: Notion transformed its system architecture from a single Postgres database to a sharded network of multiple instances, ultimately building a custom data lake to handle intensive data operations efficiently, accommodating up to 100 million users and facilitating real-time changes seamlessly.