Scaling Playwright Tests: How I Solved CI Memory Leaks and Built an Open Source Solution
Engineering a robust solution for Playwright page lifecycle management that reduced CI memory usage by 85% and became an open source tool used by thousands of developers.
#The Problem: When Test Isolation Becomes Resource Exhaustion
As a Senior QA Automation Engineer managing test suites at scale, I encountered a critical problem that many engineering teams face but few solve systematically: resource leaks in end-to-end testing. Our CI environment was crashing with out-of-memory errors, accumulating 800+ browser tabs during test execution, and turning 15-minute builds into 45-minute disasters.
This isn't just a technical curiosity—it's a production problem that affects development velocity, CI costs, and team confidence in automated testing.
#Why This Problem Matters in Real Systems
Testing conversational interfaces requires verifying multiple perspectives simultaneously. When User A sends a message to User B, you need to validate:
- Sender experience: Message delivery confirmations, chat state updates, typing indicators
- Recipient experience: Message arrival, notification badges, read receipts, UI state changes
This requires multiple browser contexts, and at scale, this becomes a resource management challenge that can cripple your development pipeline.
Our testing environment served a team of 20+ engineers with:
- 50+ development environments running nightly regression suites
- 1,800 tests per environment across desktop, mobile web, Chrome, and Safari
- Multi-perspective scenarios requiring 2-5 browser pages per test
- CI infrastructure costs growing linearly with resource usage
#Technical Root Cause Analysis
The issue wasn't with Playwright's design—it was with how teams typically handle multi-page scenarios. Playwright's context isolation model ensures each test starts clean, but pages created via browser.newPage() persist until the entire test suite finishes, not individual tests.
This creates a resource accumulation pattern:
// The hidden resource leak patterntest('conversation flow test 1', async ({ browser }) => {const senderPage = await browser.newPage() // Page 1: Never cleaned upconst recipientPage = await browser.newPage() // Page 2: Never cleaned up// Test logic...// ❌ Pages remain in memory after test completion})test('conversation flow test 2', async ({ browser }) => {const senderPage = await browser.newPage() // Page 3: Accumulatesconst recipientPage = await browser.newPage() // Page 4: Accumulates// Test logic...})// Result: Linear memory growth = Test count × Pages per test
#The CI Infrastructure Impact
The resource leak manifested as a cascading failure in our development pipeline:
Memory Growth Pattern:
- Test 1-50: 2 pages each = 100 total pages
- Test 51-100: 2 pages each = 200 total pages
- Test 150-200: 2 pages each = 400 total pages
- Result: 8GB Jenkins instances running out of memory before completion
Business Impact:
- ❌ Build reliability: 60% of CI builds failing due to resource exhaustion
- ❌ Developer productivity: 45-minute feedback cycles instead of 15 minutes
- ❌ Infrastructure costs: Scaling Jenkins instances horizontally without fixing the root cause
- ❌ Team confidence: Engineers avoiding comprehensive test coverage due to CI instability
#Engineering Solution Architecture
Rather than implementing manual cleanup everywhere (which creates maintenance overhead and error-prone code), I designed an automatic solution that handles page lifecycle management at the framework level.
#Design Principles
- Zero Configuration: Should work out-of-the-box with existing Playwright tests
- Automatic Tracking: Intercept
browser.newPage()calls transparently - Reliable Cleanup: Use Playwright's fixture system for guaranteed cleanup
- Graceful Failure Handling: Clean up pages even when tests fail or timeout
- Developer Experience: Maintain existing test code patterns
#The Cleanup Chaos
Our first attempt was the obvious one: manual cleanup.
test('messaging with manual cleanup', async ({ browser }) => {const senderPage = await browser.newPage()const recipientPage = await browser.newPage()try {// Test logic here...} finally {await senderPage.close()await recipientPage.close()}})
This worked... until it didn't. Tests that threw exceptions before reaching the finally block. Async operations that hung. Network timeouts that left pages in weird states.
We ended up with:
test('messaging with paranoid cleanup', async ({ browser }) => {let senderPage, recipientPagetry {senderPage = await browser.newPage()recipientPage = await browser.newPage()// Test logic...} finally {if (senderPage && !senderPage.isClosed()) {await senderPage.close().catch(console.error)}if (recipientPage && !recipientPage.isClosed()) {await recipientPage.close().catch(console.error)}}})
Multiply this everywhere. Every test became a defensive programming exercise. Our test code was 40% actual testing, 60% memory management ceremony.
#The Jenkins Memory Battle
Meanwhile, our DevOps team was frantically adjusting Jenkins configurations:
# Increase memory allocation (spoiler: didn't help)memory: 8GB → 16GB → 32GB# Reduce parallel workers (slowed everything down)workers: 8 → 4 → 2 → 1# Add aggressive timeouts (killed valid long-running tests)timeout: 30s → 10s
We were treating the symptoms, not the disease. More memory just meant we could accumulate more leaked pages before crashing. Fewer workers meant longer build times. Aggressive timeouts meant false negatives.
The real problem wasn't memory—it was lifecycle management.
#The Eureka Moment
During one particularly frustrating debugging session, I noticed something odd. When I manually closed a browser window in headed mode, the test would suddenly start behaving normally. It wasn't a test logic problem—it was a resource management problem.
That's when it hit me: What if page cleanup was automatic and reliable, just like Playwright's context isolation?
I wanted the same confidence I had with contexts—knowing that every test starts clean and finishes clean, regardless of what happens in between.
#Building the Solution
I spent the next few days building what would become Playwright PageMan. The concept was elegantly simple:
- Track every extra page created during test execution
- Auto-close them after each test via Playwright's fixture lifecycle
- Handle failures gracefully so cleanup always happens
- Make it automatic for the common case (
browser.newPage())
Here's how the same test looks with PageMan:
import { test, expect } from 'playwright-pageman'test('user can send and receive messages', async ({ browser }) => {// This page is automatically tracked!const senderPage = await browser.newPage()await senderPage.goto('/chat/sender-view')// This page is automatically tracked too!const recipientPage = await browser.newPage()await recipientPage.goto('/chat/recipient-view')// ... test both sides of the conversation ...// No cleanup needed! Pages auto-close after the test.})
That's it. No try-finally blocks. No manual cleanup. No defensive programming. Just the test logic that actually matters.
#Auto-Tracking Magic
The secret sauce is automatic tracking. PageMan intercepts browser.newPage() calls and automatically adds them to a cleanup queue. When the test finishes (whether it passes, fails, or times out), Playwright's fixture system ensures all tracked pages get closed.
For the less common cases—pages created via context.newPage() or popup windows—you can manually track them:
test('handle popup messages', async ({ page, context, extraPages }) => {// Open a popup windowconst [popup] = await Promise.all([context.waitForEvent('page'),page.click('#open-conversation-popup'),])// Track it for auto-cleanupextraPages.push(popup)// Test the popup interface...// Popup auto-closes after the test})
#The CI Transformation
After rolling out PageMan, our Jenkins builds went from disaster to delight:
Before PageMan:
- ❌ Builds timing out after 45 minutes
- ❌ Out of memory errors
- ❌ 800+ leaked browser windows
- ❌ Tests failing due to resource exhaustion
- ❌ Manual cleanup everywhere
After PageMan:
- ✅ Reliable 15-minute build times
- ✅ Stable memory usage (under 2GB)
- ✅ Zero leaked pages
- ✅ Tests failing only when they should
- ✅ Clean, focused test code
#Real-World Benefits
The impact went beyond just CI stability:
For QA Engineers:
- Write tests that focus on behavior, not cleanup
- No more defensive
try...finallyeverywhere - Reliable execution in both local and CI environments
For DevOps Teams:
- Predictable resource usage in CI pipelines
- Smaller Jenkins instances (saved actual money)
- No more 3 AM alerts about crashed test runners
For Development Teams:
- Faster feedback loops with stable CI
- More confidence in test results
- Multi-page testing patterns become trivial
#The Global Access Innovation
One of the most requested features came from page object models. Teams wanted to track pages created inside helper functions without passing fixtures around:
// helpers/conversation-helper.tsimport { extraPages } from 'playwright-pageman'export class ConversationHelper {async openNewChatWindow(context: BrowserContext, userId: string) {const page = await context.newPage()extraPages.push(page) // Global access - no fixture passing!await this.loginAsUser(page, userId)return page}}
This made PageMan even more seamless—no architectural changes needed, just better lifecycle management.
#The Open Source Journey
After seeing the transformation in our own testing workflows, I realized this was a universal problem. Every team doing complex end-to-end testing faces the same page management challenges:
- E-commerce sites: Testing both customer and admin interfaces
- Collaboration tools: Multiple users interacting simultaneously
- Real-time applications: Different viewport experiences
- Multi-tenant platforms: Various user role perspectives
So I open-sourced Playwright PageMan with:
- Zero configuration setup (just change your import)
- Auto-tracking enabled by default (handles 80% of cases automatically)
- Manual tracking for special cases (popups, context pages, etc.)
- Configurable timeouts and logging (for debugging edge cases)
- Full TypeScript support (because types prevent bugs)
#Configuration for Different Environments
PageMan adapts to your workflow:
// For local development - see what's happeningtest.use({pageManOptions: {logCleanup: true, // Log cleanup actionscloseTimeout: 5000, // Patient with slow pages},})// For CI - fast and silenttest.use({pageManOptions: {logCleanup: false, // No noise in CI logscloseTimeout: 1000, // Aggressive timeouts},})// For debugging - manual controltest.use({pageManOptions: {autoTrack: false, // Manual tracking only},})
#The Community Response
Since going open source, PageMan has attracted teams with even more creative use cases:
- Visual regression testing: Comparing designs across multiple viewport sizes
- Accessibility testing: Testing screen readers in multiple windows simultaneously
- Load testing: Simulating multiple user sessions from a single test
- Cross-browser workflows: Coordinating actions between different browser instances
The common thread? Every team was fighting the same page lifecycle battle, and PageMan made it disappear.
#Looking Forward
Today, PageMan manages page lifecycles for thousands of tests across dozens of CI environments. What started as a weekend fix for our Jenkins memory crisis has become a tool that makes multi-page testing humane.
The lesson? Sometimes the best tools aren't about adding new capabilities—they're about removing friction from what you already know how to do.
If your CI builds are mysteriously slow, if your Jenkins instances keep running out of memory, if you're writing more cleanup code than test logic—you're not alone. And maybe, just maybe, your test runner is hosting an involuntary browser tab convention too.
#Get Started
Want to stop the page leak crisis in your tests? PageMan takes 30 seconds to install:
npm install playwright-pageman
// Replace this:import { test, expect } from '@playwright/test'// With this:import { test, expect } from 'playwright-pageman'// That's it! Auto-tracking is enabled by default.test('your test', async ({ browser }) => {const page1 = await browser.newPage() // Automatically trackedconst page2 = await browser.newPage() // Also automatically tracked// Both pages auto-close after the test})
No more manual cleanup. No more leaked pages. No more Jenkins tab parties.
Sometimes the smartest thing you can do is let the computer handle the boring stuff, so you can focus on what actually matters—making sure your conversation works from both sides.
Fighting your own page leak crisis? Found PageMan useful for your multi-page testing scenarios? Share your experience or contribute to the project. Let's make testing cleaner, one auto-closed page at a time.