Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions Bouncer/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,9 @@ node_modules/
dist/*
# Generated by generate-manifests.mjs from manifest.base.json + manifest.<target>.json
/manifest.json

# Playwright E2E: live login sessions (X cookies + extension Firebase auth) and reports
e2e/.userdata/
playwright-report/
test-results/
.auth/
80 changes: 80 additions & 0 deletions Bouncer/e2e/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# E2E tests (Playwright + real X)

These drive the **built, unpacked extension** in a real Chromium against the
live `x.com`. Because Chrome extensions require a headed *persistent* context,
all auth lives in one Chrome profile directory (`e2e/.userdata`, gitignored):

- the **X website** session (cookies), and
- the **extension's Firebase** session (Firebase persists auth in the
extension's IndexedDB, which lives inside the profile — it is *not* captured by
Playwright's `storageState`).

Log in once; every later run reuses that profile.

## Prerequisites

1. A **dedicated throwaway X account** (don't automate your personal account).
2. `.env.prod` populated with the Imbue keys (`FIREBASE_*`, `GOOGLE_CLIENT_ID`,
`IMBUE_WS_URL`). Without them the build stubs out auth and there's no Google
sign-in button. (Already set in this repo.)
3. Install the browser binary once: `npx playwright install chromium`.

## One-time login

```bash
npm run test:e2e:login
```

A real browser window opens on x.com. **By hand:** (1) log in to X, then (2) in
the feed click the Bouncer **Settings** gear → *Activate Bouncer* and complete
Google sign-in. The in-feed settings load the same UI as the toolbar popup, so
there's no separate popup tab. Then press ▶ Resume in the Playwright Inspector
(or close it). The profile is saved to `e2e/.userdata`.

## Run the suite

```bash
npm run test:e2e # headed (a browser window is visible)
npm run test:e2e:headless # no window — uses Chromium's new headless mode
```

Both scripts run `npm run build` first so the loaded extension is current.

### Headless

`HEADLESS=1` runs the suite with no visible window. Extensions do **not** load in
Playwright's old headless or the bundled `headless-shell`, so the fixture switches
to `channel: 'chromium'` (the full build) with `headless: true`, which uses
Chromium's *new* headless mode — the only headless variant that supports
extensions. The one-time `test:e2e:login` must stay **headed** (you log in by
hand); don't set `HEADLESS` for it.

## When tests fail with "filter bar not visible"

The saved session probably expired (X rotates cookies; Firebase tokens lapse).
Just re-run `npm run test:e2e:login`.

## What's covered

- `x-feed.spec.ts` — content script injects on the feed; a typed phrase persists.
- `filter-management.spec.ts` — comma-key commit, persistence across reload, no
duplicates, chip removal, the in-feed settings modal, and filter-bar
re-injection after SPA navigation.
- `popup-settings.spec.ts` — "filter replies" toggle, experimental AI-text
toggle + section reveal, threshold slider, and BYOK Anthropic key enable
(valid + invalid, with the verification request mocked via `context.route`).
- `filtering.spec.ts` — the end-to-end filtering behavior: seeds the Anthropic
BYOK path and mocks `api.anthropic.com` to mark every post as a match, then
asserts a post is hidden (`data-filtered-by-extension`), the "View filtered"
counter advances, and the filtered-posts modal lists it. The classification
fetch runs in the background service worker — `context.route` intercepts it.

## Notes / limits

- `workers` is pinned to 1 — persistent contexts don't parallelize.
- The filtering test mocks the classifier, so it proves the *pipeline + hide +
View-filtered* path is wired correctly — not that real AI inference is
accurate. It deliberately matches every post for determinism.
- This is **local-only**. Running against real X from GitHub CI is possible but
flaky (datacenter IPs get challenged, cookies expire); add it as a nightly /
manual workflow later, not a PR gate.
100 changes: 100 additions & 0 deletions Bouncer/e2e/filter-management.spec.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
import { test, expect, readStorage, openFeed, waitForFilterBar, FILTER_INPUT } from './fixtures';

const DESCRIPTIONS_KEY = 'descriptions_twitter';

async function descriptions(context: Parameters<typeof readStorage>[0]): Promise<string[]> {
const all = await readStorage(context);
return (all[DESCRIPTIONS_KEY] as string[] | undefined) ?? [];
}

test.describe('Filter management on the real X feed', () => {
test.beforeEach(async ({ context }) => {
let [sw] = context.serviceWorkers();
if (!sw) sw = await context.waitForEvent('serviceworker');
await sw.evaluate((key) => chrome.storage.local.remove(key), DESCRIPTIONS_KEY);
});

test('comma key commits a phrase', async ({ context }) => {
const page = await openFeed(context);
const input = await waitForFilterBar(page);

await input.click();
await input.fill('crypto');
await input.press(','); // the input commits on Enter OR comma

await expect.poll(() => descriptions(context), { timeout: 10_000 }).toContain('crypto');
});

test('phrases persist across a page reload', async ({ context }) => {
const page = await openFeed(context);
const input = await waitForFilterBar(page);
await input.click();
await input.fill('engagement bait');
await input.press('Enter');
await expect.poll(() => descriptions(context), { timeout: 10_000 }).toContain('engagement bait');

await page.reload({ waitUntil: 'domcontentloaded' });

// The phrase is re-rendered as a chip from storage after reload.
await expect(
page.locator('.filter-phrase-inline:visible', { hasText: 'engagement bait' }).first()
).toBeVisible({ timeout: 15_000 });
expect(await descriptions(context)).toContain('engagement bait');
});

test('the same phrase is not added twice', async ({ context }) => {
const page = await openFeed(context);
const input = await waitForFilterBar(page);

for (let i = 0; i < 2; i++) {
await input.click();
await input.fill('politics');
await input.press('Enter');
await expect.poll(() => descriptions(context), { timeout: 10_000 }).toContain('politics');
}

expect((await descriptions(context)).filter((d) => d === 'politics')).toHaveLength(1);
});

test('clicking a phrase chip removes it', async ({ context }) => {
const page = await openFeed(context);
const input = await waitForFilterBar(page);
await input.click();
await input.fill('sports');
await input.press('Enter');
await expect.poll(() => descriptions(context), { timeout: 10_000 }).toContain('sports');

const chip = page.locator('.filter-phrase-inline:visible', { hasText: 'sports' }).first();
await expect(chip).toBeVisible();
await chip.click(); // chips are titled "Click to remove"

await expect.poll(() => descriptions(context), { timeout: 10_000 }).not.toContain('sports');
});

test('the settings gear opens the in-feed settings modal', async ({ context }) => {
const page = await openFeed(context);
await waitForFilterBar(page);

await page.locator('.filter-settings-btn:visible').first().click();

const modal = page.locator('.settings-modal-overlay');
await expect(modal).toBeVisible({ timeout: 10_000 });
// The modal hosts the same popup UI in an iframe.
const iframe = modal.locator('iframe.settings-modal-iframe');
await expect(iframe).toHaveAttribute('src', /popup\.html$/);
});

test('the filter bar re-injects after SPA navigation', async ({ context }) => {
const page = await openFeed(context);
await waitForFilterBar(page);

// Navigate away within the SPA, then back — the MutationObserver should
// re-inject the bar without a full page load.
await page.locator('a[data-testid="AppTabBar_Explore_Link"], a[href="/explore"]').first().click();
await page.waitForURL('**/explore', { timeout: 15_000 });
await page.locator('a[data-testid="AppTabBar_Home_Link"], a[href="/home"]').first().click();
await page.waitForURL('**/home', { timeout: 15_000 });

await waitForFilterBar(page);
});
});
91 changes: 91 additions & 0 deletions Bouncer/e2e/filtering.spec.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
import { test, expect, openFeed } from './fixtures';

/**
* Tier 2: the actual filtering behavior. We point the classification pipeline
* at the Anthropic BYOK path (selectedModel = "anthropic:…" + a stored key) and
* mock that HTTP endpoint to mark EVERY post as a match. That makes the outcome
* deterministic without live AI inference or a real key.
*
* Note: the classification fetch is made by the extension's *background service
* worker*, so this also exercises Playwright's service-worker route interception.
*/

const SEED = {
selectedModel: 'anthropic:claude-haiku-4-5-20251001',
anthropicApiKey: 'sk-ant-e2e-test',
descriptions_twitter: ['crypto'],
filterReplies: true,
};

async function setStorage(context: Parameters<typeof openFeed>[0], items: Record<string, unknown>) {
let [sw] = context.serviceWorkers();
if (!sw) sw = await context.waitForEvent('serviceworker');
await sw.evaluate((i) => chrome.storage.local.set(i), items);
}

test.describe('Filtering behavior (mocked Anthropic BYOK)', () => {
test.afterEach(async ({ context }) => {
let [sw] = context.serviceWorkers();
if (!sw) sw = await context.waitForEvent('serviceworker');
await sw.evaluate(() =>
chrome.storage.local.remove(['selectedModel', 'anthropicApiKey', 'descriptions_twitter'])
);
});

test('a matching post is hidden and appears in "View filtered" with reasoning', async ({
context,
}) => {
let intercepted = 0;
await context.route('https://api.anthropic.com/**', (route) => {
intercepted++;
// parseAPIResponse() treats any <category> other than "no match"/"unknown"
// as a hit, and surfaces <reasoning> in the filtered-posts view.
return route.fulfill({
status: 200,
contentType: 'application/json',
body: JSON.stringify({
content: [
{
type: 'text',
text: '<reasoning>Promotes cryptocurrency.</reasoning><category>crypto</category>',
},
],
}),
});
});

await setStorage(context, SEED);
const page = await openFeed(context);

// Wait for the live feed to render real tweets. Use 'attached', not
// 'visible': the mock matches every post, so the extension may have already
// hidden (display:none) them by the time this runs.
await page.locator('article[data-testid="tweet"]').first().waitFor({
state: 'attached',
timeout: 20_000,
});

// Hidden posts are marked on their cell container by the Twitter adapter.
await expect
.poll(
() => page.locator('[data-testid="cellInnerDiv"][data-filtered-by-extension="true"]').count(),
{ timeout: 45_000, message: 'no post was hidden — was the SW fetch intercepted?' }
)
.toBeGreaterThan(0);

// Confirms the mock actually intercepted the background service worker's fetch.
expect(intercepted).toBeGreaterThan(0);

// The in-feed "View filtered (N)" counter should advance past zero.
await expect
.poll(async () => (await page.locator('.filtered-toggle-count:visible').first().textContent()) ?? '', {
timeout: 10_000,
})
.not.toBe('(0)');

// Opening the modal lists the filtered post(s); reasoning/category flow through.
await page.locator('.filtered-toggle-btn:visible').first().click();
await expect(page.locator('.filtered-view-container')).toBeVisible({ timeout: 10_000 });
await expect(page.locator('.slop-post-wrapper').first()).toBeVisible();
});
});
Loading
Loading