Skip to content

Commit f30d185

Browse files
authored
fixup title extraction across our use-cases (#62 improves #25)
2 parents 1e03587 + 95ae871 commit f30d185

File tree

11 files changed

+1753
-70
lines changed

11 files changed

+1753
-70
lines changed

.claude/commands/corpus-loop.md

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
---
2+
argument-hint: [corpus_slug]
3+
description: uses Playwright MCP and the `corpus:view` to parse page elements
4+
---
5+
6+
- using Playwright MCP, navigate to `http://localhost:3001/corpus/$1/gitcasso`
7+
- the page will have a div with id `gitcasso-comment-spots`, wait 500ms for it to settle
8+
- inside the `gitcasso-comment-spots` div you will see something like this:
9+
10+
```json
11+
{
12+
"url": "https://github.com/diffplug/selfie/issues/523",
13+
"allTextAreas": [
14+
{
15+
"textarea": "id='feedback' name='feedback' className='form-control width-full mb-2'",
16+
"spot": "NO_SPOT"
17+
},
18+
{
19+
"textarea": "id=':rn:' name='' className='prc-Textarea-TextArea-13q4j overtype-input'",
20+
"spot": {
21+
"domain": "github.com",
22+
"number": 523,
23+
"slug": "diffplug/selfie",
24+
"title": "TODO_TITLE",
25+
"type": "GH_ISSUE_ADD_COMMENT",
26+
"unique_key": "github.com:diffplug/selfie:523"
27+
}
28+
}
29+
]
30+
}
31+
```
32+
33+
- this output means that this page is simulating the url `https://github.com/diffplug/selfie/issues/523`
34+
- every textarea on the page is represented
35+
- `NO_SPOT` means that the spot was not enhanced
36+
- `type: GH_ISSUE_ADD_COMMENT` means that it was enhanced by whichever implementation of `CommentEnhancer` returns the spot type `GH_ISSUE_ADD_COMMENT`
37+
- if you search for that string in `src/lib/enhancers` you will find the correct one
38+
- the `tryToEnhance` method returned a `CommentSpot`, and that whole data is splatted out above
39+
40+
If you make a change to the code of the enhancer, you can click the button with id `gitcasso-rebuild-btn`. It will trigger a rebuild of the browser extension, and then refresh the page. You'll be able to see the effects of your change in the `gitcasso-comment-spots` div described above.
41+
42+
## Common extraction workflow
43+
44+
If you see `"title": "TODO_TITLE"` or similar hardcoded `TODO` values in the JSON output, this indicates the enhancer needs some kind of extraction implemented:
45+
46+
1. **Find the enhancer**: Search for the `type` value (e.g., `GH_ISSUE_ADD_COMMENT`) in `src/lib/enhancers/`
47+
2. **Implement extraction**: Replace hardcoded title with DOM extraction:
48+
```javascript
49+
const title = document.querySelector('main h1')!.textContent.replace(/\s*#\d+$/, '').trim()
50+
```
51+
4. **Test with rebuild**: Click the 🔄 button to rebuild and verify the title appears correctly in the JSON
52+
53+
## Extraction code style
54+
55+
- Don't hedge your bets and write lots of fallback code or strings of `?.`. Have a specific piece of data you want to get, use non-null `!` assertions where necessary to be clear about getting.
56+
- If a field is empty, represent it with an empty string. Don't use placeholders when extracting data.
57+
- The pages we are scraping are going to change over time, and it's easier to fix broken ones if we know exactly what used to work. If the code has lots of branching paths, it's harder to tell what it was doing.

.gitignore

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -17,9 +17,5 @@ dist/
1717
.DS_Store
1818
Thumbs.db
1919

20-
# playright
20+
# playwright
2121
.playwright-mcp/
22-
browser-extension/dist-playground/
23-
browser-extension/playwright-report/
24-
browser-extension/playwright/
25-
browser-extension/test-results/

CLAUDE.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
1-
Please refer to `CONTRIBUTING.md` and `README.md`.
1+
Refer to `CONTRIBUTING.md` for the project's architecture and useful commands.
22

33
Whenever you complete a task, if you wish some info had been provided to you ahead of time instead of figuring it out from scratch, you have permission to edit this `CLAUDE.md` to add any helpful context.

src/lib/enhancers/github/githubIssueAddComment.tsx

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,10 @@ export class GitHubIssueAddCommentEnhancer implements CommentEnhancer<GitHubIssu
4242
const slug = `${owner}/${repo}`
4343
const number = parseInt(numberStr!, 10)
4444
const unique_key = `github.com:${slug}:${number}`
45-
const title = 'TODO_TITLE'
45+
const title = document
46+
.querySelector('main h1')!
47+
.textContent.replace(/\s*#\d+$/, '')
48+
.trim()
4649
return {
4750
domain: location.host,
4851
number,
@@ -77,7 +80,7 @@ export class GitHubIssueAddCommentEnhancer implements CommentEnhancer<GitHubIssu
7780
)
7881
}
7982

80-
tableTitle(_spot: GitHubIssueAddCommentSpot): string {
81-
return 'TITLE_TODO'
83+
tableTitle(spot: GitHubIssueAddCommentSpot): string {
84+
return spot.title
8285
}
8386
}

src/lib/enhancers/github/githubIssueNewComment.tsx

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ interface GitHubIssueNewCommentSpot extends CommentSpot {
99
type: 'GH_ISSUE_NEW_COMMENT'
1010
domain: string
1111
slug: string // owner/repo
12+
title: string
1213
}
1314

1415
export class GitHubIssueNewCommentEnhancer implements CommentEnhancer<GitHubIssueNewCommentSpot> {
@@ -17,9 +18,12 @@ export class GitHubIssueNewCommentEnhancer implements CommentEnhancer<GitHubIssu
1718
}
1819

1920
tryToEnhance(
20-
_textarea: HTMLTextAreaElement,
21+
textarea: HTMLTextAreaElement,
2122
location: StrippedLocation,
2223
): GitHubIssueNewCommentSpot | null {
24+
if (textarea.id === 'feedback') {
25+
return null
26+
}
2327
if (location.host !== 'github.com') {
2428
return null
2529
}
@@ -34,9 +38,12 @@ export class GitHubIssueNewCommentEnhancer implements CommentEnhancer<GitHubIssu
3438
const [, owner, repo] = match
3539
const slug = `${owner}/${repo}`
3640
const unique_key = `github.com:${slug}:new`
41+
const titleInput = document.querySelector('input[placeholder="Title"]') as HTMLInputElement
42+
const title = titleInput?.value || ''
3743
return {
3844
domain: location.host,
3945
slug,
46+
title,
4047
type: 'GH_ISSUE_NEW_COMMENT',
4148
unique_key,
4249
}
@@ -62,8 +69,8 @@ export class GitHubIssueNewCommentEnhancer implements CommentEnhancer<GitHubIssu
6269
)
6370
}
6471

65-
tableTitle(_spot: GitHubIssueNewCommentSpot): string {
66-
return 'New Issue'
72+
tableTitle(spot: GitHubIssueNewCommentSpot): string {
73+
return spot.title || 'New Issue'
6774
}
6875

6976
buildUrl(spot: GitHubIssueNewCommentSpot): string {

src/lib/enhancers/github/githubPRAddComment.tsx

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,10 @@ export class GitHubPRAddCommentEnhancer implements CommentEnhancer<GitHubPRAddCo
3838
const slug = `${owner}/${repo}`
3939
const number = parseInt(numberStr!, 10)
4040
const unique_key = `github.com:${slug}:${number}`
41-
const title = 'TODO_TITLE'
41+
const title = document
42+
.querySelector('main h1')!
43+
.textContent.replace(/\s*#\d+$/, '')
44+
.trim()
4245
return {
4346
domain: location.host,
4447
number,
@@ -70,7 +73,7 @@ export class GitHubPRAddCommentEnhancer implements CommentEnhancer<GitHubPRAddCo
7073
)
7174
}
7275

73-
tableTitle(_spot: GitHubPRAddCommentSpot): string {
74-
return 'TITLE_TODO'
76+
tableTitle(spot: GitHubPRAddCommentSpot): string {
77+
return spot.title
7578
}
7679
}

src/lib/enhancers/github/githubPRNewComment.tsx

Lines changed: 16 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,10 @@ import { prepareGitHubHighlighter } from './githubHighlighter'
88
interface GitHubPRNewCommentSpot extends CommentSpot {
99
type: 'GH_PR_NEW_COMMENT'
1010
domain: string
11-
slug: string // owner/repo/base-branch/compare-branch
11+
slug: string // owner/repo
12+
title: string
13+
head: string // `user:repo:branch` where changes are implemented
14+
base: string // branch you want changes pulled into
1215
}
1316

1417
export class GitHubPRNewCommentEnhancer implements CommentEnhancer<GitHubPRNewCommentSpot> {
@@ -38,13 +41,19 @@ export class GitHubPRNewCommentEnhancer implements CommentEnhancer<GitHubPRNewCo
3841

3942
if (!match) return null
4043
const [, owner, repo, baseBranch, compareBranch] = match
41-
const slug = baseBranch
42-
? `${owner}/${repo}/${baseBranch}...${compareBranch}`
43-
: `${owner}/${repo}/${compareBranch}`
44-
const unique_key = `github.com:${slug}`
44+
const slug = `${owner}/${repo}`
45+
const base = baseBranch || 'main'
46+
const head = compareBranch!
47+
const unique_key = `github.com:${slug}:${base}...${head}`
48+
const titleInput = document.querySelector('input[placeholder="Title"]') as HTMLInputElement
49+
const title = titleInput!.value
50+
4551
return {
52+
base,
4653
domain: location.host,
54+
head,
4755
slug,
56+
title,
4857
type: 'GH_PR_NEW_COMMENT',
4958
unique_key,
5059
}
@@ -70,8 +79,8 @@ export class GitHubPRNewCommentEnhancer implements CommentEnhancer<GitHubPRNewCo
7079
)
7180
}
7281

73-
tableTitle(_spot: GitHubPRNewCommentSpot): string {
74-
return 'TITLE_TODO'
82+
tableTitle(spot: GitHubPRNewCommentSpot): string {
83+
return spot.title || 'New Pull Request'
7584
}
7685

7786
buildUrl(spot: GitHubPRNewCommentSpot): string {

tests/corpus-view.ts

Lines changed: 11 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -477,28 +477,28 @@ function createCommentSpotDisplayScript(urlParts: ReturnType<typeof getUrlParts>
477477
478478
function updateCommentSpotDisplay() {
479479
const textareas = document.querySelectorAll('textarea');
480-
const spotsFound = [];
480+
const allTextAreas = [];
481481
482482
for (const textarea of textareas) {
483-
const forValue = 'id=' + textarea.id + ' name=' + textarea.name + ' className=' + textarea.className;
483+
const forValue = "id='" + textarea.id + "' name='" + textarea.name + "' className='" + textarea.className + "'";
484484
const enhancedItem = window.gitcassoTextareaRegistry ? window.gitcassoTextareaRegistry.get(textarea) : undefined;
485485
if (enhancedItem) {
486-
spotsFound.push({
487-
for: forValue,
486+
allTextAreas.push({
487+
textarea: forValue,
488488
spot: enhancedItem.spot,
489-
title: enhancedItem.enhancer.tableTitle(enhancedItem.spot),
490489
});
491490
} else {
492-
spotsFound.push({
493-
for: forValue,
491+
allTextAreas.push({
492+
textarea: forValue,
494493
spot: 'NO_SPOT',
495494
});
496495
}
497496
}
498-
499-
console.log('Enhanced textareas:', spotsFound.filter(s => s.spot !== 'NO_SPOT').length);
500-
console.log('All textareas on page:', textareas.length);
501-
commentSpotDisplay.innerHTML = '<div style="' + styles.header + '"><pre>${urlParts.href}\\n' + JSON.stringify(spotsFound, null, 2) + '</pre></div>';
497+
const harness = {
498+
url: '${urlParts.href}',
499+
allTextAreas: allTextAreas
500+
}
501+
commentSpotDisplay.innerHTML = '<div style="' + styles.header + '"><pre>' + JSON.stringify(harness, null, 1) + '</pre></div>';
502502
}
503503
504504
// Initial update
@@ -508,9 +508,6 @@ function createCommentSpotDisplayScript(urlParts: ReturnType<typeof getUrlParts>
508508
setTimeout(updateCommentSpotDisplay, 400);
509509
setTimeout(updateCommentSpotDisplay, 800);
510510
511-
// Update display periodically
512-
setInterval(updateCommentSpotDisplay, 2000);
513-
514511
document.body.appendChild(commentSpotDisplay);
515512
`
516513
}

tests/corpus/_corpus-index.ts

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,11 @@ export const CORPUS: Record<string, CorpusEntry> = {
1717
type: 'html',
1818
url: 'https://github.com/diffplug/gitcasso/issues/56',
1919
},
20+
gh_issue_new_populated: {
21+
description: 'a new issue wiht some fields filled out',
22+
type: 'html',
23+
url: 'https://github.com/diffplug/gitcasso/issues/new',
24+
},
2025
gh_issue_populated_comment: {
2126
description: 'comment text box has some text',
2227
type: 'html',

tests/corpus/gh_issue_new_populated.html

Lines changed: 1586 additions & 0 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)