feat: bulk connections export tools by Desperado · Pull Request #170 · stickerdaniel/linkedin-mcp-server

Desperado · 2026-02-28T17:24:01Z

Summary

Adds two new MCP tools for bulk LinkedIn connections export:

`get_my_connections`

Collects connection usernames via infinite scroll on the connections page. Configurable limit and max_scrolls. Returns {username, name, headline} for each connection.

`extract_contact_details`

Enriches profiles with structured contact data by scraping the main profile page and contact info overlay. Returns parsed fields instead of raw text:

Field	Source
`first_name`, `last_name`	Profile page (first line)
`headline`	Profile page (after connection degree marker)
`location`	Profile page (after headline)
`company`	Profile page (after "Contact info" label)
`email`, `phone`, `website`, `birthday`	Contact info overlay (labeled sections)
`profile_raw`, `contact_info_raw`	Original innerText kept as fallback

Rate-limit handling: Processes profiles in chunked batches with configurable chunk_size (default 5) and chunk_delay (default 30s). Stops early on hard rate limit, returns partial results with rate_limited flag. Individual page loads retry once after 5s backoff on soft rate limits (empty-content responses).

Files changed

linkedin_mcp_server/tools/connections.py — New tool module (follows tools/person.py pattern)
linkedin_mcp_server/scraping/extractor.py — Added scrape_connections_list(), scrape_contact_batch(), and _parse_contact_record() parser
linkedin_mcp_server/server.py — Registered register_connections_tools(mcp)

Test plan

All 105 existing tests pass
ruff format, ruff check, ty check all clean
get_my_connections with limit=10 returns 10 valid usernames
extract_contact_details with 3 usernames returns structured fields (emails found for 2/3)
No rate limiting triggered during testing
Test with larger batch (50+) to verify chunking and inter-chunk delays

🤖 Generated with Claude Code

Greptile Summary

This PR adds two new MCP tools for bulk LinkedIn connections export: get_my_connections (infinite scroll collection of connection usernames) and extract_contact_details (batch enrichment with structured contact data). The implementation follows existing code patterns and includes thoughtful rate-limit handling with configurable chunking and delays.

Key changes:

New connections.py module with two tools following the existing tool registration pattern
Bulk export methods in extractor.py with chunked batch processing and soft/hard rate-limit handling
Structured contact parser _parse_contact_record that extracts fields like email, phone, location from raw profile text
ERR_ABORTED navigation error handling for timing edge cases
Network degree filter (1st/2nd/3rd+) added to search_people

Code quality: The implementation is well-structured with proper error handling, progress reporting, deduplication, and defensive programming. Most issues from previous review rounds have been addressed.

Minor issue: Inconsistent ERR_ABORTED handling between initial navigation and re-navigation (line 559) could cause crashes in rare edge cases where the page navigates away during scroll and re-navigation encounters the same timing issue.

Confidence Score: 4/5

This PR is safe to merge with low risk - well-tested implementation following established patterns
Score reflects thorough testing (105 tests passing), clean code following existing patterns, and comprehensive rate-limit handling. Deducting 1 point for the ERR_ABORTED inconsistency that could cause failures in rare edge cases where page navigation timing issues occur during re-navigation.
Pay attention to linkedin_mcp_server/scraping/extractor.py - verify the ERR_ABORTED handling inconsistency at line 559 won't impact production usage

Important Files Changed

Filename	Overview
linkedin_mcp_server/tools/connections.py	New file adding two MCP tools for bulk connection export - `get_my_connections` for scraping connection list and `extract_contact_details` for enriching profiles with contact data. Follows existing tool patterns, includes proper error handling and progress reporting.
linkedin_mcp_server/scraping/extractor.py	Adds bulk export methods `scrape_connections_list` and `scrape_contact_batch` with chunked rate-limit handling, plus `_parse_contact_record` parser for structured field extraction. Includes ERR_ABORTED navigation handling and soft rate-limit sentinels. Small `network` filter added to `search_people`.
linkedin_mcp_server/server.py	Simple registration of new connections tools module - follows existing pattern for tool registration with no changes to other logic.
linkedin_mcp_server/tools/person.py	Adds optional `network` parameter to `search_people` for filtering by connection degree (1st, 2nd, 3rd+) - minimal, focused change with proper parameter passthrough.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Tool as connections.py<br/>(extract_contact_details)
    participant Extractor as extractor.py<br/>(scrape_contact_batch)
    participant Browser as LinkedIn Pages
    
    Client->>Tool: extract_contact_details(usernames, chunk_size, chunk_delay)
    Tool->>Tool: Parse & deduplicate usernames
    Tool->>Extractor: scrape_contact_batch(usernames, chunk_size, chunk_delay)
    
    loop For each chunk
        loop For each username in chunk
            Extractor->>Browser: Navigate to profile page
            Browser-->>Extractor: profile_text
            
            alt Soft rate limit (empty content)
                Extractor->>Extractor: Check for _RATE_LIMITED_MSG sentinel
                Extractor->>Extractor: Skip username, add to failed[]
            else Success
                Extractor->>Browser: Navigate to contact info overlay
                Browser-->>Extractor: contact_text
                Extractor->>Extractor: _parse_contact_record(profile, contact)
                Extractor->>Extractor: Add to contacts[]
            end
            
            alt Hard rate limit (RateLimitError)
                Extractor->>Extractor: Add to failed[], set rate_limited=true
                Extractor->>Extractor: Break loop
            end
        end
        
        Extractor->>Tool: Report progress
        
        alt Not last chunk
            Extractor->>Extractor: Sleep(chunk_delay)
        end
    end
    
    Extractor-->>Tool: {contacts[], total, failed[], rate_limited, pages_visited[]}
    Tool-->>Client: Return enriched data

_{Last reviewed commit: b9add13}

greptile-apps

_{4 files reviewed, 20 comments}

_{Edit Code Review Agent Settings | Greptile}

linkedin_mcp_server/tools/connections.py

linkedin_mcp_server/scraping/extractor.py

linkedin_mcp_server/tools/connections.py

greptile-apps · 2026-02-28T18:22:26Z

linkedin_mcp_server/scraping/extractor.py

+                    // Headline: try known selectors, then parse card text
+                    let headline = '';
+                    if (card) {
+                        const headlineEl = card.querySelector(
+                            '.mn-connection-card__occupation, .entity-result__primary-subtitle, span.t-normal'
+                        );
+                        if (headlineEl) headline = headlineEl.innerText.trim();
+                    }
+                    if (!headline && card) {
+                        // Fallback: split card text by newlines, second non-empty line is usually headline
+                        const lines = card.innerText.split('\\n').map(l => l.trim()).filter(Boolean);


Soft rate-limit sentinel silently corrupts contact records

extract_page returns the module-level _RATE_LIMITED_MSG string sentinel ("[Rate limited] LinkedIn blocked this section…") when a soft rate limit persists after one retry, instead of raising RateLimitError. scrape_contact_batch never checks for this sentinel before calling _parse_contact_record, so the sentinel is treated as valid profile text.

The result is a silently corrupted record:

first_name → "[Rate"

last_name → "limited] LinkedIn blocked this section. Try again later or request fewer sections."

headline, location, company → None

The contact overlay is still scraped unnecessarily

The record is added to contacts with no error indication (only profile_raw would reveal the problem)

The same risk applies if _extract_overlay returns the sentinel for contact_text.

A guard should be added before calling _parse_contact_record:

# Scrape main profile page profile_text = await self.extract_page(profile_url) pages_visited.append(profile_url) if profile_text == _RATE_LIMITED_MSG: logger.warning("Soft rate limit on profile %s, skipping", username) failed.append(username) await asyncio.sleep(_NAV_DELAY) continue # Scrape contact info overlay contact_text = await self._extract_overlay(contact_url) pages_visited.append(contact_url) if contact_text == _RATE_LIMITED_MSG: contact_text = "" # fall back to empty; parsed fields will be None

Prompt To Fix With AI

This is a comment left during a code review. Path: linkedin_mcp_server/scraping/extractor.py Line: 560-570 Comment: **Soft rate-limit sentinel silently corrupts contact records** `extract_page` returns the module-level `_RATE_LIMITED_MSG` string sentinel (`"[Rate limited] LinkedIn blocked this section…"`) when a soft rate limit persists after one retry, instead of raising `RateLimitError`. `scrape_contact_batch` never checks for this sentinel before calling `_parse_contact_record`, so the sentinel is treated as valid profile text. The result is a silently corrupted record: - `first_name` → `"[Rate"` - `last_name` → `"limited] LinkedIn blocked this section. Try again later or request fewer sections."` - `headline`, `location`, `company` → `None` - The contact overlay is still scraped unnecessarily - The record is added to `contacts` with no error indication (only `profile_raw` would reveal the problem) The same risk applies if `_extract_overlay` returns the sentinel for `contact_text`. A guard should be added before calling `_parse_contact_record`: ```python # Scrape main profile page profile_text = await self.extract_page(profile_url) pages_visited.append(profile_url) if profile_text == _RATE_LIMITED_MSG: logger.warning("Soft rate limit on profile %s, skipping", username) failed.append(username) await asyncio.sleep(_NAV_DELAY) continue # Scrape contact info overlay contact_text = await self._extract_overlay(contact_url) pages_visited.append(contact_url) if contact_text == _RATE_LIMITED_MSG: contact_text = "" # fall back to empty; parsed fields will be None ``` How can I resolve this? If you propose a fix, please make it concise.

Fixed in beebf5e — added guards for both _RATE_LIMITED_MSG sentinels. Profile sentinel skips the username (added to failed), contact overlay sentinel falls back to empty string.

greptile-apps · 2026-02-28T18:22:27Z

linkedin_mcp_server/scraping/extractor.py

+        url = "https://www.linkedin.com/mynetwork/invite-connect/connections/"
+
+        await self._page.goto(url, wait_until="domcontentloaded", timeout=30000)
+        await detect_rate_limit(self._page)
+
+        try:


chunk_size=0 causes an unhandled ValueError

range(0, total, chunk_size) raises ValueError: range() arg 3 must not be zero when chunk_size is 0. This exception propagates to the tool handler in connections.py and is returned via handle_tool_error, but the error message ("range() arg 3 must not be zero") is opaque to the caller.

A simple guard at the top of the method would produce a much clearer error:

async def scrape_contact_batch( self, usernames: list[str], chunk_size: int = 5, chunk_delay: float = 30.0, progress_cb: Callable[[int, int], Awaitable[None]] | None = None, ) -> dict[str, Any]: if chunk_size <= 0: raise ValueError(f"chunk_size must be a positive integer, got {chunk_size}")

Prompt To Fix With AI

This is a comment left during a code review. Path: linkedin_mcp_server/scraping/extractor.py Line: 514-519 Comment: **`chunk_size=0` causes an unhandled `ValueError`** `range(0, total, chunk_size)` raises `ValueError: range() arg 3 must not be zero` when `chunk_size` is `0`. This exception propagates to the tool handler in `connections.py` and is returned via `handle_tool_error`, but the error message ("range() arg 3 must not be zero") is opaque to the caller. A simple guard at the top of the method would produce a much clearer error: ```python async def scrape_contact_batch( self, usernames: list[str], chunk_size: int = 5, chunk_delay: float = 30.0, progress_cb: Callable[[int, int], Awaitable[None]] | None = None, ) -> dict[str, Any]: if chunk_size <= 0: raise ValueError(f"chunk_size must be a positive integer, got {chunk_size}") ``` How can I resolve this? If you propose a fix, please make it concise.

Fixed in beebf5e — added if chunk_size <= 0: raise ValueError(...) guard at the top of scrape_contact_batch.

linkedin_mcp_server/scraping/extractor.py

greptile-apps · 2026-02-28T18:22:29Z

linkedin_mcp_server/scraping/extractor.py

+        progress_cb: Callable[[int, int], Awaitable[None]] | None = None,
+    ) -> dict[str, Any]:
+        """Enrich a list of profiles with contact details in chunked batches.
+
+        For each username: scrapes main profile + contact_info overlay.
+


Rate-limited username is not added to failed

When RateLimitError is caught, the current username is not appended to failed before breaking out of the loop. The return value only signals rate_limited: True but doesn't record which username triggered the stop, making it difficult for callers to resume from where processing halted.

except RateLimitError: logger.warning("Rate limited during contact batch at %s", username) failed.append(username) # record the username that triggered the stop rate_limited = True break

Prompt To Fix With AI

This is a comment left during a code review. Path: linkedin_mcp_server/scraping/extractor.py Line: 596-601 Comment: **Rate-limited username is not added to `failed`** When `RateLimitError` is caught, the current username is not appended to `failed` before breaking out of the loop. The return value only signals `rate_limited: True` but doesn't record which username triggered the stop, making it difficult for callers to resume from where processing halted. ```python except RateLimitError: logger.warning("Rate limited during contact batch at %s", username) failed.append(username) # record the username that triggered the stop rate_limited = True break ``` How can I resolve this? If you propose a fix, please make it concise.

Fixed in beebf5e — added failed.append(username) before the break.

greptile-apps · 2026-02-28T18:31:33Z

linkedin_mcp_server/scraping/extractor.py

+        await scroll_to_bottom(self._page, pause_time=1.0, max_scrolls=max_scrolls)
+
+        # Extract connection data from profile link elements
+        raw_connections: list[dict[str, str]] = await self._page.evaluate(
+            """() => {
+                const results = [];
+                const seen = new Set();
+                const links = document.querySelectorAll('main a[href*="/in/"]');
+                for (const a of links) {
+                    const href = a.getAttribute('href') || '';
+                    const match = href.match(/\\/in\\/([^/?#]+)/);
+                    if (!match) continue;
+                    const username = match[1];
+                    if (seen.has(username)) continue;
+                    seen.add(username);
+
+                    // Walk up to the connection card container
+                    const card = a.closest('li') || a.parentElement;
+
+                    // Name: try known selectors, then the link's own visible text
+                    let name = '';
+                    if (card) {
+                        const nameEl = card.querySelector(
+                            '.mn-connection-card__name, .entity-result__title-text, span[dir="ltr"], span.t-bold'
+                        );
+                        if (nameEl) name = nameEl.innerText.trim();
+                    }
+                    if (!name) {
+                        // The profile link itself often contains the person's name
+                        const linkText = a.innerText.trim();
+                        if (linkText && linkText.length < 80) name = linkText;
+                    }
+
+                    // Headline: try known selectors, then parse card text
+                    let headline = '';
+                    if (card) {
+                        const headlineEl = card.querySelector(
+                            '.mn-connection-card__occupation, .entity-result__primary-subtitle, span.t-normal'
+                        );
+                        if (headlineEl) headline = headlineEl.innerText.trim();
+                    }
+                    if (!headline && card) {
+                        // Fallback: split card text by newlines, second non-empty line is usually headline
+                        const lines = card.innerText.split('\\n').map(l => l.trim()).filter(Boolean);
+                        if (lines.length >= 2) headline = lines[1];
+                    }
+
+                    results.push({ username, name, headline });
+                }
+                return results;
+            }"""
+        )
+
+        # Apply limit
+        if limit > 0:
+            raw_connections = raw_connections[:limit]


Inefficient when limit is small - scrolls through all connections before truncating.

If limit=10 but user has 500 connections, this scrolls through all 500 (~8 minutes with 1s pauses), then discards 490. Consider checking len(results) >= limit inside the JavaScript loop and breaking early.

Prompt To Fix With AI

This is a comment left during a code review. Path: linkedin_mcp_server/scraping/extractor.py Line: 527-582 Comment: Inefficient when `limit` is small - scrolls through all connections before truncating. If `limit=10` but user has 500 connections, this scrolls through all 500 (~8 minutes with 1s pauses), then discards 490. Consider checking `len(results) >= limit` inside the JavaScript loop and breaking early. How can I resolve this? If you propose a fix, please make it concise.

Won't fix — the suggestion to break early in the JS loop wouldn't help because the expensive part is scroll_to_bottom(), which runs before the JS extraction. By the time the DOM query executes, all scrolling is already done. Users already control scroll depth via the max_scrolls parameter (e.g. max_scrolls=3 for quick results). A proper fix would require refactoring the generic scroll_to_bottom utility to accept an early-exit predicate, which is out of scope for this PR.

linkedin_mcp_server/tools/connections.py

greptile-apps · 2026-02-28T18:31:35Z

linkedin_mcp_server/tools/connections.py

+        limit: int = 0,
+        max_scrolls: int = 50,


No validation for negative values - max_scrolls=-10 would bypass scrolling entirely (range produces empty sequence).

Consider adding validation: if max_scrolls < 0: raise ValueError("max_scrolls must be non-negative")

Prompt To Fix With AI

This is a comment left during a code review. Path: linkedin_mcp_server/tools/connections.py Line: 37-38 Comment: No validation for negative values - `max_scrolls=-10` would bypass scrolling entirely (range produces empty sequence). Consider adding validation: `if max_scrolls < 0: raise ValueError("max_scrolls must be non-negative")` How can I resolve this? If you propose a fix, please make it concise.

Won't fix — max_scrolls=-10 simply makes range(max_scrolls) produce an empty sequence, meaning no scrolling occurs. This doesn't crash or corrupt data; it just returns whatever connections are visible without scrolling. Since this is an MCP tool parameter with a sensible default of 50, a negative value is an obvious caller mistake that produces a self-explanatory empty result. Adding validation here would be pure noise.

greptile-apps · 2026-02-28T19:18:22Z

linkedin_mcp_server/tools/connections.py

+    )
+    async def get_my_connections(
+        ctx: Context,
+        limit: int = 0,


Missing validation for negative limit values. If limit=-10, it bypasses the if limit > 0 check on extractor.py:581 and behaves as unlimited, which is counterintuitive.

Prompt To Fix With AI

This is a comment left during a code review. Path: linkedin_mcp_server/tools/connections.py Line: 37 Comment: Missing validation for negative `limit` values. If `limit=-10`, it bypasses the `if limit > 0` check on extractor.py:581 and behaves as unlimited, which is counterintuitive. How can I resolve this? If you propose a fix, please make it concise.

Won't fix — limit=-10 behaving as unlimited is fine. The parameter semantics are "0 = unlimited" and any non-positive value logically means "no limit". This is consistent and not a bug.

greptile-apps · 2026-02-28T19:18:23Z

linkedin_mcp_server/tools/connections.py

+        usernames: str,
+        ctx: Context,
+        chunk_size: int = 5,
+        chunk_delay: float = 30.0,


Missing validation for negative chunk_delay. A negative value would cause asyncio.sleep() to return immediately, bypassing the delay entirely.

Prompt To Fix With AI

This is a comment left during a code review. Path: linkedin_mcp_server/tools/connections.py Line: 92 Comment: Missing validation for negative `chunk_delay`. A negative value would cause `asyncio.sleep()` to return immediately, bypassing the delay entirely. How can I resolve this? If you propose a fix, please make it concise.

Won't fix — asyncio.sleep() with a negative value returns immediately (same as 0), which just means "no delay". This is an MCP tool parameter with a sensible default of 30s; passing a negative value is a caller error with a harmless outcome.

greptile-apps · 2026-02-28T19:18:24Z

linkedin_mcp_server/tools/connections.py

+    async def extract_contact_details(
+        usernames: str,
+        ctx: Context,
+        chunk_size: int = 5,


Missing validation for chunk_size < 1. While extractor.py validates <= 0, the error won't be clear to callers.

Prompt To Fix With AI

This is a comment left during a code review. Path: linkedin_mcp_server/tools/connections.py Line: 91 Comment: Missing validation for `chunk_size < 1`. While extractor.py validates `<= 0`, the error won't be clear to callers. How can I resolve this? If you propose a fix, please make it concise.

Won't fix — the extractor already validates chunk_size <= 0 with a clear ValueError. Duplicating validation at the tool layer adds no value; the error message from the extractor ("chunk_size must be a positive integer, got 0") is already user-friendly.

linkedin_mcp_server/tools/connections.py

linkedin_mcp_server/scraping/extractor.py

…contact_details) Two new MCP tools for collecting LinkedIn connections and enriching them with contact details (email, phone, etc.) in rate-limit-aware chunked batches. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- chunk_delay: int → float to match scrape_contact_batch signature - Report actual completed count instead of total on early rate-limit stop Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Instead of returning raw innerText blobs, parse profile and contact overlay text into structured fields (first_name, last_name, email, phone, headline, location, company, website, birthday). Raw text kept as _raw suffix fields for fallback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Use %.0fs for chunk_delay in log message (float, not int) - Update scrape_contact_batch docstring to list actual structured fields Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

When rate limiting stops processing early, the progress message now shows "Stopped early due to rate limit (N/M processed)" instead of the misleading "Complete". Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

The GitHub suggestion merge created duplicate lines (completed/msg assigned twice, report_progress called twice). Cleaned up to single correct version. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… regex, failed tracking - Guard against _RATE_LIMITED_MSG sentinel corrupting parsed records (skip profile on soft rate limit, fall back to empty contact text) - Validate chunk_size > 0 with clear error message - Extend degree regex to match ordinals like "3rd+" and "4th" - Add rate-limited username to failed list for caller resumability Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Prevents scraping the same profile twice when duplicate usernames are passed (e.g. "user1,user1,user2"). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Support filtering LinkedIn people search by connection degree (1st/2nd/3rd+) via the `network` parameter passed through to LinkedIn's search URL. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add stabilization delay after scroll_to_bottom and re-navigate if LinkedIn redirected away from the connections page during infinite scroll. Prevents "Execution context was destroyed" errors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Catch ERR_ABORTED on initial goto (happens when page is already loaded or LinkedIn redirects during navigation), retry after delay - Add stabilization delay after scroll_to_bottom - Re-navigate if LinkedIn redirected away during infinite scroll Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

greptile-apps · 2026-03-22T14:57:24Z

Greptile Summary

This PR adds two new MCP tools — get_my_connections (bulk connection list via infinite scroll) and extract_contact_details (batch profile enrichment with contact data) — plus a network filter on the existing search_people tool. The tool-layer wiring, progress reporting, deduplication, chunked rate-limit handling, and the _parse_contact_record parser are all well-structured.

However, extract_contact_details is entirely broken in its current state due to a single root-cause error in scrape_contact_batch:

extract_page and _extract_overlay both require a positional section_name: str argument (see extractor.py lines 440–443, 553–556). Both calls in scrape_contact_batch omit this argument, causing TypeError on every iteration.
The TypeError is caught by the bare except Exception block and the username is silently added to failed — so the method always returns an empty contacts list with all usernames in failed.
Even after providing section_name, both methods return ExtractedSection objects, not strings. The sentinel comparisons (profile_text == _RATE_LIMITED_MSG) will always be False, and passing ExtractedSection objects to _parse_contact_record (which expects str) would raise AttributeError. The fix is to extract .text from the returned objects, consistent with every other caller in the codebase.

The fix is a targeted, mechanical change in scrape_contact_batch (see inline comment). Everything else in the PR is on the happy path.

Confidence Score: 3/5

Not safe to merge — extract_contact_details silently returns empty results for every input due to a TypeError from missing section_name arguments.
The primary new feature (extract_contact_details) is completely non-functional: every profile enrichment attempt throws TypeError (missing required section_name arg), is caught silently, and lands in failed. This breaks the main user path. The fix is mechanical and confined to ~10 lines of scrape_contact_batch, so it's one targeted change away from being mergeable. All prior review concerns are resolved, and get_my_connections and the network filter are clean.
linkedin_mcp_server/scraping/extractor.py — specifically scrape_contact_batch (lines 1411–1431)

Important Files Changed

Filename	Overview
linkedin_mcp_server/scraping/extractor.py	Adds `_parse_contact_record`, `scrape_connections_list`, and `scrape_contact_batch`. Critical bug: `scrape_contact_batch` calls `extract_page` and `_extract_overlay` without the required `section_name` argument and treats the `ExtractedSection` return value as a raw string, causing `TypeError` (silently swallowed) for every profile — `extract_contact_details` will never enrich any profile.
linkedin_mcp_server/tools/connections.py	New tool module for `get_my_connections` and `extract_contact_details`. Tool layer looks correct — progress reporting, deduplication, rate-limit messaging, and error handling are all properly implemented. Quality depends on the underlying extractor being fixed.
linkedin_mcp_server/server.py	One-line registration of the new connections tool module — straightforward and correct.
linkedin_mcp_server/tools/person.py	Adds optional `network` filter parameter to `search_people` — clean, minimal change, correctly threaded through to the extractor.

Sequence Diagram

sequenceDiagram
    participant Client as MCP Client
    participant CT as connections.py
    participant EX as LinkedInExtractor
    participant LI as LinkedIn

    Client->>CT: extract_contact_details(usernames, chunk_size, chunk_delay)
    CT->>EX: scrape_contact_batch(usernames, chunk_size, chunk_delay, progress_cb)

    loop Each chunk
        loop Each username in chunk
            EX->>LI: extract_page(profile_url, section_name) → ExtractedSection
            LI-->>EX: profile innerText
            EX->>LI: _extract_overlay(contact_url, section_name) → ExtractedSection
            LI-->>EX: contact overlay innerText
            EX->>EX: _parse_contact_record(profile_text, contact_text)
            EX-->>CT: progress_cb(completed, total)
        end
        EX->>EX: asyncio.sleep(chunk_delay)
    end

    EX-->>CT: {contacts, total, failed, rate_limited, pages_visited}
    CT-->>Client: result dict

    Client->>CT: get_my_connections(limit, max_scrolls)
    CT->>EX: scrape_connections_list(limit, max_scrolls)
    EX->>LI: goto /mynetwork/invite-connect/connections/
    LI-->>EX: connections page
    EX->>EX: scroll_to_bottom(max_scrolls)
    EX->>LI: page.evaluate() — extract username/name/headline
    LI-->>EX: raw_connections[]
    EX-->>CT: {connections, total, url, pages_visited}
    CT-->>Client: result dict

Prompt To Fix All With AI

This is a comment left during a code review.
Path: linkedin_mcp_server/scraping/extractor.py
Line: 1413-1433

Comment:
**`extract_page` / `_extract_overlay` called with wrong signature — every profile fails**

Both `extract_page` and `_extract_overlay` have a required `section_name: str` second parameter (see lines 440–443 and 553–556 respectively). Calling them without it raises `TypeError: extract_page() missing 1 required positional argument: 'section_name'` / `_extract_overlay() missing 1 required positional argument: 'section_name'` for every iteration.

That `TypeError` is silently swallowed by the `except Exception` block, so every username ends up in `failed` and `contacts` is always empty — making `extract_contact_details` functionally broken.

Even after adding the missing argument, `extract_page` and `_extract_overlay` return `ExtractedSection` objects, not raw strings. The current comparisons to `_RATE_LIMITED_MSG` (e.g. `if profile_text == _RATE_LIMITED_MSG`) will always be `False` (comparing dataclass to `str`), and passing the objects directly to `_parse_contact_record(profile_text, contact_text)` would raise `AttributeError: 'ExtractedSection' object has no attribute 'split'`. The rest of the codebase consistently accesses `.text` (e.g. line 1126: `if extracted.text and extracted.text != _RATE_LIMITED_MSG`).

The fix requires both changes together:

```python
                    # Scrape main profile page
                    extracted_profile = await self.extract_page(profile_url, section_name="profile")
                    pages_visited.append(profile_url)
                    profile_text = extracted_profile.text

                    if profile_text == _RATE_LIMITED_MSG:
                        logger.warning(
                            "Soft rate limit on profile %s, skipping", username
                        )
                        failed.append(username)
                        await asyncio.sleep(_NAV_DELAY)
                        continue

                    # Scrape contact info overlay
                    extracted_contact = await self._extract_overlay(contact_url, section_name="contact_info")
                    pages_visited.append(contact_url)
                    contact_text = extracted_contact.text

                    if contact_text == _RATE_LIMITED_MSG:
                        contact_text = ""  # fall back to empty; parsed fields will be None
```

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (9): Last reviewed commit: "fix: Handle ERR_ABORTED and context dest..." | Re-trigger Greptile}

greptile-apps · 2026-03-22T14:57:28Z

linkedin_mcp_server/scraping/extractor.py

+                    profile_text = await self.extract_page(profile_url)
+                    pages_visited.append(profile_url)
+
+                    if profile_text == _RATE_LIMITED_MSG:
+                        logger.warning(
+                            "Soft rate limit on profile %s, skipping", username
+                        )
+                        failed.append(username)
+                        await asyncio.sleep(_NAV_DELAY)
+                        continue
+
+                    # Scrape contact info overlay
+                    contact_text = await self._extract_overlay(contact_url)
+                    pages_visited.append(contact_url)
+
+                    if contact_text == _RATE_LIMITED_MSG:
+                        contact_text = (
+                            ""  # fall back to empty; parsed fields will be None
+                        )
+
+                    parsed = _parse_contact_record(profile_text, contact_text)


extract_page / _extract_overlay called with wrong signature — every profile fails

Both extract_page and _extract_overlay have a required section_name: str second parameter (see lines 440–443 and 553–556 respectively). Calling them without it raises TypeError: extract_page() missing 1 required positional argument: 'section_name' / _extract_overlay() missing 1 required positional argument: 'section_name' for every iteration.

That TypeError is silently swallowed by the except Exception block, so every username ends up in failed and contacts is always empty — making extract_contact_details functionally broken.

Even after adding the missing argument, extract_page and _extract_overlay return ExtractedSection objects, not raw strings. The current comparisons to _RATE_LIMITED_MSG (e.g. if profile_text == _RATE_LIMITED_MSG) will always be False (comparing dataclass to str), and passing the objects directly to _parse_contact_record(profile_text, contact_text) would raise AttributeError: 'ExtractedSection' object has no attribute 'split'. The rest of the codebase consistently accesses .text (e.g. line 1126: if extracted.text and extracted.text != _RATE_LIMITED_MSG).

The fix requires both changes together:

# Scrape main profile page extracted_profile = await self.extract_page(profile_url, section_name="profile") pages_visited.append(profile_url) profile_text = extracted_profile.text if profile_text == _RATE_LIMITED_MSG: logger.warning( "Soft rate limit on profile %s, skipping", username ) failed.append(username) await asyncio.sleep(_NAV_DELAY) continue # Scrape contact info overlay extracted_contact = await self._extract_overlay(contact_url, section_name="contact_info") pages_visited.append(contact_url) contact_text = extracted_contact.text if contact_text == _RATE_LIMITED_MSG: contact_text = "" # fall back to empty; parsed fields will be None

Prompt To Fix With AI

This is a comment left during a code review. Path: linkedin_mcp_server/scraping/extractor.py Line: 1413-1433 Comment: **`extract_page` / `_extract_overlay` called with wrong signature — every profile fails** Both `extract_page` and `_extract_overlay` have a required `section_name: str` second parameter (see lines 440–443 and 553–556 respectively). Calling them without it raises `TypeError: extract_page() missing 1 required positional argument: 'section_name'` / `_extract_overlay() missing 1 required positional argument: 'section_name'` for every iteration. That `TypeError` is silently swallowed by the `except Exception` block, so every username ends up in `failed` and `contacts` is always empty — making `extract_contact_details` functionally broken. Even after adding the missing argument, `extract_page` and `_extract_overlay` return `ExtractedSection` objects, not raw strings. The current comparisons to `_RATE_LIMITED_MSG` (e.g. `if profile_text == _RATE_LIMITED_MSG`) will always be `False` (comparing dataclass to `str`), and passing the objects directly to `_parse_contact_record(profile_text, contact_text)` would raise `AttributeError: 'ExtractedSection' object has no attribute 'split'`. The rest of the codebase consistently accesses `.text` (e.g. line 1126: `if extracted.text and extracted.text != _RATE_LIMITED_MSG`). The fix requires both changes together: ```python # Scrape main profile page extracted_profile = await self.extract_page(profile_url, section_name="profile") pages_visited.append(profile_url) profile_text = extracted_profile.text if profile_text == _RATE_LIMITED_MSG: logger.warning( "Soft rate limit on profile %s, skipping", username ) failed.append(username) await asyncio.sleep(_NAV_DELAY) continue # Scrape contact info overlay extracted_contact = await self._extract_overlay(contact_url, section_name="contact_info") pages_visited.append(contact_url) contact_text = extracted_contact.text if contact_text == _RATE_LIMITED_MSG: contact_text = "" # fall back to empty; parsed fields will be None ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps bot reviewed Feb 28, 2026

View reviewed changes

linkedin_mcp_server/tools/connections.py Outdated Show resolved Hide resolved

linkedin_mcp_server/tools/connections.py Outdated Show resolved Hide resolved

greptile-apps bot reviewed Feb 28, 2026

View reviewed changes

linkedin_mcp_server/scraping/extractor.py Outdated Show resolved Hide resolved

linkedin_mcp_server/tools/connections.py Outdated Show resolved Hide resolved

greptile-apps bot reviewed Feb 28, 2026

View reviewed changes

linkedin_mcp_server/tools/connections.py Show resolved Hide resolved

greptile-apps bot reviewed Feb 28, 2026

View reviewed changes

linkedin_mcp_server/tools/connections.py Show resolved Hide resolved

linkedin_mcp_server/scraping/extractor.py Show resolved Hide resolved

linkedin_mcp_server/scraping/extractor.py Show resolved Hide resolved

linkedin_mcp_server/scraping/extractor.py Show resolved Hide resolved

greptile-apps bot reviewed Feb 28, 2026

View reviewed changes

linkedin_mcp_server/scraping/extractor.py Show resolved Hide resolved

Desperado and others added 13 commits March 22, 2026 15:46

fix: lint formatting and missing RateLimitError import

ff8f3cb

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: address Greptile review — chunk_delay type and progress accuracy

a7e90e0

- chunk_delay: int → float to match scrape_contact_batch signature - Report actual completed count instead of total on early rate-limit stop Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: address review — log format specifier and docstring accuracy

ef8dc9f

- Use %.0fs for chunk_delay in log message (float, not int) - Update scrape_contact_batch docstring to list actual structured fields Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Update linkedin_mcp_server/tools/connections.py

7e1b1ca

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

fix: resolve duplicate code from accepted Greptile suggestion

6b58d34

The GitHub suggestion merge created duplicate lines (completed/msg assigned twice, report_progress called twice). Cleaned up to single correct version. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: deduplicate usernames in extract_contact_details

5fd323e

Prevents scraping the same profile twice when duplicate usernames are passed (e.g. "user1,user1,user2"). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat: Add network degree filter to search_people

44dbed5

Support filtering LinkedIn people search by connection degree (1st/2nd/3rd+) via the `network` parameter passed through to LinkedIn's search URL. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Desperado force-pushed the feature/bulk-connections-export branch from b9add13 to 5c7d9f4 Compare March 22, 2026 14:54

greptile-apps bot reviewed Mar 22, 2026

View reviewed changes

Conversation

Desperado commented Feb 28, 2026 • edited by greptile-apps bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

get_my_connections

extract_contact_details

Files changed

Test plan

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps bot commented Mar 22, 2026

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Desperado commented Feb 28, 2026 •

edited by greptile-apps bot

Loading

`get_my_connections`

`extract_contact_details`

greptile-apps bot left a comment •

edited

Loading