Conversation
📝 Walkthrough整体概览将 robots.txt 从服务端动态生成迁移至静态公共文件,并移除公告轮播中的第二张幻灯片。 变更
预计代码审查难度🎯 2 (简单) | ⏱️ ~8 分钟 诗歌
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c350113602
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| @@ -0,0 +1,3 @@ | |||
| User-agent: * | |||
|
|
|||
| Sitemap: https://www.furrycons.cn/sitemap/index.xml | |||
There was a problem hiding this comment.
Use deployment host in robots sitemap URL
The new static robots file hardcodes https://www.furrycons.cn/sitemap/index.xml, which regresses the previous host-aware behavior for any deployment where the hostname is not www.furrycons.cn (for example staging or self-hosted mirrors). The codebase still supports configurable public hosts via NEXT_PUBLIC_WEBSITE_URL (see src/utils/env.ts and URL/canonical helpers), so this fixed value can cause crawlers to fetch the wrong sitemap and miss the current deployment’s URLs.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
public/robots.txt (1)
1-2: 建议补充Disallow或Allow指令以明确意图
User-agent: *后缺少任何Disallow或Allow规则。虽然按 robots.txt 规范省略Disallow在技术上等同于"允许所有路径",但显式声明可读性更好,也能避免某些爬虫解析歧义。♻️ 建议补充空 Disallow 以明确允许全站抓取
User-agent: * +Disallow: Sitemap: https://www.furrycons.cn/sitemap/index.xml🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@public/robots.txt` around lines 1 - 2, 在 robots.txt 中对现有的 "User-agent: *" 明确添加抓取规则以消除歧义;在同一文件中将 "User-agent: *" 下追加一行空的 Disallow(Disallow: )或明确的 Allow(Allow: /),例如在 "User-agent: *" 下添加 "Disallow: " 来显式允许全站抓取,保持可读性并避免爬虫解析歧义。
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@public/robots.txt`:
- Line 3: robots.txt currently hardcodes "Sitemap:
https://www.furrycons.cn/sitemap/index.xml" and lacks any Disallow rule; update
the implementation so the sitemap URL is generated from NEXT_PUBLIC_WEBSITE_URL
(or equivalent env var) and ensure non-production environments emit a "Disallow:
/" line—specifically modify the robots.txt generation/serving logic (or CI
deployment step that writes public/robots.txt) to build the sitemap entry from
NEXT_PUBLIC_WEBSITE_URL and conditionally prepend "Disallow: /" when NODE_ENV
!== "production" (or when the env indicates a preview), so preview/staging
deployments are not indexed.
---
Nitpick comments:
In `@public/robots.txt`:
- Around line 1-2: 在 robots.txt 中对现有的 "User-agent: *" 明确添加抓取规则以消除歧义;在同一文件中将
"User-agent: *" 下追加一行空的 Disallow(Disallow: )或明确的 Allow(Allow: /),例如在
"User-agent: *" 下添加 "Disallow: " 来显式允许全站抓取,保持可读性并避免爬虫解析歧义。
| @@ -0,0 +1,3 @@ | |||
| User-agent: * | |||
|
|
|||
| Sitemap: https://www.furrycons.cn/sitemap/index.xml | |||
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
git ls-files | head -20Repository: FurryConsChina/website
Length of output: 464
🏁 Script executed:
# Find key config and deployment files
fd -e yml -e yaml -e json -e sh | grep -E "(github|gitlab|jenkins|ci|cd|deploy|build)" | head -20Repository: FurryConsChina/website
Length of output: 48
🏁 Script executed:
# Check for robots.txt related files and configs
fd "robots" --type fRepository: FurryConsChina/website
Length of output: 83
🏁 Script executed:
# Look for environment variable configurations and overrides
fd -e env -e config | head -20Repository: FurryConsChina/website
Length of output: 48
🏁 Script executed:
# Read the current robots.txt content
cat public/robots.txtRepository: FurryConsChina/website
Length of output: 132
🏁 Script executed:
# Check deployment workflow for production
cat .github/workflows/deploy-prod-cn.ymlRepository: FurryConsChina/website
Length of output: 1723
🏁 Script executed:
# List all GitHub workflows to see if there are staging/preview deployments
ls -la .github/workflows/Repository: FurryConsChina/website
Length of output: 526
🏁 Script executed:
# Check .env.example for environment variables
cat .env.exampleRepository: FurryConsChina/website
Length of output: 257
🏁 Script executed:
# Search for any robots-related typescript/javascript files
find . -name "*robots*" -type fRepository: FurryConsChina/website
Length of output: 85
🏁 Script executed:
# Search for NEXT_PUBLIC_WEBSITE_URL usage
rg "NEXT_PUBLIC_WEBSITE_URL" -A 2 -B 2Repository: FurryConsChina/website
Length of output: 231
🏁 Script executed:
# Search for any typescript/javascript robots files in src directory
find src -name "*robots*" -type f 2>/dev/nullRepository: FurryConsChina/website
Length of output: 48
🏁 Script executed:
# Check next.config.js for robots.txt handling
cat next.config.jsRepository: FurryConsChina/website
Length of output: 3630
🏁 Script executed:
# Check Dockerfile for environment-specific handling
cat DockerfileRepository: FurryConsChina/website
Length of output: 2441
🏁 Script executed:
# Search for staging, preview, or dev deployment configurations
rg -i "staging|preview|dev" --glob "*.yml" --glob "*.yaml" --glob "*.json" -A 2 -B 2Repository: FurryConsChina/website
Length of output: 2500
🏁 Script executed:
# Check if there are environment-specific public directory or build steps
rg -rn "public/robots" -A 2 -B 2Repository: FurryConsChina/website
Length of output: 48
🏁 Script executed:
# Search for any robots-related environment overrides
rg -rn "robots\.txt" -A 3 -B 3Repository: FurryConsChina/website
Length of output: 48
🏁 Script executed:
# Search more broadly for any staging/preview/dev references
rg -i "(staging|preview|dev|development)" . --type-list | head -30Repository: FurryConsChina/website
Length of output: 455
🏁 Script executed:
# Check README for deployment information
cat README.md | grep -i -A 5 -B 5 "deploy\|staging\|preview"Repository: FurryConsChina/website
Length of output: 1322
🏁 Script executed:
# Check if there are any build scripts or deployment scripts
find . -name "*.sh" -o -name "Makefile" | head -20Repository: FurryConsChina/website
Length of output: 48
🏁 Script executed:
# Search for any references to multiple environments or domain configurations
rg "furrycons\.cn" -A 2 -B 2Repository: FurryConsChina/website
Length of output: 6175
🏁 Script executed:
# Check for any conditional environment handling in next.config.js or build process
rg "process\.env\." next.config.jsRepository: FurryConsChina/website
Length of output: 135
Sitemap URL 硬编码生产域名,非生产环境面临索引风险
robots.txt 已是静态文件,Sitemap 地址被固定为 https://www.furrycons.cn/sitemap/index.xml。更严重的是,该文件完全缺少 Disallow: / 规则,仅包含 User-agent: * 和 Sitemap 指向,意味着搜索引擎将索引该文件所在的任何环境中的所有页面。
存在的风险:
- 非生产环境被完整索引:README 提及 PR 提交会触发 Firebase 部署,若这些预览环境对外可访问,搜索引擎会抓取到不带 Disallow 规则的 robots.txt,导致这些临时环境被搜索引擎完整收录。
- 域名变更时需手动维护:静态文件需手动同步更新。
建议:
- 恢复动态生成方案(使用
NEXT_PUBLIC_WEBSITE_URL环境变量),确保各环境自动适配对应域名,并针对非生产环境注入Disallow: /规则;或 - 若保留静态文件,在 CI/CD 流程中为非生产环境部署显式添加覆盖步骤,确保注入
Disallow: /规则以防止索引。
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@public/robots.txt` at line 3, robots.txt currently hardcodes "Sitemap:
https://www.furrycons.cn/sitemap/index.xml" and lacks any Disallow rule; update
the implementation so the sitemap URL is generated from NEXT_PUBLIC_WEBSITE_URL
(or equivalent env var) and ensure non-production environments emit a "Disallow:
/" line—specifically modify the robots.txt generation/serving logic (or CI
deployment step that writes public/robots.txt) to build the sitemap entry from
NEXT_PUBLIC_WEBSITE_URL and conditionally prepend "Disallow: /" when NODE_ENV
!== "production" (or when the env indicates a preview), so preview/staging
deployments are not indexed.
Summary by CodeRabbit
发布说明
网站配置
更新