Skip to content

WIP: Introduce typesense search #7877

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 19 commits into
base: master
Choose a base branch
from
Draft
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions source/_static/css/homepage-v1.css

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion source/_static/css/homepage-v1.css.map

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions source/_static/js/myscript-v1.js
Original file line number Diff line number Diff line change
@@ -2,6 +2,7 @@ $(document).ready(function () {
// Function to set the custom theme attribute based on the current theme
function setCustomTheme(theme) {
$('body').attr('data-custom-theme', theme);
$('html').attr('data-theme', theme);
}

// Check for a manually set theme
7 changes: 7 additions & 0 deletions source/_static/scss/partials/_sidebar.scss
Original file line number Diff line number Diff line change
@@ -50,7 +50,13 @@
}
}

.DocSearch-Container {
margin-top: 55px;
}

.sidebar-search {
display: none;

border-radius: 4px;
border: 1px solid rgba(63, 67, 80, 0.16);
background: #FFF;
@@ -69,6 +75,7 @@ body:not([data-custom-theme="light"]) {
}

.sidebar-search-container:before {
height: 0px;
left: 32px;
z-index: 20;
background-color: currentColor;
33 changes: 28 additions & 5 deletions source/_templates/custom-nav.html
Original file line number Diff line number Diff line change
@@ -1,3 +1,26 @@
<script src="https://cdn.jsdelivr.net/npm/typesense-docsearch.js@3.4.0"></script>
<script>
window.addEventListener('load', () => {
const options = location.hostname === 'localhost' ? {
node: { host: 'localhost', port: '8108', protocol: 'http' },
apiKey: 'test_api_key',
} : {
node: { host: 'h0agqxfpir543j9lp-1.a1.typesense.net', port: '443', protocol: 'https' },
apiKey: 'ZOLa3xKupe9e7DRPhkv56g8VUoCygd00',
};

docsearch({
container: '.sidebar-search-container',
typesenseCollectionName: 'mm_product_docs',
typesenseServerConfig: {
nodes: [options.node],
apiKey: options.apiKey
},
typesenseSearchParams: {}
});
});
</script>

<div data-swiftype-index=false class="notification-bar sticky-top">
<div class="notification-bar__content">
<a class="notification-bar__close" data-ol-has-click-handler="">
@@ -69,7 +92,7 @@
<a href="https://mattermost.com/channels/">
Channels
</a>
</li>
</li>
<li class="sub-menu__links--single">
<a href="https://mattermost.com/playbooks/">
Playbooks
@@ -127,7 +150,7 @@
<a href="https://mattermost.com/enterprise/cloud/">
Cloud
</a>
</li>
</li>
</ul>
</div>
</div>
@@ -173,7 +196,7 @@
<a href="https://mattermost.com/solutions/use-cases/integrated-security-operations/">
Integrated Security Operations
</a>
</li>
</li>
<li class="sub-menu__links--single">
<a href="https://mattermost.com/solutions/use-cases/out-of-band-incident-response/">
Out-of-Band Incident Response
@@ -384,7 +407,7 @@
<div class="sub-menu__dev-link-col">
<p class="sub-menu__link-col-header sub-menu__link-col-header-space">Documentation</p>
<ul>

<li class="sub-menu__links--single">
<a href="https://academy.mattermost.com/">
Academy
@@ -429,7 +452,7 @@
<a href="https://mattermost.com/community/">
Join Community
</a>
</li>
</li>
<li class="sub-menu__links--single">
<a href="https://mattermost.com/contribute/">
Contribute
3 changes: 2 additions & 1 deletion source/conf.py
Original file line number Diff line number Diff line change
@@ -4072,7 +4072,8 @@ def setup(_: Sphinx):
html_css_files = [
"css/mattermost-global.css",
"css/homepage-v1.css",
"css/compass-icons.css"
"css/compass-icons.css",
"https://cdn.jsdelivr.net/npm/typesense-docsearch-css@0.3.0",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably want to bundle this, instead of serve from cdn, to be consistent with existing assets

]

# A list of JavaScript filenames. The entry must be a filename string or a tuple containing the filename string and the
8 changes: 8 additions & 0 deletions typesense/.env.sample
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
export TYPESENSE_API_KEY=test_api_key
export TYPESENSE_HOSTNAME=localhost
export TYPESENSE_ORIGIN=http://localhost:8108
export TYPESENSE_PORT=8108
export TYPESENSE_PROTOCOL=http

export TYPESENSE_COLLECTION_NAME="mm_product_docs_1745019244"
export DOCS_SITE_ORIGIN="http://localhost:8000"
1 change: 1 addition & 0 deletions typesense/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
.env
43 changes: 43 additions & 0 deletions typesense/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Using Typesense for documentation search

With 3 terminals open, and run the following in the first two:
- `make livehtml` - Run Sphinx docs server
- `cd typesense && docker compose` - Run Typesense server and Typesense dashboard. You can access the Typesense dashboard at http://localhost:8001

After those are up and running, run this in the third terminal:
- `cd typesense && docker compose --profile optional up scraper` - Run scraper to populate Typesense. The process will exit once complete.

After running the scraper, we need to do some processing to make search result urls relative to the docs site.
- `cd typesense && ./post-process-typesense-data.sh`

If you'd like to re-index the Typesense collection, you can run:

```sh
cd typesense

# Optionally delete all existing documents in the collection. Typesense will de-duplicate docs naturally, but this reset operation forces it to remove metadata from previous runs that we may want to remove as we change the schema/filters.
./scripts/reset-typesense-collection.sh

# Re-run scraper to populate Typesense
docker compose --profile optional up scraper
```

To export the index into a jsonl file, run:

```sh
cd typesense

./scripts/download-typesense-collection.sh
```

The output of the command will be a `documents.jsonl` file in the current directory.

---

The scripts mentioned above support the following environment variables for configuration:

- `TYPESENSE_API_KEY` - Defaults to `test_api_key`
- `TYPESENSE_ORIGIN` - Defaults to `http://localhost:8108`
- `TYPESENSE_HOSTNAME` - Defaults to `localhost`
- `TYPESENSE_PORT` - Defaults to `8108`
- `TYPESENSE_PROTOCOL` - Defaults to `http`
39 changes: 39 additions & 0 deletions typesense/config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
{
"index_name": "mm_product_docs",
"allowed_domains": [
"localhost",
"mattermost-docs-preview-pulls.s3-website-us-east-1.amazonaws.com"
],
"start_urls": [
{
"url": "http://localhost:8000",
"tags": []
}
],
"sitemap_urls": [
"http://localhost:8000/sitemap.xml"
],
"selectors": {
"default": {
"lvl0": "article h1",
"lvl1": "article h2",
"lvl2": "article h3",
"lvl3": "article h4",
"lvl4": "article h5",
"lvl5": "article h6",
"text": "article p, article li"
}
},
Comment on lines +7 to +26
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add selectors for config settings, and section off user/admin docs. Below is an example from docsearch docs for sectioning off different content
https://docsearch.algolia.com/docs/legacy/config-file/#selectors_key-tailor-your-selectors

{
  "start_urls": [
    {
      "url": "http://www.example.com/docs/faq/",
      "selectors_key": "faq"
    },
    {
      "url": "http://www.example.com/docs/"
    }
  ],
  "selectors": {
    "default": {
      "lvl0": ".docs h1",
      "lvl1": ".docs h2",
      "lvl2": ".docs h3",
      "lvl3": ".docs h4",
      "lvl4": ".docs h5",
      "text": ".docs p, .docs li"
    },
    "faq": {
      "lvl0": ".faq h1",
      "lvl1": ".faq h2",
      "lvl2": ".faq h3",
      "lvl3": ".faq h4",
      "lvl4": ".faq h5",
      "text": ".faq p, .faq li"
    }
  }
}

"custom_settings": {
"token_separators": [
"-"
],
"symbols_to_index": [
"@"
]
},
"strip_chars": " .,;:#",
"stop_urls": [],
"scrape_start_urls": false,
"nb_hits": 64205
}
35 changes: 35 additions & 0 deletions typesense/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
services:
typesense:
image: typesense/typesense:0.24.0
environment:
- TYPESENSE_API_KEY=test_api_key
- TYPESENSE_DATA_DIR=/data
- TYPESENSE_ENABLE_CORS=true
ports:
- "8108:8108"
volumes:
- typesense-data:/data

typesense-dashboard:
image: ghcr.io/bfritscher/typesense-dashboard:latest
ports:
- "8001:80"

scraper:
image: typesense/docsearch-scraper
profiles:
- optional
volumes:
- ./config.json:/app/config.json
network_mode: "host"
environment:
- CONFIG=/app/config.json
- TYPESENSE_DATA_DIR=/data
- TYPESENSE_ENABLE_CORS=true
- TYPESENSE_API_KEY=${TYPESENSE_API_KEY:-test_api_key}
- TYPESENSE_HOST=${TYPESENSE_HOSTNAME:-localhost}
- TYPESENSE_PORT=${TYPESENSE_PORT:-8108}
- TYPESENSE_PROTOCOL=${TYPESENSE_PROTOCOL:-http}

volumes:
typesense-data:
18 changes: 18 additions & 0 deletions typesense/post-process-typesense-data.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
set -e

export TYPESENSE_ORIGIN="${TYPESENSE_ORIGIN:-http://localhost:8108}"
export TYPESENSE_API_KEY="${TYPESENSE_API_KEY:-test_api_key}"
export DOCS_SITE_ORIGIN="${DOCS_SITE_ORIGIN:-http://localhost:8000}"
export TYPESENSE_COLLECTION_NAME="${TYPESENSE_COLLECTION_NAME:-mm_product_docs}"

echo "Downloading typesense collection"
./scripts/download-typesense-collection.sh

echo "Cleaning relative links in typesense collection"
./scripts/clean-relative-links-in-typesense-collection.sh

echo "Importing typesense collection"
./scripts/import-typesense-collection.sh

echo "Making alias for typesense collection"
./scripts/make-alias-for-typesense-collection.sh
18 changes: 18 additions & 0 deletions typesense/scripts/clean-relative-links-in-typesense-collection.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
input_file="documents.jsonl"
output_file="processed_documents.jsonl"

DOCS_SITE_ORIGIN="${DOCS_SITE_ORIGIN:-http://localhost:8000}"

cat "documents.jsonl" | python3 -c "
import sys, json

for line in sys.stdin:
try:
doc = json.loads(line)
for key in ['url', 'url_without_anchor', 'url_without_variables']:
if key in doc and doc[key].startswith('${DOCS_SITE_ORIGIN}'):
doc[key] = doc[key][len('${DOCS_SITE_ORIGIN}'):]
print(json.dumps(doc))
except Exception as e:
print(f'// skipped invalid line: {line.strip()}', file=sys.stderr)
" > processed_documents.jsonl
15 changes: 15 additions & 0 deletions typesense/scripts/create-typesense-collection.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
TYPESENSE_ORIGIN="${TYPESENSE_ORIGIN:-http://localhost:8108}"
TYPESENSE_API_KEY="${TYPESENSE_API_KEY:-test_api_key}"

curl "${TYPESENSE_ORIGIN}/collections" \
-X POST \
-H "Content-Type: application/json" \
-H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" \
-d '{
"name": "mm_product_docs",
"fields": [
{"name": "category", "type": "string" },
{"name": "weight", "type": "int32" }
],
"default_sorting_field": "weight"
}'
6 changes: 6 additions & 0 deletions typesense/scripts/delete-typesense-collection.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
TYPESENSE_ORIGIN="${TYPESENSE_ORIGIN:-http://localhost:8108}"
TYPESENSE_API_KEY="${TYPESENSE_API_KEY:-test_api_key}"
TYPESENSE_COLLECTION_NAME="${TYPESENSE_COLLECTION_NAME:-mm_product_docs}"

curl -X DELETE "${TYPESENSE_ORIGIN}/collections/${TYPESENSE_COLLECTION_NAME}" \
-H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}"
8 changes: 8 additions & 0 deletions typesense/scripts/download-typesense-collection.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
TYPESENSE_ORIGIN="${TYPESENSE_ORIGIN:-http://localhost:8108}"
TYPESENSE_API_KEY="${TYPESENSE_API_KEY:-test_api_key}"
TYPESENSE_COLLECTION_NAME="${TYPESENSE_COLLECTION_NAME:-mm_product_docs}"

curl "${TYPESENSE_ORIGIN}/collections/${TYPESENSE_COLLECTION_NAME}/documents/export" \
-X GET \
-H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" \
-o documents.jsonl
8 changes: 8 additions & 0 deletions typesense/scripts/import-typesense-collection.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
TYPESENSE_ORIGIN="${TYPESENSE_ORIGIN:-http://localhost:8108}"
TYPESENSE_API_KEY="${TYPESENSE_API_KEY:-test_api_key}"
TYPESENSE_COLLECTION_NAME="${TYPESENSE_COLLECTION_NAME:-mm_product_docs}"

curl -X POST "${TYPESENSE_ORIGIN}/collections/${TYPESENSE_COLLECTION_NAME}/documents/import?action=upsert" \
-H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" \
-H "Content-Type: text/plain" \
--data-binary @processed_documents.jsonl
9 changes: 9 additions & 0 deletions typesense/scripts/make-alias-for-typesense-collection.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
TYPESENSE_ORIGIN="${TYPESENSE_ORIGIN:-http://localhost:8108}"
TYPESENSE_API_KEY="${TYPESENSE_API_KEY:-test_api_key}"
TYPESENSE_COLLECTION_NAME="${TYPESENSE_COLLECTION_NAME:-mm_product_docs}"

curl "${TYPESENSE_ORIGIN}/aliases/mm_product_docs" -X PUT \
-H "Content-Type: application/json" \
-H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" -d "{
\"collection_name\": \"${TYPESENSE_COLLECTION_NAME}\"
}"
7 changes: 7 additions & 0 deletions typesense/scripts/reset-typesense-collection.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
TYPESENSE_ORIGIN="${TYPESENSE_ORIGIN:-http://localhost:8108}"
TYPESENSE_API_KEY="${TYPESENSE_API_KEY:-test_api_key}"
TYPESENSE_COLLECTION_NAME="mm_product_docs"

curl -X DELETE \
"${TYPESENSE_ORIGIN}/collections/${TYPESENSE_COLLECTION_NAME}/documents?truncate=true&filter_by=anchor:!=none" \
-H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}"