Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch searches to use vespa by default #205

Merged
merged 13 commits into from
Jan 11, 2024

Conversation

olaughter
Copy link
Contributor

@olaughter olaughter commented Dec 19, 2023

Description

Makes vespa the primary search database and transfers opensearch tests over to run against vespa. This does not delete the opensearch tests or code as we're planning on keeping them around a bit longer while we build confidence in vespa.

Points to note:

  • At this stage the switchover is done by changing the use_vespa parameter to default to true, this is the path we have tested and should mean opensearch can still be queried until we come back and delete it
  • Test setup and coverage could be better, I refactored the opensearch versions somewhat but had to draw a line in the sand
  • Having a copy of the vespa schema in this repo is less than ideal. I initially attempted to fetch the production config as part of test setup but it's in a private repo, which leads to more issues than its worth. Hoping to revisit this problem to find a better solution at some point, otherwise we'd risk schema drift.
  • Some tests could not be copied over, either because they no longer are needed (yay!), or because of outstanding issues (boo!).

Next steps:

  • Ensure ingest into Opensearch and Vespa is complete
  • Setup and test vespa prod credentials
  • Merge, semvar and deploy this branch

Type of change

Please select the option(s) below that are most relevant:

  • Bug fix
  • New feature
  • Breaking change

How Has This Been Tested?

I rewrote as many opensearch tests as I could. Also months of iterating and manual testing

Reviewer Checklist

  • The PR represents a single feature (small driveby fixes are also ok)
  • The PR includes tests that are sufficient for the level of risk
  • The code is sufficiently commented, particularly in hard-to-understand areas
  • Any required documentation updates have been made
  • Any TODOs added are captured in future tickets
  • No FIXMEs remain

Copy link

linear bot commented Dec 19, 2023

@olaughter olaughter force-pushed the feature/pdct-712-switch-to-vespa-as-primary-search-db branch 2 times, most recently from a6f9f92 to 07ece8d Compare December 20, 2023 14:14
@olaughter olaughter force-pushed the feature/pdct-712-switch-to-vespa-as-primary-search-db branch 15 times, most recently from 5ae5c3e to 3166f93 Compare January 10, 2024 11:50
Minimal change switchover to make it easy to switch back if needed, this
just makes vespa the default over opensearch. The api can still query
opensearch by setting `use_vespa` to False. We can delete the opensearch
code when we're confident we won't need it as a fallback
This distinguishes opensearch tests by giving them their own pytest mark,
it also adds a parameter to the api request to cause the endpoint to still
use opensearch for these tests. Finally we rename the make task for
opensearch specific tests
Adds test db setup, fixtures and commands to run vespa in ci
Some have been made obsolete, either overtime or because vespa handles
something differently. But this adapts as many of the existing opensearch
tests as possible to point at the vespa route instead
No longer needed as we store the slug in vespa
These tests are currently failing due to an issue in the Data access
library. Removing them until the issue can be resolved
@olaughter olaughter force-pushed the feature/pdct-712-switch-to-vespa-as-primary-search-db branch from 3166f93 to 2eb2b99 Compare January 10, 2024 12:14
@olaughter olaughter marked this pull request as ready for review January 10, 2024 12:31
@olaughter olaughter requested a review from katybaulch January 10, 2024 12:47
Copy link
Contributor

@THOR300 THOR300 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! Great to get this ticked off!

Left some none blocking comments. Just some thoughts and a pointer as to whether we need docstrings and return types etc.

I'm just holding off approving as this is such an important and large PR.

The question I'm thinking is "All these tests look very good and concise but how do we know we've tested everything we need to test prior to a switch" - does copying OS tests leave us exposed to anything vespa specific etc.?

And yes agreed we should very quickly try and find a way of dynamically pulling in the vespa schema

@olaughter
Copy link
Contributor Author

Thanks Mark! I've either followed up or ticketed your suggestions.

"All these tests look very good and concise but how do we know we've tested everything we need to test prior to a switch" - does copying OS tests leave us exposed to anything vespa specific etc.?

I'd say this gets us far enough to go live. It does provide a framework for adding more search tests in that we now have the basic setup in place. There is for sure plenty of room for improvement!

Copy link
Contributor

@diversemix diversemix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work - lots of effort - much appreciated you did the heavy lifting on this 🙇

@THOR300 THOR300 self-requested a review January 11, 2024 13:30
Copy link
Contributor

@THOR300 THOR300 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@olaughter olaughter merged commit 8dfc8e1 into main Jan 11, 2024
2 checks passed
@olaughter olaughter deleted the feature/pdct-712-switch-to-vespa-as-primary-search-db branch January 11, 2024 13:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants