Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PYTHON-5089 Convert test.test_mongos_load_balancing to async #2107

Merged
merged 11 commits into from
Feb 6, 2025
223 changes: 223 additions & 0 deletions test/asynchronous/test_mongos_load_balancing.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,223 @@
# Copyright 2015-present MongoDB, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""Test AsyncMongoClient's mongos load balancing using a mock."""
from __future__ import annotations

import asyncio
import sys
import threading

from pymongo.operations import _Op

sys.path[0:0] = [""]

from test.asynchronous import AsyncMockClientTest, async_client_context, connected, unittest
from test.asynchronous.pymongo_mocks import AsyncMockClient
from test.utils import async_wait_until

from pymongo.errors import AutoReconnect, InvalidOperation
from pymongo.server_selectors import writable_server_selector
from pymongo.topology_description import TOPOLOGY_TYPE

_IS_SYNC = False


@async_client_context.require_connection
@async_client_context.require_no_load_balancer
def asyncSetUpModule():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly there isn't an async version of unittest.setUpModule. I'd just add these decorators to an asyncSetUp method for TestMongosLoadBalancing instead since it's the only test class in this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good to know, makes sense

pass


if not _IS_SYNC:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's standardize these checks to if _IS_SYNC for clarity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay yes, that's what I wanted (and did initially) but for some reason if its if _IS_SYNC first, type-checker assumes both definitions of SimpleOp inherit from threading.Thread and then insist that both implementations adhere to the thread api (in this case join must return a value)
is that preferred? it felt weird to return a value for the sake of it in the async version of SimpleOp simply because its not used at all

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our typing checks do, or the IDE's own highlighting does? Either way, we should have all checks be for _IS_SYNC whenever possible for clarity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both, our typing checks is what caught my attention first actually


class SimpleOp:
def __init__(self, client):
self.task: asyncio.Task
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.task: asyncio.Task
self.task: asyncio.Task = None

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm I don't love that because it'd mess with the type of self.task (specifically it'd have to be Optional[asyncio.Task]) in the interest of not making it optional, I moved the assignment from start() to the init. Does that work? Is that better?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you show what you mean exactly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I thought I committed and pushed HAHA sorry about that actually pushed now)

self.client = client
self.passed = False

async def run(self):
await self.client.db.command("ping")
self.passed = True # No exception raised.

def start(self):
self.task = asyncio.create_task(self.run())

async def join(self):
await self.task
else:

class SimpleOp(threading.Thread):
def __init__(self, client):
super().__init__()
self.client = client
self.passed = False

def run(self):
self.client.db.command("ping")
self.passed = True # No exception raised.


async def do_simple_op(client, nthreads):
threads = [SimpleOp(client) for _ in range(nthreads)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about using tasks here instead of threads?

Copy link
Contributor Author

@sleepyStick sleepyStick Jan 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i like it / wanted to do it but noticed that it'd result in the sync version of the file to be tasks, would I need to modify synchro to change tasks to threads to get this to work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd need to have it only do so within test files and with a specific token set to not also change things that happen to be called threads but aren't what we want to synchro.

I'd rather have concurrent executors always be called tasks for consistency rather than have them be called threads in the async code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, its tasks now :)

for t in threads:
t.start()

for t in threads:
await t.join()

for t in threads:
assert t.passed


async def writable_addresses(topology):
return {
server.description.address
for server in await topology.select_servers(writable_server_selector, _Op.TEST)
}


class TestMongosLoadBalancing(AsyncMockClientTest):
def mock_client(self, **kwargs):
mock_client = AsyncMockClient(
standalones=[],
members=[],
mongoses=["a:1", "b:2", "c:3"],
host="a:1,b:2,c:3",
connect=False,
**kwargs,
)
self.addAsyncCleanup(mock_client.aclose)

# Latencies in seconds.
mock_client.mock_rtts["a:1"] = 0.020
mock_client.mock_rtts["b:2"] = 0.025
mock_client.mock_rtts["c:3"] = 0.045
return mock_client

async def test_lazy_connect(self):
# While connected() ensures we can trigger connection from the main
# thread and wait for the monitors, this test triggers connection from
# several threads at once to check for data races.
nthreads = 10
client = self.mock_client()
self.assertEqual(0, len(client.nodes))

# Trigger initial connection.
await do_simple_op(client, nthreads)
await async_wait_until(lambda: len(client.nodes) == 3, "connect to all mongoses")

async def test_failover(self):
nthreads = 10
client = await connected(self.mock_client(localThresholdMS=0.001))
await async_wait_until(lambda: len(client.nodes) == 3, "connect to all mongoses")

# Our chosen mongos goes down.
client.kill_host("a:1")

# Trigger failover to higher-latency nodes. AutoReconnect should be
# raised at most once in each thread.
passed = []

async def f():
try:
await client.db.command("ping")
except AutoReconnect:
# Second attempt succeeds.
await client.db.command("ping")

passed.append(True)

if _IS_SYNC:
threads = [threading.Thread(target=f) for _ in range(nthreads)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use ConcurrentRunner for these as well?

for t in threads:
t.start()

for t in threads:
t.join()
else:
tasks = [asyncio.create_task(f()) for _ in range(nthreads)]
for t in tasks:
await t

self.assertEqual(nthreads, len(passed))

# Down host removed from list.
self.assertEqual(2, len(client.nodes))

async def test_local_threshold(self):
client = await connected(self.mock_client(localThresholdMS=30))
self.assertEqual(30, client.options.local_threshold_ms)
await async_wait_until(lambda: len(client.nodes) == 3, "connect to all mongoses")
topology = client._topology

# All are within a 30-ms latency window, see self.mock_client().
self.assertEqual({("a", 1), ("b", 2), ("c", 3)}, await writable_addresses(topology))

# No error
await client.admin.command("ping")

client = await connected(self.mock_client(localThresholdMS=0))
self.assertEqual(0, client.options.local_threshold_ms)
# No error
await client.db.command("ping")
# Our chosen mongos goes down.
client.kill_host("{}:{}".format(*next(iter(client.nodes))))
try:
await client.db.command("ping")
except:
pass

# We eventually connect to a new mongos.
async def connect_to_new_mongos():
try:
return await client.db.command("ping")
except AutoReconnect:
pass

await async_wait_until(connect_to_new_mongos, "connect to a new mongos")

async def test_load_balancing(self):
# Although the server selection JSON tests already prove that
# select_servers works for sharded topologies, here we do an end-to-end
# test of discovering servers' round trip times and configuring
# localThresholdMS.
client = await connected(self.mock_client())
await async_wait_until(lambda: len(client.nodes) == 3, "connect to all mongoses")

# Prohibited for topology type Sharded.
with self.assertRaises(InvalidOperation):
await client.address

topology = client._topology
self.assertEqual(TOPOLOGY_TYPE.Sharded, topology.description.topology_type)

# a and b are within the 15-ms latency window, see self.mock_client().
self.assertEqual({("a", 1), ("b", 2)}, await writable_addresses(topology))

client.mock_rtts["a:1"] = 0.045

# Discover only b is within latency window.
async def predicate():
return {("b", 2)} == await writable_addresses(topology)

await async_wait_until(
predicate,
'discover server "a" is too far',
)


if __name__ == "__main__":
unittest.main()
58 changes: 44 additions & 14 deletions test/test_mongos_load_balancing.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
"""Test MongoClient's mongos load balancing using a mock."""
from __future__ import annotations

import asyncio
import sys
import threading

Expand All @@ -30,22 +31,43 @@
from pymongo.server_selectors import writable_server_selector
from pymongo.topology_description import TOPOLOGY_TYPE

_IS_SYNC = True


@client_context.require_connection
@client_context.require_no_load_balancer
def setUpModule():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can keep these to improve performance when skipping this test suite.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved it to the class's setUp because on the async side it'd require setUpModule to be awaited and i couldn't find an easy way to achieve that. I figured it was the only class in this module so it didn't make a difference to just move it into the class. But if you know how i could await this in async, I have no hesitations to bring it back!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh whoops my bad, forgot how the wrappers interact with async. You're right!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good, wasn't sure if I was missing / forgetting what the trick was!

pass


class SimpleOp(threading.Thread):
def __init__(self, client):
super().__init__()
self.client = client
self.passed = False
if not _IS_SYNC:

class SimpleOp:
def __init__(self, client):
self.task: asyncio.Task
self.client = client
self.passed = False

def run(self):
self.client.db.command("ping")
self.passed = True # No exception raised.

def start(self):
self.task = asyncio.create_task(self.run())

def run(self):
self.client.db.command("ping")
self.passed = True # No exception raised.
def join(self):
self.task
else:

class SimpleOp(threading.Thread):
def __init__(self, client):
super().__init__()
self.client = client
self.passed = False

def run(self):
self.client.db.command("ping")
self.passed = True # No exception raised.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we settle on a common pattern for the threading.Thread -> Task conversions? I made some suggestions in #2103.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on the approach in #2094?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I believe if I implement noah's suggestion above we'd be using the same pattern :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(i think my tab wasn't fully refreshed earlier so i'm just seeing Noah's comment right now) but I actually really like that approach!



def do_simple_op(client, nthreads):
Expand Down Expand Up @@ -118,12 +140,17 @@ def f():

passed.append(True)

threads = [threading.Thread(target=f) for _ in range(nthreads)]
for t in threads:
t.start()
if _IS_SYNC:
threads = [threading.Thread(target=f) for _ in range(nthreads)]
for t in threads:
t.start()

for t in threads:
t.join()
for t in threads:
t.join()
else:
tasks = [asyncio.create_task(f()) for _ in range(nthreads)]
for t in tasks:
t

self.assertEqual(nthreads, len(passed))

Expand Down Expand Up @@ -183,8 +210,11 @@ def test_load_balancing(self):
client.mock_rtts["a:1"] = 0.045

# Discover only b is within latency window.
def predicate():
return {("b", 2)} == writable_addresses(topology)

wait_until(
lambda: {("b", 2)} == writable_addresses(topology),
predicate,
'discover server "a" is too far',
)

Expand Down
1 change: 1 addition & 0 deletions tools/synchro.py
Original file line number Diff line number Diff line change
Expand Up @@ -216,6 +216,7 @@ def async_only_test(f: str) -> bool:
"test_gridfs_spec.py",
"test_logger.py",
"test_monitoring.py",
"test_mongos_load_balancing.py",
"test_on_demand_csfle.py",
"test_raw_bson.py",
"test_read_concern.py",
Expand Down
Loading