Skip to content

Conversation

@john-sanchez31
Copy link
Contributor

Resolves #448

@john-sanchez31 john-sanchez31 marked this pull request as ready for review October 30, 2025 17:29
@john-sanchez31 john-sanchez31 requested review from a team, hadia206, juankx-bodo and knassre-bodo and removed request for a team October 30, 2025 17:29
Copy link
Contributor

@knassre-bodo knassre-bodo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revisions left for restaurant code

What is the total count of restaurants in each city?
"""
return locations.PARTITION(name="city", by=city_name).CALCULATE(
city_name, total_count=NDISTINCT(locations.restaurant_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be able to just do COUNT(locations) (since each restaurant_id value is unique)

"column name": "county",
"data type": "string",
"description": "The name of the county",
"sample values": ["New York", "San Francisco"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These county names are ALSO city names. Let's include more sample values that are unambiguously county names (make sure to update the other graphs): Miami-Dade, Cook

Comment on lines 4441 to 4445
"name": "geographies",
"type": "simple table",
"table path": "main.geographic",
"unique properties": [
"city_name"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since each row represents a unique city, I'd advise changing the name of the collection to cities, and altering the description accordingly.

"type": "simple table",
"table path": "main.location",
"unique properties": [
"restaurant_id"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't each combination of (house_number, street_name, city_name) also be unique?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean adding in unique properties this ["restaurant_id", ["house_number", "street_name", "city_name"]] or [["restaurant_id", "house_number", "street_name", "city_name"]] ? Will this change impact how the current queries are written?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first one:

"restaurant_id", ["house_number", "street_name", "city_name"]]

And no it won't necessarily change any of the current queries.

"relationships": [
{
"type": "simple join",
"name": "locations",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd call this restaurant_locations since it is mapping a city to each restaurant location within the city.

Comment on lines 2951 to 2962
high_rated_restaurants = COUNT(
restaurants.WHERE((rating > 4.0) & (LOWER(city_name) == "new york"))
)
low_rated_restaurants = COUNT(
restaurants.WHERE((rating < 4.0) & (LOWER(city_name) == "new york"))
)
return Restaurants.CALCULATE(
ratio=(
high_rated_restaurants
/ KEEP_IF(low_rated_restaurants, low_rated_restaurants > 0)
)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as before:

Suggested change
high_rated_restaurants = COUNT(
restaurants.WHERE((rating > 4.0) & (LOWER(city_name) == "new york"))
)
low_rated_restaurants = COUNT(
restaurants.WHERE((rating < 4.0) & (LOWER(city_name) == "new york"))
)
return Restaurants.CALCULATE(
ratio=(
high_rated_restaurants
/ KEEP_IF(low_rated_restaurants, low_rated_restaurants > 0)
)
)
nyc_restaurants = restaurants.WHERE(LOWER(city_name) == "new york")
n_hi_rating = SUM(nyc_restaurants.rating > 4.0)
n_lo_rating= SUM(nyc_restaurants.rating > 4.0)
return Restaurants.CALCULATE(
ratio=(
n_hi_rating
/ KEEP_IF(n_lo_rating, n_lo_rating != 0)
)
)

Comment on lines 2973 to 2985
vegan_sf_rest = COUNT(
restaurants.WHERE(
(LOWER(food_type) == "vegan") & (LOWER(city_name) == "san francisco")
)
)
no_vegan_sf_rest = COUNT(
restaurants.WHERE(
(LOWER(food_type) != "vegan") & (LOWER(city_name) == "san francisco")
)
)
return Restaurants.CALCULATE(
ratio=(vegan_sf_rest / KEEP_IF(no_vegan_sf_rest, no_vegan_sf_rest > 0))
)
Copy link
Contributor

@knassre-bodo knassre-bodo Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as before:

Suggested change
vegan_sf_rest = COUNT(
restaurants.WHERE(
(LOWER(food_type) == "vegan") & (LOWER(city_name) == "san francisco")
)
)
no_vegan_sf_rest = COUNT(
restaurants.WHERE(
(LOWER(food_type) != "vegan") & (LOWER(city_name) == "san francisco")
)
)
return Restaurants.CALCULATE(
ratio=(vegan_sf_rest / KEEP_IF(no_vegan_sf_rest, no_vegan_sf_rest > 0))
)
sf_restaurants = restaurants.WHERE(LOWER(city_name) == "san francisco")
n_vegan = SUM(LOWER(sf_restaurants.rating) == "vegan")
n_non_vegan= SUM(LOWER(sf_restaurants.rating) != "vegan")
return Restaurants.CALCULATE(
ratio=(
n_hi_rating
/ KEEP_IF(n_lo_rating, n_lo_rating != 0)
)
)

Comment on lines 2996 to 3004
italian_la_rest = COUNT(
restaurants.WHERE(
(LOWER(food_type) == "italian") & (LOWER(city_name) == "los angeles")
)
)
no_italian_la_rest = COUNT(restaurants.WHERE((LOWER(city_name) == "los angeles")))
return Restaurants.CALCULATE(
ratio=(italian_la_rest / KEEP_IF(no_italian_la_rest, no_italian_la_rest > 0))
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as before:

    la_restaurants = restaurants.WHERE(LOWER(city_name) == "los angeles")
    n_la_italian = SUM(la_restaurants.food_type == "italian")
	n_la = COUNT(la_restaurants)
	return Restaurants.CALCULATE(
	        ratio=(
	            n_la_italian
	            / KEEP_IF(n_la, n_la != 0)
	        )
	    )

Comment on lines 3046 to 3049
geographies.WHERE(COUNT(restaurants) > 0)
.PARTITION(name="regions", by=region)
.CALCULATE(rest_region=region, avg_rating=AVG(geographies.restaurants.rating))
.ORDER_BY(region.ASC())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can replace geographies.WHERE(COUNT(restaurants) > 0) with geographies.WHERE(HAS(restaurants))

What's the name and food type of all the restaurants located on Market St in
San Francisco?
"""
return locations.WHERE((LOWER(street_name) == "market st")).CALCULATE(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return locations.WHERE((LOWER(street_name) == "market st")).CALCULATE(
return locations.WHERE((LOWER(street_name) == "market st") & (LOWER(city_name) == "san francisco")).CALCULATE(

Copy link
Contributor

@hadia206 hadia206 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job. Thanks John

# Find the treatments from the doctors within the specialty in the past 6 months
recent_treatments = doctors.prescribed_treatments.WHERE(
DATEDIFF("months", start_date, DATETIME("now")) <= 6
start_date >= DATETIME("now", "-6 months", "start of day")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

@john-sanchez31 john-sanchez31 merged commit 5fb0b18 into main Nov 11, 2025
19 checks passed
@john-sanchez31 john-sanchez31 deleted the John/defog_restaurants branch November 11, 2025 23:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Defog Restaurants Database

4 participants