-
Notifications
You must be signed in to change notification settings - Fork 3
Adding defog Restaurants database for testing #449
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
knassre-bodo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revisions left for restaurant code
| What is the total count of restaurants in each city? | ||
| """ | ||
| return locations.PARTITION(name="city", by=city_name).CALCULATE( | ||
| city_name, total_count=NDISTINCT(locations.restaurant_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be able to just do COUNT(locations) (since each restaurant_id value is unique)
| "column name": "county", | ||
| "data type": "string", | ||
| "description": "The name of the county", | ||
| "sample values": ["New York", "San Francisco"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These county names are ALSO city names. Let's include more sample values that are unambiguously county names (make sure to update the other graphs): Miami-Dade, Cook
| "name": "geographies", | ||
| "type": "simple table", | ||
| "table path": "main.geographic", | ||
| "unique properties": [ | ||
| "city_name" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since each row represents a unique city, I'd advise changing the name of the collection to cities, and altering the description accordingly.
| "type": "simple table", | ||
| "table path": "main.location", | ||
| "unique properties": [ | ||
| "restaurant_id" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't each combination of (house_number, street_name, city_name) also be unique?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean adding in unique properties this ["restaurant_id", ["house_number", "street_name", "city_name"]] or [["restaurant_id", "house_number", "street_name", "city_name"]] ? Will this change impact how the current queries are written?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first one:
"restaurant_id", ["house_number", "street_name", "city_name"]]
And no it won't necessarily change any of the current queries.
| "relationships": [ | ||
| { | ||
| "type": "simple join", | ||
| "name": "locations", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd call this restaurant_locations since it is mapping a city to each restaurant location within the city.
| high_rated_restaurants = COUNT( | ||
| restaurants.WHERE((rating > 4.0) & (LOWER(city_name) == "new york")) | ||
| ) | ||
| low_rated_restaurants = COUNT( | ||
| restaurants.WHERE((rating < 4.0) & (LOWER(city_name) == "new york")) | ||
| ) | ||
| return Restaurants.CALCULATE( | ||
| ratio=( | ||
| high_rated_restaurants | ||
| / KEEP_IF(low_rated_restaurants, low_rated_restaurants > 0) | ||
| ) | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as before:
| high_rated_restaurants = COUNT( | |
| restaurants.WHERE((rating > 4.0) & (LOWER(city_name) == "new york")) | |
| ) | |
| low_rated_restaurants = COUNT( | |
| restaurants.WHERE((rating < 4.0) & (LOWER(city_name) == "new york")) | |
| ) | |
| return Restaurants.CALCULATE( | |
| ratio=( | |
| high_rated_restaurants | |
| / KEEP_IF(low_rated_restaurants, low_rated_restaurants > 0) | |
| ) | |
| ) | |
| nyc_restaurants = restaurants.WHERE(LOWER(city_name) == "new york") | |
| n_hi_rating = SUM(nyc_restaurants.rating > 4.0) | |
| n_lo_rating= SUM(nyc_restaurants.rating > 4.0) | |
| return Restaurants.CALCULATE( | |
| ratio=( | |
| n_hi_rating | |
| / KEEP_IF(n_lo_rating, n_lo_rating != 0) | |
| ) | |
| ) |
| vegan_sf_rest = COUNT( | ||
| restaurants.WHERE( | ||
| (LOWER(food_type) == "vegan") & (LOWER(city_name) == "san francisco") | ||
| ) | ||
| ) | ||
| no_vegan_sf_rest = COUNT( | ||
| restaurants.WHERE( | ||
| (LOWER(food_type) != "vegan") & (LOWER(city_name) == "san francisco") | ||
| ) | ||
| ) | ||
| return Restaurants.CALCULATE( | ||
| ratio=(vegan_sf_rest / KEEP_IF(no_vegan_sf_rest, no_vegan_sf_rest > 0)) | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as before:
| vegan_sf_rest = COUNT( | |
| restaurants.WHERE( | |
| (LOWER(food_type) == "vegan") & (LOWER(city_name) == "san francisco") | |
| ) | |
| ) | |
| no_vegan_sf_rest = COUNT( | |
| restaurants.WHERE( | |
| (LOWER(food_type) != "vegan") & (LOWER(city_name) == "san francisco") | |
| ) | |
| ) | |
| return Restaurants.CALCULATE( | |
| ratio=(vegan_sf_rest / KEEP_IF(no_vegan_sf_rest, no_vegan_sf_rest > 0)) | |
| ) | |
| sf_restaurants = restaurants.WHERE(LOWER(city_name) == "san francisco") | |
| n_vegan = SUM(LOWER(sf_restaurants.rating) == "vegan") | |
| n_non_vegan= SUM(LOWER(sf_restaurants.rating) != "vegan") | |
| return Restaurants.CALCULATE( | |
| ratio=( | |
| n_hi_rating | |
| / KEEP_IF(n_lo_rating, n_lo_rating != 0) | |
| ) | |
| ) |
| italian_la_rest = COUNT( | ||
| restaurants.WHERE( | ||
| (LOWER(food_type) == "italian") & (LOWER(city_name) == "los angeles") | ||
| ) | ||
| ) | ||
| no_italian_la_rest = COUNT(restaurants.WHERE((LOWER(city_name) == "los angeles"))) | ||
| return Restaurants.CALCULATE( | ||
| ratio=(italian_la_rest / KEEP_IF(no_italian_la_rest, no_italian_la_rest > 0)) | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as before:
la_restaurants = restaurants.WHERE(LOWER(city_name) == "los angeles")
n_la_italian = SUM(la_restaurants.food_type == "italian")
n_la = COUNT(la_restaurants)
return Restaurants.CALCULATE(
ratio=(
n_la_italian
/ KEEP_IF(n_la, n_la != 0)
)
)| geographies.WHERE(COUNT(restaurants) > 0) | ||
| .PARTITION(name="regions", by=region) | ||
| .CALCULATE(rest_region=region, avg_rating=AVG(geographies.restaurants.rating)) | ||
| .ORDER_BY(region.ASC()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can replace geographies.WHERE(COUNT(restaurants) > 0) with geographies.WHERE(HAS(restaurants))
| What's the name and food type of all the restaurants located on Market St in | ||
| San Francisco? | ||
| """ | ||
| return locations.WHERE((LOWER(street_name) == "market st")).CALCULATE( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return locations.WHERE((LOWER(street_name) == "market st")).CALCULATE( | |
| return locations.WHERE((LOWER(street_name) == "market st") & (LOWER(city_name) == "san francisco")).CALCULATE( |
hadia206
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job. Thanks John
| # Find the treatments from the doctors within the specialty in the past 6 months | ||
| recent_treatments = doctors.prescribed_treatments.WHERE( | ||
| DATEDIFF("months", start_date, DATETIME("now")) <= 6 | ||
| start_date >= DATETIME("now", "-6 months", "start of day") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change?
Resolves #448