Skip to content
This repository has been archived by the owner on Jan 27, 2021. It is now read-only.

GeoDa integration #5

Open
bertday opened this issue Apr 19, 2020 · 5 comments
Open

GeoDa integration #5

bertday opened this issue Apr 19, 2020 · 5 comments
Labels
core Relates to core code or basic features of the app

Comments

@bertday
Copy link
Contributor

bertday commented Apr 19, 2020

The current Atlas uses GeoDa wrapped as a WebAssembly module for a few spatial analysis tasks:

  • Hotspot/cluster analysis (aka LISA)
  • Cartograms
  • Possibly others (wondering if @lixun910 might have more insights?)

This is a really interesting approach and has been fun to explore from an engineering standpoint—however, I did want to bring up a few discussion points as we start to migrate these features into the refactored app.

  • Browser support: WebAssembly has wide support on modern browsers, but is not implemented on legacy browsers such as Internet Explorer. Based on our user analytics, about 2.2% of visits came from IE. That's a small number to be sure, but my concern is that some of these folks are key stakeholders at less-resourced institutions, or are working under IT policies that bar them from installing their own browser (this is common at hospitals and gov't agencies). As we consider the overall accessibility of the site, we may want to keep these users in mind.
  • Performance/scaling: My understanding was that there were some issues with the in-browser LISA analysis taking too long when the app started, which is why we've been caching the results as JSON files and checking them into the repo (if I got any of that wrong, please feel free to weigh in @lixun910). As our data grows over time and as we add new sources, this will likely become a larger issue.
  • Packaging: the current WebAssembly module injects a variable called Module into the global JavaScript scope, which presents some challenges when integrating with a Node-based project such as the refactored app. I've started a repo that would turn the module into a Node package, but this may take more time than we've allocated for the refactor.

At the same time I think we can all recognize that GeoDa is driving some of the most important insights in the Atlas, so I wanted to make a few suggestions around how we could modify the stack to provide the best user experience possible without compromising on spatial analysis.

  • Preprocessing: One option would be to preprocess the data with GeoDa so that the WebAssembly module isn't needed for things like viewing clusters; this would address both browser support and performance issues. As I mentioned above, we may already be doing this to some extent, so it would just be a matter of formalizing the caching process and possibly using pygeoda in place of the browser workflow. I think this would be the most straightforward approach and would be my recommendation.
  • Server-side processing: Another option would be to deploy pygeoda as a lightweight backend service, such as an AWS Lambda function, that the browser app could call when it needs to run analysis. We could use something like this in the case of the cartogram, which I'm not sure could be pre-processed as easily.

I was wondering if the group had any additional thoughts or ideas re: Geoda on the front end. Once again, I think this is a fantastic feature of the app and looking forward to finding a long-term solution for supporting it in the Atlas 😄

/cc @jkoschinsky @Makosak @lixun910 @linqinyu

@bertday bertday added the core Relates to core code or basic features of the app label Apr 19, 2020
@lixun910
Copy link
Member

lixun910 commented Apr 20, 2020 via email

@lixun910
Copy link
Member

lixun910 commented Apr 20, 2020 via email

@bertday
Copy link
Contributor Author

bertday commented Apr 20, 2020

Thank you @lixun910, really appreciate your feedback on this! I didn’t realize it was only USAFacts being cached so that helps. I had forgotten about centroid labels. I’m not as familiar with weights but maybe we can review those sometime 🙂

Thinking about this some more, I’m not sure that I’m going to have time to get up to speed with pygeoda and write the caching script at the same time as working on the front end, so that may be a reason to take a more incremental approach. The issue of browser support might still be good to talk about sometime with @Makosak

I’d like to try to wrap the Wasm code into a npm package if I can (quickly). I’ll follow up on that repo with a few questions I had. If it feels like it’s taking too long, I’m happy to try to bring in the global module.

@lixun910
Copy link
Member

lixun910 commented Apr 20, 2020 via email

@Makosak
Copy link

Makosak commented Apr 20, 2020

Process sounds great overall. I'd be in favor of more caching as the central data core will get large and could help work on a variety of browsers, but we could spin off that fast GeoDa action for some customized tools in the future. But really, whatever works best for the core crew here.

We could also just show both cores + neighbors for clusters for the next release instead of the neighbor highlight -- will be okay to wait until May release so we have time to test out both solutions and determine which is easier & more effective with users. Getting in more data easily will still be priority first, and we can pull in features back from the GeoDa functionality next.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
core Relates to core code or basic features of the app
Projects
None yet
Development

No branches or pull requests

3 participants