Skip to content

Latest commit

 

History

History
108 lines (67 loc) · 10.6 KB

ARCHITECTURE.md

File metadata and controls

108 lines (67 loc) · 10.6 KB

Ubuntu.com application architecture

This lays out the high level architecture of the ubuntu.com application. The ubuntu.com domain has many different pieces drawn from a number of different places.

The core application

ubuntu.com is a Flask v1 app. It makes use of a number of our standard Python modules. Specifically, these two effect the whole site:

  • flask-base: This is our core Flask app (instantiated here) that sets default functionality (e.g. redirects.yaml, templates/404.html, robots.txt, favicon, caching headers, security headers)
  • templatefinder: A standard view (created here) for serving any template based on its path. E.g. creating a file at templates/my-new-page.html will result in a page being displayed at ubuntu.com/my-new-page.

Local development

It uses dotrun for local development, defining standard endpoints for serve, build, test, watch etc. within the package.json. For more information, see the README.md.

Deployment

Merging a pull request (PR) into the main branch will automatically trigger a deployment to the production site, which normally takes around 5 minutes. This follows our standard deployment flow.

File structure

Major website areas

Homepage

The homepage (url rule) has no server-side dynamic functionality, so is instead served directly by the template finder view from templates/index.html.

Since ubuntu.com is a high-traffic site, and the homepage its most popular page, it is important that the initial response for this page remains as fast as possible, so it should not be making server-side database or API calls directly if it can be avoided.

Homepage takeovers and engage pages

The main central area of the homepage contains a "takeover" - a large strip advertising a Canonical product or service. These takeovers often then link to "engage pages" - a page about a specific piece of marketing content like a whitepaper.

The list of takeovers is maintained by the Marketing team, and when the homepage loads, a random takeover will be chosen and then on reloads it will cycle through the list of takeovers, so you can see all takeovers by refreshing a few times.

The takeovers are loaded client-side by calling https://ubuntu.com/takeovers.json (url rule, view function). You can also see all available takeovers listed at https://ubuntu.com/takeovers (url rule, view function), and you can see available engage pages at https://ubuntu.com/engage.

The takeover content is maintained by Marketing in the takeovers category in Discourse, and the /takeovers.json view pulls the content from here through the Discourse API using our Discourse module (instantiated here). Similarly, the engage pages are maintained in a Discourse category, which populates https://ubuntu.com/engage.

Here's a more complete guide to the engage pages and takeovers system.

Blog

The blog pages under https://ubuntu.com/blog make use of the blog module (instantiated here) to pull in content from the admin.insights.ubuntu.com Wordpress installation through the Wordpress API. This content is managed through the Wordpress admin area, which is only accessible while on the company VPN.

Documentation

There are a large number of documentation areas on ubuntu.com (complete at the time of writing, I think):

Each of these is served with our Discourse module, and pulls its content from a set of topics in Discourse, as with the takeovers and engage pages.

For more information, see the Creating Discourse based documentation pages guide.

Search

https://ubuntu.com/search (url rule) makes use of our Google Programmable Search account to pull results through Google's API using our search module. The account is configured to index many different domains that we own, and results can be displayed from all of them.

We have had trouble with search spam in the past which has led to us hitting API rate limits, breaking our search. There is now rate limiting on the content-cache level which will hopefully take care of this. We've set up a dashboard in Graylog for monitoring the traffic on the search engine.

Downloads

The download pages, e.g. https://ubuntu.com/download/desktop, include links for people to download Ubuntu. These pages get extremely busy on our six-monthly release days.

When people click the "Download" button they are sent to the thank-you page, e.g. https://ubuntu.com/download/desktop/thank-you?version=22.04.3&architecture=amd64 (url rule, view function). Similar to the homepage, it's important that this page is served as straightforwardly as possible with no back-end API/database calls so the page can remain responsive.

After the page has loaded, JavaScript will download the list of download mirrors from https://ubuntu.com/mirrors.json and choose one to trigger the download with. If JavaScript isn't available, the download will instead be triggered from our own download server, releases.ubuntu.com.

The mirrors.json view:

  • Looks for a list of mirrors in the local file etc/ubuntu-mirrors-rss.xml. This file gets built into the production image in the Docker build when the site is released. To refresh the mirror list the site needs to be released again.
  • It uses geolite2 to match the request IP address to a location and serves mirrors for that location.

Security pages

https://ubuntu.com/security/cves (url rule, view function) and https://ubuntu.com/security/notices (url rule, view function) list CVEs and Ubuntu Security Notices respectively. These pages are build by pulling in information from the security API. The local API models are here, and this API is documented at https://ubuntu.com/security/api/docs.

The API, even though it's hosted on ubuntu.com at ubuntu.com/security/cves.json etc., is actually deployed separately in the ubuntu-com-security-api project. This is to keep this high-traffic database-backed service separate from the main ubuntu.com application so as not to introduce stability issues.

Pro

To be completed by @jpmartinspt.