You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jul 23, 2020. It is now read-only.
The idea of this epic is to improve the different mechanism we have in place to report the state of che.openshift.io. We would add solutions to monitor the state, report the metrics and expose those in a way which could be leverage to provide a better user experience.
Example: When we detect that the platform is having issues to start the workspaces, we should inform the user that his workspace might take longer than usual to get ready.
User Story 1: As a user, I should be able to know the status of the platform
Today, we are already measuring some elements on of the platform is behaving:
Workspace startup time
PVC mount time with empty PVC
PVC mount time with big PVC
Those metrics are currently not exposed and not everybody can access those.
With those metrics, we want to setup the basis of a status page and our monitoring tools:
Have an agent that is running the test with a defined interval
Expose the metrics to the prometheus format
Add a status page which will display the information (like status.io)
There are many different online services that are providing information about the state of their platform:
User Story 2: As an admin, or ops of the system, I should be notified/alerted when the platform is not behaving properly.
Once we have the metrics reported into prometheus format and the status available to end-user, we need to put in place an alerting system so when someone is going bad with the platform we have the information that is reported to the right people.
User Story 3: As an ops or admin of the platform, I'd like to get more insights about how the platform is behaving.
Once the basis are setup, we would enrich the metrics we are following:
time of pulling images
time of pulling images that are already cached
time to create routes
time to clone a repository
time it spent in initializing a Language Server
User Story 4: As a user using the product, I should be notified in my environment if there is something behaving wrong on the platform.
User Story 5: As one deploying Che, I'd like to benefit from this tooling.
We should be able to provide those tools for anyone who setup Che on their own.
User Story 6: As a user, I want to get in-context feedback about the state of the platform.
There are multiple aspects where we could provide information about the state of the platform:
When starting the workspace, if the state of the platform doesn't provide fast start of the workspace, we should provide a message "The platform is currently under load, your workspace may take longer than usual to get ready."
When in the IDE, we could have a small status widget in the status bar - showing different indicators about the state of the platform
The text was updated successfully, but these errors were encountered:
Goals
The idea of this epic is to improve the different mechanism we have in place to report the state of che.openshift.io. We would add solutions to monitor the state, report the metrics and expose those in a way which could be leverage to provide a better user experience.
Example: When we detect that the platform is having issues to start the workspaces, we should inform the user that his workspace might take longer than usual to get ready.
User Story 1: As a user, I should be able to know the status of the platform
Today, we are already measuring some elements on of the platform is behaving:
Those metrics are currently not exposed and not everybody can access those.
With those metrics, we want to setup the basis of a status page and our monitoring tools:
There are many different online services that are providing information about the state of their platform:
User Story 2: As an admin, or ops of the system, I should be notified/alerted when the platform is not behaving properly.
Once we have the metrics reported into prometheus format and the status available to end-user, we need to put in place an alerting system so when someone is going bad with the platform we have the information that is reported to the right people.
User Story 3: As an ops or admin of the platform, I'd like to get more insights about how the platform is behaving.
Once the basis are setup, we would enrich the metrics we are following:
User Story 4: As a user using the product, I should be notified in my environment if there is something behaving wrong on the platform.
User Story 5: As one deploying Che, I'd like to benefit from this tooling.
We should be able to provide those tools for anyone who setup Che on their own.
User Story 6: As a user, I want to get in-context feedback about the state of the platform.
There are multiple aspects where we could provide information about the state of the platform:
The text was updated successfully, but these errors were encountered: