|
2 | 2 | title: "Gemini" |
3 | 3 | --- |
4 | 4 |
|
5 | | -Google's [Gemini 2.5 Computer Use model](https://blog.google/technology/google-deepmind/gemini-computer-use-model/) is a specialized model built on Gemini 2.5 Pro's capabilities to power agents that can interact with user interfaces. |
| 5 | +[Gemini 2.5 Computer Use](https://blog.google/technology/google-deepmind/gemini-computer-use-model/) is Google's groundbreaking capability that enables AI models to interact with computers the way humans do—by looking at screens, moving cursors, clicking buttons, and typing text. This powerful feature allows AI agents to control web browsers, navigate interfaces, and perform complex tasks across applications. |
| 6 | + |
| 7 | +With Gemini Computer Use, the AI can: |
| 8 | +- **Navigate websites and applications** by interpreting visual interfaces |
| 9 | +- **Click buttons and fill forms** just like a human would |
| 10 | +- **Take screenshots** to understand and verify its actions |
| 11 | +- **Perform multi-step workflows** that span multiple applications or web pages |
6 | 12 |
|
7 | 13 | By integrating Gemini 2.5 Computer Use with Kernel, you can run these AI-powered browser automations on cloud-hosted infrastructure, eliminating the need for local browser management and enabling scalable, reliable AI agents. |
8 | 14 |
|
9 | | -## Quick setup with our example template |
| 15 | +## Quick setup with Computer Use |
| 16 | + |
| 17 | +Get started with Gemini Computer Use and Kernel using our pre-configured app template: |
| 18 | + |
| 19 | +```bash |
| 20 | +npx @onkernel/create-kernel-app my-computer-use-app |
| 21 | +``` |
10 | 22 |
|
11 | | -Get started quickly with our TypeScript template that demonstrates Gemini 2.5 Computer Use with Kernel. |
| 23 | +Choose `TypeScript` as the programming language and then select `gemini-cua` as the template. |
12 | 24 |
|
13 | | -Check out the [Open-source Gemini Template](https://github.com/onkernel/ts-stagehand-google-cua-agent) repository for a complete working example that shows how to: |
14 | | -- Set up Gemini 2.5 Computer Use with Kernel |
15 | | -- Use Stagehand for browser automation |
16 | | -- Run AI-powered web interactions on cloud infrastructure |
| 25 | +Then follow the [Quickstart guide](/quickstart/) to deploy and run your Computer Use automation on Kernel's infrastructure. |
17 | 26 |
|
18 | | -## Benefits of using Kernel with Gemini Computer Use |
| 27 | +## Benefits of using Kernel with Computer Use |
19 | 28 |
|
20 | 29 | - **No local browser management**: Run Computer Use automations without installing or maintaining browsers locally |
21 | | -- **Scalability**: Launch multiple browser sessions in parallel for concurrent automations |
22 | | -- **Stealth mode**: Built-in anti-detection features for web interactions |
| 30 | +- **Scalability**: Launch multiple browser sessions in parallel for concurrent AI agents |
| 31 | +- **Stealth mode**: Built-in anti-detection features for reliable web interactions |
23 | 32 | - **Session persistence**: Maintain browser state across automation runs |
24 | | -- **Live view**: Debug your automations with real-time browser viewing |
| 33 | +- **Live view**: Debug your Computer Use agents with real-time browser viewing |
| 34 | +- **Cloud infrastructure**: Run computationally intensive AI agents without local resource constraints |
25 | 35 |
|
26 | 36 | ## Next steps |
27 | 37 |
|
28 | | -- Check out [live view](/browsers/live-view) for debugging your automations |
| 38 | +- Check out [live view](/browsers/live-view) for debugging your Computer Use automations |
29 | 39 | - Learn about [stealth mode](/browsers/stealth) for avoiding detection |
30 | 40 | - Learn how to properly [terminate browser sessions](/browsers/termination) |
31 | 41 | - Learn how to [deploy](/apps/deploy) your Computer Use app to Kernel |
0 commit comments