# n1 Browser Agent

A Chrome extension that executes browser automation tasks from natural language using the Yutori n1 API.

## Features

- **Autonomous Agent Loop**: Enter a task and the agent automatically executes it step-by-step until completion
- **Screenshot-based Understanding**: Captures screenshots after each action for the n1 model to understand page changes
- **Full Action Support**: Supports click, type, scroll, key press, hover, drag, navigation, and more
- **Real-time Progress**: Watch the agent's thoughts and actions as it works
- **Stop Anytime**: Cancel running tasks with the stop button
- **Visual Feedback**: Highlights elements being interacted with

## Installation

1. **Get an API Key**
   - Sign up at [Yutori](https://yutori.com) to get your API key

2. **Load the Extension in Chrome**
   - Open Chrome and go to `chrome://extensions/`
   - Enable "Developer mode" (toggle in top right)
   - Click "Load unpacked"
   - Select the `n1-browser-extension` folder

3. **Configure the Extension**
   - Click the extension icon in your toolbar
   - Click the settings (gear) icon
   - Enter your Yutori API key
   - Click "Save"

## Usage

1. Navigate to any webpage (e.g., google.com)
2. Click the n1 Browser Agent extension icon
3. Enter a task, for example:
   - "Search for 'best restaurants in NYC' and click the first result"
   - "Go to amazon.com and search for wireless headphones"
   - "Fill out the contact form with test data and submit it"
   - "Find the login button and click it"

4. Click "Start Task" or press Enter
5. The agent will autonomously:
   - Capture a screenshot of the current page
   - Analyze the page and decide on the next action
   - Execute the action (click, type, scroll, etc.)
   - Repeat until the task is complete or you click "Stop"

6. Watch progress in the action log - you'll see the agent's thoughts and actions in real-time

## Supported Actions

| Action | Description | Example Command |
|--------|-------------|-----------------|
| `click` | Click on an element | "Click the submit button" |
| `type` | Type text into an input | "Type 'hello' in the search field" |
| `scroll` | Scroll the page | "Scroll down", "Scroll to the bottom" |
| `key_press` | Press keyboard keys | "Press Enter", "Press Ctrl+A" |
| `hover` | Hover over an element | "Hover over the menu" |
| `drag` | Drag and drop | "Drag the slider to the right" |
| `goto_url` | Navigate to a URL | "Go to google.com" |
| `go_back` | Go to previous page | "Go back" |
| `refresh` | Refresh the page | "Refresh the page" |
| `wait` | Wait for a duration | "Wait for 2 seconds" |
| `stop` | Task completion | (automatic) |

## Technical Details

### Coordinate System
The n1 API uses a 1000x1000 relative coordinate system. The extension automatically converts these to actual page coordinates based on viewport size.

### Screenshot Format
Screenshots are captured at the browser's current resolution and sent as base64-encoded PNG images.

### Conversation Context
The extension maintains conversation history for multi-turn interactions. Click "Clear" to start a new conversation.

## Troubleshooting

### Extension can't capture screenshots
- Make sure you're on a regular web page (not chrome:// pages or extension pages)
- Reload the page and try again

### Actions not working correctly
- Some websites with strict Content Security Policies may block injected events
- Try using simpler commands or breaking down complex tasks

### API errors
- Check that your API key is correct
- Verify your internet connection
- Check the browser console for detailed error messages

## Development

### Project Structure
```
n1-browser-extension/
├── manifest.json       # Extension configuration
├── popup.html         # Popup UI
├── popup.css          # Popup styles
├── popup.js           # Popup logic
├── background.js      # Service worker (API calls, action execution)
├── icons/            # Extension icons
└── README.md         # This file
```

### Building
No build step required - this is a vanilla JS extension.

### Testing
1. Make changes to the source files
2. Go to `chrome://extensions/`
3. Click the refresh icon on the extension card
4. Test your changes

## License

MIT License - See LICENSE file for details.

## Links

- [n1 API Documentation](https://docs.yutori.com/reference/n1)
- [Yutori Website](https://yutori.com)
