项目简介

Browser Use: AI 浏览器助手 – 让 AI 像人类一样浏览网页

一个能让 AI 像真实用户一样自然操作浏览器的 Python 工具库，通过简单的代码配置就能实现网页自动化任务，如订票、求职申请、数据收集等实际应用场景

主要功能：

– 让 AI 能够像人类一样浏览和操作网页

– 支持多标签页管理

– 可以提取网页内容和进行视觉识别

– 能够记录和重复执行特定操作

– 支持自定义动作（如保存文件、推送数据库等）

实际应用案例：

– 自动搜索和申请工作机会

– 自动查询航班信息

– 在 Hugging Face 上搜索和保存模型信息

技术特点

– 支持主流的 LLM 模型（GPT-4、Claude、Llama 等）

– 可以并行运行多个 AI Agent

– 提供错误自我修正能力

– 允许开发者添加自定义功能

快速启动

用pip:

pip install browser-use

（可选）安装剧作家：

playwright install

启动你的代理：

from langchain_openai import ChatOpenAIfrom browser_use import Agentimport asyncio
async def main():    agent = Agent(        task="Find a one-way flight from Bali to Oman on 12 January 2025 on Google Flights. Return me the cheapest option.",        llm=ChatOpenAI(model="gpt-4o"),    )    result = await agent.run()    print(result)
asyncio.run(main())

并且不要忘记将 API 密钥添加到.env文件中。

OPENAI_API_KEY=ANTHROPIC_API_KEY=

注册自定义操作

如果您想添加代理可以执行的自定义操作，您可以像这样注册它们：

You can use BOTH sync or async functions.

您可以使用同步或异步函数。

from browser_use.agent.service import Agentfrom browser_use.browser.service import Browserfrom browser_use.controller.service import Controller
# Initialize controller firstcontroller = Controller()
@controller.action('Ask user for information')def ask_human(question: str, display_question: bool) -> str:  return input(f'\n{question}\nInput: ')

或者使用 Pydantic 定义参数

class JobDetails(BaseModel):  title: str  company: str  job_link: str  salary: Optional[str] = None
@controller.action('Save job details which you found on page', param_model=JobDetails, requires_browser=True)async def save_job(params: JobDetails, browser: Browser):  print(params)
  # use the browser normally  page = browser.get_current_page()  page.go_to(params.job_link)

然后运行你的代理：

model = ChatAnthropic(model_name='claude-3-5-sonnet-20240620', timeout=25, stop=None, temperature=0.3)agent = Agent(task=task, llm=model, controller=controller)
await agent.run()

并行代理

在 99% 的情况下，您应该使用 1 个浏览器实例并并行化代理，每个代理有 1 个上下文。您还可以在代理完成后重用上下文。

browser = Browser()

for i in range(10):    # This create a new context and automatically closes it after the agent finishes (with `__aexit__`)    async with browser.new_context() as context:        agent = Agent(task=f"Task {i}", llm=model, browser_context=context)
        # ... reuse context

上下文与浏览器

如果您不指定browser或browser_context代理将创建一个新的浏览器实例和上下文。

获取 XPath 历史记录

要获取代理所做的所有操作的完整历史记录，您可以使用run方法的输出：

history: list[AgentHistory] = await agent.run()
print(history)

浏览器配置

您可以使用BrowserConfig和BrowserContextConfig类配置浏览器。

The most important options are:
最重要的选项是：

headless ：是否以无头模式运行浏览器
keep_open : 脚本完成后是否保持浏览器打开
disable_security ：是否禁用浏览器安全功能（如果处理 iFrame 等跨源请求，则非常有用）
cookies_file ：用于持久化的 cookies 文件的路径
minimum_wait_page_load_time ：获取LLM输入的页面状态之前等待的最短时间
wait_for_network_idle_page_load_time ：获取页面状态之前等待网络请求完成的时间
maximum_wait_page_load_time ：继续之前等待页面加载的最长时间

项目链接

https://github.com/browser-use/browser-use

扫码加入技术交流群，备注「开发语言-城市-昵称」

（文：GitHubStore）

一	二	三	四	五	六	日
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

让 AI 像人类一样浏览网页Browser Use

项目简介

快速启动

注册自定义操作

并行代理

上下文与浏览器

获取 XPath 历史记录

浏览器配置

项目链接

发表评论取消回复

项目简介

快速启动

注册自定义操作

并行代理

上下文与浏览器

获取 XPath 历史记录

浏览器配置

项目链接

发表评论 取消回复

发表评论取消回复