Skip to main content

入门 - 库

安装

🌐 Installation

Pip

PyPI 版本

pip install --upgrade pip
pip install playwright
playwright install

Conda

conda config --add channels conda-forge
conda config --add channels microsoft
conda install playwright
playwright install

这些命令会下载 Playwright 包并安装 Chromium、Firefox 和 WebKit 的浏览器二进制文件。要修改此行为,请参阅 安装参数

🌐 These commands download the Playwright package and install browser binaries for Chromium, Firefox and WebKit. To modify this behavior see installation parameters.

用法

🌐 Usage

安装完成后,你可以在 Python 脚本中使用 import Playwright,并启动三种浏览器中的任何一种(chromiumfirefoxwebkit)。

🌐 Once installed, you can import Playwright in a Python script, and launch any of the 3 browsers (chromium, firefox and webkit).

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page()
page.goto("https://playwright.nodejs.cn")
print(page.title())
browser.close()

Playwright 支持两种 API 变体:同步和异步。如果你的现代项目使用 asyncio,你应该使用异步 API:

🌐 Playwright supports two variations of the API: synchronous and asynchronous. If your modern project uses asyncio, you should use async API:

import asyncio
from playwright.async_api import async_playwright

async def main():
async with async_playwright() as p:
browser = await p.chromium.launch()
page = await browser.new_page()
await page.goto("https://playwright.nodejs.cn")
print(await page.title())
await browser.close()

asyncio.run(main())

第一个脚本

🌐 First script

在我们的第一个脚本中,我们将导航到 https://playwright.nodejs.cn/ 并在 WebKit 中截取屏幕截图。

🌐 In our first script, we will navigate to https://playwright.nodejs.cn/ and take a screenshot in WebKit.

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
browser = p.webkit.launch()
page = browser.new_page()
page.goto("https://playwright.nodejs.cn/")
page.screenshot(path="example.png")
browser.close()

默认情况下,Playwright 会以无头模式运行浏览器。要查看浏览器界面,请将 headless 选项设置为 False。你也可以使用 slow_mo 来减慢执行速度。更多信息请参阅调试工具的 章节

🌐 By default, Playwright runs the browsers in headless mode. To see the browser UI, set headless option to False. You can also use slow_mo to slow down execution. Learn more in the debugging tools section.

firefox.launch(headless=False, slow_mo=50)

交互模式 (REPL)

🌐 Interactive mode (REPL)

你可以启动交互式 Python REPL:

🌐 You can launch the interactive python REPL:

python

然后在其中启动 Playwright 进行快速实验:

🌐 and then launch Playwright within it for quick experimentation:

from playwright.sync_api import sync_playwright
playwright = sync_playwright().start()
# Use playwright.chromium, playwright.firefox or playwright.webkit
# Pass headless=False to launch() to see the browser UI
browser = playwright.chromium.launch()
page = browser.new_page()
page.goto("https://playwright.nodejs.cn/")
page.screenshot(path="example.png")
browser.close()
playwright.stop()

异步 REPL,例如 asyncio REPL:

🌐 Async REPL such as asyncio REPL:

python -m asyncio
from playwright.async_api import async_playwright
playwright = await async_playwright().start()
browser = await playwright.chromium.launch()
page = await browser.new_page()
await page.goto("https://playwright.nodejs.cn/")
await page.screenshot(path="example.png")
await browser.close()
await playwright.stop()

Pyinstaller

你可以使用 Playwright 配合 Pyinstaller 来创建独立的可执行文件。

🌐 You can use Playwright with Pyinstaller to create standalone executables.

main.py
from playwright.sync_api import sync_playwright

with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page()
page.goto("https://playwright.nodejs.cn/")
page.screenshot(path="example.png")
browser.close()

如果你希望将浏览器与可执行文件打包在一起:

🌐 If you want to bundle browsers with the executables:

PLAYWRIGHT_BROWSERS_PATH=0 playwright install chromium
pyinstaller -F main.py
note

将浏览器与可执行文件打包会生成更大的二进制文件。建议只打包你使用的浏览器。

🌐 Bundling the browsers with the executables will generate bigger binaries. It is recommended to only bundle the browsers you use.

已知问题

🌐 Known issues

time.sleep() 导致状态过时

🌐 time.sleep() leads to outdated state

你很可能不需要手动等待,因为 Playwright 有自动等待。如果你仍然依赖手动等待,你应该使用 page.wait_for_timeout(5000) 而不是 time.sleep(5),最好完全不要等待超时,但有时它对调试是有用的。在这些情况下,使用我们的等待方法(wait_for_timeout)而不是 time 模块。这是因为我们内部依赖异步操作,而使用 time.sleep(5) 时,这些操作无法被正确处理。

🌐 Most likely you don't need to wait manually, since Playwright has auto-waiting. If you still rely on it, you should use page.wait_for_timeout(5000) instead of time.sleep(5) and it is better to not wait for a timeout at all, but sometimes it is useful for debugging. In these cases, use our wait (wait_for_timeout) method instead of the time module. This is because we internally rely on asynchronous operations and when using time.sleep(5) they can't get processed correctly.

与 Windows 上的 asyncioSelectorEventLoop 不兼容

🌐 incompatible with SelectorEventLoop of asyncio on Windows

Playwright 在子进程中运行驱动程序,因此在 Windows 上需要 asyncioProactorEventLoop,因为 SelectorEventLoop 不支持异步子进程。

🌐 Playwright runs the driver in a subprocess, so it requires ProactorEventLoop of asyncio on Windows because SelectorEventLoop does not supports async subprocesses.

在 Windows 的 Python 3.7 上,Playwright 将默认事件循环设置为 ProactorEventLoop,因为它在 Python 3.8 及以上版本中是默认的。

🌐 On Windows Python 3.7, Playwright sets the default event loop to ProactorEventLoop as it is default on Python 3.8+.

进程

🌐 Threading

Playwright 的 API 不是进程安全的。如果你在多进程环境中使用 Playwright,应该为每个进程创建一个 Playwright 实例。更多详情请参见 进程问题

🌐 Playwright's API is not thread-safe. If you are using Playwright in a multi-threaded environment, you should create a playwright instance per thread. See threading issue for more details.