入门 - 库
安装
¥Installation
Pip
pip install --upgrade pip
pip install playwright
playwright install
Conda
conda config --add channels conda-forge
conda config --add channels microsoft
conda install playwright
playwright install
这些命令会下载 Playwright 软件包并安装 Chromium、Firefox 和 WebKit 浏览器二进制文件。要修改此行为,请参阅 安装参数。
¥These commands download the Playwright package and install browser binaries for Chromium, Firefox and WebKit. To modify this behavior see installation parameters.
用法
¥Usage
安装完成后,你可以使用 Python 脚本运行 import
Playwright,并启动 3 个浏览器(chromium
、firefox
和 webkit
)中的任意一个。
¥Once installed, you can import
Playwright in a Python script, and launch any of the 3 browsers (chromium
, firefox
and webkit
).
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page()
page.goto("https://playwright.nodejs.cn")
print(page.title())
browser.close()
Playwright 支持两种 API 变体:同步和异步。如果你的现代项目使用 asyncio,则应该使用异步 API:
¥Playwright supports two variations of the API: synchronous and asynchronous. If your modern project uses asyncio, you should use async API:
import asyncio
from playwright.async_api import async_playwright
async def main():
async with async_playwright() as p:
browser = await p.chromium.launch()
page = await browser.new_page()
await page.goto("https://playwright.nodejs.cn")
print(await page.title())
await browser.close()
asyncio.run(main())
第一个脚本
¥First script
在我们的第一个脚本中,我们将导航到 https://playwright.nodejs.cn/
并在 WebKit 中截取屏幕截图。
¥In our first script, we will navigate to https://playwright.nodejs.cn/
and take a screenshot in WebKit.
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.webkit.launch()
page = browser.new_page()
page.goto("https://playwright.nodejs.cn/")
page.screenshot(path="example.png")
browser.close()
默认情况下,Playwright 以无头模式运行浏览器。要查看浏览器 UI,请将 headless 选项设置为 False
。你还可以使用 slow_mo 来减慢执行速度。在调试工具 section 中了解更多信息。
¥By default, Playwright runs the browsers in headless mode. To see the browser UI, set headless option to False
. You can also use slow_mo to slow down execution. Learn more in the debugging tools section.
firefox.launch(headless=False, slow_mo=50)
交互模式 (REPL)
¥Interactive mode (REPL)
你可以启动交互式 Python REPL:
¥You can launch the interactive python REPL:
python
然后在其中启动 Playwright 进行快速实验:
¥and then launch Playwright within it for quick experimentation:
from playwright.sync_api import sync_playwright
playwright = sync_playwright().start()
# Use playwright.chromium, playwright.firefox or playwright.webkit
# Pass headless=False to launch() to see the browser UI
browser = playwright.chromium.launch()
page = browser.new_page()
page.goto("https://playwright.nodejs.cn/")
page.screenshot(path="example.png")
browser.close()
playwright.stop()
异步 REPL,例如 asyncio
REPL:
¥Async REPL such as asyncio
REPL:
python -m asyncio
from playwright.async_api import async_playwright
playwright = await async_playwright().start()
browser = await playwright.chromium.launch()
page = await browser.new_page()
await page.goto("https://playwright.nodejs.cn/")
await page.screenshot(path="example.png")
await browser.close()
await playwright.stop()
Pyinstaller
你可以将 Playwright 与 Pyinstaller 结合使用来创建独立的可执行文件。
¥You can use Playwright with Pyinstaller to create standalone executables.
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page()
page.goto("https://playwright.nodejs.cn/")
page.screenshot(path="example.png")
browser.close()
如果你希望将浏览器与可执行文件打包在一起:
¥If you want to bundle browsers with the executables:
- Bash
- PowerShell
- Batch
PLAYWRIGHT_BROWSERS_PATH=0 playwright install chromium
pyinstaller -F main.py
$env:PLAYWRIGHT_BROWSERS_PATH="0"
playwright install chromium
pyinstaller -F main.py
set PLAYWRIGHT_BROWSERS_PATH=0
playwright install chromium
pyinstaller -F main.py
将浏览器与可执行文件打包在一起将生成更大的二进制文件。建议仅打包你使用的浏览器。
¥Bundling the browsers with the executables will generate bigger binaries. It is recommended to only bundle the browsers you use.
已知问题
¥Known issues
time.sleep()
导致状态过时
¥time.sleep()
leads to outdated state
由于 Playwright 已提供 auto-waiting,因此你很可能无需手动等待。如果你仍然依赖它,你应该使用 page.wait_for_timeout(5000)
而不是 time.sleep(5)
,最好根本不要等待超时,但有时它对调试很有用。在这种情况下,请使用我们的 wait (wait_for_timeout
) 方法,而不是 time
模块。这是因为我们内部依赖异步操作,而使用 time.sleep(5)
时,它们无法正确处理。
¥Most likely you don't need to wait manually, since Playwright has auto-waiting. If you still rely on it, you should use page.wait_for_timeout(5000)
instead of time.sleep(5)
and it is better to not wait for a timeout at all, but sometimes it is useful for debugging. In these cases, use our wait (wait_for_timeout
) method instead of the time
module. This is because we internally rely on asynchronous operations and when using time.sleep(5)
they can't get processed correctly.
与 Windows 上的 SelectorEventLoop
或 asyncio
不兼容
¥incompatible with SelectorEventLoop
of asyncio
on Windows
Playwright 在子进程中运行驱动程序,因此在 Windows 上需要 ProactorEventLoop
或 asyncio
,因为 SelectorEventLoop
不支持异步子进程。
¥Playwright runs the driver in a subprocess, so it requires ProactorEventLoop
of asyncio
on Windows because SelectorEventLoop
does not supports async subprocesses.
在 Windows Python 3.7 上,Playwright 将默认事件循环设置为 ProactorEventLoop
,因为它是 Python 3.8+ 上的默认设置。
¥On Windows Python 3.7, Playwright sets the default event loop to ProactorEventLoop
as it is default on Python 3.8+.
线程
¥Threading
Playwright 的 API 并非线程安全的。如果你在多线程环境中使用 Playwright,则应该为每个线程创建一个 Playwright 实例。详细信息请参见 线程问题。
¥Playwright's API is not thread-safe. If you are using Playwright in a multi-threaded environment, you should create a playwright instance per thread. See threading issue for more details.