Skip to main content

下载

介绍

🌐 Introduction

对于页面下载的每个附件,都会触发 page.on("download") 事件。所有这些附件都会下载到一个临时文件夹中。你可以使用事件中的 Download 对象获取下载的 URL、文件名和数据流。

🌐 For every attachment downloaded by the page, page.on("download") event is emitted. All these attachments are downloaded into a temporary folder. You can obtain the download url, file name and payload stream using the Download object from the event.

你可以在 browser_type.launch() 中使用 downloads_path 选项指定下载文件的保存位置。

🌐 You can specify where to persist downloaded files using the downloads_path option in browser_type.launch().

note

当生成这些文件的浏览器上下文关闭时,下载的文件会被删除。

🌐 Downloaded files are deleted when the browser context that produced them is closed.

这是处理文件下载的最简单方法:

🌐 Here is the simplest way to handle the file download:

# Start waiting for the download
with page.expect_download() as download_info:
# Perform the action that initiates download
page.get_by_text("Download file").click()
download = download_info.value

# Wait for the download process to complete and save the downloaded file somewhere
download.save_as("/path/to/save/at/" + download.suggested_filename)

变化

🌐 Variations

如果你不知道是什么启动了下载,你仍然可以处理该事件:

🌐 If you have no idea what initiates the download, you can still handle the event:

page.on("download", lambda download: print(download.path()))

请注意,处理事件会分叉控制流,使脚本更难理解。由于主控制流并未等待此操作完成,你的场景可能在文件下载过程中就已经结束。

🌐 Note that handling the event forks the control flow and makes the script harder to follow. Your scenario might end while you are downloading a file since your main control flow is not awaiting for this operation to resolve.

note

有关上传文件,请参阅上传文件部分。 :::

🌐 For uploading files, see the uploading files section.