使用chrome headless模式对目标网页生成截图,添加水印

目标网页生成截图的方法,摆脱手动大量截图的烦恼

对目标网页生成截图可使用的方法有如下几种

手动操作
PhantomJS
cutycapt
Chromium 或 Firefox 的 headless 模式

本文通过node的Puppeteer操作chrome headless模式生成截图

一个用于网页截图及相关服务的Docker镜像，支持命令行和HTTP接口。使用node、puppeteer和chrome headless实现。

A Docker image for web page screenshot and related services, supporting both command line and HTTP interface. Implemented with node, puppeteer and chrome headless

English documentation

在docker中运行的方法

1. 获取镜像

有两种方法获取镜像：

从Docker Hub获取镜像
```
docker pull eyunzhu/webpage-screenshot
```

克隆仓库自己构建镜像

git clone https://github.com/eyunzhu/webpage-screenshot
cd webpage-screenshot
docker build -t eyunzhu/webpage-screenshot .

2. 运行容器

docker run -d -p 1234:8080 --cap-add SYS_ADMIN --name screenshot-service eyunzhu/webpage-screenshot

该镜像的工作目录为/app，本地可挂载目录/app/data，用作保存截图的目录，(即请求参数p可以设置为data)

3. 调用服务

提供两种调用服务的方式：

通过HTTP接口方式

使用以下URL获取指定网站的截图：

http://ip:{port}/?url=https://eyunzhu.com

请求参数

参数	描述	默认值
url	目标URL	https://eyunzhu.com
w	截图宽度	1470
h	截图高度	780
q	截图质量百分比	100
d	下载标志（如果非空，下载图像）
p	服务器保存路径（如果非空，在服务器上保存图像，确保路径存在）
n	图像名称	screenshot.jpeg
f	如果非空，捕获完整页面

通过命令行方式

# 进入上述启动的容器，然后执行以下示例命令生成截图
# 指定URL和宽度参数
node cmd.js --url https://eyunzhu.com --width 800

命令行参数

Options:
      --version   显示版本号                                           [布尔]
  -u, --url       要截图的URL                       [字符串] [默认值: "https://eyunzhu.com"]
  -w, --width     视口宽度                           [数字] [默认值: 1470]
  -h, --height    视口高度                           [数字] [默认值: 780]
  -q, --quality   截图质量                           [数字] [默认值: 100]
  -p, --path      保存截图的目录                   [字符串] [默认值: "data"]
  -n, --name      截图文件名称           [字符串] [默认值: "screenshot.jpeg"]
  -f, --fullPage  捕获完整页面                       [布尔] [默认值: false]
      --help      显示帮助

在本地电脑中使用的方法

本地使用前提需要安装Node、Puppeteer、Yargs、Chrome浏览器。

以下在Mac下演示。

# 前提安装好Node、Chrome浏览器
git clone https://github.com/eyunzhu/webpage-screenshot
cd webpage-screenshot

HTTP方式，运行 http.js

npm install puppeteer yargs # 安装Puppeteer、Yargs

// 运行脚本前先修改 http.js 脚本中的 HTTP 服务端口和 Google Chrome Stable 路径
const port = 8080;
const browser = await puppeteer.launch({
      // 此路径请根据系统中安装 Google Chrome Stable 的实际情况更改；Mac 下安装了 Chrome 浏览器时，可注释掉不写
      //executablePath:'/usr/bin/google-chrome-stable',
      defaultViewport: { width, height }
   });

node http.js # 启动服务

服务启动后即可通过 HTTP 进行调用。

CMD方式，运行 cmd.js

// 运行脚本前先修改 cmd.js 脚本中的 Google Chrome Stable 路径
const browser = await puppeteer.launch({
      // 此路径请根据系统中安装 Google Chrome Stable 的实际情况更改；Mac 下安装了 Chrome 浏览器时，可注释掉不写
      //executablePath:'/usr/bin/google-chrome-stable',
      defaultViewport: { width, height }
   });

# 开始使用
node cmd.js --help

附录

附上node使用Puppeteer操作浏览器截图的代码

前提是安装了node、Puppeteer库、chrome浏览器

const puppeteer = require('puppeteer');
async function autoScroll(page) {
  return page.evaluate(() => {
    return new Promise((resolve, reject) => {
      let totalHeight = 0;
      let distance = 100;
      let timer = setInterval(() => {
        let scrollHeight = document.body.scrollHeight;
        window.scrollBy(0, distance);
        totalHeight += distance;
        if (totalHeight >= scrollHeight) {
          clearInterval(timer);
          resolve();
        }
      }, 100);
    })
  });
}
async function main() {
  // const browser = await puppeteer.launch();
  const browser = await puppeteer.launch(
    {
      //关闭无头模式
      headless: false,
      //全屏打开浏览器
      args: ['--start-maximized'],
      //设置浏览器页面尺寸
      defaultViewport: { width: 1474, height: 864 }
    });
  const page = await browser.newPage();
  await page.goto('https://eyunzhu.com/', { waitUntil: 'networkidle2' });//networkidle2 表示 不但加载了所有 HTML、CSS、JavaScript 等静态资源，也完成了一些必要的动态数据请求
  // await page.waitForSelector('body');//等待元素出现
  // await page.waitForTimeout(5000);//等待5秒加载
  // // 对元素截图
  // let body = await page.$('.download_js')
  // await body.screenshot({
  //   path: 'download.png'
  // })
  // 滚动
  await autoScroll(page)
  await page.screenshot({
    path: 'apex.jpg',
    quality: 15, // 设置截图质量为 15%
    fullPage: true,//是否截全页，如果截取全页请先滚动页面
  });
  await browser.close();
}
main()

另

在mac中也可直接命令行操作chrome截图

/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --no-sandbox --disable-setuid-sandbox --headless --disable-gpu --window-size=1024x768 --screenshot --screenshot-quality=90 https://eyunzhu.com

chrome截图

使用chrome headless模式对目标网页生成截图,添加水印

在docker中运行的方法

1. 获取镜像

2. 运行容器

3. 调用服务

请求参数

在本地电脑中使用的方法

附录

另

发表评论登录

目前评论：0

随机文章

QQ:1036795373

公众号：忆云竹

在docker中运行的方法

1. 获取镜像

2. 运行容器

3. 调用服务

请求参数

在本地电脑中使用的方法

附录

另

发表评论 登录

目前评论：0

随机文章

QQ:1036795373

公众号：忆云竹

发表评论登录