scrapyd 服务器添加认证信息
我们也可以在scrapyd前面加一层反向代理来实现用户认证。以nginx为例, 配置nginx
安装nginx
sudo apt-get install nginx
配置nginx
vi /etc/nginx/nginx.conf
修改如下:
# Scrapyd local proxy for basic authentication.
# Don't forget iptables rule.
# iptables -A INPUT -p tcp --destination-port 6800 -s ! 127.0.0.1 -j DROP
http {
server {
listen 6801;
location / {
proxy_pass http://127.0.0.1:6800/;
auth_basic "Restricted";
auth_basic_user_file /etc/nginx/conf.d/.htpasswd;
}
}
}
/etc/nginx/htpasswd/user.htpasswd
里设置的用户名 enlong和密码都是test 修改配置文件,添加用户信息
Nginx使用htpasswd创建用户认证
python@ubuntu:/etc/nginx/conf.d$ sudo htpasswd -c .htpasswd enlong
New password:
Re-type new password:
Adding password for user enlong
python@ubuntu:/etc/nginx/conf.d$ cat .htpasswd
enlong:$apr1$2slPhvee$6cqtraHxoxclqf1DpqIPM.
python@ubuntu:/etc/nginx/conf.d$ sudo htpasswd -bc .htpasswd admin admin
apache htpasswd命令用法实例
1、如何利用htpasswd命令添加用户?
htpasswd -bc .passwd www.leapsoul.cn php
在bin目录下生成一个.passwd文件,用户名www.leapsoul.cn,密码:php,默认采用MD5加密方式
2、如何在原有密码文件中增加下一个用户?
htpasswd -b .passwd leapsoul phpdev
去掉c选项,即可在第一个用户之后添加第二个用户,依此类推
重启nginx
sudo service nginx restart
测试Nginx
F:\_____gitProject_______\curl-7.33.0-win64-ssl-sspi\tieba_baidu>curl http://localhost:6800/schedule.json -d project=tutorial -d spider=tencent -u enlong:test
{"status": "ok", "jobid": "5ee61b08428611e6af1a000c2969bafd", "node_name": "ubuntu"}
配置scrapy.cfg文件
[deploy]
url = http://192.168.17.129:6801/
project = tutorial
username = admin
password = admin
注意上面的url
已经修改为了nginx监听的端口。
提醒: 记得修改服务器上scrapyd的配置bind_address
字段为127.0.0.1
,以免可以从外面绕过nginx, 直接访问6800端口。 关于配置可以参看本文后面的配置文件设置.
修改配置文件
sudo vi /etc/scrapyd/scrapyd.conf
[scrapyd]
bind_address = 127.0.0.1
scrapyd
启动的时候会自动搜索配置文件,配置文件的加载顺序为
/etc/scrapyd/scrapyd.conf
/etc/scrapyd/conf.d/*
scrapyd.conf
~/.scrapyd.conf
最后加载的会覆盖前面的设置
默认配置文件如下, 可以根据需要修改
[scrapyd]
eggs_dir = eggs
logs_dir = logs
items_dir = items
jobs_to_keep = 5
dbs_dir = dbs
max_proc = 0
max_proc_per_cpu = 4
finished_to_keep = 100
poll_interval = 5
bind_address = 0.0.0.0
http_port = 6800
debug = off
runner = scrapyd.runner
application = scrapyd.app.application
launcher = scrapyd.launcher.Launcher
[services]
schedule.json = scrapyd.webservice.Schedule
cancel.json = scrapyd.webservice.Cancel
addversion.json = scrapyd.webservice.AddVersion
listprojects.json = scrapyd.webservice.ListProjects
listversions.json = scrapyd.webservice.ListVersions
listspiders.json = scrapyd.webservice.ListSpiders
delproject.json = scrapyd.webservice.DeleteProject
delversion.json = scrapyd.webservice.DeleteVersion
listjobs.json = scrapyd.webservice.ListJobs
关于配置的各个参数具体含义,可以参考官方文档