<> Write it at the front

For distributed crawler learning , Or for technology learning , There is no shortcut , There are two ways to go , First, practice repeatedly , What makes perfect ; Second, look at the code shared by others and learn how to write it over and over again , Until you do it yourself .

I believe that you can simply run the distributed crawler , You may find that the distributed crawler is a mental adjustment , There are not too many changes in the way the code is written , But we need to know that we use scrapy-redis Direct build distributed crawler , So I stood on the shoulders of my predecessors and climbed the wall , But as a first step to understanding distributed crawlers “ structure ”, We have achieved this milestone , Next, we need to make this milestone more solid , Easy to climb back .

I'll use it next 3 Some cases , Repeated distributed crawler writing , Until there's no way .

Find a website for reference today , Everyone is a product manager , Sorry about the statement , Learning purpose , Crawling data will be deleted in time .

<> Building distributed crawler

<> Create a scrapy Reptile Engineering

scrapy.exe startproject woshipm

Create a crawler project with the above command , Pay attention if your scrapy.exe No environment variables are configured , Please navigate to the scrapy.exe And then execute the command

<> Create a CrawlSpider Crawler files for

D:\python100\venv\Scripts\scrapy.exe genspider -t crawl pm xxx.com

When creating a crawler file with the above command , Be sure to pay attention to the spider Inside the folder , without , You need to copy the generated pm.py file , Paste to spider Inside the folder

The above two steps are completed , The current directory structure is as follows

Technology
©2019-2020 Toolsou All rights reserved,
Send love - A little romance for programmers VHDL—— Design of frequency divider Python Implementation of Hanoi Tower code It's over , Starting salary 30khtml+css+js Make a simple website home page QQ Login interface implementation Hill sorting of sorting algorithm ——c++ realization 【 Wechat applet learning 】 Netease music cloud code page implementation details Resume the 13th session python Blue Bridge Cup 2022 Solution to the 13th Blue Bridge Cup ( whole )