in peach interval, is the relationship between the speed and the amount of site response to grab. Slow response speed, the crawler will gradually reduce the capture, fast response speed, grabbing quantity will rise, and crawl speed of response peak valley value interval of about 20 days, is about 3 weeks. But also you crawl and host load relationship, in the February to April period, a new server, with CDN, Down server. In order not to continue to increase the complexity of the map, I will not go out.
time interval is black, web crawling quantity and the website included quantity relationship. Grab a large quantity, included the number will also increase, grab less, included the number will be reduced accordingly. The peak amount of grab and included peak interval quantity, in about 15 days, is about 2 weeks. How to understand the way to reduce the number of included, because of a large site, no way to avoid all of the pages are the one and only, or have a high value, the search engine always start after the crawl, gradually eliminating useless pages. Because the operation process need to eliminate useless pages than the storage complex, always need some time to deal with. Every day included new, remove the old. When a new page you grab included less, before >
is a bit of a mess, I just write a blog to plan, put all the contents together, I’ll illustrate this figure about the content, is the response time of a Chinese web crawler, love Shanghai, love Shanghai included the number, and from love to the sea traffic trend chart. All the data were reliable, the number of pages of more than 10 million, 24 hours non-stop crawler crawl, not out of stock.
read in blog people should know about this article: the page loading speed is how to influence the effect of the Shanghai dragon. As the saying goes, the ears, seeing is believing. But in the contact data is Alibaba, it is difficult to share in the post, big business, you know. So I was a few months ago began processing data, 3 months of data collection, still can see the promise of something. Although the conclusion and in the teacher said to have no difference, but after all these things are summed up on my own, but also the ultimate combination of flow data to analyze, believe that if you do Shanghai dragon in a large and medium-sized website, BOSS can not control your Shanghai dragon how do, as long as the final flow, are stronger than what the data used in this article, or your own data, finally to the company to the resource, budget, are very persuasive. At the same time to provide a feasible way for you to make your own Shanghai dragon. Of course, the optimization of the site speed, is the need of technical support. Therefore I also keep the view: Shanghai dragon is 100% skilled. First look at the picture below.