Concurrency Limit on HttpWebRequest

I am writing an application to measure how fast I can download web pages using C#. I supply a list of unique domain names, then I spawn X number of threads and perform HTTPWebRequests until the list of domains has been consumed. The problem is that no matter how many threads I use, I only get about 3 pages per second.

I discovered that the System.Net.ServicePointManager.DefaultConnectionLimit is 2, but I was under the impression that this is related to the number of connections per domain. Since each domain in the list is unique, this should not be an issue.

Then I found that the GetResponse() method blocks access from all other processes until the WebResponse is closed: http://www.codeproject.com/KB/IP/Crawler.aspx#WebRequest, I have not found any other information on the web to back this claim up, however I implemented a HTTP request using sockets, and I noticed a significant speed up (4x to 6x).

So my questions: does anyone know exactly how the HttpWebRequest objects work?, is there a workaround besides what was mentioned above?, or are there any examples of high speed web crawlers written in C# anywhere?

9
задан Kam Sheffield 7 December 2010 в 22:38
поделиться