Hit a url multiple times [duplicate]

Question

I want to hit a URL N number of times in python. Currently I have been doing this using webbrowser.open(), but it is very slow and consumes a lot of memory. Any more efficient method?

@EnnoShioji: it is not a duplicate. Making multiple requests to the same url in an efficient manner is a different problem. You want ab-like tool rather than mere curl. — jfs
– jfs, Commented Mar 1, 2014 at 11:22
related: Problem with multi threaded Python app and socket connections — jfs
– jfs, Commented Mar 1, 2014 at 11:36
import requests requests.get(url = "some_url") --- run this in a loop — Sumant Agnihotri
– Sumant Agnihotri, Commented Jan 5, 2020 at 14:15

Alexander · Accepted Answer · 2013-09-12 08:46:09Z

4

Take a look at Urllib2.urlopen

import urllib2

for _ in range(10):
    urllib2.urlopen("http://www.stackoverflow.com")

answered Sep 12, 2013 at 8:46

Alexander

12.8k5 gold badges62 silver badges79 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

abarnert · Accepted Answer · 2013-09-12 08:55:09Z

F.X.'s answer is almost certainly what you want.

But you asked about efficiency, and if you really want to be as efficient as possible, you can do better. The sooner you close the socket, the less you're wasting in CPU, memory, and bandwidth, both on your machine and on the web server.

Also, if you make multiple requests in parallel, while that won't save any resources on your machine (it'll actually waste some) or the server, it will probably finish faster. Is that what you're after?

Of course that raises the question of what exactly you mean by "hit a URL". Is it acceptable to just send the request and immediately shutdown? Or do you need to wait for at least the response line? For that matter, is it acceptable to make a HEAD request instead of a GET? Do you need realistic/useful headers?

Anyway, in order to do this, you'd want to drop down to a lower level. Most higher-level libraries don't give you any way to, e.g., close the socket before reading anything. But it's not that hard to craft HTTP requests.*

For example:

from contextlib import closing
from socket import create_connection
from concurrent.futures import ThreadPoolExecutor, wait

host, port = 'www.example.com', 80
path = '/path/to/resource.html'

def spam_it():
    with closing(create_connection((host, port))) as sock:
        sock.sendall('GET {} HTTP/1.0\n\n'.format(path))

with ThreadPoolExecutor(max_workers=16) as executor:
    wait(executor.submit(spam_it) for _ in range(10000))

* Well, manually crafting HTTP requests is actually quite involved… If you only need to craft a static, trivial one, do it yourself, but in general, you definitely want to use urllib, requests, or some other library.

+1. Though the example code won't work on Python 2 or Python 3 (use bytes literal for the data to send and mention that futures is 3rd party on Python 2)

F.X. · Accepted Answer · 2013-09-12 08:40:16Z

2

Use urllib2? As a standard rule of thumb, always look in the standard library first, there are tons of useful packages there.

answered Sep 12, 2013 at 8:40

F.X.

7,4756 gold badges53 silver badges74 bronze badges

1 Comment

abarnert Over a year ago

You said urllib, but linked to urllib2. Otherwise, good answer.

svk · Accepted Answer · 2013-09-12 08:40:38Z

1

import urllib2

url = "http://www.google.com"
n = 8

for i in range(n):
  urllib.urlopen( url ).read()

You may wish to look into the requests module if you're eventually going to want to anything less trivial with HTTP requests.

answered Sep 12, 2013 at 8:40

svk

5,94920 silver badges22 bronze badges

Collectives™ on Stack Overflow

Hit a url multiple times [duplicate]

4 Answers 4

Comments

1 Comment

1 Comment

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

1 Comment

1 Comment

Comments

Linked

Related