how I can get input from html forms on other sites? I want it to return a dictionary such as:
form = [('name' = 'somename', 'type' = 'text', 'value':''},{' name' = 'somename', 'type' = 'submit', 'value': ' submit ').
Sorry for my English.
you probably wont be able to retrieve form data from other users on other sites. If you wish to use a script to send data to a form, mechanize is one tool that makes this quite easy.
<form> tag. This should be all that you need to get started. If the forms are indeterministic, no script will be able to assist you. If you mean the forms are generated by client-side JavaScript, then browser automation may help.Yeah mechanize is sweet !
import mechanize
# Browser
br = mechanize.Browser()
br.set_handle_equiv(True)
br.set_handle_gzip(True)
br.set_handle_redirect(True)
br.set_handle_referer(True)
br.set_handle_robots(False)
br.set_handle_refresh(mechanize._http.HTTPRefreshProcessor(), max_time=1)
br.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')]
# we inspect the all form element in the http://stackoverflow.com
br.open('http://stackoverflow.com')
for form in br.forms():
print form
Look at mechanize, lxml.html and BeatifulSoup.
BeautifulSoup is also much slower than lxml.html
urllib.urlopen-ing a url), or is this some Django based thing?