0

I want to parse data from a website after submitting a form and I'm using requests library to do that. This is the website. There is a form on that site. after submitting the form the page reloads and it generates a new table that contains the information, and that's the information I want.

This the header when I manually submit the form:

activeFormName:report_builder_form
repProviance:66
repStation:40754
parameters:1
start_year:1951
end_year:1963
SearchBtn:جستجو
SearchBtn:جستجو
__sisReportRowCount:10
__sisReportParamType:simple`

I send post request using a dictionary of data:

import requests
from bs4 import BeautifulSoup
form_data = {
        'activeFormName':'report_builder_form',
        "repProviance": 66,
        'repStation': 40754,
        'parameters':1 ,
        "start_year": 1951,
        "end_year": 1963,
        "SearchBtn":"%D8%AC%D8%B3%D8%AA%D8%AC%D9%88",
    #     "SearchBtn":"جستجو", ### This line and above are the same.
            "__sisReportParamType": 'simple',
            "__sisReportRowCount": 10
        } 

respones = requests.post(url,data=form_data)
s = BeautifulSoup(respones.content,'lxml')

but it always gives me an HTML file that contains no information.

5
  • I send data to my web page through request and I have not had problems, you have verified the status_code that returns you. You could provide the url. Commented Jul 12, 2017 at 5:01
  • status code is 200, and the url is:irimo.ir/far/wd/… Commented Jul 12, 2017 at 5:02
  • Code 200, that is rare, since with post data is created and the code that should return you is 201. Commented Jul 12, 2017 at 5:05
  • I copied your code, with the url your provided, the response is NOT empty... perhaps what you are looking for on the page is generated by javascript, in that case that isn't in the response html and you should use another package like selenium instead of requests to handle javascript Commented Jul 12, 2017 at 5:06
  • Yes that is not empty but it is without proper information. I start reading selenium documentation. Commented Jul 12, 2017 at 5:09

1 Answer 1

1
       import time 
import requests 
from bs4 import BeautifulSoup 
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'} 

url = '.......' 
ses = requests.Session() 
respones = ses.get(url) 
time.sleep(5) 
pay_load = { 
'activeFormName':'report_builder_form', 
"repProviance": 66, 
'repStation': 40754, 
'parameters':1 , 
"start_year": 1951, 
"end_year": 1963, 
"SearchBtn":"%D8%AC%D8%B3%D8%AA%D8%AC%D9%88", 
# "SearchBtn":"جستجو", 
"__sisReportParamType": 'simple', 
"__sisReportRowCount": 10 
} 

s =ses.post(respones.url, data=pay_load) 

soup = BeautifulSoup(s.content,'html.parser') 
print(soup.prettify())

try posting the data like this

Sign up to request clarification or add additional context in comments.

6 Comments

NO :) I just don't mention that here. :) I print it so I know that is a blank HTML file! by blank html I mean, some html tags without any data.
try the edited ans @Mehdi, hope this is what you are looking for
Yes but as I said in question I think I provide proper payload, and if there is a lack of data, I can't find it. When i look at http header I set all thing that it has.
try the edited ans without the payload, add the headers, and make a get request, the one i have done in the ans, copy the whole program and run it you will get what you are looking for @Mehdi
I can get the page without even using the headers that you provide for me, it gives me the website with some data including a form. When I submit that firm page will be reloaded. Every thing is the same but a new table that contains some data and I want that data.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.