0

I aim to download web files while in headless mode. My program downloads perfectly when NOT in headless mode, but once I add the constraint not to show MS Edge opening, the downloading is disregarded.

import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import Select

driver = webdriver.Edge()
driver.get("URL")

id_box = driver.find_element(By.ID,"...")
pw_box = driver.find_element(By.ID,"...")
id_box.send_keys("...")
pw_box.send_keys("...")
log_in = driver.find_element(By.ID,"...")
log_in.click()

time.sleep(0.1) # If not included, get error: "Unable to locate element"

drop_period = Select(driver.find_element(By.ID,"..."))
drop_period.select_by_index(1)
drop_consul = Select(driver.find_element(By.ID,"..."))
drop_consul.select_by_visible_text("...")
drop_client = Select(driver.find_element(By.ID,"..."))
drop_client.select_by_index(1)

# Following files do not download with headless inculded:

driver.find_element(By.XPATH, "...").click()
driver.find_element(By.XPATH, "...").click()


4 Answers 4

0

In that case, you might try downloading the file using the direct link (to the file) and python requests.

You'll need to get the url, by parsing the elemt its href:

Downloading and saving a file from url should work as following then:

import requests as req

remote_url = 'http://www.example.com/file.txt'
local_file_name = 'my_file.txt'

data = req.get(remote_url)

# Save file data to local copy
with open(local_file_name, 'wb')as file:
    file.write(data.content)

resource

Sign up to request clarification or add additional context in comments.

6 Comments

I can't find a URL for the file download, only to the website. The file download only appears after I have selected drop down lists, and the page automatically update.
Can you add the page_source or url to the website, and the desired button element?
However, information I download on the webpage is discretionary, I can not provide access to that page.
Hmm I think there should be a way to identify the direct link. You might try using the network (browser-developer-tools) tab to get the url to the file-request.
|
0

There are different headless modes for Chrome. If you want to download files, use one of the special ones.

For Chrome 109 and above, use:

options.add_argument("--headless=new")

For Chrome 108 and below, use:

options.add_argument("--headless=chrome")

Reference: https://github.com/chromium/chromium/commit/e9c516118e2e1923757ecb13e6d9fff36775d1f4

Comments

0

Downloading files in headless mode works for me on MicrosoftEdge version 110.0.1587.41 using following options:

    MicrosoftEdge: [{
        "browserName": "MicrosoftEdge",
        "ms:edgeOptions": {
            args: ['--headless=new'],
            prefs: {
                "download.prompt_for_download": false,
                "plugins.always_open_pdf_externally": true,
                'download.default_directory': "dlFolder"
            }
        },
    }]

Nothing worked until I added the option '--headless=new'

N.B: Tested on a Mac environment using webdriverIO

Comments

0

The options.add_argument("headless=new") syntax also works for Edge.

I previously used the following syntax to open Edge in headless mode:

from selenium import webdriver
from selenium.webdriver.edge.options import Options

options = Options()
options.add_experimental_option("prefs", {"download.default_directory": my_download_folder, "download.prompt_for_download": False, 'profile.default_content_settings.popups': False})     
options.add_experimental_option("excludeSwitches", ["enable-logging"])
options.add_argument('log-level=3') 
options.headless = True
browser = webdriver.Edge(options=options)
browser.get(url)

The above still works fine (opens the browser in headless mode, clicks links, etc), but doesn't allow file downloads. (You can click on a download link, but nothing happens). New syntax fixes this issue:

from selenium import webdriver
from selenium.webdriver.edge.options import Options

options = Options()
options.add_experimental_option("prefs", {"download.default_directory": my_download_folder, "download.prompt_for_download": False, 'profile.default_content_settings.popups': False})     
options.add_experimental_option("excludeSwitches", ["enable-logging"])
options.add_argument('log-level=3') 
options.add_argument("headless=new")
browser = webdriver.Edge(options=options)
browser.get(url)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.