1

I have a file that has multiple urls and list with url paths. I am trying to concatenate or join them together with the url paths in the list. I am having a lot of trouble trying to get this to work.

file has urls like this

foobar.com
foobar.com.tk
foobar.org
list1 = ['/foobar.php','/foobar.html','/foobar.php']

with open('file1.txt') as f:
    Nlist = [line.strip() for line in f]

I don't know if it matters or not but the file with urls doesn't have the http:// header and when I try to join the urls from the list with the paths I keep getting an error or the code is all bunched up...How do I join the urls from the file with the paths from the list?

5
  • 2
    Can you share the output you expect to get for this sample data? It would help make the question clearer Commented Jan 16, 2020 at 8:25
  • Please provide more details for the error and any actual/expected output. You can concatenate lists using list_a + list_b, regardless of where their data comes from. Commented Jan 16, 2020 at 8:26
  • oh yea I'm sorry I meant to do that, but forgot the output should look something like this ``` foobar.com/foobar.php foobar.org/foobar.html foobar.com.tk/foobar.php or foobar.com/foobar.php foobar.org/foobar.html foobar.com.tk/foobar.php ``` Commented Jan 16, 2020 at 8:27
  • so will i have to put the file into a list first? i am trying to iterate through them so I can make a post request with each individual url...if i do list_a + list_b it just adds the list together I need to be able to join the urls in the file with the paths in the list so they form a url foobar.com/foobar.php like that.... Commented Jan 16, 2020 at 8:31
  • lines = list(open(filename, 'r')) Commented Jan 16, 2020 at 8:34

2 Answers 2

1

You could get every url combination with itertools.product:

from itertools import product
from pprint import pprint

list1 = ["/foobar.php", "/foobar.html", "/foobar.php"]

with open("file1.txt") as f:
    pprint(
        set(
            "https://%s%s" % (root, path)
            for root, path in product(map(str.strip, f), list1)
        )
    )

Urls:

{'https://foobar.com.tk/foobar.html',
 'https://foobar.com.tk/foobar.php',
 'https://foobar.com/foobar.html',
 'https://foobar.com/foobar.php',
 'https://foobar.org/foobar.html',
 'https://foobar.org/foobar.php'}

Note: You can use set() here to remove duplicate urls from the result.

Sign up to request clarification or add additional context in comments.

Comments

1
list1 = ['/foobar.php','/foobar.html','/foobar.php']

with open('file1.txt') as f:
    Nlist = [line.strip() for line in f]

for i in range(len(Nlist)):
    pth = 'https://' + Nlist[i] + list1[i]
    print(pth)

os.path.join() doesn't seem to like all of the dot whatevers, so it seems that you have to resort to your own concatenation.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.