How to get a full URL from a resource URL in python

Question

On web pages, resources such as images, css and javascript are loaded by a client's web browser, when embedded with <img>, <link> and <script> tags respectively.

A resource URL can take different forms, it can be a full URL, for example:

http://cdn.mysite.com/images/animage.jpg

It can be a relative path:

images/animage.jpg
../images/animage.jpg

Or just a reference to the root

/images/animage.jpg

How could I create a function in python, that takes the URL of the page, and the URL of a resource on it and ensures that the full URL is returned?

For example:

def resource_url(page,resource):
    ## if the resource is a full URL, return that
    ## if not, use the page URL and the resource to return the full URL

Have you looked at the urllib.parse.urljoin method? docs.python.org/release/3.1.3/library/urllib.parse.html — user130076
– user130076, Commented Feb 23, 2012 at 14:18

platinummonkey · Accepted Answer · 2012-02-23 14:19:40Z

1

from urlparse import urljoin

def resource_url(page, resource):
  if not resource.startswith(page):
    # doesn't start with http://example.com
    resource = urljoin(page, resource)
  return resource

answered Feb 23, 2012 at 14:19

platinummonkey

8088 silver badges19 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

How to get a full URL from a resource URL in python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related