Is there a pre-existing function or class for URL normalization in PHP?
Specifically, following the semantic preserving normalization rules laid out in this wikipedia article on URL normalization, (or whatever 'standard' I should be following).
- Converting the scheme and host to lower case
- Capitalizing letters in escape sequences
- Adding trailing / (to directories, not files)
- Removing the default port
- Removing dot-segments
Right now, I'm thinking that I'll just use parse_url(), and apply the rules individually, but I'd prefer to avoid reinventing the wheel.
<link rel="canonical"...>. Just, normalizing a URL for, for example, requesting data about it from an API, particularly those that require that the URL be hashed, and so if you don't use a normalized URL, you'll get inaccurate or no results.http://stackoverflow.comandhttp://stackoverflow.com//? can u provide more example of url u try to avoid ?