1

I've got an array with a list of image URLs that I'm trying to search/replace with a regex (via gsub). The values are in the format //subdomain.website.com/folder/image.extension. I want to add 'https' in front of each array entry.

I've tried to use gsub, but the array remains unchanged:

matches = source.scan(/(\/\/\w+\.\w+\.\w{2,4}\/\w+\/\w+\.\w{2,4})/).uniq
matches.each {|value| value.to_s.gsub!(/\/\//, 'https://')}

In Perl, I could do something like this to change each value:

for (@matches) {
    s/\/\//https:\/\//g;
}

Am I calling the gsub function in an incorrect manner?

1
  • Don't use regex to modify URLs, instead use the URI class to parse them, then change the scheme. See the example. Commented Nov 10, 2014 at 5:59

4 Answers 4

1

First of all, I find it strange that you are calling to_s on value, since value is an array which will include array notation when converted to a string, so value.to_s might look something like ["//subdomain.website.com/folder/image.exte"].

You can avoid this by changing your regex to not include a capture group:

/\/\/\w+\.\w+\.\w{2,4}\/\w+\/\w+\.\w{2,4}/

Now to the main part of your question, you should be calling map on matches, instead of each. The map method will change each element in the array to the result of calling the supplied block with the given element.

Put together it might look like this:

matches = source.scan(/\/\/\w+\.\w+\.\w{2,4}\/\w+\/\w+\.\w{2,4}/).uniq
matches.map { |value| value.gsub(/\/\//, 'https://') }
# => ["https://subdomain.website.com/folder/image.exte"]
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you. I was under the impression that I needed a capture group to save the regex matches. Now I know. Also, I found it necessary to use gsub! in place of gsub to successfully change the array values.
0

If you know every element of your array is formatted properly and ready to have "https:" prepended, it seems like concatenating would be simpler than gsub. For example,

matches.map! { |value| "https:" << value }

should work once you have an array of strings as described by @August.

Comments

0

You could try something like this.

matches.map{|m| "https#{m}"}

This should add https to the front of every element.

Comments

0

Ruby comes with a nice class for this called URI, so take advantage of it:

require 'uri'

uri = URI.parse('//www.example.com')  # => #<URI::Generic:0x007ff0098581e8 URL://www.example.com>
uri.scheme = 'https'                  # => "https"
uri.to_s                              # => "https://www.example.com"

If you want to process a list of URLs:

%w[
  //www.example.com
].map{ |url|                          # => ["//www.example.com"]
  uri = URI.parse(url)                # => #<URI::Generic:0x007ff009853350 URL://www.example.com>
  uri.scheme = 'https'                # => "https"
  uri.to_s                            # => "https://www.example.com"
}                                     # => ["https://www.example.com"]

The advantage of URI is it's smart enough to do the right thing if the URL already has a scheme or is missing it entirely:

require 'uri'

%w[
  http://foo.com
  https://foo.com
  //foo.com
].map { |url|
  uri = URI.parse(url)
  uri.scheme = 'https'
  uri.to_s 
} # => ["https://foo.com", "https://foo.com", "https://foo.com"]

If you insist on using a regex, then simplify it:

url = '//www.example.com'
url[/^/] = 'https:'
url # => "https://www.example.com"

And:

%w[
  //www.example.com
].map{ |url|           # => ["//www.example.com"]
  url[/^/] = 'https:'  # => "https:"
  url                  # => "https://www.example.com"
}                      # => ["https://www.example.com"]

Using a regular expression isn't smart enough to sense whether the scheme already exists, so more code has to be written to handle that situation.

1 Comment

Slightly off-topic: do other languages (PHP/Python/Perl) have similar functionality as the URI class in Ruby? For example, I know Python has urllib.parse, though it does not seem capable of changing a URI's scheme.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.