class HTML::Pipeline::AbsoluteSourceFilter

Public Instance Methods

call() click to toggle source

HTML Filter for replacing relative and root relative image URLs with fully qualified URLs

This is useful if an image is root relative but should really be going through a cdn, or if the content for the page assumes the host is known i.e. scraped webpages and some RSS feeds.

Context options:

:image_base_url - Base URL for image host for root relative src.
:image_subpage_url - For relative src.

This filter does not write additional information to the context. This filter would need to be run before CamoFilter.

# File lib/html/pipeline/absolute_source_filter.rb, line 20
def call
  doc.search("img").each do |element| 
    next if element['src'].nil? || element['src'].empty?
    src = element['src'].strip
    unless src.start_with? 'http'
      if src.start_with? '/'
        base = image_base_url
      else
        base = image_subpage_url
      end
      element["src"] = URI.join(base, src).to_s
    end
  end
  doc
end
image_base_url() click to toggle source

Private: the base url you want to use

# File lib/html/pipeline/absolute_source_filter.rb, line 37
def image_base_url
  context[:image_base_url] or raise "Missing context :image_base_url for #{self.class.name}"
end
image_subpage_url() click to toggle source

Private: the relative url you want to use

# File lib/html/pipeline/absolute_source_filter.rb, line 42
def image_subpage_url
  context[:image_subpage_url] or raise "Missing context :image_subpage_url for #{self.class.name}"
end