module Corefines::String::ForceUTF8
@!method #force_utf8
Returns a copy of _str_ with encoding changed to UTF-8 and all invalid byte sequences replaced with the Unicode Replacement Character (U+FFFD). If _str_ responds to +#scrub!+ (Ruby >=2.1), then it's used for replacing invalid bytes. Otherwise a simple custom implementation is used (may not return the same result as +#scrub!+). @return [String] a valid UTF-8 string.
@!method #force_utf8!
Changes the encoding to UTF-8, replaces all invalid byte sequences with the Unicode Replacement Character (U+FFFD) and returns self. This is same as {#force_utf8}, except it indents the receiver in-place. @return (see #force_utf8)
Public Instance Methods
force_utf8()
click to toggle source
# File lib/corefines/string.rb, line 208 def force_utf8 dup.force_utf8! end
force_utf8!()
click to toggle source
# File lib/corefines/string.rb, line 212 def force_utf8! str = force_encoding(Encoding::UTF_8) if str.respond_to? :scrub! str.scrub! else result = ''.force_encoding('BINARY') invalid = false str.chars.each do |c| if c.valid_encoding? result << c invalid = false elsif !invalid result << "\uFFFD" invalid = true end end replace result.force_encoding(Encoding::UTF_8) end end