module Linguistics::EN::Numbers
Numeric methods for the English-language Linguistics module.
Constants
- MANY_RANGE
- NTH
Numerical inflections
- NUMBER_RANGE
- NUMBER_TO_WORDS_FUNCTIONS
A collection of functions for transforming digits into word phrases. Indexed by the number of digits being transformed; e.g.,
NUMBER_TO_WORDS_FUNCTIONS[2]
is the function for transforming double-digit numbers.- NUMEROUS_RANGE
- NUMWORD_DEFAULTS
Default configuration arguments for the numwords function
- ORDINALS
Ordinal word parts
- ORDINAL_SUFFIXES
- QUANTIFY_DEFAULTS
Default configuration arguments for the quantify function
- SEVERAL_RANGE
Default ranges for quantify
- TEENS
- TENS
- THOUSANDS
- UNITS
Numeral names
Public Instance Methods
Split the given number
up into groups of
groupsize
and return them as an Array of words. Use zeroword
for
any occurences of '0'.
# File lib/linguistics/en/numbers.rb, line 428 def number_to_custom_word_groups( number, groupsize, zeroword="zero" ) self.log.debug "Making custom word groups of %d digits out of %p" % [ groupsize, number ] # Build a Regexp with <config[:group]> number of digits. Any past # the first are optional. re = Regexp.new( "(\\d)" + ("(\\d)?" * (groupsize - 1)) ) self.log.debug " regex for matching groups of %d digits is %p" % [ groupsize, re ] # Scan the string, and call the word-chunk function that deals with # chunks of the found number of digits. return number.to_s.scan( re ).collect do |digits| self.log.debug " digits = %p" % [ digits ] numerals = digits.flatten.compact.collect {|i| i.to_i} self.log.debug " numerals = %p" % [ numerals ] fn = NUMBER_TO_WORDS_FUNCTIONS[ numerals.length ] self.log.debug " number to word function is #%d: %p" % [ numerals.length, fn ] fn.call( zeroword, *numerals ).strip end end
Split the given number
up into groups of three and return the
Array of words describing each group in the
standard style.
# File lib/linguistics/en/numbers.rb, line 452 def number_to_standard_word_groups( number, andword="and" ) phrase = number.to_s phrase.sub!( /\A\s*0+/, '' ) chunks = [] mill = 0 self.log.debug "Making standard word groups out of %p" % [ phrase ] # Match backward from the end of the digits in the string, turning # chunks of three, of two, and of one into words. mill += 1 while phrase.sub!( /(\d)(\d)(\d)(?=\D*\Z)/ ) do words = to_hundreds( $1.to_i, $2.to_i, $3.to_i, mill, andword ) chunks.unshift words.strip.squeeze(' ') unless words.nil? '' end phrase.sub!( /(\d)(\d)(?=\D*\Z)/ ) do chunks.unshift to_tens( $1.to_i, $2.to_i, mill ).strip.squeeze(' ') '' end phrase.sub!( /(\d)(?=\D*\Z)/ ) do chunks.unshift to_units( $1.to_i, mill ).strip.squeeze(' ') '' end return chunks end
Return the specified number number
as an array of number
phrases.
# File lib/linguistics/en/numbers.rb, line 415 def number_to_words( number, config ) return [config[:zero]] if number.to_i.zero? if config[:group].nonzero? then return number_to_custom_word_groups( number, config[:group], config[:zero] ) else return number_to_standard_word_groups( number, config[:and] ) end end
Return the specified number as english words. One or more configuration values may be passed to control the returned String:
- :group
-
Controls how many numbers at a time are grouped together. Valid values are
0
(normal grouping),1
(single-digit grouping, e.g., “one, two, three, four”),2
(double-digit grouping, e.g., “twelve, thirty-four”, or3
(triple-digit grouping, e.g., “one twenty-three, four”). - :comma
-
Set the character/s used to separate word groups. Defaults to
", "
. - :and
-
Set the word and/or characters used where
' and '
(the default) is normally used. Setting:and
to' '
, for example, will cause2556
to be returned as “two-thousand, five hundred fifty-six” instead of “two-thousand, five hundred and fifty-six”. - :zero
-
Set the word used to represent the numeral
0
in the result.'zero'
is the default. - :decimal
-
Set the translation of any decimal points in the number; the default is
'point'
. - :as_array
-
If set to a true value, the number will be returned as an array of word groups instead of a String.
# File lib/linguistics/en/numbers.rb, line 136 def numwords( hashargs={} ) num = self.to_s self.log.debug "Turning %p into number words..." % [ num ] config = NUMWORD_DEFAULTS.merge( hashargs ) raise "Bad chunking option: #{config[:group]}" unless config[:group].between?( 0, 3 ) # Array of number parts: first is everything to the left of the first # decimal, followed by any groups of decimal-delimted numbers after that parts = [] # Wordify any sign prefix sign = (/\A\s*\+/ =~ num) ? 'plus' : (/\A\s*\-/ =~ num) ? 'minus' : '' # Strip any ordinal suffixes ord = true if num.sub!( /(st|nd|rd|th)\Z/, '' ) # Split the number into chunks delimited by '.' chunks = if !config[:decimal].empty? then if config[:group].nonzero? num.split(/\./) else num.split(/\./, 2) end else [ num ] end # Wordify each chunk, pushing arrays into the parts array chunks.each_with_index do |chunk,section| chunk.gsub!( /\D+/, '' ) self.log.debug " working on chunk %p (section %d)" % [ chunk, section ] # If there's nothing in this chunk of the number, set it to zero # unless it's the whole-number part, in which case just push an # empty array. if chunk.empty? self.log.debug " chunk is empty..." if section.zero? self.log.debug " skipping the empty whole-number part" parts.push [] next end end # Split the number section into wordified parts unless this is the # second or succeeding part of a non-group number unless config[:group].zero? && section.nonzero? parts.push number_to_words( chunk, config ) self.log.debug " added %p" % [ parts.last ] else parts.push number_to_words( chunk, config.merge(:group => 1) ) self.log.debug " added %p" % [ parts.last ] end end self.log.debug "Parts => %p" % [ parts ] # Turn the last word of the whole-number part back into an ordinal if # the original number came in that way. if ord && !parts[0].empty? self.log.debug " turning the last whole-number part back into an ordinal, since it " + "came in that way" parts[0][-1] = ordinal( parts[0].last ) end # If the caller's expecting an Array return, just flatten and return the # parts array. if config[:as_array] self.log.debug " returning the number parts as an Array" unless sign.empty? parts[0].unshift( sign ) end return parts.flatten end # Catenate each sub-parts array into a whole number part and one or more # post-decimal parts. If grouping is turned on, all sub-parts get joined # with commas, otherwise just the whole-number part is. if config[:group].zero? self.log.debug " no custom grouping" if parts[0].length > 1 self.log.debug " whole and decimal part; working on the whole number first" # Join all but the last part together with commas wholenum = parts[0][0...-1].join( config[:comma] ) # If the last part is just a single word, append it to the # wholenum part with an 'and'. This is to get things like 'three # thousand and three' instead of 'three thousand, three'. if /^\s*(\S+)\s*$/ =~ parts[0].last self.log.debug "last word is a single word; using the 'and' separator: %p" % [ config[:and] ] wholenum += config[:and] + parts[0].last else self.log.debug "last word has multiple words; using the comma separator: %p" % [ config[:comma] ] wholenum += config[:comma] + parts[0].last end else self.log.debug " non-decimal." wholenum = parts[0][0] end decimals = parts[1..-1].collect {|part| part.join(" ")} self.log.debug " wholenum: %p; decimals: %p" % [ wholenum, decimals ] # Join with the configured decimal; if it's empty, just join with # spaces. unless config[:decimal].empty? self.log.debug " joining with the configured decimal: %p" % [ config[:decimal] ] return sign + ([ wholenum ] + decimals). join( " #{config[:decimal]} " ).strip else self.log.debug " joining with the spaces since no decimal is configured" return sign + ([ wholenum ] + decimals). join( " " ).strip end else self.log.debug " grouping with decimal %p and comma %p" % config.values_at( :decimal, :comma ) return parts.compact. separate( config[:decimal] ). delete_if {|el| el.empty?}. join( config[:comma] ). strip end end
Transform the given number
into an ordinal word. The
number
object can be either an Integer or a String.
# File lib/linguistics/en/numbers.rb, line 270 def ordinal if self.respond_to?( :to_int ) number = self.to_int return "%d%s" % [ number, (NTH[ number % 100 ] || NTH[ number % 10 ]) ] else number = self.to_s self.log.debug "Making an ordinal out of a non-Integer (%p)" % [ number ] return number.sub( /(#{ORDINAL_SUFFIXES})\Z/ ) { ORDINALS[$1] } end end
Transform the given number
into an ordinate word.
# File lib/linguistics/en/numbers.rb, line 285 def ordinate return self.numwords.en.ordinal end
Return a phrase describing the specified number
of objects in
the inflected object in general terms. The following options can be used to
control the makeup of the returned quantity String:
- :joinword
-
Sets the word (and any surrounding spaces) used as the word separating the quantity from the noun in the resulting string. Defaults to
' of '
.
# File lib/linguistics/en/numbers.rb, line 298 def quantify( number=0, args={} ) phrase = self.to_s self.log.debug "Quantifying %d instances of %p" % [ number, phrase ] num = number.to_i config = QUANTIFY_DEFAULTS.merge( args ) case num when 0 phrase.en.no when 1 phrase.en.a when SEVERAL_RANGE "several " + phrase.en.plural( num ) when NUMBER_RANGE "a number of " + phrase.en.plural( num ) when NUMEROUS_RANGE "numerous " + phrase.en.plural( num ) when MANY_RANGE "many " + phrase.en.plural( num ) else # Anything bigger than the MANY_RANGE gets described like # "hundreds of thousands of..." or "millions of..." # depending, of course, on how many there are. thousands, subthousands = Math::log10( num ).to_i.divmod( 3 ) self.log.debug "thousands = %p, subthousands = %p" % [ thousands, subthousands ] stword = case subthousands when 2 "hundreds" when 1 "tens" else nil end unless thousands.zero? thword = to_thousands( thousands ).strip.en.plural end [ # Hundreds (of)... stword, # thousands (of) thword, # stars. phrase.en.plural(number) ].compact.join( config[:joinword] ) end end
Transform the specified number of hundreds-, tens-, and units-place
numerals into a word phrase. If the number of thousands
(thousands
) is greater than 0, it will be used to determine
where the decimal point is in relation to the hundreds-place number.
# File lib/linguistics/en/numbers.rb, line 384 def to_hundreds( hundreds, tens=0, units=0, thousands=0, joinword=" and " ) joinword = ' ' if joinword.empty? if hundreds.nonzero? return to_units( hundreds ) + " hundred" + (tens.nonzero? || units.nonzero? ? joinword : '') + to_tens( tens, units ) + to_thousands( thousands ) elsif tens.nonzero? || units.nonzero? return to_tens( tens, units ) + to_thousands( thousands ) else return nil end end
Transform the specified number of tens- and units-place numerals into a
word-phrase at the given number of thousands
places.
# File lib/linguistics/en/numbers.rb, line 367 def to_tens( tens, units, thousands=0 ) raise ArgumentError, "tens: no implicit conversion from nil" unless tens raise ArgumentError, "units: no implicit conversion from nil" unless units unless tens == 1 return TENS[ tens ] + ( tens.nonzero? && units.nonzero? ? '-' : '' ) + to_units( units, thousands ) else return TEENS[ units ] + to_thousands( thousands ) end end
Transform the specified number into one or more words like 'thousand', 'million', etc. Uses the thousands (American) system.
# File lib/linguistics/en/numbers.rb, line 400 def to_thousands( thousands=0 ) parts = [] (0..thousands).step( THOUSANDS.length - 1 ) {|i| if i.zero? parts.push THOUSANDS[ thousands % (THOUSANDS.length - 1) ] else parts.push THOUSANDS.last end } return parts.join(" ") end
Transform the specified number of units-place numerals into a word-phrase
at the given number of thousands
places.
# File lib/linguistics/en/numbers.rb, line 360 def to_units( units, thousands=0 ) return UNITS[ units ] + to_thousands( thousands ) end