Files

Ai4r::Classifiers::OneR

Introduction

The idea of the OneR algorithm is identify the single attribute to use to classify data that makes fewest prediction errors. It generates rules based on a single attribute.

Attributes

data_set[R]
rule[R]

Public Instance Methods

build(data_set) click to toggle source

Build a new OneR classifier. You must provide a DataSet instance as parameter. The last attribute of each item is considered as the item class.

# File lib/ai4r/classifiers/one_r.rb, line 30
def build(data_set)
  data_set.check_not_empty
  @data_set = data_set
  if (data_set.num_attributes == 1) 
    @zero_r = ZeroR.new.build(data_set)
    return self;
  else
    @zero_r = nil;
  end
  domains = @data_set.build_domains
  @rule = nil
  domains[1...-1].each_index do |attr_index|
    rule = build_rule(@data_set.data_items, attr_index, domains)
    @rule = rule if !@rule || rule[:correct] > @rule[:correct]
  end
  return self
end
eval(data) click to toggle source

You can evaluate new data, predicting its class. e.g.

classifier.eval(['New York',  '<30', 'F'])  # => 'Y'
# File lib/ai4r/classifiers/one_r.rb, line 51
def eval(data)
  return @zero_r.eval(data) if @zero_r
  attr_value = data[@rule[:attr_index]]
  return @rule[:rule][attr_value]
end
get_rules() click to toggle source

This method returns the generated rules in ruby code. e.g.

classifier.get_rules
  # =>  if age_range == '<30' then marketing_target = 'Y'
        elsif age_range == '[30-50)' then marketing_target = 'N'
        elsif age_range == '[50-80]' then marketing_target = 'N'
        end

It is a nice way to inspect induction results, and also to execute them:

marketing_target = nil
eval classifier.get_rules   
puts marketing_target
  # =>  'Y'
# File lib/ai4r/classifiers/one_r.rb, line 71
def get_rules
  return @zero_r.get_rules if @zero_r
  sentences = []
  attr_label = @data_set.data_labels[@rule[:attr_index]]
  class_label = @data_set.data_labels.last
  @rule[:rule].each_pair do |attr_value, class_value|
    sentences << "#{attr_label} == '#{attr_value}' then #{class_label} = '#{class_value}'"
  end
  return "if " + sentences.join("\nelsif ") + "\nend"        
end

Protected Instance Methods

build_rule(data_examples, attr_index, domains) click to toggle source
# File lib/ai4r/classifiers/one_r.rb, line 84
def build_rule(data_examples, attr_index, domains)
  domain = domains[attr_index]
  value_freq = Hash.new
  domain.each do |attr_value| 
    value_freq[attr_value] = Hash.new { |hash, key| hash[key] = 0 }
  end
  data_examples.each do |data|
    value_freq[data[attr_index]][data.last] = value_freq[data[attr_index]][data.last] + 1
  end
  rule = {}
  correct_instances = 0
  value_freq.each_pair do |attr, class_freq_hash|  
    max_freq = 0
    class_freq_hash.each_pair do |class_value, freq| 
      if max_freq < freq
        rule[attr] = class_value 
        max_freq = freq
      end
    end
    correct_instances += max_freq
  end
  return {:attr_index => attr_index, :rule => rule, :correct => correct_instances}
end

[Validate]

Generated with the Darkfish Rdoc Generator 2.