NAME
    AI::MaxEntropy - Perl extension for learning Maximum Entropy Models

SYNOPSIS
      use AI::MaxEntropy;

      # create a maximum entropy learner
      my $me = AI::MaxEntropy->new; 
  
      # the learner see 2 red round smooth apples
      $me->see(['round', 'smooth', 'red'] => 'apple' => 2);
  
      # the learner see 3 yellow long smooth bananas
      $me->see(['long', 'smooth', 'yellow'] => 'banana' => 3);

      # and more

      # samples needn't have the same numbers of active features
      $me->see(['rough', 'big'] => 'pomelo');

      # the order of active features is not concerned, too
      $me->see(['big', 'rough'] => 'pomelo');

      # ...

      # okay, let it learn
      my $model = $me->learn;

      # then, we can make prediction on unseen data

      # ask what a red thing is most likely to be
      print $model->predict(['red'])."\n";
      # the answer is apple, because all red things the learner have ever seen
      # are apples
  
      # ask what a smooth thing is most likely to be
      print $model->predict(['smooth'])."\n";
      # the answer is banana, because the learner have seen more smooth bananas
      # (weighted 3) than smooth apples (weighted 2)

      # ask what a red, long thing is most likely to be
      print $model->predict(['red', 'long'])."\n";
      # the answer is banana, because the learner have seen more long bananas
      # (weighted 3) than red apples (weighted 2)

      # print out scores of all possible answers to the feature round and red
      for ($model->all_labels) {
          my $s = $model->score(['round', 'red'] => $_);
          print "$_: $s\n";
      }
  
      # save the model
      $model->save('model_file');

      # load the model
      $model->load('model_file');

DESCRIPTION
    Maximum Entropy (ME) model is a popular machine learning approach, which
    is used widely as a general classifier. A ME learner try to recover the
    real probability distribution based on limited number of observations,
    by applying the principle of maximum entropy. The principle of maximum
    entropy assumes nothing on unknown data, in another word, all unknown
    things are as even as possible, which makes the entropy of the
    distribution maxmized.

  Samples
    In this module, each observation is abstracted as a sample. A sample is
    denoted as "x => y => w", which consists of a set of active features
    (array ref x), a label (scalar y) and a weight (scalar w). The client
    program adds a new sample to the learner by "see".

  Features and Active Features
    The features describe which characteristics things can have. And, if a
    thing has a certain feature, we say that feature is active in that thing
    (an active feature). For example, an apple is red, round and smooth,
    then the active features of an apple can be denoted as an array ref
    "['red', 'round', 'smooth']". Each element here is an active feature
    (generally, denoted by a string), and the order of active features is
    not concerned.

  Label
    The label denotes the name of the thing we describe. For the example
    above, we are describe an apple, so the label can be 'apple'.

  Weight
    The weight can be simply taken as how many times a thing with certain
    characteristics occurs, or how persuasive it is. For example, we see 2
    red round smooth apples, we denote it as "['red', 'round', 'smooth'] =>
    'apple' => 2".

  Model
    After seeing enough samples, a model can be learnt from them by calling
    "learn", which generates an AI::MaxEntropy::Model object. A model is
    generally considered as a classifier. When given a set of features, one
    can ask which label is most likely to come with these features by
    calling "predict" in AI::MaxEntropy::Model.

FUNCTIONS
    NOTE: This is still an alpha version, the APIs are possible to be
    changed in future versions.

  new
    Create a Maximum Entropy learner. Optionally, initial values of
    properties can be specified here.

      my $me1 = AI::MaxEntropy->new;
      my $me2 = AI::MaxEntropy->new(
          optimizer => { epsilon => 1e-6 });
      my $me3 = AI::MaxEntropy->new(
          optimizer => { m => 7, epsilon => 1e-4 },
          smoother => { type => 'gaussian', sigma => 0.8 }
      );

    The properties values specified in creation time can be changed later,
    like,

      $me->{optimizer} = { epsilon => 1e-3 };
      $me->{smoother} = {};

  see
    Let the Maximum Entropy learner see a new sample. The weight can be
    omitted, in which case, default weight 1.0 will be used.

      my $me = AI::MaxEntropy->new;

      # see a sample with default weight 1.0
      $me->see(['a', 'b'] => 'p');
  
      # see a sample with specified weight 0.5
      $me->see(['c', 'd'] => 'q' => 0.5);

  forget_all
    Forget all samples the learner have seen previously.

  learn
    Learn a model from all the samples, returns an AI::MaxEntropy::Model
    object, which can be used to make prediction on unseen data.

      ...

      my $model = $me->learn;

      print $model->predict(['x1', 'x2', ...]);

PROPERTIES
  optimizer
    The optimizer is the essential component of this module. This property
    enable the client program to customize the behavior of the optimizer. It
    is a hash ref, containing all parameters that the client program want to
    pass to the L-BFGS optimizer. Please refer to "List of Parameters" in
    Algorithm::LBFGS for details.

  smoother
    The smoother is a solution to the over-fitting problem. This property
    chooses the which type of smoother the client program want to apply and
    sets the smoothing parameters.

    Only one smoother have been implemented in this version, the Gaussian
    smoother.

    One can apply the Gaussian smoother as following,

      my $me = AI::MaxEntropy->new(
          smoother => { type => 'gaussian', sigma => 0.6 }
      );

    The Gaussian smoother has one parameter sigma, which indicates the
    strength of smoothing. Usually, sigma is a positive number no greater
    than 1.0. The strength of smoothing grows as sigma getting close to 0.

  progress_cb
    Usually, learning a model is a time consuming job. And the most time
    consuming part of this process is the optimization. This callback
    subroutine is for people who want to trace the progress of the
    optimization. By tracing it, they can do some useful output, which makes
    the learning process more user-friendly.

    This callback subroutine will be passed directly to "fmin" in
    Algorithm::LBFGS. You can also pass a string 'verbose' to this property,
    which simply print out the progress by a build-in callback subroutine.
    Please see "progress_cb" in Algorithm::LBFGS for details.

SEE ALSO
    AI::MaxEntropy::Model, Algorithm::LBFGS

    Statistics::MaxEntropy, Algorithm::CRF, Algorithm::SVM

AUTHOR
    Laye Suen, <laye@cpan.org>

COPYRIGHT AND LICENSE
    The MIT License

    Copyright (C) 2008, Laye Suen

    Permission is hereby granted, free of charge, to any person obtaining a
    copy of this software and associated documentation files (the
    "Software"), to deal in the Software without restriction, including
    without limitation the rights to use, copy, modify, merge, publish,
    distribute, sublicense, and/or sell copies of the Software, and to
    permit persons to whom the Software is furnished to do so, subject to
    the following conditions:

    The above copyright notice and this permission notice shall be included
    in all copies or substantial portions of the Software.

    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
    OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
    MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
    IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
    CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
    TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
    SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

REFERENCE
    A. L. Berge, V. J. Della Pietra, S. A. Della Pietra. A Maximum Entropy
    Approach to Natural Language Processing, Computational Linguistics,
    1996.
    S. F. Chen, R. Rosenfeld. A Gaussian Prior for Smoothing Maximum Entropy
    Models, February 1999 CMU-CS-99-108.