ProductPromotion
Logo

Elixir

made by https://0x3d.site

GitHub - fredwu/simple_bayes: A Naive Bayes machine learning implementation in Elixir.
A Naive Bayes machine learning implementation in Elixir. - fredwu/simple_bayes
Visit Site

GitHub - fredwu/simple_bayes: A Naive Bayes machine learning implementation in Elixir.

GitHub - fredwu/simple_bayes: A Naive Bayes machine learning implementation in Elixir.

Simple Bayes Travis Coverage Hex.pm

A Naive Bayes machine learning implementation in Elixir.

In machine learning, naive Bayes classifiers are a family of simple probabilistic classifiers based on applying Bayes' theorem with strong (naive) independence assumptions between the features.

Naive Bayes has been studied extensively since the 1950s. It was introduced under a different name into the text retrieval community in the early 1960s, and remains a popular (baseline) method for text categorization, the problem of judging documents as belonging to one category or the other (such as spam or legitimate, sports or politics, etc.) with word frequencies as the features. With appropriate preprocessing, it is competitive in this domain with more advanced methods including support vector machines. It also finds application in automatic medical diagnosis.

Naive Bayes classifiers are highly scalable, requiring a number of parameters linear in the number of variables (features/predictors) in a learning problem. Maximum-likelihood training can be done by evaluating a closed-form expression, which takes linear time, rather than by expensive iterative approximation as used for many other types of classifiers. - Wikipedia

Features

  • Naive Bayes algorithm with different models
    • Multinomial
    • Binarized (boolean) multinomial
    • Bernoulli
  • Multiple storage options
    • In-memory (default)
    • File system
    • Dets (Disk-based Erlang Term Storage)
  • Ignores stop words
  • Additive smoothing
  • TF-IDF
  • Optional keywords weighting
  • Optional word stemming via Stemmer

Feature Matrix

Multinomial Binarized multinomial Bernoulli
Stop words
Additive smoothing
TF-IDF
Keywords weighting
Stemming

Usage

Install by adding :simple_bayes and optionally :stemmer (for the default stemming functionality) to deps in your mix.exs:

defp deps do
  [
    {:simple_bayes, "~> 0.12"},
    {:stemmer,      "~> 1.0"}
  ]
end

If you're on Elixir 1.3 or below, ensure :simple_bayes and optionally :stemmer are started before your application:

def application do
  [applications: [:logger, :simple_bayes, :stemmer]]
end
bayes = SimpleBayes.init()
        |> SimpleBayes.train(:apple, "red sweet")
        |> SimpleBayes.train(:apple, "green", weight: 0.5)
        |> SimpleBayes.train(:apple, "round", weight: 2)
        |> SimpleBayes.train(:banana, "sweet")
        |> SimpleBayes.train(:banana, "green", weight: 0.5)
        |> SimpleBayes.train(:banana, "yellow long", weight: 2)
        |> SimpleBayes.train(:orange, "red")
        |> SimpleBayes.train(:orange, "yellow sweet", weight: 0.5)
        |> SimpleBayes.train(:orange, "round", weight: 2)

bayes |> SimpleBayes.classify_one("Maybe green maybe red but definitely round and sweet.")
# => :apple

bayes |> SimpleBayes.classify("Maybe green maybe red but definitely round and sweet.")
# => [
#   apple:  0.18519202529366116,
#   orange: 0.14447781772131096,
#   banana: 0.10123406763124557
# ]

bayes |> SimpleBayes.classify("Maybe green maybe red but definitely round and sweet.", top: 2)
# => [
#   apple:  0.18519202529366116,
#   orange: 0.14447781772131096,
# ]

With and without word stemming (requires a stem function, we recommend Stemmer):

SimpleBayes.init()
|> SimpleBayes.train(:apple, "buying apple")
|> SimpleBayes.train(:banana, "buy banana")
|> SimpleBayes.classify("buy apple")
# => [
#   banana: 0.05719389206673358,
#   apple: 0.05719389206673358
# ]

SimpleBayes.init(stem: &Stemmer.stem/1) # or any other stemming function
|> SimpleBayes.train(:apple, "buying apple")
|> SimpleBayes.train(:banana, "buy banana")
|> SimpleBayes.classify("buy apple")
# => [
#   apple: 0.18096114003107086,
#   banana: 0.15054767928902865
# ]

Configuration (Optional)

For application wide configuration, in your application's config/config.exs:

config :simple_bayes, model: :multinomial
config :simple_bayes, storage: :memory
config :simple_bayes, default_weight: 1
config :simple_bayes, smoothing: 0
config :simple_bayes, stem: false
config :simple_bayes, top: nil
config :simple_bayes, stop_words: ~w(
  a about above after again against all am an and any are aren't as at be
  because been before being below between both but by can't cannot could
  couldn't did didn't do does doesn't doing don't down during each few for from
  further had hadn't has hasn't have haven't having he he'd he'll he's her here
  here's hers herself him himself his how how's i i'd i'll i'm i've if in into
  is isn't it it's its itself let's me more most mustn't my myself no nor not of
  off on once only or other ought our ours ourselves out over own same shan't
  she she'd she'll she's should shouldn't so some such than that that's the
  their theirs them themselves then there there's these they they'd they'll
  they're they've this those through to too under until up very was wasn't we
  we'd we'll we're we've were weren't what what's when when's where where's
  which while who who's whom why why's with won't would wouldn't you you'd
  you'll you're you've your yours yourself yourselves
)

Alternatively, you may pass in the configuration options when you initialise:

SimpleBayes.init(
  model:          :multinomial,
  storage:        :memory,
  default_weight: 1,
  smoothing:      0,
  stem:           false,
  top:            nil,
  stop_words:     []
)

Available options for :model are:

  • :multinomial (default)
  • :binarized_multinomial
  • :bernoulli

Available options for :storage are:

  • :memory (default, can also be used by any database, see below for more details)
  • :file_system
  • :dets

Storage options have extra configurations:

Memory

  • :namespace - optional, it's only useful when you want to load by the namespace

File System

  • :file_path

Dets

  • :file_path

File System vs Dets

File system encodes and decodes data using base64, whereas Dets is a native Erlang library. Performance wise file system with base64 tends to be faster with less data, and Dets faster with more data. YMMV, please do your own comparison.

Configuration Examples

# application-wide configuration:
config :simple_bayes, storage: :file_system
config :simple_bayes, file_path: "path/to/the/file.txt"

# per-initialization configuration:
SimpleBayes.init(
  storage: :file_system,
  file_path: "path/to/the/file.txt"
)

Storage Usage

opts = [
  storage:   :file_system,
  file_path: "test/temp/file_sysmte_test.txt"
]

SimpleBayes.init(opts)
|> SimpleBayes.train(:apple, "red sweet")
|> SimpleBayes.train(:apple, "green", weight: 0.5)
|> SimpleBayes.train(:apple, "round", weight: 2)
|> SimpleBayes.train(:banana, "sweet")
|> SimpleBayes.save()

SimpleBayes.load(opts)
|> SimpleBayes.train(:banana, "green", weight: 0.5)
|> SimpleBayes.train(:banana, "yellow long", weight: 2)
|> SimpleBayes.train(:orange, "red")
|> SimpleBayes.train(:orange, "yellow sweet", weight: 0.5)
|> SimpleBayes.train(:orange, "round", weight: 2)
|> SimpleBayes.save()

SimpleBayes.load(opts)
|> SimpleBayes.classify("Maybe green maybe red but definitely round and sweet")

In-memory save/2, load/1 and the encoded_data option

Calling SimpleBayes.save/2 is unnecessary for :memory storage. However, when using the in-memory storage, you are able to get the encoded data - this is useful if you would like to store the encoded data in your persistence of choice. For example:

{:ok, _pid, encoded_data} = SimpleBayes.init()
|> SimpleBayes.train(:apple, "red sweet")
|> SimpleBayes.train(:apple, "green", weight: 0.5)
|> SimpleBayes.train(:apple, "round", weight: 2)
|> SimpleBayes.train(:banana, "sweet")
|> SimpleBayes.save()

# now store `encoded_data` in your database of choice
# once `encoded_data` is fetched again from the database, you are then able to:

SimpleBayes.load(encoded_data: encoded_data)
|> SimpleBayes.train(:banana, "green", weight: 0.5)
|> SimpleBayes.train(:banana, "yellow long", weight: 2)
|> SimpleBayes.train(:orange, "red")
|> SimpleBayes.train(:orange, "yellow sweet", weight: 0.5)
|> SimpleBayes.train(:orange, "round", weight: 2)
|> SimpleBayes.classify("Maybe green maybe red but definitely round and sweet")

Changelog

Please see CHANGELOG.md.

License

Licensed under MIT.

Articles
to learn more about the elixir concepts.

Resources
which are currently available to browse on.

mail [email protected] to add your project or resources here 🔥.

FAQ's
to know more about the topic.

mail [email protected] to add your project or resources here 🔥.

Queries
or most google FAQ's about Elixir.

mail [email protected] to add more queries here 🔍.

More Sites
to check out once you're finished browsing here.

0x3d
https://www.0x3d.site/
0x3d is designed for aggregating information.
NodeJS
https://nodejs.0x3d.site/
NodeJS Online Directory
Cross Platform
https://cross-platform.0x3d.site/
Cross Platform Online Directory
Open Source
https://open-source.0x3d.site/
Open Source Online Directory
Analytics
https://analytics.0x3d.site/
Analytics Online Directory
JavaScript
https://javascript.0x3d.site/
JavaScript Online Directory
GoLang
https://golang.0x3d.site/
GoLang Online Directory
Python
https://python.0x3d.site/
Python Online Directory
Swift
https://swift.0x3d.site/
Swift Online Directory
Rust
https://rust.0x3d.site/
Rust Online Directory
Scala
https://scala.0x3d.site/
Scala Online Directory
Ruby
https://ruby.0x3d.site/
Ruby Online Directory
Clojure
https://clojure.0x3d.site/
Clojure Online Directory
Elixir
https://elixir.0x3d.site/
Elixir Online Directory
Elm
https://elm.0x3d.site/
Elm Online Directory
Lua
https://lua.0x3d.site/
Lua Online Directory
C Programming
https://c-programming.0x3d.site/
C Programming Online Directory
C++ Programming
https://cpp-programming.0x3d.site/
C++ Programming Online Directory
R Programming
https://r-programming.0x3d.site/
R Programming Online Directory
Perl
https://perl.0x3d.site/
Perl Online Directory
Java
https://java.0x3d.site/
Java Online Directory
Kotlin
https://kotlin.0x3d.site/
Kotlin Online Directory
PHP
https://php.0x3d.site/
PHP Online Directory
React JS
https://react.0x3d.site/
React JS Online Directory
Angular
https://angular.0x3d.site/
Angular JS Online Directory