ProductPromotion
Logo

Elixir

made by https://0x3d.site

GitHub - Overbryd/myhtmlex: Elixir/Erlang bindings for lexborisov's myhtml
Elixir/Erlang bindings for lexborisov's myhtml. Contribute to Overbryd/myhtmlex development by creating an account on GitHub.
Visit Site

GitHub - Overbryd/myhtmlex: Elixir/Erlang bindings for lexborisov's myhtml

GitHub - Overbryd/myhtmlex: Elixir/Erlang bindings for lexborisov's myhtml

Myhtmlex

Bindings for lexborisov's myhtml.

  • Available as a hex package: {:myhtmlex, "~> 0.2.0"}
  • Documentation

Example

iex> Myhtmlex.decode("<h1>Hello world</h1>")
{"html", [], [{"head", [], []}, {"body", [], [{"h1", [], ["Hello world"]}]}]}

Benchmark results (Nif calling mode) on various file sizes on a 2,5Ghz Core i7:

  Settings:
    duration:      1.0 s

  ## FileSizesBench
  [15:28:42] 1/3: github_trending_js.html 341k
  [15:28:46] 2/3: w3c_html5.html 131k
  [15:28:48] 3/3: wikipedia_hyperlink.html 97k

  Finished in 7.52 seconds

  ## FileSizesBench
  benchmark name                iterations   average time
  wikipedia_hyperlink.html 97k        1000   1385.86 µs/op
  w3c_html5.html 131k                 1000   2179.30 µs/op
  github_trending_js.html 341k         500   5686.21 µs/op

Configuration

The module you are calling into is always Myhtmlex and depending on your application configuration, it chooses between the underlying implementations Myhtmlex.Safe (default) and Myhtmlex.Nif.

Erlang interoperability is a tricky mine-field. You can call into C directly using native implemented functions (Nif). But this comes with the risk, that if anything goes wrong within the C implementation, your whole VM will crash. No more supervisor cushions for here on, just violent crashes.

That is why the default mode of operation keeps your VM safe and happy. If you need ultimate parsing speed, or you can simply tolerate VM-level crashes, read on.

Call into C-Node (default)

This is the default mode of operation. If your application cannot tolerate VM-level crashes, this option allows you to gain the best of both worlds. The added overhead is client/server communications, and a worker OS-process that runs next to your VM under VM supervision.

You do not have to do anything to start the worker process, everything is taken care of within the library. If you are not running in distributed mode, your VM will automatically be assigned a sname.

The worker OS-process stays alive as long as it is under VM-supervision. If your VM goes down, the OS-process will die by itself. If the worker OS-process dies for some reason, your VM stays unaffected and will attempt to restart it seamlessly.

Call into Nif

If your application is aiming for ultimate parsing speed, and in the worst case can tolerate VM-level crashes, you can call directly into the Nif.

  1. Require myhtmlex without runtime

    in your mix.exs

     def deps do
       [
         {:myhtmlex, ">= 0.0.0", runtime: false}
       ]
     end
    
  2. Configure the mode to Myhtmlex.Nif

    e.g. in config/config.exs

     config :myhtmlex, mode: Myhtmlex.Nif
    
  3. Bonus: You can open up in-memory references to parsed trees, without parsing + mapping erlang terms in one go

Contribution / Bug Reports

  • Please make sure you do git submodule update after a checkout/pull
  • If you have problems building the project, please consider adding a Dockerfile to build-tests/ to replicate the build error
  • The project aims to be fully tested

Roadmap

The exposed functions on Myhtmlex are not subject to change. This project is under active development.

  • Expose node-retrieval functions
  • Parse a HTML-document into a tree
  • Investigate safety and calling options
    • Call as dirty-nif
    • Call as C-Node (check branch c-node)

Articles
to learn more about the elixir concepts.

Resources
which are currently available to browse on.

mail [email protected] to add your project or resources here 🔥.

FAQ's
to know more about the topic.

mail [email protected] to add your project or resources here 🔥.

Queries
or most google FAQ's about Elixir.

mail [email protected] to add more queries here 🔍.

More Sites
to check out once you're finished browsing here.

0x3d
https://www.0x3d.site/
0x3d is designed for aggregating information.
NodeJS
https://nodejs.0x3d.site/
NodeJS Online Directory
Cross Platform
https://cross-platform.0x3d.site/
Cross Platform Online Directory
Open Source
https://open-source.0x3d.site/
Open Source Online Directory
Analytics
https://analytics.0x3d.site/
Analytics Online Directory
JavaScript
https://javascript.0x3d.site/
JavaScript Online Directory
GoLang
https://golang.0x3d.site/
GoLang Online Directory
Python
https://python.0x3d.site/
Python Online Directory
Swift
https://swift.0x3d.site/
Swift Online Directory
Rust
https://rust.0x3d.site/
Rust Online Directory
Scala
https://scala.0x3d.site/
Scala Online Directory
Ruby
https://ruby.0x3d.site/
Ruby Online Directory
Clojure
https://clojure.0x3d.site/
Clojure Online Directory
Elixir
https://elixir.0x3d.site/
Elixir Online Directory
Elm
https://elm.0x3d.site/
Elm Online Directory
Lua
https://lua.0x3d.site/
Lua Online Directory
C Programming
https://c-programming.0x3d.site/
C Programming Online Directory
C++ Programming
https://cpp-programming.0x3d.site/
C++ Programming Online Directory
R Programming
https://r-programming.0x3d.site/
R Programming Online Directory
Perl
https://perl.0x3d.site/
Perl Online Directory
Java
https://java.0x3d.site/
Java Online Directory
Kotlin
https://kotlin.0x3d.site/
Kotlin Online Directory
PHP
https://php.0x3d.site/
PHP Online Directory
React JS
https://react.0x3d.site/
React JS Online Directory
Angular
https://angular.0x3d.site/
Angular JS Online Directory