Oracle open sources Tribuo Java machine learning library

Oracle has open-sourced its Tribuo Java machine learning library, and makes it available free under an Apache 2.0 license; which tool is developed by Oracle Labs, and now accessible on GitHub and Maven Central.

While Oracle is looking to make it easier for developers to build and deploy machine learning models in Java, as it has already happened with Python, and meeting enterprise needs in the machine learning space.

Tribuo offers standard machine learning functionality which includes algorithms for building and deploying classification, clustering, and regression models in Java, with interfaces for TensorFlow, XGBoost, and ONNX.

How Tribuo can be useful in Natural language processing?

As Tribuo includes pipelines for transformation of data and provides a suite of evaluations for supported prediction tasks, and also collects statistics on inputs, it can describe the range of every input, with features like, managing IDs and outputs to avoid ID conflicts and confusion for chaining models.

Tribuo model identifies a feature when seen for the first time, which is particularly useful in working with natural language processing, and the models knows exactly what outputs are, with the outputs being strongly typed, therefore Developers don't have to wonder whether a float is a probability, regressed value, or a cluster ID.

The provenance system in Tribuo can also generate a configuration that rebuilds the training pipeline to reproduce the model, with a tweaked model that can be built on new data or hyperparameters, allowing users to always know what a model is, where it came from, and how to create it.

Tribuo filling a gap for machine learning in Enterprise applications

Oracle believes that Tribuo can fill a gap in the marketplace for machine learning for enterprise apps, whereas the Google-built TensorFlow library already provides core algorithms for deep learning, Tribuo can offer several machine learning algorithms, which are in not available in TensorFlow.

And also providing an interface to TensorFlow, with the Apache Spark analytics engine for large, distributed systems, Tribuo will be ideal for smaller computations on a single machine.

Additionally, Tribuo provides interfaces to XGBoost and the ONNX runtime, along with to TensorFlow, allowing models trained in TensorFlow and XGBoost or stored in the ONNX format to be deployed alongside native Tribuo models, which support allows deployment in Java of models using popular Python libraries such as PyTorch.

Social Items

How Tribuo can be useful in Natural language processing?

Tribuo filling a gap for machine learning in Enterprise applications