Run Models On Your Data
Oxen makes it easy to choose the right model, get to the perfect prompt, or kick off the data flywheel that is needed to improve state of the art AI.
Oxen makes it easy to choose the right model, get to the perfect prompt, or kick off the data flywheel that is needed to improve state of the art AI.
Oxen’s public and private datasets allow you to iterate on data within your organization or share them with the world.
A dataset from the Allen Institute of AI consisting of genuine grade-school level, multiple-choice science questions, assembled to encourage research in advanced question-answering. The dataset the Challenging Set of questions.
A benchmark collection for sentence-based image description and search, consisting of 8,000 images that are each paired with five different captions which provide clear descriptions of the salient entities and events. … The images were chosen from six different Flickr groups, and tend not to contain any well-known people or locations, but were manually selected to depict a variety of scenes and situations.
Subset of speech commands to test audio recognition systems on.
CelebFaces Attributes Dataset (CelebA) is a large-scale face attributes dataset with more than 200K celebrity images, each with 40 attribute annotations. The images in this dataset cover large pose variations and background clutter. CelebA has large diversities, large quantities, and rich annotations.
AI is only as good as the datasets you feed it. Gain visibility into the data that goes in and out of your model.
Datasets change every day. Oxen’s version control allows you to quickly narrow down the most important changes that affect your model.
Oxen’s data version control is built to handle data of any shape or size.
Oxen.ai saves your engineers hours syncing data from training, testing, to evaluation. From fast syncing of data to removing push/pull bottlenecks from traditional VCS systems, Oxen.ai was built for machine learning datasets and workflows.
Oxen’s data version control turns your unstructured data into beautifully rendered datasets that evolve over time. Dive into any version of the dataset at any point in time and see exactly what changed.
Oxen.ai has re-imagined version control for data. At the core are the same principles that have made Git so powerful, but Oxen has optimized down to the merkle trees, hashing principles, and network protocols to make it work effortlessly with large scale datasets.
Oxen.ai allows all your stakeholders to share, review, and edit data together. ML Engineering, Data Science, Product, Legal, Auditing, and Community can all contribute. The more eyes the better.
Ease of Use | Performance | Collaboration | Open Source | Data Visibility | Scalability | Any Data Format | Compare & Diff | |
---|---|---|---|---|---|---|---|---|
Hugging Face | * | ** | ||||||
Neptune.ai | ||||||||
LakeFS | ||||||||
DVC | ||||||||
GitLFS | ||||||||
Project Nessie |
* | Hugging Face is migrating their data versioning from Git LFS to the closed source Xethub. As Hugging Face is not open source, it is expected that Xethub will remain closed-source |
** | While small datasets have a preview and are visible in Hugging Face, large datasets are unable to be seen through their website |
Oxen.ai has developed a strong and growing community of individuals focused on furthering machine learning and artificial intelligence. From academic researchers training the next generation of models, to full-stack developers leveraging existing API's to build amazing products. Every Friday we get together and read research papers, discuss them, and apply them to our own work.