Arcadia: Put your LLMs to Work — Part IV: Putting It All Together
March 23, 2024
Part IV of the 4-Part DocuSeries on Building Arcadia — the End-to-End Platform for API ML model Billing
With Arcadia, I broke all the rules of GTM:
I didn’t validate the idea
I didn’t focus on a narrow-enough segment to define my ICP
I’m not sure for whom it solves a desperate enough problem
But … I also just don’t bloody care.
Yes, I just built it. Yes, it was an engineering burst straight to code. And it was also proper fun!
It solved my problem. I built Arcadia out of my own neccessity to help ML teams charge other enterprises for the usage of their ML/LLM models without giving the models itself.
I love a quote from Derek Shivers, founder of CDBaby:
When you create a company, you create an utopia.
An utopia of what you want the world to be. Arcadia is an utopia of mine.
A world where ML engineering teams can easily share inference access to their proprietary ML models.
A world where biotech startups can share their top-notch ML algorithms for drug discovery to other research labs. And vice versa.
I know I’ll use it again. Even if just for myself. I had loads of fun.
With that in mind, join the waitlist :)
waitlist embed
Arcadia is an end-to-end platform for uploading your ML models up for inference and charging for their API usage.
Watch the 1st Arcadia Demo. Sign up to the Waitlist here.
Throughout this 4-Part Series, I’m going to take you on a journey of experimentation, frustration, and ultimately creation of Arcadia — the platform for exposing your existing ML/LLMs to the world with zero downtime(free cost), and limitless ROI.
Put your LLMs to Work — Part I
Full Platform Implementation — Part II
Stripe Billing — Part III
Putting It All Together — Part IV
This article is the last of 4 in a 4-Part DocuSeries on building Arcadia. So far, we have learned(this list keeps growing longer and longer :))
Idea formulation
System design
Automating ML model containerization
Connecting the Backend & Frontend Components for Model Deployment
To use the SDK for single-line ML Model Inference
How Stripe Connect Works to Make Anybody Both a Deployer and a User
How to Automatically Issue Invoices based on API Usage
In this article, we will conclude the 4-Part Series by:
Giving a Platform Overview
Key Considerations We Should Be Aware Of
Why You Should Join the Arcadia Waitlist
Q1: Platform Overview
Why? Because ML teams have valuable proprietary ML models they could rent out. It’s a capitalist move, with zero barrier-to-entry for all parties. Everybody wins. Arcadia was built for ML engineering teams that have existing ML models they could earn from, but don’t want others to have full access.
How? By having a platform that solves API billing, hosting, inference, and scalability for you. ML teams can now easily upload, deploy, and charge for their proprietary ML models.
What? SaaS dashboard & SDK for developers, teams, and enterprises looking to charge for their ML models via API billing. Also an infrastructure for deploying ML models and billing others for API usage.
Basic Workflow
User should be able to connect with Stripe for API billing
User can upload any ML/LLM model for inference to other people
Any other user can use a single-call SDK command to invoke the ML model
API Billing per model API calls.
ML/LLM Model Deployment
The exported model format really doesn’t matter. Any file can be pushed through. We do need to have preinstalled loaders for model formats to be able to call predict on them however.
Docker containers are pushed (still) locally and the models are exposed via designated ports. In the future, Kubernetes will ensure horizontal scaling when adding new models, and vertical scaling for when the models are too large for one container, or the request influx is too large.
During upload, each model is prompted name, description, category, use_case_category, and charge_per_api_call. We use this data to display and search through the Arcadia marketplace.
Arcadia Marketplace
All models are stored inside the Arcadia Marketplace. In here, you can filter through:
name
category(LLM, Computer Vision, NLP, etc)
use_case_category(Text Summarization, etc)
Model Pages
I’m most proud of the model pages which quite resemble the HuggingFace ones. I really pushed to achieve the following URL format for each model:
Without further ado, here are the model pages:
Q2: What Are We Missing?
Cooool. Okay, here’s what we will absolutely need to implement for Arcadia to work:
Connect to one of the clouds — I’m thinking GKE on GCP to run scalable model containers and expose external ports there.
I need to set Auth0 and Stripe callbacks to domain URLs(won’t accept localhosts)
ModelUsage should have been implemented properly with displays of API calls made by you the user to other proprietary ML models.
We could offer a more granular display to deployers as to what is happenning with their ML models in other people’s productions.
Similarly, we should offer users access into model outputs, especially if dealing with LLMs. Additionally, model and data decay as well.
Q3: Why You Should Join the Arcadia Waitlist
I’m stoked to be building this! It’s always great to see something that was once a mere idea in my mind, now working in real-time. Anyhow, you should definitely join the waitlist:
If you have proprietary ML models you want to expose for inference, you should join the waitlist.
If you have existing ML models that are sitting in production, but you don’t want to give full access to competitors, you should join the waitlist.
If you are an independent ML engineer who has built something cool that could be bought by Google(jk), and you want to license the ML model for 10% in eternity, you should join the waitlist.
If you are a research company with proprietary data you wish to share with fellow companies and charge them for it, you should definitely join the waitlist.
If you’re a big enterprise who spent millions on R&D, and could print money if you rent out your model for API billing, you should join the waitlist.
And finally, if you’re interested in how Arcadia works, you should join the waitlist.
To Conclude:
Arcadia was built out of my own need to expose ML models for API billing to other engineers/teams. In that sense, it’s the anti-HuggingFace.
It supports any ML/LLM model and can be used to infer API usage across the Arcadia model marketplace.
We automated the process of model containerization so that any user can expose the model to anyone in the world.