Website   ·   Docs   ·   Twitter   ·   Slack   ·   Discord   
   
   Quick Start (5min)   ·   Tutorial (30min)   ·   Deployment Guide   ·

Search code, repositories, users, issues, pull requests...

submited by
Style Pass
2024-09-16 15:30:04

Website · Docs · Twitter · Slack · Discord Quick Start (5min) · Tutorial (30min) · Deployment Guide · API Reference · Configuration Reference

The TensorZero Gateway is a high-performance model gateway written in Rust 🦀 that provides a unified API interface for all major LLM providers, allowing for seamless cross-platform integration and fallbacks. It handles structured schema-based inference with <1ms P99 latency overhead (see Benchmarks) and built-in observability and experimentation (and soon, inference-time optimizations). It also collects downstream metrics and feedback associated with these inferences, with first-class support for multi-step LLM systems. Everything is stored in a ClickHouse data warehouse that you control for real-time, scalable, and developer-friendly analytics. Over time, TensorZero Recipes leverage this structured dataset to optimize your prompts and models: run pre-built recipes for common workflows like fine-tuning, or create your own with complete flexibility using any language and platform. Finally, the gateway's experimentation features and GitOps orchestration enable you to iterate and deploy with confidence, be it a single LLM or thousands of LLMs. Our goal is to help engineers build, manage, and optimize the next generation of LLM applications: systems that learn from real-world experience. Read more about our Vision & Roadmap. Get Started Next steps? The Quick Start (5min) and the Tutorial (30min) show it's easy to set up an LLM application with TensorZero. The tutorial teaches how to build a simple chatbot, an email copilot, a weather RAG system, and a structured data extraction pipeline. Questions? Ask us on Slack or Discord. Using TensorZero at work? Email us at hello@tensorzero.com to set up a Slack or Teams channel with your team (free). Examples We are working on a series of complete runnable examples illustrating TensorZero's data & learning flywheel. Writing Haikus to Satisfy a Judge with Hidden Preferences This example fine-tunes GPT-4o Mini to generate haikus tailored to a specific taste. You'll see TensorZero's "data flywheel in a box" in action: better variants leads to better data, and better data leads to better variants. You'll see progress by fine-tuning the LLM multiple times. Fine-Tuning TensorZero JSON Functions for Named Entity Recognition (CoNLL++) This example shows that an optimized Llama 3.1 8B model can be trained to outperform GPT-4o on an NER task using a small amount of training data, and served by Fireworks at a fraction of the cost and latency. Automated Prompt Engineering for Math Reasoning (GSM8K) with a Custom Recipe (DSPy) TensorZero provides a number of pre-built optimization recipes covering common LLM engineering workflows. But you can also easily create your own recipes and workflows! This example shows how to optimize a TensorZero function using an arbitrary tool — here, DSPy. & many more on the way!

Our goal is to help engineers build, manage, and optimize the next generation of LLM applications: systems that learn from real-world experience. Read more about our Vision & Roadmap.

Leave a Comment