Startup: SF Tensors is Reinventing AI Infrastructure

ernie

3 months ago

In AI and high-performance computing (HPC), most research and development is limited by compute resources. SF Tensors, a San Francisco-based startup funded by Y Combinator and angel investors, is tackling this problem head-on. Founded by brothers Luk, Tom, and Ben Koska, the company is building tools to make compute fast, affordable, and portable, freeing AI teams to focus on innovation rather than infrastructure.

Background

Company: SF Tensors (San Francisco Tensor Company)
Founded: 2025
HQ: San Francisco
Funding: Seed from YCombinator
Co-founders: Ben Koska, Tom Koska, and Luk Koska
Product: AI compute language and infrastructure.

The Problem: Locked into CUDA

Today’s AI infrastructure is dominated by NVIDIA GPUs. While these provide incredible performance, they come with trade-offs: switching cloud providers, leveraging cheaper spot instances, or integrating alternative accelerators like AMD GPUs or TPUs often requires rewriting kernels and models from scratch. For small labs and startups, this creates a painful choice: overpay for compute or maintain a large kernel and infra team. SF Tensors aims to change that.

EMMA: A Language Designed for AI First

At the heart of SF Tensors’ approach is EMMA, a programming language purpose-built for high-performance, hardware-aware AI computation. According to emma-lang.org, EMMA blends the intuitive syntax of Swift with the speed of C, making it both developer-friendly and extremely efficient.

Key features include:

Intuitive syntax for readable, maintainable code from prototype to production.
Native async/await to simplify parallel and asynchronous GPU operations.
Builder functions for generating hardware-optimized MLIR code at compile time.
Native GPU kernel support across NVIDIA, AMD, and Vulkan platforms.
Low-level control, safe by default, letting developers write high-performance code without sacrificing safety.
Gradual migration path from languages like CUDA, enabling smooth adoption.

EMMA also supports direct integration with existing CUDA code, allowing developers to leverage legacy kernels while benefiting from EMMA’s performance optimizations. From neural network training to real-time AI applications, EMMA is designed to make AI code faster, safer, and portable across evolving hardware.

The SF Tensors Manifesto: Philosophy Meets Practice

The SF Tensors manifesto outlines a bold vision for the future of AI infrastructure:

The Future is Heterogeneous: CPUs, GPUs, TPUs, and domain-specific accelerators must all be first-class citizens.
Performance Should Be Accessible: High-performance optimization should be automatic, measurable, and universal.
Cost is a Dimension of Performance: It’s not just FLOPS, it’s FLOPS per dollar. Elastic cloud and smart resource allocation democratize access.
Mathematics is the Ground Truth: Memory distance, cache locality, and algorithmic complexity are quantifiable and optimizable.
Code Should Outlive Hardware: Write once in EMMA; run anywhere.
We Build Tools, Not Walled Gardens: Open, extensible, developer-first infrastructure is central to SF Tensors’ philosophy.

The manifesto reflects a commitment to developer freedom, performance, and long-term sustainability, emphasizing tools that empower researchers rather than locking them into a vendor ecosystem.

Products: Kernel Optimization and Elastic Cloud

SF Tensors is building two major products to realize this vision:

Kernel Optimizer: This system transforms any model or kernel into its mathematically fastest form by simulating memory hierarchies, cache behavior, and hardware topology. In many cases, it outperforms hand-tuned implementations, making high-performance AI accessible without deep hardware expertise.
Elastic Cloud: SF Tensors’ managed infrastructure automatically finds the cheapest hardware across providers, orchestrates training jobs, and handles all infrastructure complexity—whether you need 1 GPU or 10,000. Researchers can focus on experimentation, not cloud orchestration.

Combined, these products aim to decouple performance from cost and infrastructure expertise, a critical need for small AI labs and startups.

Why It Matters

SF Tensors’ approach could be a game-changer for AI research. By combining EMMA’s hardware-aware programming, kernel optimization, and elastic multi-cloud compute, they make high-performance AI compute accessible, portable, and cost-efficient.

The company’s early experience scaling to thousands of GPUs highlights a core insight: compute should be a commodity, not a moat. Their philosophy, products, and manifesto make clear that they want developers to innovate without worrying about infrastructure, costs, or vendor lock-in.

The SF Tensors Edge: Built by Practitioners

SF Tensors isn’t just a theory—it’s rooted in real-world experience. As the CEO shared on LinkedIn, solving multi-cloud bugs and kernel issues quickly was only possible through hands-on work and a deep understanding of infrastructure pain points. EMMA, the kernel optimizer, and Elastic Cloud all reflect this practical, problem-solving mindset, designed to help researchers focus on what really matters: advancing AI.

In short: SF Tensors is tackling one of AI’s most underappreciated bottlenecks: infrastructure complexity. With EMMA, kernel optimization, and Elastic Cloud, they’re building a developer-first, hardware-agnostic stack that could redefine how AI models are trained, fine-tuned, and deployed—making high-performance compute accessible to anyone with an idea and ambition.