# serverless
16 tools tagged
Showing 16 of 16 tools
Cerebras
Wafer-scale inference at thousands of tokens per second
Cerebras Inference serves open-weight LLMs like Llama, Qwen, and GPT-OSS on wafer-scale CS-3 chips through an OpenAI-compatible API, benchmarking between 1,800 and 2,600 output tokens per second on Llama 3.1 8B and several hundred on 70B models. A free tier offers one million tokens per day with no credit card, while paid pay-per-token pricing starts at $0.04 per million tokens for the smaller Llama models.
Tembo
Managed Postgres platform with 200+ extensions as pre-built stacks
Tembo is a managed PostgreSQL platform that packages 200+ Postgres extensions into purpose-built stacks for specific workloads. Stacks include OLAP analytics, vector search, message queues, geospatial, and machine learning, turning PostgreSQL into a specialized database for each use case. Eliminates the need for separate Redis, Elasticsearch, or Kafka instances alongside Postgres.
Modal
Serverless GPU compute platform for AI inference and training
Modal is a serverless compute platform that lets developers run AI workloads on GPUs with a Python-first SDK. Functions deploy with decorators, auto-scale from zero to thousands of containers, and bill per second. It supports LLM inference, fine-tuning, batch jobs, and sandboxes, with current GPU options including B200, H200, H100, A100, L40S, A10, L4, and T4. Modal’s 2026 Series C valued the company at $4.65B.
Val Town
Instant serverless TypeScript platform for APIs, crons, and bots
Val Town is a collaborative platform for writing and deploying serverless TypeScript functions instantly from the browser. Create APIs, cron jobs, email handlers, and bots with a single Cmd+S. Features version control, collaboration, and a social coding model where functions can import from other users. YC-backed with 450K+ monthly active users.
turbopuffer
Serverless vector and full-text search on object storage
turbopuffer is a serverless vector and full-text search engine built on object storage and vendor-positioned as roughly 10x cheaper than traditional vector databases. Used by Anthropic, Cursor, Notion, and Atlassian for production search workloads. Official site reports 4T+ documents, 10M+ writes/s, and 25k+ queries/s in production systems. Funded by Thrive Capital.
Cloudflare Agents
Build and deploy AI agents on Cloudflare's edge network
Cloudflare Agents is an open-source SDK for building and deploying AI agents that run on Cloudflare's global edge network. It provides durable state, scheduled tasks, WebSocket communication, and browser rendering capabilities within Workers. Agents persist across requests using Durable Objects and can orchestrate multi-step workflows with built-in MCP server support. Over 7,000 GitHub stars.
Restate
Durable execution engine for workflows and AI agents
Restate is a durable execution engine that provides reliable workflow orchestration for AI agents and backend services. It runs as a single binary with no external dependencies, delivering sub-50ms latency and 94K+ actions per second. Supports TypeScript, Python, Go, Java, and Kotlin SDKs with built-in retries, sagas, and virtual object state. MIT licensed with 3,700+ GitHub stars.
Google Colab
Free cloud-hosted Jupyter notebooks with GPU access
Google Colab is a free cloud-hosted Jupyter notebook environment providing access to GPUs and TPUs for machine learning, data analysis, and education. Requires no setup — notebooks run in the browser with pre-installed ML libraries including TensorFlow, PyTorch, scikit-learn, and pandas. Features Google Drive integration for persistent storage, collaborative editing, and sharing. Free tier includes limited GPU access; Colab Pro provides faster GPUs, more memory, and longer runtimes.
BentoML
ML model serving and deployment framework
BentoML is an open-source framework with 7K+ GitHub stars for packaging, deploying, and serving ML models as production-ready APIs. Bundles models, preprocessing, and serving logic into portable Bento archives with auto-generated REST/gRPC endpoints. Features adaptive batching for throughput optimization, GPU scheduling, multi-model inference pipelines, and containerization. Supports all major ML frameworks including PyTorch, TensorFlow, scikit-learn, and Hugging Face Transformers.
AWS Amplify
Build full-stack web & mobile apps on AWS
AWS's TypeScript-based full-stack platform for building and deploying web and mobile applications. AWS Amplify Gen 2 lets developers define backend resources — data, auth, storage, serverless functions — entirely in TypeScript, then auto-provisions AWS resources via CDK. Includes per-developer sandboxes, fullstack branch deployments, real-time data sync, offline support, and seamless Bedrock/Redis integration.
Cloudflare Pages
JAMstack platform for frontend developers
JAMstack deployment platform that builds and hosts websites on Cloudflare's global edge network across 300+ cities. Cloudflare Pages connects to GitHub or GitLab for automatic deployments on every push, generates unique preview URLs per pull request, supports server-side logic via Pages Functions on Workers, and includes instant rollbacks, branch deployments, and Cloudflare Access controls.
Convex
The reactive backend for modern apps
Reactive backend-as-a-service with real-time sync, TypeScript-native queries and mutations, automatic caching, and built-in file storage. No SQL required — define your backend logic in TypeScript and Convex handles the database, real-time subscriptions, and serverless functions. Ideal for apps that need instant data updates without complex WebSocket infrastructure.
Xata
Serverless database with search and AI built-in
Serverless database platform that combines Postgres, full-text search, analytics, and AI features in a single service. Built-in vector search for AI applications, branching for safe schema changes, and a spreadsheet-like UI for data exploration. Designed for developers who want powerful database capabilities without managing separate services for search, analytics, and embeddings.
PlanetScale
MySQL-compatible serverless database
Relational database platform for MySQL and Postgres with Vitess-backed MySQL scale, PlanetScale Postgres, query insights, deploy-request workflows, and Database Traffic Control. It fits production teams that need managed relational performance, safe schema changes, replicas, and database expertise rather than a simple hobby database.
Neon
Serverless Postgres
Serverless Postgres platform separating storage and compute for branching, autoscaling, read replicas, instant restore, and scale-to-zero workloads. Neon works with standard PostgreSQL clients and ORMs, supports extensions such as pgvector, and sits inside a broader Neon backend platform with Auth, Data API, Functions, Object Storage, and AI Gateway features.
SST
Build full-stack apps on your own infra
Open-source framework for building and deploying full-stack applications on AWS with infrastructure-as-code. Supports Next.js, Remix, Astro, and more with zero-config deployments. Manages Lambda, DynamoDB, S3, and other AWS services through a clean TypeScript API, giving developers the power of AWS without the complexity of CloudFormation or CDK.