aicoolies logo

# serverless

16 tools tagged

Showing 16 of 16 tools

Cerebras logo

Cerebras

Wafer-scale inference at thousands of tokens per second

Cerebras Inference serves open-weight LLMs like Llama, Qwen, and GPT-OSS on wafer-scale CS-3 chips through an OpenAI-compatible API, benchmarking between 1,800 and 2,600 output tokens per second on Llama 3.1 8B and several hundred on 70B models. A free tier offers one million tokens per day with no credit card, while paid pay-per-token pricing starts at $0.04 per million tokens for the smaller Llama models.

freemium
Tembo logo

Tembo

Managed Postgres platform with 200+ extensions as pre-built stacks

Tembo is a managed PostgreSQL platform that packages 200+ Postgres extensions into purpose-built stacks for specific workloads. Stacks include OLAP analytics, vector search, message queues, geospatial, and machine learning, turning PostgreSQL into a specialized database for each use case. Eliminates the need for separate Redis, Elasticsearch, or Kafka instances alongside Postgres.

freemiumOpen Source
Modal logo

Modal

Serverless GPU compute platform for AI inference and training

Modal is a serverless compute platform that lets developers run AI workloads on GPUs with a Python-first SDK. Functions deploy with decorators, auto-scale from zero to thousands of containers, and bill per second. It supports LLM inference, fine-tuning, batch jobs, and sandboxes, with current GPU options including B200, H200, H100, A100, L40S, A10, L4, and T4. Modal’s 2026 Series C valued the company at $4.65B.

freemium
Val Town logo

Val Town

Instant serverless TypeScript platform for APIs, crons, and bots

Val Town is a collaborative platform for writing and deploying serverless TypeScript functions instantly from the browser. Create APIs, cron jobs, email handlers, and bots with a single Cmd+S. Features version control, collaboration, and a social coding model where functions can import from other users. YC-backed with 450K+ monthly active users.

freemium
turbopuffer logo

turbopuffer

Serverless vector and full-text search on object storage

turbopuffer is a serverless vector and full-text search engine built on object storage and vendor-positioned as roughly 10x cheaper than traditional vector databases. Used by Anthropic, Cursor, Notion, and Atlassian for production search workloads. Official site reports 4T+ documents, 10M+ writes/s, and 25k+ queries/s in production systems. Funded by Thrive Capital.

paid
Cloudflare Agents logo

Cloudflare Agents

Build and deploy AI agents on Cloudflare's edge network

Cloudflare Agents is an open-source SDK for building and deploying AI agents that run on Cloudflare's global edge network. It provides durable state, scheduled tasks, WebSocket communication, and browser rendering capabilities within Workers. Agents persist across requests using Durable Objects and can orchestrate multi-step workflows with built-in MCP server support. Over 7,000 GitHub stars.

freemiumOpen Source
Restate logo

Restate

Durable execution engine for workflows and AI agents

Restate is a durable execution engine that provides reliable workflow orchestration for AI agents and backend services. It runs as a single binary with no external dependencies, delivering sub-50ms latency and 94K+ actions per second. Supports TypeScript, Python, Go, Java, and Kotlin SDKs with built-in retries, sagas, and virtual object state. MIT licensed with 3,700+ GitHub stars.

freemiumOpen Source
Google Colab logo

Google Colab

Free cloud-hosted Jupyter notebooks with GPU access

Google Colab is a free cloud-hosted Jupyter notebook environment providing access to GPUs and TPUs for machine learning, data analysis, and education. Requires no setup — notebooks run in the browser with pre-installed ML libraries including TensorFlow, PyTorch, scikit-learn, and pandas. Features Google Drive integration for persistent storage, collaborative editing, and sharing. Free tier includes limited GPU access; Colab Pro provides faster GPUs, more memory, and longer runtimes.

freemiumOpen SourceTelemetry
BentoML logo

BentoML

ML model serving and deployment framework

BentoML is an open-source framework with 7K+ GitHub stars for packaging, deploying, and serving ML models as production-ready APIs. Bundles models, preprocessing, and serving logic into portable Bento archives with auto-generated REST/gRPC endpoints. Features adaptive batching for throughput optimization, GPU scheduling, multi-model inference pipelines, and containerization. Supports all major ML frameworks including PyTorch, TensorFlow, scikit-learn, and Hugging Face Transformers.

open-sourceOpen Source
AWS Amplify logo

AWS Amplify

Build full-stack web & mobile apps on AWS

AWS's TypeScript-based full-stack platform for building and deploying web and mobile applications. AWS Amplify Gen 2 lets developers define backend resources — data, auth, storage, serverless functions — entirely in TypeScript, then auto-provisions AWS resources via CDK. Includes per-developer sandboxes, fullstack branch deployments, real-time data sync, offline support, and seamless Bedrock/Redis integration.

freemium
Cloudflare Pages logo

Cloudflare Pages

JAMstack platform for frontend developers

JAMstack deployment platform that builds and hosts websites on Cloudflare's global edge network across 300+ cities. Cloudflare Pages connects to GitHub or GitLab for automatic deployments on every push, generates unique preview URLs per pull request, supports server-side logic via Pages Functions on Workers, and includes instant rollbacks, branch deployments, and Cloudflare Access controls.

freemium
Convex logo

Convex

The reactive backend for modern apps

Reactive backend-as-a-service with real-time sync, TypeScript-native queries and mutations, automatic caching, and built-in file storage. No SQL required — define your backend logic in TypeScript and Convex handles the database, real-time subscriptions, and serverless functions. Ideal for apps that need instant data updates without complex WebSocket infrastructure.

freemium
Xata logo

Xata

Serverless database with search and AI built-in

Serverless database platform that combines Postgres, full-text search, analytics, and AI features in a single service. Built-in vector search for AI applications, branching for safe schema changes, and a spreadsheet-like UI for data exploration. Designed for developers who want powerful database capabilities without managing separate services for search, analytics, and embeddings.

freemium
PlanetScale logo

PlanetScale

MySQL-compatible serverless database

Relational database platform for MySQL and Postgres with Vitess-backed MySQL scale, PlanetScale Postgres, query insights, deploy-request workflows, and Database Traffic Control. It fits production teams that need managed relational performance, safe schema changes, replicas, and database expertise rather than a simple hobby database.

paid
Neon logo

Neon

Serverless Postgres

Serverless Postgres platform separating storage and compute for branching, autoscaling, read replicas, instant restore, and scale-to-zero workloads. Neon works with standard PostgreSQL clients and ORMs, supports extensions such as pgvector, and sits inside a broader Neon backend platform with Auth, Data API, Functions, Object Storage, and AI Gateway features.

freemiumOpen Source
SST logo

SST

Build full-stack apps on your own infra

Open-source framework for building and deploying full-stack applications on AWS with infrastructure-as-code. Supports Next.js, Remix, Astro, and more with zero-config deployments. Manages Lambda, DynamoDB, S3, and other AWS services through a clean TypeScript API, giving developers the power of AWS without the complexity of CloudFormation or CDK.

open-sourceOpen Source