# modal-examples — Serverless LLM Jobs on Modal

> Learn production patterns for serverless jobs (LLM inference, data pipelines) using Modal’s official examples. Run one and adapt it to your workload.

## Install

Save the content below to `.claude/skills/` or append to your `CLAUDE.md`:

# modal-examples — Serverless LLM Jobs on Modal

> Learn production patterns for serverless jobs (LLM inference, data pipelines) using Modal’s official examples. Run one and adapt it to your workload.

## Quick Use

1. Install:
   ```bash
   pip install modal
   ```
2. Run:
   ```bash
   modal run 01_getting_started/hello_world.py
   ```
3. Verify:
   - Run one example and confirm a remote run completes and prints output to your terminal.


---

## Intro

Learn production patterns for serverless jobs (LLM inference, data pipelines) using Modal’s official examples. Run one and adapt it to your workload.

- **Best for:** Developers who want a quick, example-driven path to run LLM workloads as serverless jobs
- **Works with:** Python, Modal CLI, cloud execution with local development loop (per README)
- **Setup time:** 12 minutes


### Quantitative Notes

- Setup time ~12 minutes (install + auth + run one example)
- GitHub stars + forks (verified): see Source & Thanks
- Examples are organized into multiple folders; start with 1 file before scaling up


---

## Practical Notes

Treat examples as templates: fork one that matches your workload (batch, web endpoint, GPU inference), replace the core function with your model/tool call, then add logging and retries. Keep a local dev loop with a tiny input set so iteration stays fast.

**Safety note:** Treat secrets carefully: store API keys in env/secret managers and avoid printing them in logs.

### FAQ

**Q: Do I need an account?**
A: Yes. The README instructs you to sign up and set an API key for the Modal CLI.

**Q: Can I run LLM inference?**
A: Many examples demonstrate patterns you can adapt to inference and data workloads; follow the repo structure.

**Q: How do I keep costs predictable?**
A: Pin resources, set concurrency limits, and use small test runs before scaling.

---

## Source & Thanks

> GitHub: https://github.com/modal-labs/modal-examples
> Owner avatar: https://avatars.githubusercontent.com/u/88658467?v=4
> License (SPDX): MIT
> GitHub stars (verified via `api.github.com/repos/modal-labs/modal-examples`): 1,189
> GitHub forks (verified via `api.github.com/repos/modal-labs/modal-examples`): 288


---

<!-- ZH -->

# modal-examples——在 Modal 上跑无服务器 LLM 任务

> 用 Modal 官方示例集学习无服务器任务的生产实践（包含 LLM 推理与数据流水线场景）：先跑通一个例子，再按你的业务改造为可复用的 job，并加入日志、重试、并发控制与资源配额，更易规模化。

## 快速使用

1. 安装：
   ```bash
   pip install modal
   ```
2. 运行：
   ```bash
   modal run 01_getting_started/hello_world.py
   ```
3. 验证：
   - Run one example and confirm a remote run completes and prints output to your terminal.


---

## 简介

用 Modal 官方示例集学习无服务器任务的生产实践（包含 LLM 推理与数据流水线场景）：先跑通一个例子，再按你的业务改造为可复用的 job，并加入日志、重试、并发控制与资源配额，更易规模化。

- **适合谁（Best for）:** 想用示例驱动方式快速把 LLM 工作负载跑成无服务器 job 的开发者
- **兼容工具（Works with）:** Python、Modal CLI、云端执行 + 本地开发闭环（见 README）
- **安装时间（Setup time）:** 12 分钟


### 量化信息

- 跑通约 12 分钟（安装 + 授权 + 跑一个示例）
- GitHub stars + forks（已核验）：见「来源与感谢」
- 示例按目录分类；建议先从 1 个脚本跑通再逐步扩展


---

## 实战要点

把示例当模板：挑一个最接近你场景的（批处理、Web endpoint、GPU 推理），把核心函数替换为你的模型/工具调用，再补日志与重试。保持本地开发闭环：用一小份输入数据迭代，才能跑得快。

**安全提示：** 谨慎处理密钥：把 API key 放在环境变量/密钥管理中，并避免在日志中输出。

### FAQ

**Q: 需要账号吗？**
A: 需要。README 提示你注册并为 Modal CLI 配置 API key。

**Q: 能跑 LLM 推理吗？**
A: 示例覆盖多种可复用的模式，你可以按仓库结构改造用于推理与数据任务。

**Q: 如何控制成本？**
A: 固定资源规格、限制并发，并在放大规模前用小规模测试跑通。

---

## 来源与感谢

> GitHub：https://github.com/modal-labs/modal-examples
> Owner avatar：https://avatars.githubusercontent.com/u/88658467?v=4
> 许可证（SPDX）：MIT
> GitHub stars（已通过 `api.github.com/repos/modal-labs/modal-examples` 核验）：1,189
> GitHub forks（已通过 `api.github.com/repos/modal-labs/modal-examples` 核验）：288


---
Source: https://tokrepo.com/en/workflows/modal-examples-serverless-llm-jobs-on-modal
Author: Script Depot