What is Gorilla — LLM That Accurately Calls APIs and Functions?

A fine-tuned large language model trained to generate correct API and function calls, reducing hallucination in tool-use scenarios.

Is Gorilla — LLM That Accurately Calls APIs and Functions free to use?

Yes. Gorilla — LLM That Accurately Calls APIs and Functions is freely available on TokRepo. Check the Source & Thanks section on the asset page for the specific open-source license.

How do I install Gorilla — LLM That Accurately Calls APIs and Functions?

Visit the asset page on TokRepo and click "Copy for agent" to get the installation instructions. Most assets can be installed with a single command.

Gorilla — LLM That Accurately Calls APIs and Functions

Introduction

Gorilla is a research project from UC Berkeley that fine-tunes LLMs to generate accurate API and function calls from natural language instructions. It addresses the hallucination problem where standard LLMs fabricate API parameters or call nonexistent endpoints, providing a reliable bridge between natural language and programmatic tool use.

What Gorilla Does

Generates syntactically correct function calls from natural language descriptions
Supports structured output in multiple formats including OpenAI, Anthropic, and raw JSON
Provides the Berkeley Function Calling Leaderboard for evaluating tool-use models
Includes OpenFunctions models fine-tuned on thousands of real-world API specifications
Handles multi-turn conversations with function call chaining and parallel execution

Architecture Overview

Gorilla models are fine-tuned from base LLMs using a curated dataset of API documentation paired with natural language queries and correct function calls. The training pipeline uses retrieval-augmented fine-tuning where API documentation is injected into the context during training to ground the model in real specifications. This approach lets the model generalize to unseen APIs by learning the mapping pattern rather than memorizing specific endpoints.

Self-Hosting & Configuration

Run locally with Python 3.10+ and a CUDA-capable GPU for inference
Download model weights from Hugging Face or use the hosted API endpoint
Configure the serving backend with vLLM or Hugging Face Transformers
Set temperature and sampling parameters through the API server
Models range from 7B to 13B parameters depending on the variant

Key Features

Reduces API call hallucination compared to general-purpose LLMs
Supports 1,600+ real-world APIs in training data from major cloud providers
Open evaluation framework with reproducible benchmarks across models
Compatible with the OpenAI function calling format for drop-in replacement
Actively maintained with regular model updates and expanded API coverage

Comparison with Similar Tools

GPT-4 Function Calling — Proprietary and closed; Gorilla provides an open-source alternative with competitive accuracy
LangChain Tools — A framework for chaining tools; Gorilla handles the model-level function call generation
Instructor — Focuses on structured output extraction; Gorilla specifically targets API call generation
NexusRaven — Similar function-calling model but with a narrower API coverage

FAQ

Q: Does Gorilla work with custom APIs not in the training set? A: Yes, when provided with API documentation in the prompt, Gorilla can generalize to unseen APIs.

Q: What GPU is required to run Gorilla locally? A: The 7B model runs on a single GPU with 16 GB VRAM; larger variants need 24+ GB.

Q: Can Gorilla replace the OpenAI function calling API? A: It supports the same format and can serve as an open-source alternative for function call generation.

Q: How is Gorilla evaluated? A: Through the Berkeley Function Calling Leaderboard, which tests accuracy on real API calls across multiple categories.

Gorilla — LLM That Accurately Calls APIs and Functions

Agent 可直接安装

Introduction

What Gorilla Does

Architecture Overview

Self-Hosting & Configuration

Key Features

Comparison with Similar Tools

FAQ

Sources

讨论

相关资产

Gorilla — LLM That Writes Accurate API Calls

llm.c — LLM Training in Simple Raw C/CUDA

KoboldCpp — Single-File Local LLM Inference Engine

LM Evaluation Harness — Unified LLM Benchmarking Framework