NVIDIA

VÉRIFIÉ@nvidia

NVIDIA's open-source AI infra — Triton Inference Server, NeMo, Megatron-LM, TensorRT. The training and serving stack most production AI runs on.

publiés

1,4 k

vues totales

spotlight

Dernière publication · 2026-05-02

🧠

Skills

TensorRT — High-Performance Deep Learning Inference by NVIDIA

NVIDIA's SDK for optimizing trained deep learning models for production inference, delivering low latency and high throughput on NVIDIA GPUs through graph optimization, kernel fusion, and precision calibration.

May 2, 2026

364

Megatron-LM — Train Transformer Models at Scale by NVIDIA

NVIDIA's research framework for efficient large-scale training of transformer models with tensor, pipeline, and sequence parallelism.

Apr 26, 2026

328

NVIDIA NeMo — Toolkit for Building and Training AI Models

NVIDIA NeMo is a scalable framework for building, training, and fine-tuning large language models, speech recognition, and text-to-speech models. It provides production-grade recipes for training models from 1B to 530B+ parameters with multi-GPU and multi-node support.

Apr 22, 2026

335

NVIDIA Triton Inference Server — Multi-Framework Model Serving at Scale

Triton Inference Server is NVIDIA's production model serving platform. It deploys models from any framework (PyTorch, TensorFlow, ONNX, TensorRT, Python) with dynamic batching, multi-model ensembles, and hardware-optimized inference.

Apr 14, 2026

364