[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"workflow-asset-e48c6746":3,"seo:featured-workflow:e48c6746-4f09-11f1-9bc6-00163e2b0d79:en":84,"workflow-related-asset-e48c6746-e48c6746-4f09-11f1-9bc6-00163e2b0d79":85},{"id":4,"uuid":5,"slug":6,"title":7,"description":8,"author_id":9,"author_name":10,"author_avatar":11,"token_estimate":12,"time_saved":12,"model_used":13,"fork_count":12,"vote_count":12,"view_count":12,"parent_id":12,"parent_uuid":13,"lang_type":14,"steps":15,"tags":22,"has_voted":28,"visibility":18,"share_token":13,"is_featured":12,"content_hash":29,"asset_kind":30,"target_tools":31,"install_mode":35,"entrypoint":19,"risk_profile":36,"dependencies":38,"verification":44,"agent_metadata":47,"agent_fit":60,"trust":72,"provenance":81,"created_at":83,"updated_at":83},3664,"e48c6746-4f09-11f1-9bc6-00163e2b0d79","asset-e48c6746","minimind — Train a 64M-Parameter LLM from Scratch in 2 Hours","An open-source educational project that lets you train a small but functional language model from scratch on consumer hardware in about two hours, covering the full LLM training pipeline.","8a911193-3180-11f1-9bc6-00163e2b0d79","AI Open Source","https:\u002F\u002Ftokrepo.com\u002Fapple-touch-icon.png",0,"","en",[16],{"id":17,"step_order":18,"title":19,"description":13,"prompt_template":20,"variables":13,"depends_on":21,"expected_output":13},4238,1,"minimind Overview","# minimind — Train a 64M-Parameter LLM from Scratch in 2 Hours\n\n## Quick Use\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fjingyaogong\u002Fminimind.git\ncd minimind\npip install -r requirements.txt\npython train_pretrain.py\npython train_sft.py\npython web_demo.py\n```\n\n## Introduction\nminimind is an open-source educational project that demystifies LLM training by providing a complete pipeline to train a 64M-parameter language model from scratch in approximately two hours on a single consumer GPU. It covers pretraining, supervised fine-tuning, and DPO alignment.\n\n## What minimind Does\n- Trains a compact language model from scratch with full pretraining on a text corpus\n- Implements supervised fine-tuning (SFT) for instruction-following capabilities\n- Includes DPO (Direct Preference Optimization) for basic alignment\n- Provides an interactive web demo for chatting with the trained model\n- Documents every training stage with clear explanations in both Chinese and English\n\n## Architecture Overview\nminimind implements a decoder-only transformer architecture with rotary position embeddings, grouped query attention, and SwiGLU activation. The model uses a custom tokenizer trained on the same corpus. The training pipeline is built with PyTorch and supports distributed training via DDP, though a single GPU is sufficient for the default 64M configuration.\n\n## Self-Hosting & Configuration\n- Requires Python 3.9+ with PyTorch and a CUDA GPU (minimum 8GB VRAM)\n- Pretraining data is included or can be replaced with custom text corpora\n- Training configs control model size (26M to 218M parameters), learning rate, and batch size\n- The web demo runs locally with Gradio, accessible through a browser\n- Full training from scratch completes in about 2 hours on an RTX 3090\n\n## Key Features\n- End-to-end LLM training in minimal, readable code with extensive documentation\n- Multiple model sizes from 26M to 218M parameters for different hardware budgets\n- Complete pipeline covering tokenizer training, pretraining, SFT, and DPO alignment\n- Bilingual documentation (Chinese and English) making it accessible to a global audience\n- Modular design allows swapping components like attention mechanisms and position encodings\n\n## Comparison with Similar Tools\n- **nanochat** — Karpathy's chat-focused trainer; minimind focuses on the full pretraining pipeline with smaller models\n- **nanoGPT** — pretraining only; minimind adds SFT and DPO stages for a complete chat model\n- **LitGPT** — production fine-tuning toolkit; minimind prioritizes educational clarity over feature completeness\n- **Axolotl** — advanced fine-tuning; minimind teaches fundamentals with a from-scratch approach\n\n## FAQ\n**Q: Can the trained model actually hold conversations?**\nA: Yes. The 64M model handles simple conversations. Larger configs (218M) produce noticeably better results.\n\n**Q: What GPU is required?**\nA: An 8GB VRAM GPU (e.g., RTX 3060) works for the smallest model. 16GB+ recommended for larger configs.\n\n**Q: Is this useful beyond education?**\nA: The codebase serves as a starting point for custom small model development and domain-specific training experiments.\n\n**Q: How does it compare to fine-tuning a pretrained model?**\nA: Training from scratch produces weaker models but provides complete understanding of the LLM pipeline. For production, fine-tuning is more practical.\n\n## Sources\n- https:\u002F\u002Fgithub.com\u002Fjingyaogong\u002Fminimind","0",[23],{"id":24,"name":25,"slug":26,"icon":27},12,"Configs","config","⚙️",false,"bc7639e48bd2624da43369a8a46b9801375833cf0e3cfd2bb729a20a0066449d","skill",[32,33,34],"claude_code","codex","gemini_cli","single",{"executes_code":28,"modifies_global_config":28,"requires_secrets":37,"uses_absolute_paths":28,"network_access":28},[],{"npm":39,"pip":40,"brew":42,"system":43},[],[41],"requirements.txt",[],[],{"commands":45,"expected_files":46},[],[19],{"asset_kind":30,"target_tools":48,"install_mode":35,"entrypoint":19,"risk_profile":49,"dependencies":51,"content_hash":29,"verification":56,"inferred":59},[32,33,34],{"executes_code":28,"modifies_global_config":28,"requires_secrets":50,"uses_absolute_paths":28,"network_access":28},[],{"npm":52,"pip":53,"brew":54,"system":55},[],[41],[],[],{"commands":57,"expected_files":58},[],[19],true,{"target":33,"score":61,"status":62,"policy":63,"why":64,"asset_kind":30,"install_mode":35},98,"native","allow",[65,66,67,68,69,70,71],"target_tools includes codex","asset_kind skill","install_mode single","markdown-only","policy allow","safe markdown-only Codex install","trust established",{"author_trust_level":73,"verified_publisher":28,"asset_signed_hash":29,"signature_status":74,"install_count":12,"report_count":12,"dangerous_capability_badges":75,"review_status":76,"signals":77},"established","hash_only",[],"unreviewed",[78,79,80],"author has published assets","content hash available","no dangerous capability badges",{"owner_uuid":9,"owner_name":10,"source_url":82,"content_hash":29,"visibility":18,"created_at":83,"updated_at":83},"https:\u002F\u002Ftokrepo.com\u002Fen\u002Fworkflows\u002Fasset-e48c6746","2026-05-14 04:25:33",null,[86,138,185,232],{"id":87,"uuid":88,"slug":89,"title":90,"description":91,"author_id":9,"author_name":10,"author_avatar":11,"token_estimate":12,"time_saved":12,"model_used":13,"fork_count":12,"vote_count":12,"view_count":92,"parent_id":12,"parent_uuid":13,"lang_type":14,"steps":93,"tags":94,"has_voted":28,"visibility":18,"share_token":13,"is_featured":12,"content_hash":96,"asset_kind":30,"target_tools":97,"install_mode":35,"entrypoint":98,"risk_profile":99,"dependencies":101,"verification":106,"agent_metadata":109,"agent_fit":121,"trust":123,"provenance":126,"created_at":128,"updated_at":129,"__relatedScore":130,"__relatedReasons":131,"__sharedTags":136},2177,"e1ea2334-416b-11f1-9bc6-00163e2b0d79","llm-foundry-llm-training-code-foundation-models-databricks-e1ea2334","LLM Foundry — LLM Training Code for Foundation Models by Databricks","An open-source library for training, fine-tuning, and evaluating large language models, built on the Composer training library by MosaicML\u002FDatabricks.",87,[],[95],{"id":24,"name":25,"slug":26,"icon":27},"554d207a571afa3c06ba3ff606a9c104b0022d65f0917a751b69e0f562da0778",[32,33,34],"LLM Foundry",{"executes_code":28,"modifies_global_config":28,"requires_secrets":100,"uses_absolute_paths":28,"network_access":28},[],{"npm":102,"pip":103,"brew":104,"system":105},[],[],[],[],{"commands":107,"expected_files":108},[],[98],{"asset_kind":30,"target_tools":110,"install_mode":35,"entrypoint":98,"risk_profile":111,"dependencies":113,"content_hash":96,"verification":118},[32,33,34],{"executes_code":28,"modifies_global_config":28,"requires_secrets":112,"uses_absolute_paths":28,"network_access":28},[],{"npm":114,"pip":115,"brew":116,"system":117},[],[],[],[],{"commands":119,"expected_files":120},[],[98],{"target":33,"score":61,"status":62,"policy":63,"why":122,"asset_kind":30,"install_mode":35},[65,66,67,68,69,70,71],{"author_trust_level":73,"verified_publisher":28,"asset_signed_hash":96,"signature_status":74,"install_count":12,"report_count":12,"dangerous_capability_badges":124,"review_status":76,"signals":125},[],[78,79,80],{"owner_uuid":9,"owner_name":10,"source_url":127,"content_hash":96,"visibility":18,"created_at":128,"updated_at":129},"https:\u002F\u002Ftokrepo.com\u002Fen\u002Fworkflows\u002Fllm-foundry-llm-training-code-foundation-models-databricks-e1ea2334","2026-04-26 20:31:43","2026-05-13 00:14:44",88.91672400822526,[132,133,134,135],"topic-match","same-kind","same-target","same-author",[26,137],"configs",{"id":139,"uuid":140,"slug":141,"title":142,"description":143,"author_id":9,"author_name":10,"author_avatar":11,"token_estimate":12,"time_saved":12,"model_used":13,"fork_count":12,"vote_count":12,"view_count":144,"parent_id":12,"parent_uuid":13,"lang_type":14,"steps":145,"tags":146,"has_voted":28,"visibility":18,"share_token":13,"is_featured":12,"content_hash":148,"asset_kind":30,"target_tools":149,"install_mode":35,"entrypoint":150,"risk_profile":151,"dependencies":153,"verification":158,"agent_metadata":161,"agent_fit":173,"trust":175,"provenance":178,"created_at":180,"updated_at":181,"__relatedScore":182,"__relatedReasons":183,"__sharedTags":184},2511,"f07fee9a-45df-11f1-9bc6-00163e2b0d79","asset-f07fee9a","GPT-NeoX — Open-Source Large Language Model Training Library","A GPU-optimized library by EleutherAI for training large-scale autoregressive language models. GPT-NeoX powered the training of GPT-NeoX-20B and Pythia, providing the open-source community with tools for billion-parameter model training.",92,[],[147],{"id":24,"name":25,"slug":26,"icon":27},"4480e03ed6f0fdc0567c6e7e2b22bb3fa2f85d933d3e13f190dc95c13dff7289",[32,33,34],"GPT-NeoX Overview",{"executes_code":28,"modifies_global_config":28,"requires_secrets":152,"uses_absolute_paths":28,"network_access":28},[],{"npm":154,"pip":155,"brew":156,"system":157},[],[],[],[],{"commands":159,"expected_files":160},[],[150],{"asset_kind":30,"target_tools":162,"install_mode":35,"entrypoint":150,"risk_profile":163,"dependencies":165,"content_hash":148,"verification":170},[32,33,34],{"executes_code":28,"modifies_global_config":28,"requires_secrets":164,"uses_absolute_paths":28,"network_access":28},[],{"npm":166,"pip":167,"brew":168,"system":169},[],[],[],[],{"commands":171,"expected_files":172},[],[150],{"target":33,"score":61,"status":62,"policy":63,"why":174,"asset_kind":30,"install_mode":35},[65,66,67,68,69,70,71],{"author_trust_level":73,"verified_publisher":28,"asset_signed_hash":148,"signature_status":74,"install_count":12,"report_count":12,"dangerous_capability_badges":176,"review_status":76,"signals":177},[],[78,79,80],{"owner_uuid":9,"owner_name":10,"source_url":179,"content_hash":148,"visibility":18,"created_at":180,"updated_at":181},"https:\u002F\u002Ftokrepo.com\u002Fen\u002Fworkflows\u002Fasset-f07fee9a","2026-05-02 12:32:34","2026-05-14 04:49:59",86.9527244228309,[132,133,134,135],[26,137],{"id":186,"uuid":187,"slug":188,"title":189,"description":190,"author_id":9,"author_name":10,"author_avatar":11,"token_estimate":12,"time_saved":12,"model_used":13,"fork_count":12,"vote_count":12,"view_count":191,"parent_id":12,"parent_uuid":13,"lang_type":14,"steps":192,"tags":193,"has_voted":28,"visibility":18,"share_token":13,"is_featured":12,"content_hash":195,"asset_kind":30,"target_tools":196,"install_mode":35,"entrypoint":197,"risk_profile":198,"dependencies":200,"verification":205,"agent_metadata":208,"agent_fit":220,"trust":222,"provenance":225,"created_at":227,"updated_at":228,"__relatedScore":229,"__relatedReasons":230,"__sharedTags":231},2934,"fa6e0b07-4c49-11f1-9bc6-00163e2b0d79","asset-fa6e0b07","Liger-Kernel — Efficient GPU Kernels for LLM Training","Liger-Kernel provides optimized Triton kernels for LLM training that reduce GPU memory usage and improve throughput, serving as drop-in replacements for standard HuggingFace Transformers layers.",35,[],[194],{"id":24,"name":25,"slug":26,"icon":27},"e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",[32,33,34],"SKILL.md",{"executes_code":28,"modifies_global_config":28,"requires_secrets":199,"uses_absolute_paths":28,"network_access":28},[],{"npm":201,"pip":202,"brew":203,"system":204},[],[],[],[],{"commands":206,"expected_files":207},[],[],{"asset_kind":30,"target_tools":209,"install_mode":35,"entrypoint":197,"risk_profile":210,"dependencies":212,"content_hash":195,"verification":217,"inferred":59},[32,33,34],{"executes_code":28,"modifies_global_config":28,"requires_secrets":211,"uses_absolute_paths":28,"network_access":28},[],{"npm":213,"pip":214,"brew":215,"system":216},[],[],[],[],{"commands":218,"expected_files":219},[],[],{"target":33,"score":61,"status":62,"policy":63,"why":221,"asset_kind":30,"install_mode":35},[65,66,67,68,69,70,71],{"author_trust_level":73,"verified_publisher":28,"asset_signed_hash":195,"signature_status":74,"install_count":12,"report_count":12,"dangerous_capability_badges":223,"review_status":76,"signals":224},[],[78,79,80],{"owner_uuid":9,"owner_name":10,"source_url":226,"content_hash":195,"visibility":18,"created_at":227,"updated_at":228},"https:\u002F\u002Ftokrepo.com\u002Fen\u002Fworkflows\u002Fasset-fa6e0b07","2026-05-10 16:26:44","2026-05-13 20:50:59",85.33445375115093,[132,133,134,135],[26,137],{"id":233,"uuid":234,"slug":235,"title":236,"description":237,"author_id":9,"author_name":10,"author_avatar":11,"token_estimate":238,"time_saved":12,"model_used":239,"fork_count":12,"vote_count":12,"view_count":240,"parent_id":12,"parent_uuid":13,"lang_type":14,"steps":241,"tags":242,"has_voted":28,"visibility":18,"share_token":13,"is_featured":12,"content_hash":244,"asset_kind":30,"target_tools":245,"install_mode":35,"entrypoint":236,"risk_profile":248,"dependencies":250,"verification":255,"agent_metadata":258,"agent_fit":270,"trust":277,"provenance":281,"created_at":283,"updated_at":284,"__relatedScore":285,"__relatedReasons":286,"__sharedTags":287},259,"a69b498a-76d7-4cb4-b4fd-d4006a89b5a0","unsloth-2x-faster-local-llm-training-inference-a69b498a","Unsloth — 2x Faster Local LLM Training & Inference","Unsloth is a unified local interface for running and training AI models. 58.7K+ GitHub stars. 2x faster training with 70% less VRAM across 500+ models including Qwen, DeepSeek, Llama, Gemma. Web UI wi",500,"Claude Code",95,[],[243],{"id":24,"name":25,"slug":26,"icon":27},"4d5f936d395ce536af5eab32ac8898d00a65fd0dcfd08938c4354f5b00ed7690",[32,33,246,34,247],"cursor","windsurf",{"executes_code":28,"modifies_global_config":28,"requires_secrets":249,"uses_absolute_paths":28,"network_access":59},[],{"npm":251,"pip":252,"brew":253,"system":254},[],[],[],[],{"commands":256,"expected_files":257},[],[236],{"asset_kind":30,"target_tools":259,"install_mode":35,"entrypoint":236,"risk_profile":260,"dependencies":262,"content_hash":244,"verification":267},[32,33,246,34,247],{"executes_code":28,"modifies_global_config":28,"requires_secrets":261,"uses_absolute_paths":28,"network_access":59},[],{"npm":263,"pip":264,"brew":265,"system":266},[],[],[],[],{"commands":268,"expected_files":269},[],[236],{"target":33,"score":271,"status":272,"policy":273,"why":274,"asset_kind":30,"install_mode":35},64,"needs_confirmation","confirm",[65,66,67,275,276,71],"policy confirm","risk_profile.network_access is true",{"author_trust_level":73,"verified_publisher":28,"asset_signed_hash":244,"signature_status":74,"install_count":12,"report_count":12,"dangerous_capability_badges":278,"review_status":76,"signals":280},[279],"network_access",[78,79],{"owner_uuid":9,"owner_name":10,"source_url":282,"content_hash":244,"visibility":18,"created_at":283,"updated_at":284},"https:\u002F\u002Ftokrepo.com\u002Fen\u002Fworkflows\u002Funsloth-2x-faster-local-llm-training-inference-a69b498a","2026-03-31 19:35:42","2026-05-14 02:41:16",79.97340684955935,[132,133,134,135],[26,137]]