[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"workflow-asset-73dc2715":3,"seo:featured-workflow:73dc2715-4ea4-11f1-9bc6-00163e2b0d79:fr":84,"workflow-related-asset-73dc2715-73dc2715-4ea4-11f1-9bc6-00163e2b0d79":85},{"id":4,"uuid":5,"slug":6,"title":7,"description":8,"author_id":9,"author_name":10,"author_avatar":11,"token_estimate":12,"time_saved":12,"model_used":13,"fork_count":12,"vote_count":12,"view_count":12,"parent_id":12,"parent_uuid":13,"lang_type":14,"steps":15,"tags":22,"has_voted":28,"visibility":18,"share_token":13,"is_featured":12,"content_hash":29,"asset_kind":30,"target_tools":31,"install_mode":35,"entrypoint":19,"risk_profile":36,"dependencies":38,"verification":44,"agent_metadata":47,"agent_fit":60,"trust":72,"provenance":81,"created_at":83,"updated_at":83},3579,"73dc2715-4ea4-11f1-9bc6-00163e2b0d79","asset-73dc2715","Snorkel — Programmatic Data Labeling for Machine Learning","Snorkel is a framework for building and managing training datasets programmatically using labeling functions, data augmentation, and slicing, replacing manual annotation with scalable automated approaches.","8a911193-3180-11f1-9bc6-00163e2b0d79","AI Open Source","https:\u002F\u002Ftokrepo.com\u002Fapple-touch-icon.png",0,"","en",[16],{"id":17,"step_order":18,"title":19,"description":13,"prompt_template":20,"variables":13,"depends_on":21,"expected_output":13},4139,1,"Snorkel Overview","# Snorkel — Programmatic Data Labeling for Machine Learning\n\n## Quick Use\n```bash\npip install snorkel\npython -c \"\nfrom snorkel.labeling import labeling_function, PandasLFApplier, LabelModel\nimport pandas as pd\n\n@labeling_function()\ndef lf_contains_error(x):\n    return 1 if 'error' in x.text.lower() else -1\n\n# Apply labeling functions and train a label model\n\"\n```\n\n## Introduction\nSnorkel is a data-centric AI framework from Stanford that replaces hand-labeling with programmatic labeling functions. Users write simple Python functions that encode heuristics, patterns, or external knowledge sources, and Snorkel combines their noisy outputs into high-quality training labels using a generative model.\n\n## What Snorkel Does\n- Lets users write labeling functions as simple Python heuristics\n- Combines multiple noisy label sources into probabilistic training labels\n- Models labeling function accuracy and correlations automatically\n- Supports data augmentation through transformation functions\n- Provides slicing functions for fine-grained model analysis\n\n## Architecture Overview\nSnorkel operates in three stages. First, labeling functions produce a label matrix where each row is a data point and each column is a labeling function's vote. Second, a generative label model learns the accuracy and correlation structure of the labeling functions without ground truth, producing probabilistic labels. Third, these soft labels train a downstream discriminative model (any standard classifier) that generalizes beyond the labeling function coverage.\n\n## Self-Hosting & Configuration\n- Install via pip: `pip install snorkel`\n- Define labeling functions as decorated Python functions\n- Apply labeling functions to your dataset with the built-in applier\n- Train the label model to estimate function accuracies\n- Feed probabilistic labels to any downstream ML framework\n\n## Key Features\n- Replaces manual labeling with programmatic heuristics at scale\n- Learns labeling function quality without any ground-truth labels\n- Handles conflicting and overlapping label sources automatically\n- Integrates with pandas DataFrames and standard ML pipelines\n- Supports transformation functions for data augmentation\n\n## Comparison with Similar Tools\n- **Label Studio** — manual annotation UI; Snorkel automates labeling with code\n- **Prodigy** — active learning annotation tool; Snorkel uses heuristic functions instead of human feedback\n- **Cleanlab** — detects label errors in existing datasets; Snorkel generates labels from scratch\n- **Argilla** — collaborative data curation; Snorkel focuses on programmatic weak supervision\n\n## FAQ\n**Q: Do labeling functions need to be perfect?**\nA: No. Snorkel is designed to work with noisy, incomplete labeling functions and automatically estimates their accuracy.\n\n**Q: How many labeling functions do I need?**\nA: Even 3-5 functions can produce useful labels. More functions with diverse signals generally improve quality.\n\n**Q: Does Snorkel replace all manual labeling?**\nA: It dramatically reduces the need for manual labels. A small validation set is still recommended for evaluation.\n\n**Q: Can I use LLMs as labeling functions?**\nA: Yes, wrapping an LLM call in a labeling function is a common and effective pattern.\n\n## Sources\n- https:\u002F\u002Fgithub.com\u002Fsnorkel-team\u002Fsnorkel\n- https:\u002F\u002Fwww.snorkel.org","0",[23],{"id":24,"name":25,"slug":26,"icon":27},12,"Configs","config","⚙️",false,"96cb5b47ff850dcd350e87f7d9ad285a15310ec13d8bb26c5acf7cefa77b588e","skill",[32,33,34],"claude_code","codex","gemini_cli","single",{"executes_code":28,"modifies_global_config":28,"requires_secrets":37,"uses_absolute_paths":28,"network_access":28},[],{"npm":39,"pip":40,"brew":42,"system":43},[],[41],"snorkel",[],[],{"commands":45,"expected_files":46},[],[19],{"asset_kind":30,"target_tools":48,"install_mode":35,"entrypoint":19,"risk_profile":49,"dependencies":51,"content_hash":29,"verification":56,"inferred":59},[32,33,34],{"executes_code":28,"modifies_global_config":28,"requires_secrets":50,"uses_absolute_paths":28,"network_access":28},[],{"npm":52,"pip":53,"brew":54,"system":55},[],[41],[],[],{"commands":57,"expected_files":58},[],[19],true,{"target":33,"score":61,"status":62,"policy":63,"why":64,"asset_kind":30,"install_mode":35},98,"native","allow",[65,66,67,68,69,70,71],"target_tools includes codex","asset_kind skill","install_mode single","markdown-only","policy allow","safe markdown-only Codex install","trust established",{"author_trust_level":73,"verified_publisher":28,"asset_signed_hash":29,"signature_status":74,"install_count":12,"report_count":12,"dangerous_capability_badges":75,"review_status":76,"signals":77},"established","hash_only",[],"unreviewed",[78,79,80],"author has published assets","content hash available","no dangerous capability badges",{"owner_uuid":9,"owner_name":10,"source_url":82,"content_hash":29,"visibility":18,"created_at":83,"updated_at":83},"https:\u002F\u002Ftokrepo.com\u002Fen\u002Fworkflows\u002Fasset-73dc2715","2026-05-13 16:19:25",null,[86,139,186,233],{"id":87,"uuid":88,"slug":89,"title":90,"description":91,"author_id":9,"author_name":10,"author_avatar":11,"token_estimate":12,"time_saved":12,"model_used":13,"fork_count":12,"vote_count":12,"view_count":92,"parent_id":12,"parent_uuid":13,"lang_type":14,"steps":93,"tags":94,"has_voted":28,"visibility":18,"share_token":13,"is_featured":12,"content_hash":96,"asset_kind":30,"target_tools":97,"install_mode":35,"entrypoint":98,"risk_profile":99,"dependencies":101,"verification":106,"agent_metadata":109,"agent_fit":121,"trust":123,"provenance":127,"created_at":129,"updated_at":130,"__relatedScore":131,"__relatedReasons":132,"__sharedTags":137},2333,"bb9da5f7-4361-11f1-9bc6-00163e2b0d79","nivo-rich-data-visualization-components-react-bb9da5f7","Nivo — Rich Data Visualization Components for React","A data visualization library built on D3 and React that provides ready-made, themeable chart components with server-side rendering support.",156,[],[95],{"id":24,"name":25,"slug":26,"icon":27},"66ae2ccf1e3292fb6da91b5e64819fc1f59f718f37589dc82def9822805c62fc",[32,33,34],"Nivo",{"executes_code":28,"modifies_global_config":28,"requires_secrets":100,"uses_absolute_paths":28,"network_access":28},[],{"npm":102,"pip":103,"brew":104,"system":105},[],[],[],[],{"commands":107,"expected_files":108},[],[98],{"asset_kind":30,"target_tools":110,"install_mode":35,"entrypoint":98,"risk_profile":111,"dependencies":113,"content_hash":96,"verification":118},[32,33,34],{"executes_code":28,"modifies_global_config":28,"requires_secrets":112,"uses_absolute_paths":28,"network_access":28},[],{"npm":114,"pip":115,"brew":116,"system":117},[],[],[],[],{"commands":119,"expected_files":120},[],[98],{"target":33,"score":61,"status":62,"policy":63,"why":122,"asset_kind":30,"install_mode":35},[65,66,67,68,69,70,71],{"author_trust_level":73,"verified_publisher":28,"asset_signed_hash":96,"signature_status":74,"install_count":12,"report_count":12,"dangerous_capability_badges":124,"review_status":76,"signals":125},[],[126,78,79,80],"asset has usage views",{"owner_uuid":9,"owner_name":10,"source_url":128,"content_hash":96,"visibility":18,"created_at":129,"updated_at":130},"https:\u002F\u002Ftokrepo.com\u002Fen\u002Fworkflows\u002Fnivo-rich-data-visualization-components-react-bb9da5f7","2026-04-29 08:24:06","2026-05-13 05:16:07",83.29384947861385,[133,134,135,136],"topic-match","same-kind","same-target","same-author",[26,138],"configs",{"id":140,"uuid":141,"slug":142,"title":143,"description":144,"author_id":9,"author_name":10,"author_avatar":11,"token_estimate":12,"time_saved":12,"model_used":13,"fork_count":12,"vote_count":12,"view_count":145,"parent_id":12,"parent_uuid":13,"lang_type":14,"steps":146,"tags":147,"has_voted":28,"visibility":18,"share_token":13,"is_featured":12,"content_hash":149,"asset_kind":30,"target_tools":150,"install_mode":35,"entrypoint":151,"risk_profile":152,"dependencies":154,"verification":159,"agent_metadata":162,"agent_fit":174,"trust":176,"provenance":179,"created_at":181,"updated_at":182,"__relatedScore":183,"__relatedReasons":184,"__sharedTags":185},2349,"18e91308-43a5-11f1-9bc6-00163e2b0d79","seaborn-statistical-data-visualization-built-matplotlib-18e91308","Seaborn — Statistical Data Visualization Built on Matplotlib","Seaborn is a Python data visualization library built on top of Matplotlib that provides a high-level interface for drawing attractive and informative statistical graphics with minimal code.",130,[],[148],{"id":24,"name":25,"slug":26,"icon":27},"47a96bc1169508adb69ed7a225359a2fe4d120dfaa667b94556a3630ed657ec8",[32,33,34],"Seaborn",{"executes_code":28,"modifies_global_config":28,"requires_secrets":153,"uses_absolute_paths":28,"network_access":28},[],{"npm":155,"pip":156,"brew":157,"system":158},[],[],[],[],{"commands":160,"expected_files":161},[],[151],{"asset_kind":30,"target_tools":163,"install_mode":35,"entrypoint":151,"risk_profile":164,"dependencies":166,"content_hash":149,"verification":171},[32,33,34],{"executes_code":28,"modifies_global_config":28,"requires_secrets":165,"uses_absolute_paths":28,"network_access":28},[],{"npm":167,"pip":168,"brew":169,"system":170},[],[],[],[],{"commands":172,"expected_files":173},[],[151],{"target":33,"score":61,"status":62,"policy":63,"why":175,"asset_kind":30,"install_mode":35},[65,66,67,68,69,70,71],{"author_trust_level":73,"verified_publisher":28,"asset_signed_hash":149,"signature_status":74,"install_count":12,"report_count":12,"dangerous_capability_badges":177,"review_status":76,"signals":178},[],[126,78,79,80],{"owner_uuid":9,"owner_name":10,"source_url":180,"content_hash":149,"visibility":18,"created_at":181,"updated_at":182},"https:\u002F\u002Ftokrepo.com\u002Fen\u002Fworkflows\u002Fseaborn-statistical-data-visualization-built-matplotlib-18e91308","2026-04-29 16:26:19","2026-05-13 16:25:31",83.17590694348365,[133,134,135,136],[26,138],{"id":187,"uuid":188,"slug":189,"title":190,"description":191,"author_id":9,"author_name":10,"author_avatar":11,"token_estimate":12,"time_saved":12,"model_used":13,"fork_count":12,"vote_count":12,"view_count":192,"parent_id":12,"parent_uuid":13,"lang_type":14,"steps":193,"tags":194,"has_voted":28,"visibility":18,"share_token":13,"is_featured":12,"content_hash":196,"asset_kind":30,"target_tools":197,"install_mode":35,"entrypoint":198,"risk_profile":199,"dependencies":201,"verification":206,"agent_metadata":209,"agent_fit":221,"trust":223,"provenance":226,"created_at":228,"updated_at":229,"__relatedScore":230,"__relatedReasons":231,"__sharedTags":232},2345,"b4b37485-43a4-11f1-9bc6-00163e2b0d79","protocol-buffers-language-neutral-data-serialization-google-b4b37485","Protocol Buffers — Language-Neutral Data Serialization by Google","Protocol Buffers (protobuf) is Google's language-neutral, platform-neutral mechanism for serializing structured data. It is smaller, faster, and simpler than XML or JSON for inter-service communication and data storage.",103,[],[195],{"id":24,"name":25,"slug":26,"icon":27},"a21728b0e9c6b4f40dd91b2dc244e38e416175e886518a021cb6701af2716b49",[32,33,34],"Protocol Buffers",{"executes_code":28,"modifies_global_config":28,"requires_secrets":200,"uses_absolute_paths":28,"network_access":28},[],{"npm":202,"pip":203,"brew":204,"system":205},[],[],[],[],{"commands":207,"expected_files":208},[],[198],{"asset_kind":30,"target_tools":210,"install_mode":35,"entrypoint":198,"risk_profile":211,"dependencies":213,"content_hash":196,"verification":218},[32,33,34],{"executes_code":28,"modifies_global_config":28,"requires_secrets":212,"uses_absolute_paths":28,"network_access":28},[],{"npm":214,"pip":215,"brew":216,"system":217},[],[],[],[],{"commands":219,"expected_files":220},[],[198],{"target":33,"score":61,"status":62,"policy":63,"why":222,"asset_kind":30,"install_mode":35},[65,66,67,68,69,70,71],{"author_trust_level":73,"verified_publisher":28,"asset_signed_hash":196,"signature_status":74,"install_count":12,"report_count":12,"dangerous_capability_badges":224,"review_status":76,"signals":225},[],[126,78,79,80],{"owner_uuid":9,"owner_name":10,"source_url":227,"content_hash":196,"visibility":18,"created_at":228,"updated_at":229},"https:\u002F\u002Ftokrepo.com\u002Fen\u002Fworkflows\u002Fprotocol-buffers-language-neutral-data-serialization-google-b4b37485","2026-04-29 16:23:31","2026-05-13 05:21:10",83.02555000894817,[133,134,135,136],[26,138],{"id":234,"uuid":235,"slug":236,"title":237,"description":238,"author_id":9,"author_name":10,"author_avatar":11,"token_estimate":12,"time_saved":12,"model_used":13,"fork_count":12,"vote_count":12,"view_count":239,"parent_id":12,"parent_uuid":13,"lang_type":14,"steps":240,"tags":241,"has_voted":28,"visibility":18,"share_token":13,"is_featured":12,"content_hash":243,"asset_kind":30,"target_tools":244,"install_mode":35,"entrypoint":245,"risk_profile":246,"dependencies":248,"verification":253,"agent_metadata":256,"agent_fit":268,"trust":270,"provenance":273,"created_at":275,"updated_at":276,"__relatedScore":277,"__relatedReasons":278,"__sharedTags":279},2289,"af6eba92-42dc-11f1-9bc6-00163e2b0d79","open3d-modern-library-3d-data-processing-af6eba92","Open3D — Modern Library for 3D Data Processing","An open-source library for 3D data processing with fast implementations for point clouds, meshes, RGB-D images, and 3D visualization using both C++ and Python APIs.",99,[],[242],{"id":24,"name":25,"slug":26,"icon":27},"d64d86e1c435e0a24287dfaaf56c6e3aa7fd48aafffaa0f7aba8ca77b176ac32",[32,33,34],"Open3D Overview",{"executes_code":28,"modifies_global_config":28,"requires_secrets":247,"uses_absolute_paths":28,"network_access":28},[],{"npm":249,"pip":250,"brew":251,"system":252},[],[],[],[],{"commands":254,"expected_files":255},[],[245],{"asset_kind":30,"target_tools":257,"install_mode":35,"entrypoint":245,"risk_profile":258,"dependencies":260,"content_hash":243,"verification":265},[32,33,34],{"executes_code":28,"modifies_global_config":28,"requires_secrets":259,"uses_absolute_paths":28,"network_access":28},[],{"npm":261,"pip":262,"brew":263,"system":264},[],[],[],[],{"commands":266,"expected_files":267},[],[245],{"target":33,"score":61,"status":62,"policy":63,"why":269,"asset_kind":30,"install_mode":35},[65,66,67,68,69,70,71],{"author_trust_level":73,"verified_publisher":28,"asset_signed_hash":243,"signature_status":74,"install_count":12,"report_count":12,"dangerous_capability_badges":271,"review_status":76,"signals":272},[],[78,79,80],{"owner_uuid":9,"owner_name":10,"source_url":274,"content_hash":243,"visibility":18,"created_at":275,"updated_at":276},"https:\u002F\u002Ftokrepo.com\u002Fen\u002Fworkflows\u002Fopen3d-modern-library-3d-data-processing-af6eba92","2026-04-28 16:31:42","2026-05-13 15:43:50",83,[133,134,135,136],[26,138]]