[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"workflow-asset-5b7a9740":3,"seo:featured-workflow:5b7a9740-4ecb-11f1-9bc6-00163e2b0d79:fr":86,"workflow-related-asset-5b7a9740-5b7a9740-4ecb-11f1-9bc6-00163e2b0d79":87},{"id":4,"uuid":5,"slug":6,"title":7,"description":8,"author_id":9,"author_name":10,"author_avatar":11,"token_estimate":12,"time_saved":12,"model_used":13,"fork_count":12,"vote_count":12,"view_count":14,"parent_id":12,"parent_uuid":13,"lang_type":15,"steps":16,"tags":23,"has_voted":29,"visibility":19,"share_token":13,"is_featured":12,"content_hash":30,"asset_kind":31,"target_tools":32,"install_mode":36,"entrypoint":20,"risk_profile":37,"dependencies":39,"verification":45,"agent_metadata":48,"agent_fit":61,"trust":73,"provenance":82,"created_at":84,"updated_at":85},3619,"5b7a9740-4ecb-11f1-9bc6-00163e2b0d79","asset-5b7a9740","LakeFS — Git-Like Version Control for Data Lakes","LakeFS adds Git-like branching, committing, and merging to your data lake on S3, GCS, or Azure Blob Storage, enabling reproducible data pipelines and zero-copy experimentation.","8a911193-3180-11f1-9bc6-00163e2b0d79","AI Open Source","https:\u002F\u002Ftokrepo.com\u002Fapple-touch-icon.png",0,"",7,"en",[17],{"id":18,"step_order":19,"title":20,"description":13,"prompt_template":21,"variables":13,"depends_on":22,"expected_output":13},4179,1,"LakeFS Data Versioning","# LakeFS — Git-Like Version Control for Data Lakes\n\n## Quick Use\n```bash\n# Run LakeFS with Docker\ndocker run --pull always -p 8000:8000 treeverse\u002Flakefs run --local-settings\n\n# Install the CLI\npip install lakefs-cli\n\n# Create a repository backed by S3\nlakectl repo create lakefs:\u002F\u002Fmy-repo s3:\u002F\u002Fmy-bucket\u002Fdata\n\n# Create a branch and commit\nlakectl branch create lakefs:\u002F\u002Fmy-repo\u002Fexperiment -s lakefs:\u002F\u002Fmy-repo\u002Fmain\nlakectl commit lakefs:\u002F\u002Fmy-repo\u002Fexperiment -m \"Add training dataset v2\"\n```\n\n## Introduction\nLakeFS brings version control semantics to object storage. Data engineers can create branches, run experimental transformations in isolation, diff the results against production, and merge — all without copying data. It acts as a gateway that intercepts S3-compatible API calls and manages versioned metadata.\n\n## What LakeFS Does\n- Provides Git-like branching, committing, merging, and reverting for data stored in object storage\n- Exposes an S3-compatible API so existing tools (Spark, Trino, dbt, Airflow) work unchanged\n- Enables zero-copy branching — branches share underlying data until changes diverge\n- Tracks lineage and enables data diffing between any two references\n- Supports pre-merge and pre-commit hooks for data quality validation\n\n## Architecture Overview\nLakeFS runs as a stateless Go service backed by PostgreSQL (for metadata) and your existing object store (S3, GCS, or Azure) for data. When a client writes via the S3 gateway, LakeFS records the object in a branch-specific namespace. Commits create immutable snapshots of the metadata tree. Merges perform a three-way diff on metadata pointers, not on data bytes, making them fast regardless of dataset size.\n\n## Self-Hosting & Configuration\n- Deploy via Docker, Kubernetes Helm chart, or native binaries\n- Requires PostgreSQL (or DynamoDB on AWS) for metadata storage\n- Configure the blockstore backend (S3, GCS, Azure, or local filesystem)\n- Set up authentication via built-in users, LDAP, or OIDC\n- Integrate with Airflow, Spark, or dbt using the S3-compatible endpoint with lakefs:\u002F\u002F URIs\n\n## Key Features\n- Zero-copy branching — create branches instantly without duplicating data\n- S3-compatible gateway for transparent integration with any S3-aware tool\n- Pre-commit and pre-merge hooks for automated data validation\n- Web UI and CLI for browsing repositories, diffs, and commit history\n- Open source under the Apache 2.0 license with an active community\n\n## Comparison with Similar Tools\n- **Delta Lake** — table format with ACID transactions and time travel; LakeFS works at the object storage level across any file format\n- **DVC** — Git-based data versioning for ML experiments; LakeFS versions entire data lakes with branching semantics\n- **Apache Iceberg** — table format with snapshot isolation; LakeFS provides repository-level versioning independent of table format\n- **Nessie** — Git-like catalog for Iceberg tables; LakeFS is format-agnostic and operates at the storage layer\n\n## FAQ\n**Q: Does branching duplicate my data?**\nA: No. LakeFS uses copy-on-write at the metadata level. Branches share the same underlying objects until changes are made.\n\n**Q: Can I use LakeFS with Spark?**\nA: Yes. Point your Spark jobs at the LakeFS S3 gateway using lakefs:\u002F\u002F URIs. No code changes needed beyond updating the endpoint.\n\n**Q: What happens if LakeFS goes down?**\nA: Data in the object store remains accessible directly. LakeFS only manages metadata; it does not move or transform your data.\n\n**Q: Does it support garbage collection?**\nA: Yes. A built-in GC process reclaims unreferenced objects from deleted branches or old commits.\n\n## Sources\n- https:\u002F\u002Fgithub.com\u002Ftreeverse\u002FlakeFS\n- https:\u002F\u002Fdocs.lakefs.io","0",[24],{"id":25,"name":26,"slug":27,"icon":28},12,"Configs","config","⚙️",false,"d7685c5d4848e1f0b405cc4150b91c654b8d113f53fb3706b06543fb59988103","skill",[33,34,35],"claude_code","codex","gemini_cli","single",{"executes_code":29,"modifies_global_config":29,"requires_secrets":38,"uses_absolute_paths":29,"network_access":29},[],{"npm":40,"pip":41,"brew":43,"system":44},[],[42],"lakefs-cli",[],[],{"commands":46,"expected_files":47},[],[20],{"asset_kind":31,"target_tools":49,"install_mode":36,"entrypoint":20,"risk_profile":50,"dependencies":52,"content_hash":30,"verification":57,"inferred":60},[33,34,35],{"executes_code":29,"modifies_global_config":29,"requires_secrets":51,"uses_absolute_paths":29,"network_access":29},[],{"npm":53,"pip":54,"brew":55,"system":56},[],[42],[],[],{"commands":58,"expected_files":59},[],[20],true,{"target":34,"score":62,"status":63,"policy":64,"why":65,"asset_kind":31,"install_mode":36},98,"native","allow",[66,67,68,69,70,71,72],"target_tools includes codex","asset_kind skill","install_mode single","markdown-only","policy allow","safe markdown-only Codex install","trust established",{"author_trust_level":74,"verified_publisher":29,"asset_signed_hash":30,"signature_status":75,"install_count":12,"report_count":12,"dangerous_capability_badges":76,"review_status":77,"signals":78},"established","hash_only",[],"unreviewed",[79,80,81],"author has published assets","content hash available","no dangerous capability badges",{"owner_uuid":9,"owner_name":10,"source_url":83,"content_hash":30,"visibility":19,"created_at":84,"updated_at":85},"https:\u002F\u002Ftokrepo.com\u002Fen\u002Fworkflows\u002Fasset-5b7a9740","2026-05-13 20:57:54","2026-05-14 02:13:46",null,[88,144,191,246],{"id":89,"uuid":90,"slug":91,"title":92,"description":93,"author_id":94,"author_name":95,"author_avatar":11,"token_estimate":12,"time_saved":12,"model_used":13,"fork_count":12,"vote_count":12,"view_count":96,"parent_id":12,"parent_uuid":13,"lang_type":15,"steps":97,"tags":98,"has_voted":29,"visibility":19,"share_token":13,"is_featured":12,"content_hash":104,"asset_kind":31,"target_tools":105,"install_mode":36,"entrypoint":106,"risk_profile":107,"dependencies":109,"verification":114,"agent_metadata":117,"agent_fit":129,"trust":131,"provenance":134,"created_at":136,"updated_at":137,"__relatedScore":138,"__relatedReasons":139,"__sharedTags":143},2671,"e52ce9fe-4838-11f1-9bc6-00163e2b0d79","asset-e52ce9fe","Jujutsu (jj) — A Git-Compatible Next-Generation Version Control System","A version control system that combines the best ideas from Git, Mercurial, and Pijul with automatic rebasing, first-class conflicts, and a working-copy-as-commit model.","8a910e34-3180-11f1-9bc6-00163e2b0d79","Script Depot",92,[],[99],{"id":100,"name":101,"slug":102,"icon":103},11,"Scripts","script","📜","37c42c81101ad53d6310f2e285702a187532469a91734e814bd92a017b852321",[33,34,35],"Jujutsu Overview",{"executes_code":29,"modifies_global_config":29,"requires_secrets":108,"uses_absolute_paths":29,"network_access":29},[],{"npm":110,"pip":111,"brew":112,"system":113},[],[],[],[],{"commands":115,"expected_files":116},[],[106],{"asset_kind":31,"target_tools":118,"install_mode":36,"entrypoint":106,"risk_profile":119,"dependencies":121,"content_hash":104,"verification":126},[33,34,35],{"executes_code":29,"modifies_global_config":29,"requires_secrets":120,"uses_absolute_paths":29,"network_access":29},[],{"npm":122,"pip":123,"brew":124,"system":125},[],[],[],[],{"commands":127,"expected_files":128},[],[106],{"target":34,"score":62,"status":63,"policy":64,"why":130,"asset_kind":31,"install_mode":36},[66,67,68,69,70,71,72],{"author_trust_level":74,"verified_publisher":29,"asset_signed_hash":104,"signature_status":75,"install_count":12,"report_count":12,"dangerous_capability_badges":132,"review_status":77,"signals":133},[],[79,80,81],{"owner_uuid":94,"owner_name":95,"source_url":135,"content_hash":104,"visibility":19,"created_at":136,"updated_at":137},"https:\u002F\u002Ftokrepo.com\u002Fen\u002Fworkflows\u002Fasset-e52ce9fe","2026-05-05 12:14:22","2026-05-13 16:17:57",105.9527244228309,[140,141,142],"topic-match","same-kind","same-target",[],{"id":145,"uuid":146,"slug":147,"title":148,"description":149,"author_id":94,"author_name":95,"author_avatar":11,"token_estimate":12,"time_saved":12,"model_used":13,"fork_count":12,"vote_count":12,"view_count":150,"parent_id":12,"parent_uuid":13,"lang_type":15,"steps":151,"tags":152,"has_voted":29,"visibility":19,"share_token":13,"is_featured":12,"content_hash":154,"asset_kind":31,"target_tools":155,"install_mode":36,"entrypoint":156,"risk_profile":157,"dependencies":159,"verification":164,"agent_metadata":167,"agent_fit":179,"trust":181,"provenance":184,"created_at":186,"updated_at":187,"__relatedScore":188,"__relatedReasons":189,"__sharedTags":190},1926,"ba761d71-3de4-11f1-9bc6-00163e2b0d79","pachyderm-data-versioning-pipeline-orchestration-ba761d71","Pachyderm — Data Versioning and Pipeline Orchestration","Version your data like Git, build reproducible data pipelines triggered by commits, and track lineage from raw input to model output on Kubernetes.",82,[],[153],{"id":100,"name":101,"slug":102,"icon":103},"f2c58ef3805c2fb1291520ddeedff04017982867c30cb74e7599d4ab3c40b8b3",[33,34,35],"Pachyderm",{"executes_code":29,"modifies_global_config":29,"requires_secrets":158,"uses_absolute_paths":29,"network_access":29},[],{"npm":160,"pip":161,"brew":162,"system":163},[],[],[],[],{"commands":165,"expected_files":166},[],[156],{"asset_kind":31,"target_tools":168,"install_mode":36,"entrypoint":156,"risk_profile":169,"dependencies":171,"content_hash":154,"verification":176},[33,34,35],{"executes_code":29,"modifies_global_config":29,"requires_secrets":170,"uses_absolute_paths":29,"network_access":29},[],{"npm":172,"pip":173,"brew":174,"system":175},[],[],[],[],{"commands":177,"expected_files":178},[],[156],{"target":34,"score":62,"status":63,"policy":64,"why":180,"asset_kind":31,"install_mode":36},[66,67,68,69,70,71,72],{"author_trust_level":74,"verified_publisher":29,"asset_signed_hash":154,"signature_status":75,"install_count":12,"report_count":12,"dangerous_capability_badges":182,"review_status":77,"signals":183},[],[79,80,81],{"owner_uuid":94,"owner_name":95,"source_url":185,"content_hash":154,"visibility":19,"created_at":186,"updated_at":187},"https:\u002F\u002Ftokrepo.com\u002Fen\u002Fworkflows\u002Fpachyderm-data-versioning-pipeline-orchestration-ba761d71","2026-04-22 08:46:41","2026-05-13 16:11:48",95.87861713856411,[140,141,142],[],{"id":192,"uuid":193,"slug":194,"title":195,"description":196,"author_id":9,"author_name":10,"author_avatar":11,"token_estimate":12,"time_saved":12,"model_used":13,"fork_count":12,"vote_count":12,"view_count":197,"parent_id":12,"parent_uuid":13,"lang_type":15,"steps":198,"tags":199,"has_voted":29,"visibility":19,"share_token":13,"is_featured":12,"content_hash":201,"asset_kind":31,"target_tools":202,"install_mode":36,"entrypoint":203,"risk_profile":204,"dependencies":206,"verification":211,"agent_metadata":214,"agent_fit":226,"trust":233,"provenance":237,"created_at":239,"updated_at":240,"__relatedScore":241,"__relatedReasons":242,"__sharedTags":244},1458,"f8c7831d-391f-11f1-9bc6-00163e2b0d79","dolt-sql-database-you-can-fork-clone-branch-merge-f8c7831d","Dolt — The SQL Database You Can Fork, Clone, Branch and Merge","The world's first version-controlled SQL database. Dolt combines MySQL compatibility with Git-style branching, diffing and merging so schemas and data can be reviewed and pull-requested.",63,[],[200],{"id":25,"name":26,"slug":27,"icon":28},"c2f754d5e0fb073e4934bf96e9dc28c76f525fe8595b6932aeb3c72a40a8116b",[33,34,35],"Dolt Guide",{"executes_code":29,"modifies_global_config":29,"requires_secrets":205,"uses_absolute_paths":29,"network_access":60},[],{"npm":207,"pip":208,"brew":209,"system":210},[],[],[],[],{"commands":212,"expected_files":213},[],[203],{"asset_kind":31,"target_tools":215,"install_mode":36,"entrypoint":203,"risk_profile":216,"dependencies":218,"content_hash":201,"verification":223},[33,34,35],{"executes_code":29,"modifies_global_config":29,"requires_secrets":217,"uses_absolute_paths":29,"network_access":60},[],{"npm":219,"pip":220,"brew":221,"system":222},[],[],[],[],{"commands":224,"expected_files":225},[],[203],{"target":34,"score":227,"status":228,"policy":229,"why":230,"asset_kind":31,"install_mode":36},64,"needs_confirmation","confirm",[66,67,68,231,232,72],"policy confirm","risk_profile.network_access is true",{"author_trust_level":74,"verified_publisher":29,"asset_signed_hash":201,"signature_status":75,"install_count":12,"report_count":12,"dangerous_capability_badges":234,"review_status":77,"signals":236},[235],"network_access",[79,80],{"owner_uuid":9,"owner_name":10,"source_url":238,"content_hash":201,"visibility":19,"created_at":239,"updated_at":240},"https:\u002F\u002Ftokrepo.com\u002Fen\u002Fworkflows\u002Fdolt-sql-database-you-can-fork-clone-branch-merge-f8c7831d","2026-04-16 07:08:10","2026-05-13 16:13:12",95.70926996097583,[140,141,142,243],"same-author",[27,245],"configs",{"id":247,"uuid":248,"slug":249,"title":250,"description":251,"author_id":94,"author_name":95,"author_avatar":11,"token_estimate":12,"time_saved":12,"model_used":13,"fork_count":12,"vote_count":12,"view_count":252,"parent_id":12,"parent_uuid":13,"lang_type":15,"steps":253,"tags":254,"has_voted":29,"visibility":19,"share_token":13,"is_featured":12,"content_hash":256,"asset_kind":31,"target_tools":257,"install_mode":36,"entrypoint":258,"risk_profile":259,"dependencies":261,"verification":266,"agent_metadata":269,"agent_fit":281,"trust":283,"provenance":286,"created_at":288,"updated_at":289,"__relatedScore":290,"__relatedReasons":291,"__sharedTags":292},2552,"f581db76-46ca-11f1-9bc6-00163e2b0d79","asset-f581db76","TerminusDB — Document Graph Database with Git-Like Versioning","TerminusDB is a document graph database that versions your data like Git. It stores JSON documents with graph relationships, enabling branch, merge, diff, and time-travel operations on your entire dataset.",77,[],[255],{"id":100,"name":101,"slug":102,"icon":103},"9a9c7df9e95c17843dcbc25d550c4cc49f2f69187f9069e11d15415a171e2ae1",[33,34,35],"TerminusDB",{"executes_code":29,"modifies_global_config":29,"requires_secrets":260,"uses_absolute_paths":29,"network_access":29},[],{"npm":262,"pip":263,"brew":264,"system":265},[],[],[],[],{"commands":267,"expected_files":268},[],[258],{"asset_kind":31,"target_tools":270,"install_mode":36,"entrypoint":258,"risk_profile":271,"dependencies":273,"content_hash":256,"verification":278},[33,34,35],{"executes_code":29,"modifies_global_config":29,"requires_secrets":272,"uses_absolute_paths":29,"network_access":29},[],{"npm":274,"pip":275,"brew":276,"system":277},[],[],[],[],{"commands":279,"expected_files":280},[],[258],{"target":34,"score":62,"status":63,"policy":64,"why":282,"asset_kind":31,"install_mode":36},[66,67,68,69,70,71,72],{"author_trust_level":74,"verified_publisher":29,"asset_signed_hash":256,"signature_status":75,"install_count":12,"report_count":12,"dangerous_capability_badges":284,"review_status":77,"signals":285},[],[79,80,81],{"owner_uuid":94,"owner_name":95,"source_url":287,"content_hash":256,"visibility":19,"created_at":288,"updated_at":289},"https:\u002F\u002Ftokrepo.com\u002Fen\u002Fworkflows\u002Fasset-f581db76","2026-05-03 16:34:54","2026-05-13 21:47:06",94.83814190403572,[140,141,142],[]]