Data Versioning Control

Created on 2023-05-12T20:19:17-05:00

Return to the Index

This card pertains to a resource available on the internet.

This card can also be read via Gemini.

File storage

Acts as a second layer of commands on top of (but not directly tied to) Git.

You use a separate checkout, push, and pull command to move files around to storage servers.

Files are read and written to external services (like S3-compatible ones) while pointers are left behind in the .dvc folder. Ostensibly you then have to commit the .dvc folder after such changes.

Make replacement

Includes a "pipeline" feature that is supposed to be "makefiles for machine learning."

Supposed to associate some input and output data; i didn't look super close in to the pipeline mechanism.