Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I love this explanation. One unfortunately confusing extra piece that some people might occasionally run into is that there are two types of tag. Most tags are what you describe: pointers (git ref) to a commit (git object) and nothing more. These are usually referred to as lightweight tags.

There are also annotated tags that can contain a message, have a timestamp, and sha, etc. These are proper git objects that behave a lot like commit objects, except they're still typically only referring to another git object (commit).



I thought I understood tags...

But now what's a "proper git object"? Is there an improper git object? Is there a proper git non-object?


Underneath the SCM plumbing, the "true core" of git is a content-addressable object store. (See https://git-scm.com/book/en/v2/Git-Internals-Git-Objects)

When you `git fetch`, git is asking the remote to walk a tree of objects — starting at the commit object that the ref points to — and deliver them to you, to unpack into your own object store.

Git could in theory do a lot with just objects — with the whole "data state" of the repo (config, reflog, etc) just being objects, and then one toplevel journal file to track the hash of the newest versions of these state objects. (Sort of like how many DBMSes keep much of the config inside the database.)

But git mostly isn't designed to do this. Instead, git's higher SCM layers manage their state directly, outside of the object store, as files in well-known locations under .git/. This means that this higher-level state isn't part of the object-store synchronization step, and there must instead be a domain-specific synchronization step for each kind of SCM state metadata where applicable.

Tags are an interesting exception, though, in that while the default "lightweight" tags are "high-level SCM metadata" of the kind that isn't held in the object store; "annotated" tags become objects held in the object store.

(To be honest, I'm not sure what the benefit is of having "lightweight" tags that live outside the object store. To me, it looks like tags could just always be objects, and "lightweight" vs "annotated" should just determine the required fields of the data in the object. Maybe it's a legacy thing? Maybe third-party tooling parses lightweight tags out of the .git/ directory directly, and can't "see" annotated tags?)


Lightweight tags are simply references to commits that lie in refs/tags instead of refs/heads. Annotated tags are references to tag objects rather than commit objects. In both cases, the purpose of the reference is to give the object (tag or commit) a name.

I think no one uses lightweight tags anymore, except if you push by mistake a commit to refs/tags/something.


"proper git object" = anything that has its own unique hash.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: