Using Git attributes

Using Git attributes

Sergei KUDINOV

By Sergei KUDINOV

Jan 25, 2025

Categories
DevOps & SRE
Tags
Git
GitOps
Do you like our work......we hire!

Never miss our publications about Open Source, big data and distributed systems, low frequency of one email every two months.

Git attributes is not a concept that we learn in the early days when familiarizing with Git. Not every experienced software engineer is familiar with it due to its uncommon usage. However, when working on larger or open-source projects with multiple collaborators, using Git attributes can be vastly more efficient than ignoring them. This article introduces Git attributes and presents some of the use cases.

Introduction to Git attributes

Git attributes customize Git to behave differently for each repository. They apply path-specific settings to subdirectories or subsets of files. These settings can configure things such as end-of-line normalization for text files or diff algorithms for binary files.

Git attributes are set either in the .gitattributes file of one of your directories (usually the root of the project) or the .git/info/attributes file. The later is used when you don’t want the attributes to be committed in your project tree.

Each line in these files is of the form:

pattern attr1 attr2 ...

pattern matches paths using the same rules as in .gitignore files, with a few exceptions:

  • negative patterns are forbidden
  • patterns that match a directory do not recursively match paths inside that directory (so using the trailing-slash path/ syntax is pointless in an attributes file; use path/** instead)

Each attribute can be in one of these states for a given pattern:

  • Set
    It is specified just by listing the name of the attribute, for example, pattern attr1.
  • Unset
    It is specified by listing the name of the attribute prefixed with -, for example, pattern -attr1.
  • Set to a value
    It is specified by listing the name of the attribute followed by = and its value, for example, pattern attr1=value.
  • Unspecified
    No pattern matches the path or a pattern doesn’t have the attribute.

Here are some examples of attributes:

  • text - enables and controls end-of-line normalization
  • eol - sets a specific line-ending style to be used in the working directory
  • binary - a built-in macro attribute which unsets the text and diff attributes, equivalent to -text -diff
  • diff - affects how Git generates diffs for particular files

The detailed information and the full list of attributes are in the official documentation.

Use case: Configuring line endings

Every time you press Enter on your keyboard when writing a text document, you add an invisible character called line ending. By default, Windows uses both a carriage return character and a line feed character (CRLF) represented by the \r\n control characters, whereas macOS and Linux systems use only the LF character (\n). Here’s a good article written about the history of these characters.

Annoying conflicts can happen when you or your collaborators work in different environments and commit to the same Git repository. Anyone can commit files using different line endings, and when you save them, your editor may be configured to rewrite line endings to match your environment. As a result, Git detects a difference. You can run the git diff command to see the related warning:

warning: CRLF will be replaced by LF in path/to/file.
The file will have its original line endings in your working directory

To avoid such a problem, you can configure Git to properly handle line endings in the current repository automatically. This setting in the .gitattributes file (or in .git/info/attributes) treats all files as text files and converts to OS’s line ending on checkout and back to LF-style on commit automatically.

* text eol=lf

Note that because of this setting, Git breaks the binary files in the repository when adding them to the index. In the next section, we will learn how to avoid this situation.

Use case: Identifying binary files

The binary attribute denotes all files that are truly binary. Git understands that the files specified are not text, and it will not try to change them. For example, for the JPG and PNG files, you can put the following settings:

*.png binary
*.jpg binary

binary is a built-in macro attribute which is equivalent to:

-diff -merge -text

You can define custom macro attributes in the top-level Git attributes files of a repository.

Use case: Diffing binary files

Git can be configured to diff binary files. It is very helpful, for example, to solve one of the most annoying problems to version documents like Microsoft Word or OpenDocument.

The following setting configures to diff Microsoft Word files:

*.docx diff=word

This tells Git that any .docx files should use the word filter when you try to view a diff. But this filter must be set up to use the docx2txt program which converts Word documents into readable text files. You can follow this tutorial to learn how to install and apply it.

Conclusion

This article shortly introduced Git attributes and presented some of their use cases.

Share this article

Canada - Morocco - France

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Science…

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.

Support Ukrain