10

I'm trying to use Git and GitHub to sync a number of app configuration files. These are XML or plist files stored in a binary format. For example, a Keyboard Maestro .kmsync file.

I can open these files via a text editor to see an XML format.

But when I view these file diffs in a GitHub Pull Request, commit view, etc. I see a useless binary diff with no visible changes:

Showing with 0 additions and 0 deletions.
BIN +17 Bytes (100%)
Binary file not shown.

I can get the a text-based diff to display locally via git via a .gitattributes file. However, it appears that GitHub doesn't respect these modifications:

GitHub doesn't use .gitattributes files for choosing which files to show in a diff, so it's not possible to get around this that way. [source]

I want to see the text-based changes and line diffs when I view these files on GitHub in my commits and Pull Requests.

For example, the GitHub PR here. Feel free to fork and experiment:
https://github.com/pkamb/so/pull/1

How can I convince the web view of a GitHub repo to use text-based diffing for certain "binary" files?


I cannot find an existing question for my specific ask (displaying a non-binary diff on GitHub).

The following questions relate to for this same behavior, but for local git (not GitHub).

My question is the opposite of this question, which seeks to display text files as binary files on GitHub:

9
  • How about a link to the corresponding .kmsync file(s) in your repository and also a link to your .gitattributes file? That would make the problem easier to reproduce and less theoretical. I do not think it is a good idea to expect all people viewing your question (42 already at the time of writing this) to each create a sample repository, trying to replicate your situation. Commented Oct 13, 2020 at 12:54
  • 1
    I just checked. The .kmsync file is a binary file! I thought it is an XML file accidentally treated as a binary. You said you can open that file in a text editor and see XML. That is impossible, unless your editor knows how to expand the binary format. How would GitHub know that? Please explain. I think you are asking a bit too much there. A .gitattributes file cannot make a binary file magically into a text file, only tell Git which files to treat as binary or text, in case it does not do the right thing automatically. This does not work locally in Git, so why would it work on GitHub? Commented Oct 14, 2020 at 3:44
  • 1
    Then it is a special editor or an editor with a special plugin for that kind of file. Maybe under the hood it uses plutil -convert as mentioned in the thread you linked to. My editors here on Windows and my IDE definitely cannot open it. Commented Oct 14, 2020 at 4:26
  • 3
    What you should do instead is to save XML files in your repository and convert them to binary format during the build or deploy process, if you need those files, not the other way around. This is an SCM (source code management) best practice. Then you would not have any headaches concerning diffs anymore either. Commented Oct 14, 2020 at 4:30
  • 1
    I don't think this is possible on GitHub. There is special handling for some file types like pdf but I don't think you can configure this. Commented Oct 14, 2020 at 5:56

1 Answer 1

1
+100

There isn't a way to force GitHub to display these files as text because they are not. When GitHub renders files as part of an HTML page, they must be in some encoding, and the only reasonable choice for encodings these days is UTF-8. These files cannot be displayed as-is as UTF-8 because they contain byte sequences that are not valid in UTF-8, in addition to control characters, which generally cannot be rendered well in a web page.

It is possible to convert these files to text for diffing using a .gitattributes file using the diff type and the diff.*.textconv attribute in your config file. This works great on your machine, but it won't work on GitHub. First of all, GitHub doesn't have your tool for rendering files, and secondly, GitHub doesn't support external programs for rendering files in general, mostly for security reasons. Some common formats are supported, but this is not one of them.

Also note that the program to be used is stored in the Git configuration and not in the .gitattributes file; this is intentional, since shipping a list of programs to execute in the repository is a security problem. Therefore, GitHub can't possibly even know the program you'd be using here.

If your kmsync files have a plain text equivalent that you can compile into the binary format, then you can store that format in the repository and build it as part of a build step. That will be diffable and will still provide the binary formats that you can use for your project. This is no different than compiling code into binaries or plain text into PDFs.

Sign up to request clarification or add additional context in comments.

2 Comments

GitHub seems to use their linguist tool as a gitattributes alternative; is there any possibility of using that tool to diff binary data?
Linguist is used for detecting the language of files. It's possible to configure it in .gitatrributes in various ways, but none of those options can force a file to be rendered as plain text when it's not for the reasons mentioned above.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.