0

I'm trying to get a config file from our GitHub using the get contents api.

This returns a JSON containing the file content encoded as a base64 string.

I'd like to get it as text

Steps I've taken

  1. get initial api response:
    curl -H 'Authorization: token MY_TOKEN' \ https://github.com/api/v3/repos/MY_OWNER/MY_REPO/contents/MY_FILE
    this returns a JSON response with a field "content": "encoded content ..."

  2. get the encoded string:
    add <prev command> | grep -F "content\":"
    this gets the content, but there's still the "content": string, the " chars and a comma at the end

  3. cut the extras:
    <prev command> | cut -d ":" -f 2 | cut -d "\"" -f 2

  4. decode:
    <prev command | base64 --decode>

final command:
curl -H 'Authorization: token MY_TOKEN' \ https://github.com/api/v3/repos/MY_OWNER/MY_REPO/contents/MY_FILE | \ grep -F "content\":" | cut -d ":" -f 2 | cut -d "\"" -f 2 | base64 --decode

Issues:

  1. the resulting string (before the base64 --decode) decodes in an online decoder (not well -> see next item), but fails to do so in bash. The response being

    "Invalid character in input stream."

  2. When decoding the string in an online decoder, some (not all) of the file is in gibberish, and not the original text. I've tried all the available charsets.

Notes:

  1. I've tried removing the last 2 (newline) chars with sed 's/..$//', but this has no effect.
  2. If I select the output with the mouse and copy paste it to a echo MY_ECODED_STRING_PASTED_HERE | base64 --decode command, it has the same effect as the online tool, that is, it decodes as gibberish.
7
  • Using Bash for this is probably going to give you more gray hairs than you would prefer. But for a start, replace the ad hoc pipeline with a proper JSON processor like jq. Commented Nov 22, 2017 at 11:46
  • Locale settings will affect what characters are considered as valid. Try export LC_ALL=C near the beginning of your script to enforce traditional POSIX byte=character semantics. Commented Nov 22, 2017 at 11:47
  • exporting LC_ALL=C has no effect. Commented Nov 22, 2017 at 12:03
  • Googling the error message suggests that the input isn't actually entirely base64. See e.g. lists.jboss.org/pipermail/apiman-user/2015-October/000365.html -- Without access to a representative sample, it's hard to say what exactly is wrong with it. Commented Nov 22, 2017 at 12:29
  • echo moo | base64 --decode >/dev/null works fine while echo moo.bar | base64 --decode >/dev/null gets me "invalid character in input stream". For the record, valid base64 is alphabetics, numbers, and a couple of mathematical symbols (+, /, = at the end of the stream for padding). Commented Nov 22, 2017 at 12:32

2 Answers 2

2

Add header Accept: application/vnd.github.VERSION.raw to the GET.

Sign up to request clarification or add additional context in comments.

1 Comment

This is the recommended way to control what content you want - see the documentation for more details: developer.github.com/v3/media/#git-blob-properties
0

Following tripleee's advice, i've switched the extracting method to jq

file=randomFileName74894031264.txt

curl -H 'Authorization: token MY_TOKEN' https://github.com/api/v3/repos/MY_OWNER/MY_REPO/contents/MY_FILE > "$file"

encoded_str=($(jq -r '.content' "$file"))

echo "$encoded_str" | base64 -D
rm -f "$file"

This works when running from the command line, but when running as a script the stdout doesn't flush, and we only get the first few lines of the file.

I will update this answer when I've formalized a generic script.

1 Comment

You don't need a temporary file, curl | jq -r .content | base64 -D should do it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.