39

I need to get the contents of a file hosted in a GitHub repo. I'd prefer to get a JSON response with metadata along with it. I've tried numerous URLs with cURL with to only get a response of {"message":"Not Found"}. I just need the URL structure. If it matters, it's from an organization on GitHub. Here's what I think should work but doesn't:

http://api.github.com/repos/<organization>/<repository>/git/branches/<branch>/<file> 
4
  • 1
    see stackoverflow.com/questions/9240961/… Commented Feb 14, 2012 at 6:32
  • Three requests for a simple JSON response? Good lawd. Not intuitive at all. Surely there's a more elegant way. Commented Feb 14, 2012 at 8:53
  • 1
    This is probably one of the weakest bits of their API. You can navigate the structure using their Trees API (at Git Data in docs). In order to use that you'll need a sha. You can dig that out of repo branches. Perhaps it is easier for you to use raw.github.com like this? raw.github.com/:user/:repo/:branch/:filename . You can easily combine these two approaches to figure out if some file exists and then to fetch it. Commented May 23, 2012 at 12:07
  • Yeah, I found out about that a couple of days ago. I need the file structure, though. Basically, I want to link to the Github files on my website. Think of it as an index page for my Github files. Commented May 23, 2012 at 13:43

3 Answers 3

42

As the description (located at http://developer.github.com/v3/repos/contents/) says:

/repos/:owner/:repo/contents/:path

An ajax code will be:

$.ajax({ url: readme_uri, dataType: 'jsonp', success: function(results) { var content = results.data.content; }); 

Replace the readme_uri by the proper /repos/:owner/:repo/contents/:path.

Sign up to request clarification or add additional context in comments.

6 Comments

Is this new? I swear this wasn't here when I asked. I looked all over the dev pages for this. Thanks.
Looks like GitHub is sending file content encoded in Base64...
@taseenb use https://raw.githubusercontent.com/:owner/:repo/master/:path to get raw (binary, not Base64)
@Peter where did you find the link you mentioned in your comment? Saved my day :) It was horrible converting base64 encoded content back to raw
You can request the raw content by setting the Accept header to application/vnd.github.v3.raw
|
39

This GitHub API page provides the full reference. The API endpoint for reading a file:

https://api.github.com/repos/{username}/{repository_name}/contents/{file_path} 
{ "encoding": "base64", "size": 5362, "name": "README.md", "content": "encoded content ...", "sha": "3d21ec53a331a6f037a91c368710b99387d012c1", ... } 
  • Consider using a personal access token
    • Rate-limits (up to 60 per-hour for anonymous, up to 5,000 per-hour for authenticated) read more
    • Enable accessing files in private repos
  • The file content in the response is base64 encoded string

Using curl

Reading https://github.com/airbnb/javascript/blob/master/package.json using GitHub's API via curl:

curl -H 'Accept: application/vnd.github.v3.raw' https://api.github.com/repos/airbnb/javascript/contents/package.json 
  • Make sure to pass header Accept: application/vnd.github.v3.raw to get raw file response (thanks jakub.g)

Using Python

Reading https://github.com/airbnb/javascript/blob/master/package.json using GitHub's API in Python:

import base64 import json import requests import os def github_read_file(username, repository_name, file_path, github_token=None): headers = {} if github_token: headers['Authorization'] = f"token {github_token}" url = f'https://api.github.com/repos/{username}/{repository_name}/contents/{file_path}' r = requests.get(url, headers=headers) r.raise_for_status() data = r.json() file_content = data['content'] file_content_encoding = data.get('encoding') if file_content_encoding == 'base64': file_content = base64.b64decode(file_content).decode() return file_content def main(): github_token = os.environ['GITHUB_TOKEN'] username = 'airbnb' repository_name = 'javascript' file_path = 'package.json' file_content = github_read_file(username, repository_name, file_path, github_token=github_token) data = json.loads(file_content) print(data['name']) if __name__ == '__main__': main() 
  • Define an environment variable GITHUB_TOKEN before running

1 Comment

Note: to have the contents directly instead of a base64 version, pass 'Accept: application/vnd.github.v3.raw' request header: curl -H 'Accept: application/vnd.github.v3.raw' 'https://api.github.com/repos/airbnb/javascript/contents/package.json' (no need to pipe to | jq -r ".content" | base64 --decode).
0

Here's an alternative, more contemporary solution using fsspec in Python. This example includes grabbing a zip file from github and unzipping it locally.

GitHubFs uses PyGitHub under the hood, which is a Python library to access the GitHub REST API.

Goes without saying you will need to keep your token secure.

Requires fsspec with github and libarchive to be installed.

 from pathlib import Path from fsspec import filesystem def get_remote_github_obj( org: str = "my-org", repo: str = "my-repo", branch: str = "my-branch", username: str = "my-username", token: str = "my-github-pat" ): git_fs = filesystem( "github", org=org, repo=repo, branch=branch, username=username, token=token, ) # Create a new dir Path("/path/to/desired/local/download/location/").mkdir(parents=True, exist_ok=True) # Get the zip archive from URL git_fs.get( rpath="relative/path/to/remote/zip_file.zip", lpath="/path/to/desired/local/download/location/zip_file.zip" ) # Use libarchive to unarchive libarchive_fs = filesystem( "libarchive", fo="/path/to/desired/local/download/location/zip_file.zip", ) libarchive_fs.get( rpath="/", lpath="/path/to/desired/local/download/location/example_unarchived_dir", recursive=True, ) 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.