Connectors GitHub

GitHub

API

The GitHub connector is an API connector that imports issues and pull requests from a GitHub repository. It produces structural claims and optionally extracts additional claims from issue and PR bodies via text extraction.

How it works

Claim patterns

SubjectPredicateObject
(repo#number, issue) authored_by (author, person)
(repo#number, issue) has_state (open/closed, status)
(repo#number, issue) labeled (label, label) per label
(repo#number, issue) assigned_to (assignee, person) per assignee

All structural claims have confidence 1.0. Claims extracted from bodies inherit the confidence of the extraction mode.

1 Create a Personal Access Token

Go to github.com/settings/tokens and generate a new token (classic or fine-grained).

2 Select scopes

3 Install the requests package

$ pip install requests

4 Connect

import os
conn = db.connect("github",
    token=os.environ["GITHUB_TOKEN"],
    repo="owner/repo",
)
result = conn.run(db)
Parameter Required Default Description
token Yes GitHub personal access token
repo Yes Repository in owner/repo format
include_prs No True Include pull requests
state No all all, open, or closed
max_items No 500 Maximum issues/PRs to fetch
labels No None Filter to issues with these labels
extraction No heuristic Text extraction mode for bodies: heuristic, smart, llm, or none
save No False Encrypt and persist token

Basic usage

conn = db.connect("github",
    token=os.environ["GITHUB_TOKEN"],
    repo="owner/repo",
)
result = conn.run(db)

Issues only, filtered

conn = db.connect("github",
    token=os.environ["GITHUB_TOKEN"],
    repo="owner/repo",
    include_prs=False,
    state="open",
    labels=["bug", "priority"],
    max_items=100,
)
result = conn.run(db)

With LLM extraction and save

conn = db.connect("github",
    token=os.environ["GITHUB_TOKEN"],
    repo="owner/repo",
    extraction="llm",
    save=True,
)
result = conn.run(db)

Related Connectors