Text connector that fetches Google Docs via the Drive and Docs APIs.
Each document's content is extracted and ingested through
db.ingest_text() to produce structured, provenanced claims.
application/vnd.google-apps.documentdb.ingest_text(text, source_id="gdocs:{doc_id}")result.errors
This is a text connector. It calls db.ingest_text() to extract claims from
document content using heuristic or LLM-powered extraction. Configure an LLM provider with
db.configure_curator() for deeper extraction.
https://www.googleapis.com/auth/drive.readonlyhttps://www.googleapis.com/auth/documents.readonlyrequests package:
pip install requests
The connector requires a Google OAuth2 access token. Tokens are short-lived (typically 1 hour). Use a refresh token flow or the Google Auth library to obtain a fresh access token before connecting.
Pass the token as token in db.connect(). The factory maps it to the
access_token parameter on the underlying GDocsConnector.
| Parameter | Required | Default | Description |
|---|---|---|---|
access_token |
Yes | — | Google OAuth2 access token with drive.readonly and documents.readonly scopes |
max_docs |
No | 50 |
Maximum number of documents to fetch from Drive |
save |
No | False |
Encrypt and persist the token to disk (requires cryptography package) |
Note: In db.connect(), the access token is passed as token,
which the factory maps to access_token on the underlying GDocsConnector.
conn = db.connect("gdocs", token="ya29.a0...") result = conn.run(db)
conn = db.connect("gdocs", token="ya29.a0...", max_docs=20, save=True, ) result = conn.run(db)
conn = db.connect("gdocs", token=os.environ["GOOGLE_TOKEN"], max_docs=100, ) result = conn.run(db)