SharePoint Importer
Walk a SharePoint/OneDrive document library with the Microsoft Graph API
and stream every file into RememberOS. Ships in the repo:
connectors/sharepoint/import_sharepoint.py (stdlib + httpx only).
One-time Azure setup#
- Entra ID → App registrations → New registration.
- API permissions → Microsoft Graph → Application permissions:
Sites.Read.All+Files.Read.All→ grant admin consent. (Scope to specific sites withSites.Selectedif you prefer.) - Certificates & secrets → new client secret.
Run#
export GRAPH_TENANT_ID=… GRAPH_CLIENT_ID=… GRAPH_CLIENT_SECRET=…
export SHAREPOINT_SITE="https://contoso.sharepoint.com/sites/Marketing"
export LONGMEM_API_KEY=mv_…
python import_sharepoint.py --collection sharepoint --dry-run # list only
python import_sharepoint.py --collection sharepoint # import
Behaviour#
- Resumable: a state file records each item's id + eTag — re-runs import only new or changed files.
- Per-file: uploads go to the async drop queue individually; one failure never stops the run.
--max-file-mb(default 10) skips and reports oversize files;--concurrency(default 4) bounds parallelism.