Content Types

Drop anything; RememberOS extracts the text, makes it searchable, and keeps the original downloadable.

Type	What happens
Plain text / Markdown / CSV	indexed directly; long documents are chunked into multiple searchable memories
Audio (mp3, wav, m4a, ogg…)	transcribed with Whisper; the transcript becomes searchable memory, the file stays playable in the Vault
Video (mp4, mov, webm…)	audio track transcribed; original playable
PDF, docx, pptx, xlsx	text extracted per format, chunked, embedded
Images (png, jpg, webp…)	captioned with a vision model so they're semantically searchable; rendered inline in the Vault
JSON / structured rows	via the dlt destination or the API: one row → one memory, text column embedded, the rest queryable metadata; the Vault pretty-prints JSON bodies
Anything else	stored and downloadable, indexed by filename

Limits & behaviour#

10 MB per file (oversize files are reported, not silently dropped).
Originals live in object storage (yours, with BYO storage) and are served via short-lived presigned URLs.
Big batches upload per-file, directly to storage (presigned PUT) — one unreadable file fails only itself and is named in the result.
Audio/video transcription and image captioning use the platform OpenAI key by default — these are the only content types whose bytes touch a third party; plain text and embeddings stay on-box.