AutoDiary: Automated Daily Summarization from Images and Audio
AutoDiary is a wearable AI device that automatically captures images and voice notes throughout your day, transcribes and describes them using AI, and stores them as searchable memories. Caregivers or users can simply ask questions like "What did I do this morning?" and get instant, human-readable answers. Built on an ESP32-S3 microcontroller with a privacy-first, fully local architecture, combining computer vision, speech recognition, and retrieval-augmented generation.
CATEGORY: EDGE AI
The Problem
Over 55 million people worldwide live with dementia, with Alzheimer's accounting for 60–70% of cases. Existing assistive tools require active user engagement which are impractical for those with severe cognitive decline.

Key Features
Multimodal capture: images via camera, voice via microphone, triggered by a simple button press
AI-powered understanding: automatic image description and speech transcription
Semantic memory search: natural language queries using Retrieval-Augmented Generation (RAG)
Private by design: all data processed and stored locally, no cloud dependency
Two form factors: smart glasses and a pendant wearable
System Architecture

The system is built across three layers:
Wearable device: ESP32-S3 microcontroller with camera, microphone, and Wi-Fi
Backend server: Flask-based processing pipeline with vision AI, speech-to-text, and vector search
Web interface: chat-based diary with timeline, media viewer, and query input
Performance Highlights
Image processing completes in under 9 seconds end-to-end, with speech transcription hitting 91% accuracy and vision descriptions at 93% accuracy. The pendant form factor sustains 6–7 hours of battery life, while memory retrieval responds in under 1 second.
Technologies Used
Hardware: Seeed XIAO ESP32-S3, OV2640 camera, I2S MEMS microphone
AI Models: Gemma-3 4B (vision), Whisper Small (speech), Qwen3-0.6B (embeddings)
Backend: Python, Flask, ChromaDB, SQLite
Deployment: Docker
Team
Built as a final-year B.Tech project at Walchand Institute of Technology, Solapur by Padmanabh Kulkarni, Rithik Purohit, and Krishna Shingan, under the supervision of Dr. R. S. Khamitkar.
Link
Docker Image: https://hub.docker.com/r/padmanabh10/autodiary
YouTube Demo: https://youtu.be/_2bnRmLbbPM?si=zZsU6XkTSu2Mi8pu