Realtime AI Speech powered by OpenAI Realtime API, ESP32, Secure WebSockets, and Deno Edge Functions for >10-minute uninterrupted global conversations
Elato.open-source.conversational.AI.device.mp4- Install Supabase CLI and set up your Local Supabase Backend. From the root directory, run:
- Set up your NextJS Frontend. (See the Frontend README) From the frontend-nextjs directory, run the following commands. (Login creds: Email: [email protected], Password: admin)
-
Add your ESP32-S3 Device MAC Address to the Settings page in the NextJS Frontend. This links your device to your account. To find your ESP32-S3 Device's MAC Address, build and upload test/print_mac_address_test.cpp using PlatformIO.
-
Add your OpenAI API Key in the server-deno/.env and frontend-nextjs/.env.local file.
- Start the Deno server. (See the Deno server README)
-
Set up your ESP32 Arduino Client. (See the ESP32 README) On PlatformIO, first Build the project, then Upload the project to your ESP32.
-
The ESP32 should open an AP ELATO-DEVICE to connect to Wifi. Connect to it and go to http://192.168.4.1 to configure the device wifi.
-
Once your Wifi is configured, turn the device off and on again and it should connect to your Wifi and the Deno edge server.
-
Now you can talk to your AI Character!
ElatoAI consists of three main components:
- Frontend Client (Next.js hosted on Vercel) - to create and talk to your AI agents and 'send' it to your ESP32 device
- Edge Server Functions (Deno running on Deno/Supabase Edge) - to handle the websocket connections from the ESP32 device and the OpenAI API calls
- ESP32 IoT Client (PlatformIO/Arduino) - to receive the websocket connections from the Edge Server Functions and send audio to the OpenAI API via the Deno edge server.
- Realtime Speech-to-Speech: Instant speech conversion powered by OpenAI's Realtime APIs.
- Create Custom AI Agents: Create custom agents with different personalities and voices.
- Customizable Voices: Choose from a variety of voices and personalities.
- Secure WebSockets: Reliable, encrypted WebSocket communication.
- Server VAD Turn Detection: Intelligent conversation flow handling for smooth interactions.
- Opus Audio Compression: High-quality audio streaming with minimal bandwidth.
- Global Edge Performance: Low latency Deno Edge Functions ensuring seamless global conversations.
- ESP32 Arduino Framework: Optimized and easy-to-use hardware integration.
- Conversation History: View your conversation history.
- Device Management: Register and manage your devices.
- User Authentication: Secure user authentication and authorization.
- Conversations with WebRTC and Websockets: Talk to your AI with WebRTC on the NextJS webapp and with websockets on the ESP32.
- Volume Control: Control the volume of the ESP32 speaker from the NextJS webapp.
- Realtime Transcripts: The realtime transcripts of your conversations are stored in the Supabase DB.
Frontend | Next.js, Vercel |
Backend | Supabase DB |
Edge Functions | Deno Edge Functions on Deno/Supabase |
IoT Client | PlatformIO, Arduino Framework, ESP32-S3 |
Audio Codec | Opus |
Communication | Secure WebSockets |
Libraries | ArduinoJson, WebSockets, AsyncWebServer, ESP32_Button, Arduino Audio Tools, ArduinoLibOpus |
- ⚡️ Latency: <1s round-trip globally
- 🎧 Audio Quality: Opus codec at 24kbps (high clarity)
- ⏳ Uninterrupted Conversations: Up to 10 minutes continuous conversations
- 🌎 Global Availability: Optimized with edge computing with Deno
- Secure WebSockets (WSS) for encrypted data transfers
- Optional: API Key encryption with 256-bit AES
- Supabase DB for secure authentication
- Supabase RLS for all tables
- 3-4s Cold start time while connecting to edge server
- Limited to upto 10 minutes of uninterrupted conversations
- Edge server stops when wall clock time is exceeded
- No speech interruption detection on ESP32
- Looking for Speech Interruption detection on ESP32
- Adding Arduino IDE support
- Adding tool calling support on Deno Edge
We welcome contributions
- Fork this repository.
- Create your feature branch (git checkout -b feature/EpicFeature).
- Commit your changes (git commit -m 'Add EpicFeature').
- Push to the branch (git push origin feature/EpicFeature).
- Open a PR
This project is licensed under the MIT License - see the LICENSE file for details.
If you find this project interesting or useful, drop a GitHub ⭐️. It helps a lot!