<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[OpenVoiceOS Blog]]></title><description><![CDATA[Latest news and updates from the OpenVoiceOS community]]></description><link>https://openvoiceos.github.io/ovos-blogs</link><generator>RSS for Node</generator><lastBuildDate>Thu, 05 Mar 2026 14:51:13 GMT</lastBuildDate><atom:link href="https://openvoiceos.github.io/ovos-blogs/feed.xml" rel="self" type="application/rss+xml"/><pubDate>Thu, 05 Mar 2026 14:51:12 GMT</pubDate><copyright><![CDATA[Copyright 2026, OpenVoiceOS]]></copyright><language><![CDATA[en-US]]></language><ttl>60</ttl><item><title><![CDATA[Boring installs, now on macOS: ovos-installer supports Intel + Apple Silicon]]></title><description><![CDATA[<h1>Boring installs, now on macOS (Intel + Apple Silicon)</h1>
<p>If you’ve been following OVOS for a while, you already know the pattern:</p>
<ul>
<li>The fun part is building a voice assistant that <strong>doesn’t rent space in someone else’s cloud</strong>.</li>
<li>The unglamorous part is everything around it: Python versions, services, audio, config drift… and that one dependency that only fails on Tuesdays.</li>
</ul>
<p><code>ovos-installer</code> exists to make the unglamorous part <strong>uneventful</strong>.</p>
<p>And now it does that on <strong>macOS</strong>, on both <strong>Intel</strong> and <strong>Apple Silicon</strong> Macs.</p>
<p>No “weekend-long dependency archaeology”. No “it works until you reboot”. Just boring installs and boring updates — which is the highest compliment I can pay an installer.</p>
<hr>
<h2>What macOS support means (and what it doesn’t… yet)</h2>
<p>macOS support is real — and it’s also deliberately scoped so we can keep it stable.</p>
<p><strong>Current macOS support matrix:</strong></p>
<ul>
<li><strong>Method:</strong> <code>virtualenv</code> (only)</li>
<li><strong>Channel:</strong> <code>alpha</code> (only)</li>
<li><strong>Service manager:</strong> <code>launchd</code></li>
</ul>
<p>That’s not a forever rule. It’s the “ship one reliable path first” rule.</p>
<hr>
<h2>macOS prerequisites (aka: the four boring ingredients)</h2>
<p>Before you run the installer, make sure you have:</p>
<ol>
<li>
<p><strong>Homebrew</strong> installed and available in your <code>PATH</code></p>
</li>
<li>
<p><strong>Xcode Command Line Tools</strong> installed:</p>
<pre><code class="hljs language-bash">xcode-select --install
</code></pre>
</li>
<li>
<p><strong>Homebrew Bash</strong> installed (required by <code>ovos-installer</code>):</p>
<pre><code class="hljs language-bash">brew install bash
</code></pre>
</li>
<li>
<p><strong>Microphone permission</strong> granted to your terminal app</p>
<p>macOS is <em>very</em> polite about letting you install things, then <em>very</em> strict about letting you capture audio.</p>
<p>Go to: <strong>System Settings → Privacy &#x26; Security → Microphone</strong><br>
and enable your terminal (Terminal, iTerm2, etc.).</p>
</li>
</ol>
<hr>
<h2>Quickstart (copy/paste speed-run)</h2>
<pre><code class="hljs language-bash"><span class="hljs-built_in">sudo</span> sh -c <span class="hljs-string">"<span class="hljs-subst">$(curl -fsSL https://raw.githubusercontent.com/OpenVoiceOS/ovos-installer/main/installer.sh)</span>"</span>
</code></pre>
<p>If you’re wearing your “operator” hat (recommended), do the boring responsible thing: download the script first, read it, then run it.</p>
<p>Inside the TUI, pick:</p>
<ul>
<li><strong>Channel:</strong> <code>alpha</code></li>
<li><strong>Method:</strong> <code>virtualenv</code></li>
<li><strong>Profile:</strong> <code>ovos</code></li>
</ul>
<p>The installer will provision a supported Python runtime for the virtualenv (default <code>3.11</code>) using <code>uv</code>, so you don’t have to fight whichever Python your system feels like shipping this week.</p>
<hr>
<h2>launchd service management (without becoming a launchd expert)</h2>
<p>macOS uses <code>launchd</code>, so the installer sets you up with a small wrapper that makes service management human-readable.</p>
<p>You’ll get an <code>ovos</code> command you can use like this:</p>
<pre><code class="hljs language-bash">ovos list

ovos status ovos
ovos status ovos-core

ovos restart ovos-core
ovos stop ovos
ovos start ovos-audio
</code></pre>
<p>That <code>ovos</code> “meta target” controls the user-level OVOS services managed by <code>launchd</code>.</p>
<p>Boring. Predictable. Exactly what we want.</p>
<hr>
<h2>Microphones on macOS: CoreAudio, minus the drama</h2>
<p>Audio is usually where “almost working” goes to die.</p>
<p>To keep this boring on macOS, <code>ovos-microphone-plugin-sounddevice</code> is the recommended path:
it’s built on <code>python-sounddevice</code> (PortAudio) and uses CoreAudio-friendly settings tuned for clean wake word + STT capture.</p>
<h3>Install the plugin</h3>
<p>If you want the newest macOS/CoreAudio tuning (the one from PR #16), install the <strong>pre-release</strong>:</p>
<pre><code class="hljs language-bash">pip install --pre ovos-microphone-plugin-sounddevice
</code></pre>
<h3>Configure it in <code>mycroft.conf</code></h3>
<p>Edit your <code>mycroft.conf</code> (usually <code>~/.config/mycroft/mycroft.conf</code>) and set the microphone module:</p>
<pre><code class="hljs language-json"><span class="hljs-punctuation">{</span>
  <span class="hljs-attr">"listener"</span><span class="hljs-punctuation">:</span> <span class="hljs-punctuation">{</span>
    <span class="hljs-attr">"microphone"</span><span class="hljs-punctuation">:</span> <span class="hljs-punctuation">{</span>
      <span class="hljs-attr">"module"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"ovos-microphone-plugin-sounddevice"</span><span class="hljs-punctuation">,</span>
      <span class="hljs-attr">"ovos-microphone-plugin-sounddevice"</span><span class="hljs-punctuation">:</span> <span class="hljs-punctuation">{</span>
        <span class="hljs-attr">"device"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"Built-in Microphone"</span><span class="hljs-punctuation">,</span>
        <span class="hljs-attr">"latency"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"low"</span><span class="hljs-punctuation">,</span>
        <span class="hljs-attr">"multiplier"</span><span class="hljs-punctuation">:</span> <span class="hljs-number">1.0</span><span class="hljs-punctuation">,</span>
        <span class="hljs-attr">"blocksize"</span><span class="hljs-punctuation">:</span> <span class="hljs-number">1024</span><span class="hljs-punctuation">,</span>
        <span class="hljs-attr">"queue_maxsize"</span><span class="hljs-punctuation">:</span> <span class="hljs-number">8</span><span class="hljs-punctuation">,</span>

        <span class="hljs-attr">"use_coreaudio_settings"</span><span class="hljs-punctuation">:</span> <span class="hljs-literal"><span class="hljs-keyword">true</span></span><span class="hljs-punctuation">,</span>
        <span class="hljs-attr">"coreaudio_conversion_quality"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"max"</span><span class="hljs-punctuation">,</span>
        <span class="hljs-attr">"coreaudio_change_device_parameters"</span><span class="hljs-punctuation">:</span> <span class="hljs-literal"><span class="hljs-keyword">false</span></span><span class="hljs-punctuation">,</span>
        <span class="hljs-attr">"coreaudio_fail_if_conversion_required"</span><span class="hljs-punctuation">:</span> <span class="hljs-literal"><span class="hljs-keyword">false</span></span><span class="hljs-punctuation">,</span>

        <span class="hljs-attr">"auto_sample_rate_fallback"</span><span class="hljs-punctuation">:</span> <span class="hljs-literal"><span class="hljs-keyword">true</span></span><span class="hljs-punctuation">,</span>
        <span class="hljs-attr">"auto_channel_fallback"</span><span class="hljs-punctuation">:</span> <span class="hljs-literal"><span class="hljs-keyword">true</span></span><span class="hljs-punctuation">,</span>
        <span class="hljs-attr">"auto_latency_fallback"</span><span class="hljs-punctuation">:</span> <span class="hljs-literal"><span class="hljs-keyword">true</span></span>
      <span class="hljs-punctuation">}</span>
    <span class="hljs-punctuation">}</span>
  <span class="hljs-punctuation">}</span>
<span class="hljs-punctuation">}</span>
</code></pre>
<h3>Device selection tips (because device names are a lifestyle choice)</h3>
<ul>
<li>Exact name: <code>"device": "Built-in Microphone"</code></li>
<li>Substring match: <code>"device": "Built-in"</code></li>
<li>Regex match: <code>"device": "regex:^MacBook.*Microphone"</code></li>
<li>Numeric index: <code>"device": 0</code></li>
<li>Default input device: omit <code>device</code> or set <code>"device": "default"</code></li>
</ul>
<hr>
<h2>Telemetry: opt-in, install-time only (and yes, macOS is already showing up)</h2>
<p>I like privacy by default and feedback by consent.</p>
<p><code>ovos-installer</code> telemetry is:</p>
<ul>
<li><strong>Anonymous</strong></li>
<li><strong>Opt-in</strong> (the installer asks)</li>
<li><strong>Install-time only</strong> (nothing keeps phoning home once the install is done)</li>
</ul>
<p>It captures things like OS name/version, architecture, Python version, install channel, and which features were enabled — useful for improving the installer and the platform without collecting personal data.</p>
<p>There’s a public dashboard here:</p>
<ul>
<li><a href="https://telemetry.smartgic.io/ovos-installer/dashboard/">https://telemetry.smartgic.io/ovos-installer/dashboard/</a></li>
</ul>
<p>And it already shows macOS users — which is both a good signal and a mild accusation that people have been doing this the hard way. Now we can make it repeatable.</p>
<hr>
<h2>What actually landed (high level)</h2>
<p>Two PRs did the heavy lifting:</p>
<ul>
<li>
<p><strong>ovos-installer #441</strong><br>
macOS support for Intel + Apple Silicon, <code>launchd</code> service management, and the “make it behave like the Linux experience” glue.</p>
</li>
<li>
<p><strong>ovos-microphone-plugin-sounddevice #16</strong><br>
better CoreAudio tuning, smarter device selection, more runtime controls (latency, blocksize, queue sizing, gain), plus tests to keep it from regressing the moment we touch it again.</p>
</li>
</ul>
<hr>
<h2>Try it on your Mac (and help us make it boring for everyone)</h2>
<p>If you’ve got:</p>
<ul>
<li>a Mac Mini acting as a tiny server,</li>
<li>a MacBook you want as an OVOS dev box,</li>
<li>an Intel Mac you’re not ready to retire,</li>
</ul>
<p>…this is your excuse to try OVOS on macOS using the real installer path.</p>
<p>If something breaks: open an issue.<br>
If you fix something: open a PR.<br>
If you just test and report: you’re still doing the work that makes this boring later.</p>
<hr>
<h2>Resources</h2>
<ul>
<li>ovos-installer: <a href="https://github.com/OpenVoiceOS/ovos-installer">https://github.com/OpenVoiceOS/ovos-installer</a></li>
<li>PR #441 (macOS support): <a href="https://github.com/OpenVoiceOS/ovos-installer/pull/441">https://github.com/OpenVoiceOS/ovos-installer/pull/441</a></li>
<li>ovos-microphone-plugin-sounddevice: <a href="https://github.com/OpenVoiceOS/ovos-microphone-plugin-sounddevice">https://github.com/OpenVoiceOS/ovos-microphone-plugin-sounddevice</a></li>
<li>PR #16 (CoreAudio improvements): <a href="https://github.com/OpenVoiceOS/ovos-microphone-plugin-sounddevice/pull/16">https://github.com/OpenVoiceOS/ovos-microphone-plugin-sounddevice/pull/16</a></li>
<li>Telemetry docs: <a href="https://github.com/OpenVoiceOS/ovos-installer/blob/main/docs/telemetry.md">https://github.com/OpenVoiceOS/ovos-installer/blob/main/docs/telemetry.md</a></li>
<li>Telemetry dashboard: <a href="https://telemetry.smartgic.io/ovos-installer/dashboard/">https://telemetry.smartgic.io/ovos-installer/dashboard/</a></li>
</ul>
<hr>
<h2>Help Us Build Voice for Everyone</h2>
<p>OpenVoiceOS is more than software, it’s a mission. If you believe voice assistants should be open, inclusive, and user-controlled, here’s how you can help:</p>
<ul>
<li><strong>💸 Donate</strong>: Help us fund development, infrastructure, and legal protection.</li>
<li><strong>📣 Contribute Open Data</strong>: Share voice samples and transcriptions under open licenses.</li>
<li><strong>🌍 Translate</strong>: Help make OVOS accessible in every language.</li>
</ul>
<p>We're not building this for profit. We're building it for people. With your support, we can keep voice tech transparent, private, and community-owned.</p>
<p>👉 <a href="https://www.openvoiceos.org/contribution">Support the project here</a></p>]]></description><link>https://openvoiceos.github.io/ovos-blogs/posts/2026-03-05-ovos-installer-macos-intel-apple-silicon</link><guid isPermaLink="false">https://openvoiceos.github.io/ovos-blogs/posts/2026-03-05-ovos-installer-macos-intel-apple-silicon</guid><dc:creator><![CDATA[Gaëtan Trellu]]></dc:creator><pubDate>Thu, 05 Mar 2026 00:00:00 GMT</pubDate></item><item><title><![CDATA[Bringing Real-Time Offline Speech Recognition to OpenVoiceOS]]></title><description><![CDATA[<h2>Bringing Real-Time Offline Speech Recognition to OpenVoiceOS - ONNX, New Plugins, and the Road Here</h2>
<p>Speech recognition — turning spoken words into text — has dramatically improved over the past few years. When OpenVoiceOS (OVOS) began, <em>offline</em> automatic speech recognition (ASR) was <strong>elusive</strong>. Today, with new ONNX-powered runtimes and models, offline STT is practical and performant even on modest hardware.</p>
<p>In this post we will:</p>
<ul>
<li>Explain the <em>evolution</em> of offline STT on OVOS</li>
<li>Describe why ONNX matters</li>
<li>Introduce the new OVOS plugins that make modern ASR work <em>locally</em></li>
<li>Point you to resources for both casual users and developers</li>
</ul>
<hr>
<h2><strong>Where Offline STT on OVOS Started</strong></h2>
<p>OVOS has historically supported many STT backends, but most were designed for cloud or desktop environments rather than small devices.</p>
<p>Here’s a quick historical rundown:</p>
<h3><strong>Full Framework Backends</strong></h3>
<p>Many early OVOS STT plugins wrapped models and code that required:</p>
<ul>
<li><strong>PyTorch</strong>, <strong>TensorFlow</strong>, or</li>
<li>Full toolkits like <strong>NVIDIA NeMo</strong></li>
</ul>
<p>This worked on desktops and servers but was <strong>difficult</strong> on small single-board computers (SBCs) like the Raspberry Pi because:</p>
<ul>
<li>Framework installs are massive (hundreds of MBs)</li>
<li>Native builds of CUDA, CuDNN, etc., are complex</li>
<li>Some models require dedicated GPUs to run at usable speeds</li>
</ul>
<p>As a result, for most OVOS users, these backends were <em>theoretical</em> offline options. Practical use often required self-hosting distinct servers or complex custom builds.</p>
<hr>
<h2><strong>Early Lightweight Offline Options</strong></h2>
<p>Some OVOS plugins attempted to address dependency bloat or offline operation early on:</p>
<h3>✔️ <strong><code>ovos-stt-plugin-vosk</code></strong></h3>
<p>🔗 <a href="https://github.com/OpenVoiceOS/ovos-stt-plugin-vosk">https://github.com/OpenVoiceOS/ovos-stt-plugin-vosk</a></p>
<ul>
<li>Based on the Vosk speech recognition library (Kaldi-based)</li>
<li>Designed for local offline STT</li>
<li><strong>Limitation:</strong> While fast, existing Vosk models generally offer lower accuracy compared to modern end-to-end neural ASR.</li>
</ul>
<h3>✔️ <strong><code>ovos-stt-plugin-citrinet</code></strong></h3>
<p>🔗 <a href="https://github.com/OpenVoiceOS/ovos-stt-plugin-citrinet">https://github.com/OpenVoiceOS/ovos-stt-plugin-citrinet</a></p>
<ul>
<li>Citrinet models exported to ONNX</li>
<li><strong>Limitation:</strong> Models were small and fast, but few pre-trained models were available, and those that existed struggled with accuracy.</li>
</ul>
<p>These plugins were important steps, they removed heavy dependencies, but in practice, their <em>accuracy</em> was rarely competitive with cloud solutions or large models.</p>
<hr>
<h2><strong>The Practical Offline Choice Before ONNX</strong></h2>
<p>Whisper models from OpenAI were (and remain) a major force in open ASR.</p>
<p>The <em>fasterwhisper</em> backend provided a lightweight way to run Whisper models locally using CTranslate2 for C++ acceleration and historically has been the default go-to plugin.</p>
<ul>
<li>🔗 <a href="https://github.com/OpenVoiceOS/ovos-stt-plugin-fasterwhisper">https://github.com/OpenVoiceOS/ovos-stt-plugin-fasterwhisper</a></li>
<li>🔗 <a href="https://github.com/OpenVoiceOS/ovos-stt-plugin-whispercpp">https://github.com/OpenVoiceOS/ovos-stt-plugin-whispercpp</a></li>
<li>🔗 <a href="https://github.com/TigreGotico/ovos-stt-plugin-whisper">https://github.com/TigreGotico/ovos-stt-plugin-whisper</a></li>
</ul>
<p><strong>Practically speaking</strong>, this was the <em>only</em> widely usable offline solution for OVOS on SBCs — provided:</p>
<ul>
<li>You used very small Whisper models (tiny/base), or</li>
<li>You had a GPU to accelerate inference.</li>
</ul>
<p>Without a GPU, even small Whisper models were often too slow on ARM CPUs for a snappy voice assistant experience.</p>
<p><strong>So offline STT <em>existed</em>, but was effectively limited unless you had higher-end hardware.</strong></p>
<hr>
<h2><strong>Why ONNX Changes Everything</strong></h2>
<p><strong>ONNX</strong> (Open Neural Network Exchange) isn’t a model; it’s a <strong>portable format plus optimized runtimes</strong> that let you run models efficiently on many platforms.</p>
<p>Official ONNX site: <a href="https://onnx.ai/">https://onnx.ai/</a></p>
<h3>Key Properties</h3>
<ul>
<li><strong>Portable inference</strong>: Export a model once and run it anywhere that supports ONNX Runtime.</li>
<li><strong>Low dependency overhead</strong>: You don’t need PyTorch, TensorFlow, or large ML stacks.</li>
<li><strong>Hardware Agnostic</strong>: ONNX Runtime can use CPU, GPU, NPUs, and platform accelerators seamlessly:
<ul>
<li>CPU: Optimized kernels</li>
<li>GPU: CUDA, TensorRT</li>
<li>Apple Silicon: CoreML</li>
<li>Windows ML: DirectML</li>
</ul>
</li>
<li><strong>Quantization support</strong>: Reduces model size and improves inference speed on resource-limited hardware (e.g., using int8 models).</li>
</ul>
<p>The result: <strong>real-time or near-real-time speech recognition on devices that previously struggled to run anything but tiny models.</strong></p>
<hr>
<h2><strong>Architectures in Modern ASR (Quick Reference)</strong></h2>
<p>Modern ASR is built on several architectural paradigms. All of the following can now be exported to ONNX and run with optimized inference in OVOS:</p>
<table>
<thead>
<tr>
<th>Architecture</th>
<th>Description</th>
<th>Resource</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>CTC</strong></td>
<td>Aligns audio to text without pre-segmented transcripts</td>
<td><a href="https://en.wikipedia.org/wiki/Connectionist_temporal_classification">Wikipedia</a></td>
</tr>
<tr>
<td><strong>RNN-T (Transducer)</strong></td>
<td>sequence-to-sequence models that do not employ attention mechanisms.</td>
<td><a href="https://arxiv.org/abs/1211.3711">Arxiv</a></td>
</tr>
<tr>
<td><strong>Transformer</strong></td>
<td>Attention-based networks used in many ASR models</td>
<td><a href="https://en.wikipedia.org/wiki/Transformer_(machine_learning_model)">Wikipedia</a></td>
</tr>
<tr>
<td><strong>Conformer</strong></td>
<td>Combines convolution and attention for speech</td>
<td><a href="https://arxiv.org/abs/2005.08100">Arxiv</a></td>
</tr>
<tr>
<td><strong>Whisper</strong></td>
<td>Large general-purpose ASR model family from OpenAI</td>
<td><a href="https://github.com/openai/whisper">GitHub</a></td>
</tr>
<tr>
<td><strong>Paraformer</strong></td>
<td>Non-autoregressive, speed-optimized model</td>
<td><a href="https://arxiv.org/abs/2206.08317">Arxiv</a></td>
</tr>
<tr>
<td><strong>Zipformer</strong></td>
<td>Faster, more memory-efficient, and better-performing transformer</td>
<td><a href="https://arxiv.org/abs/2310.11230">Arxiv</a></td>
</tr>
</tbody>
</table>
<hr>
<h2><strong>Two New ONNX-Powered OVOS STT Plugins</strong></h2>
<h3><strong>1) <code>ovos-stt-plugin-sherpa-onnx</code></strong></h3>
<p>📦 <a href="https://github.com/TigreGotico/ovos-stt-plugin-sherpa-onnx">https://github.com/TigreGotico/ovos-stt-plugin-sherpa-onnx</a></p>
<p>This plugin connects OVOS with the <a href="https://github.com/k2-fsa/sherpa-onnx"><strong>Sherpa-ONNX ecosystem</strong></a>, a performant, multi-model, ONNX-centric ASR framework.</p>
<h3><strong>2) <code>ovos-stt-plugin-onnx-asr</code></strong></h3>
<p>📦 <a href="https://github.com/TigreGotico/ovos-stt-plugin-onnx-asr">https://github.com/TigreGotico/ovos-stt-plugin-onnx-asr</a></p>
<p>This plugin integrates the <a href="https://github.com/istupakov/onnx-asr"><strong>onnx-asr Python library</strong></a> directly into OVOS, giving you a simple API to run ONNX models.</p>
<p>We even converted some basque/spanish/catalan models for usage with the onnx-asr plugin, you can find them <a href="https://huggingface.co/collections/OpenVoiceOS/ovos-stt-asr">in ths huggingface collection</a></p>
<h3>Features of the new plugins:</h3>
<ul>
<li><strong>Modern model families</strong>: Transducer/Zipformer, Paraformer, Parakeet, Canary, Whisper, GigaAM, Moonshine</li>
<li><strong>Auto-download</strong> of models</li>
<li><strong>Quantized models</strong> for low-power devices</li>
<li><strong>Hardware Acceleration</strong>: Works well on both CPU and GPU (if available)</li>
</ul>
<hr>
<h2><strong>What This Means for OVOS Users</strong></h2>
<h3><strong>For Casual Users</strong></h3>
<ul>
<li>Real offline STT — no internet required.</li>
<li>Better accuracy than previous lightweight plugins.</li>
<li>Easier installation processes.</li>
</ul>
<h3><strong>For Developers</strong></h3>
<ul>
<li>Full access to modern ASR architectures.</li>
<li>Benchmark and swap models without rewriting code.</li>
<li>Experiment with ONNX Runtime acceleration backends.</li>
</ul>
<hr>
<h1><strong>In Summary</strong></h1>
<p>The landscape of <em>offline speech recognition on OpenVoiceOS</em> has matured:</p>
<table>
<thead>
<tr>
<th>Stage</th>
<th>Typical Requirements</th>
<th>Practical Result</th>
</tr>
</thead>
<tbody>
<tr>
<td>Early Full Framework Models</td>
<td>PyTorch / NeMo</td>
<td>Works on desktop, not SBC</td>
</tr>
<tr>
<td>Vosk / Citrinet ONNX</td>
<td>Lightweight, low accuracy</td>
<td>Usable, but limited accuracy</td>
</tr>
<tr>
<td>Whisper / FasterWhisper</td>
<td>Better accuracy, still heavy</td>
<td>Best offline until now</td>
</tr>
<tr>
<td><strong>ONNX STT (Sherpa + onnx-asr)</strong></td>
<td>Minimal deps, efficient</td>
<td><strong>Fast, real-time, offline, portable</strong></td>
</tr>
</tbody>
</table>
<p>With these new ONNX plugins, OVOS users get <strong>the most capable offline STT stack yet</strong> — faster, more accurate, and deployable in more environments than ever before.</p>
<hr>
<h2>Help Us Build Voice for Everyone</h2>
<p>OpenVoiceOS is more than software, it’s a mission. If you believe voice assistants should be open, inclusive, and user-controlled, here’s how you can help:</p>
<ul>
<li><strong>💸 Donate</strong>: Help us fund development, infrastructure, and legal protection.</li>
<li><strong>📣 Contribute Open Data</strong>: Share voice samples and transcriptions under open licenses.</li>
<li><strong>🌍 Translate</strong>: Help make OVOS accessible in every language.</li>
</ul>
<p>We're not building this for profit. We're building it for people. With your support, we can keep voice tech transparent, private, and community-owned.</p>
<p>👉 <a href="https://www.openvoiceos.org/contribution">Support the project here</a></p>]]></description><link>https://openvoiceos.github.io/ovos-blogs/posts/2026-02-16-onnx-asr</link><guid isPermaLink="false">https://openvoiceos.github.io/ovos-blogs/posts/2026-02-16-onnx-asr</guid><dc:creator><![CDATA[JarbasAl]]></dc:creator><pubDate>Mon, 16 Feb 2026 00:00:00 GMT</pubDate></item><item><title><![CDATA[OpenVoice OS @ Speechday 2026]]></title><description><![CDATA[<p>On Monday, February 2, 2026, Peter Steenbergen and I attended the <a href="https://sites.google.com/view/dutchspeechtechday2026">fourth Dutch Speech Tech Day</a> at “Beeld &#x26; Geluid” (Sound &#x26; Vision) in Hilversum, The Netherlands. It was our first time attending this annual meeting of approximately 100 researchers, innovators, and speech technology enthusiasts, held in the inspiring Radio &#x26; Television archive building.</p>
<p><img src="/assets/blog/OpenVoiceOS-Speechday-2026/sound-and-vision-building.png" alt="The exterior of the Sound &#x26; Vision building, modern and colorful">
<img src="/assets/blog/OpenVoiceOS-Speechday-2026/sound-and-vision-main-venue.jpg" alt="The main venue at Sound &#x26; Vision">
<img src="/assets/blog/OpenVoiceOS-Speechday-2026/peter-in-the-infomarket-stand.JPG" alt="Peter at the infomarket stand"></p>
<h2>OVOS &#x26; VisioLab</h2>
<p>We represented OpenVoice OS to promote our FOSS (free and open-source) framework. I was also there on behalf of VisioLab (the tech innovation department of Visio). My pitch focused on the <a href="https://voicelab.visio.org">VoiceLab project</a>, which centers on testing and prototyping new voice interactions for the blind and visually impaired—early adopters of voice-first interfacing. Additionally, we looked for collaborators to help build a truly open-source Dutch TTS through the subproject <a href="https://voicelab.visio.org/spraakmakers/#ai-stem-van-nederland">The Voice of Holland</a>.</p>
<p><img src="/assets/blog/OpenVoiceOS-Speechday-2026/timon-and-peter-at-the-infostand.JPG" alt="Timon and Peter in front of our info-stand"></p>
<p>We had many fascinating conversations with like-minded people. Once visitors realized we had no commercial interest and were simply there to learn, exchange knowledge, and promote a sovereign platform, the ice broke quickly. It was refreshing to connect as peers in such an open environment.</p>
<h2>LEGO</h2>
<p><img src="/assets/blog/OpenVoiceOS-Speechday-2026/ovos-lego-modules.jpg" alt="9 modules visualized in LEGO blocks, the OVOS pipeline"></p>
<p>The interest in OpenVoiceOS was wonderful to see. Using LEGO blocks as a metaphor for the framework's modularity proved to be a great conversation starter ("Hey, what’s that LEGO all about?"). These modules—including voice activity detection, automatic speech recognition (ASR), and intent matching—form the OVOS pipeline.</p>
<p>This modularity aligned perfectly with the day's themes. New speech models appearing weekly on HuggingFace were the "talk of the town." I especially enjoyed demonstrating this flexibility: just the evening before the event, I swapped my favorite <a href="https://github.com/OpenVoiceOS/ovos-stt-plugin-chromium">OVOS-Google-STT plugin</a> for a newly released offline, open-source ASR model (<a href="https://huggingface.co/istupakov/parakeet-tdt-0.6b-v3-onnx">NVIDIA parakeet-tdt-0.6-v3-onnx</a>).</p>
<p>Thanks to an Apache 2.0 licensed <a href="https://github.com/TigreGotico/ovos-stt-plugin-onnx-asr">OVOS-onnx-ASR-plugin</a> created by the OVOS lead developer only hours after the model's release—wow, that was fast!—we could give a live demo of a completely local setup.</p>
<p>It performed with the speed of proprietary online plugins; you couldn't tell the difference from the outside! The next challenge is testing this on a Raspberry Pi 4 to see if we can maintain that performance.</p>
<h2>AI Voice Personas</h2>
<p>We also discussed the rise of AI speech agents —essentially LLMs with speech I/O. While new shiny agents like <a href="https://unmute.sh">unmute.sh</a> are impressive, they often require significant hardware to achieve low latency. According to their <a href="https://github.com/kyutai-labs/unmute#using-multiple-gpus">GitHub documentation</a>, they run STT, TTS, and the LLM server on three separate GPUs to reach their target performance.</p>
<p>Our focus remains on local, private home devices that -just work- and are fast and usable without a massive backend. Our work on <a href="https://openvoiceos.github.io/ovos-technical-manual/150-personas/">"OVOS-persona"</a> is ongoing; whether it acts as the start of the pipeline or a fallback, the goal is to provide users with modular choices. They are in control.</p>
<h2>Looking Forward</h2>
<p>It was a pleasure to meet organizers and researchers like Matt Coler (University of Groningen). We share the ambition to support "small" or underexposed languages and user groups, such as people with speech disorders or specific dialects. By joining forces with this network of innovators, we can reduce dependency on Big Tech and prove that we can be self-sufficient in speech technology.</p>
<p><img src="/assets/blog/OpenVoiceOS-Speechday-2026/timon-and-matt-coler.JPG" alt="Selfie of Matt Coler and Timon"></p>
<h2>References</h2>
<ul>
<li><a href="https://sites.google.com/view/dutchspeechtechday2026">Dutch Speech Tech Day 2026</a></li>
<li><a href="https://voicelab.visio.org">VoiceLab Visio</a></li>
<li><a href="https://openvoiceos.github.io/ovos-technical-manual/150-personas/">OVOS Technical Manual - Personas</a></li>
<li><a href="https://github.com/TigreGotico/ovos-stt-plugin-onnx-asr">OVOS ONNX ASR Plugin</a></li>
</ul>
<hr>
<h2>Help Us Build Voice for Everyone</h2>
<p>OpenVoiceOS is more than software — it’s a mission.</p>
<p>If you believe voice assistants should be open, inclusive, and user-controlled, there are many ways to help:</p>
<ul>
<li><strong>💸 Donate</strong> — support development, infrastructure, and long-term sustainability</li>
<li><strong>📣 Contribute open data</strong> — share voice samples and transcriptions under open licenses</li>
<li><strong>🌍 Translate</strong> — help make OpenVoiceOS accessible in every language</li>
</ul>
<p>We’re not building this for profit.</p>
<p>We’re building it for people.</p>
<p>👉 <a href="https://www.openvoiceos.org/contribution">Support the project here</a></p>]]></description><link>https://openvoiceos.github.io/ovos-blogs/posts/2026-02-05-OpenVoiceOS-Speechday-2026</link><guid isPermaLink="false">https://openvoiceos.github.io/ovos-blogs/posts/2026-02-05-OpenVoiceOS-Speechday-2026</guid><dc:creator><![CDATA[Timon van Hasselt]]></dc:creator><pubDate>Thu, 05 Feb 2026 13:00:00 GMT</pubDate></item><item><title><![CDATA[Voice: 5× Faster Than Typing and ready for the AI era]]></title><description><![CDATA[<h1>Voice Is (Again) the New Interface</h1>
<h2>5× Faster than typing and far smarter than keystrokes</h2>
<p>For decades, we’ve accepted keyboards, touchscreens, and endless menus as the default way to interact with computers. We type emails. We click buttons. We fill in forms.</p>
<p>And yet, none of this is natural.</p>
<p>Humans didn’t evolve to communicate through keyboards. We evolved to speak.<br>
This belief has always been at the core of <a href="https://www.openvoiceos.org/">OpenVoiceOS</a>: voice-first interaction should be open, user-controlled, and respectful of privacy.</p>
<p>Now, as artificial intelligence becomes more capable than ever, the mismatch is obvious: incredibly powerful systems are still locked behind slow, outdated interfaces.</p>
<p>Voice is not just returning. It’s becoming essential.</p>
<hr>
<h2>Speed matters and voice wins, hands down</h2>
<p>The numbers are hard to ignore:</p>
<ul>
<li><strong>Average speaking speed:</strong> ~130–160 words per minute</li>
<li><strong>Average typing speed:</strong> ~40–60 words per minute (experienced typists)</li>
<li><strong>Most people:</strong> closer to 20–30 words per minute</li>
</ul>
<p>That means speaking is <strong>up to five times faster than typing</strong>.</p>
<p>In real-world terms, this translates to:</p>
<ul>
<li>Less time entering information</li>
<li>Fewer interruptions to your workflow</li>
<li>Faster access to systems, tools, and knowledge</li>
</ul>
<p>Stanford University study showed that voice input on smartphones was <a href="https://hci.stanford.edu/research/speech/Ubicomp18_pdf.pdf"><strong>around three times faster than typing and often more accurate</strong></a>, thanks to modern deep-learning-based speech recognition.</p>
<h3>Voice removes the input bottleneck</h3>
<p><img src="/assets/blog/voice-faster-than-typing/input-bottleneck.png" alt="Voice removes the input bottleneck"></p>
<blockquote>
<p><strong>Speech removes the input bottleneck.</strong><br>
Humans think faster than they type. Voice aligns input speed with natural cognition, while keyboards artificially slow interaction with modern AI systems.</p>
</blockquote>
<p>Speed isn’t a luxury. At scale, it’s a competitive advantage.</p>
<hr>
<h2>AI Has leaped forward however interfaces have not</h2>
<p>Over the last few years, AI has taken a massive leap forward.</p>
<p>Large Language Models, autonomous agents, copilots, and smart systems can reason, summarize, plan, and act. Yet most of them are still accessed through:</p>
<ul>
<li>Keyboards</li>
<li>Text boxes</li>
<li>Click-heavy user interfaces</li>
</ul>
<p>This creates a clear bottleneck.</p>
<blockquote>
<p>Powerful intelligence, trapped behind slow input.</p>
</blockquote>
<p>At OpenVoiceOS, we see voice as the missing interface layer for modern AI — one that connects humans to intelligence at the speed of thought.</p>
<p>Voice allows you to:</p>
<ul>
<li>Talk directly to your smart environment</li>
<li>Control AI agents and language models naturally</li>
<li>Interact without breaking focus or switching context</li>
</ul>
<p>And crucially, this does <strong>not</strong> require sending your data to the cloud.<br>
OpenVoiceOS is designed to be able to run <strong>locally</strong>, <strong>offline</strong>, and is fully <strong>open source</strong>.</p>
<p>Voice isn’t a novelty feature. It’s the most efficient way to access intelligence.</p>
<hr>
<h2>Voice-First is more natural and more inclusive</h2>
<p>Voice changes how we interact across environments:</p>
<ul>
<li>In homes and offices</li>
<li>On factory floors and in warehouses</li>
<li>In cars, on bikes, and on the move</li>
<li>In healthcare, logistics, and field work</li>
</ul>
<p>The advantages are immediate and practical:</p>
<ul>
<li><strong>Hands-free operation:</strong> safer and more efficient</li>
<li><strong>Faster documentation:</strong> speak instead of type</li>
<li><strong>Accessibility by design:</strong> empowers people with disabilities, RSI, or limited mobility</li>
</ul>
<p>Unlike traditional UI shortcuts, voice doesn’t require training. Speaking is universal.</p>
<p>This focus on accessibility and inclusivity is a recurring theme within the OpenVoiceOS community, where voice is not treated as a convenience feature, but as a fundamental interface for everyone.</p>
<p>With OpenVoiceOS, voice-first also means:</p>
<ul>
<li>No forced cloud dependency</li>
<li>No opaque data harvesting</li>
<li>Full transparency and control</li>
</ul>
<p>Fast. Local. Private.</p>
<hr>
<h2>Voice makes AI smarter, too</h2>
<p>Voice isn’t just faster, it changes <em>how</em> we communicate with AI.</p>
<p>Spoken interaction naturally provides:</p>
<ul>
<li>More context</li>
<li>Richer phrasing</li>
<li>Clearer intent</li>
</ul>
<p>When connected to language models and agents, voice-based interaction often results in better prompts and more accurate responses without users needing to learn prompt engineering or special syntax.</p>
<p>This makes voice a natural companion to LLMs and AI agents, an idea we explore further in our writing about voice-driven AI workflows within our blog.</p>
<p>In short: voice reduces friction for humans <em>and</em> for AI systems.</p>
<hr>
<h2>Early adopters gain the edge</h2>
<p>Voice technology has existed for years but only now do we have the AI capabilities to fully unlock its potential.</p>
<p>Organizations and developers who adopt voice-first thinking today will:</p>
<ul>
<li>Build faster, more intuitive user experiences</li>
<li>Reduce interaction cost and cognitive load</li>
<li>Create systems that scale naturally with AI</li>
</ul>
<p>We’ve spent decades optimizing keyboards and touchscreens.</p>
<p>The future interface doesn’t use keyboards.</p>
<p>It speaks!</p>
<p>At OpenVoiceOS, we’re building that future, open, local, privacy-respecting, and community-driven.</p>
<p><strong>Voice isn’t just faster. It’s smarter.</strong></p>
<hr>
<h2>Help Us Build Voice for Everyone</h2>
<p>OpenVoiceOS is more than software — it’s a mission.</p>
<p>If you believe voice assistants should be open, inclusive, and user-controlled, there are many ways to help:</p>
<ul>
<li><strong>💸 Donate</strong> — support development, infrastructure, and long-term sustainability</li>
<li><strong>📣 Contribute open data</strong> — share voice samples and transcriptions under open licenses</li>
<li><strong>🌍 Translate</strong> — help make OpenVoiceOS accessible in every language</li>
</ul>
<p>We’re not building this for profit.</p>
<p>We’re building it for people.</p>
<p>👉 <a href="https://www.openvoiceos.org/contribution">Support the project here</a></p>]]></description><link>https://openvoiceos.github.io/ovos-blogs/posts/2026-01-25-voice-first</link><guid isPermaLink="false">https://openvoiceos.github.io/ovos-blogs/posts/2026-01-25-voice-first</guid><dc:creator><![CDATA[Peter Steenbergen]]></dc:creator><pubDate>Sun, 25 Jan 2026 00:00:00 GMT</pubDate></item><item><title><![CDATA[OVOS & HiveMind in the Manufacturing Industry]]></title><description><![CDATA[<h2>OVOS &#x26; HiveMind in the Manufacturing Industry</h2>
<blockquote>
<p>This blog was originally posted in the <a href="https://tigregotico.pt/blog/2025-11-26-OVOS-hivemind-industry">TigreGotico website</a></p>
</blockquote>
<p>As the lead developer of <strong><a href="https://openvoiceos.org">OpenVoiceOS</a></strong>, maintained by a non-profit, and the creator of <strong><a href="https://jarbashivemind.github.io/HiveMind-community-docs/">HiveMind</a></strong>, I’ve always believed in open, privacy-respecting voice technology. What I did not anticipate was how quickly these tools would end up in industrial research, especially without any direct involvement from me.</p>
<p>The <strong><a href="https://coala-ai.de">COALA</a></strong> and <strong><a href="https://wasabiproject.eu">WASABI</a></strong> EU projects have built an entire industrial voice-assistant framework around <a href="https://openvoiceos.org">OVOS</a> + <a href="https://jarbashivemind.github.io/HiveMind-community-docs/">HiveMind</a>, integrating them with their own tools, UI, and conversation engines.</p>
<p>I am not involved with these deployments, but the fact that the stack is being adopted organically is a strong validation of its design.</p>
<hr>
<h2>WASABI Open Call</h2>
<p>The 2nd <a href="https://wasabiproject.eu/wp-content/uploads/2025/08/WASABI_Guide_for_Applicants_2nd-OC_vFIN.pdf">WASABI Open Call</a> to provide financial support to at least 10 experiments led by SMEs recently closed.
This open call is designed to support AI-based digital assistance experiments involving SMEs from manufacturing.</p>
<p>All WASABI Open Call experiments are required to:</p>
<ul>
<li>run the <strong>WASABI/COALA OVOS Docker stack</strong></li>
<li>connect via <strong>HiveMind</strong></li>
<li>develop a custom <strong>OVOS Skill</strong> containing their industrial logic</li>
</ul>
<p>The usage of OVOS/Hivemind is explained in these 2 documents from the Wasabi project:</p>
<ul>
<li><a href="https://wasabiproject.eu/wp-content/uploads/2024/01/WASABI_D2.1_template_v0.7_FINAL.pdf">Deliverable D2.1</a></li>
<li><a href="https://files.wasabiproject.eu/wp-content/uploads/2023/Docs/wp2/Deliverables/D2.4/WASABI_D2.4_Joint%20WASABI%20Demonstrator_v0.5_final.pdf">Deliverable D2.4</a></li>
</ul>
<p><img src="/assets/blog/ovos-hivemind-industry/wasabi_stack.png" alt="wasabi_ovos"></p>
<hr>
<h2>Examples of Industrial Applications</h2>
<h3><strong>1. Worker Guidance &#x26; Assembly Support</strong></h3>
<p>Experiments like <strong><a href="https://wasabiproject.eu/ticonai-2">TICONAI</a></strong> and <strong><a href="https://wasabiproject.eu/skite-main-2">SKITE</a></strong> are using OVOS skills to guide workers during complex tasks such as assembling components, validating procedures, or providing step-by-step instructions hands-free.</p>
<h3><strong>2. Quality Control and Error Reduction</strong></h3>
<p>Projects like <strong><a href="https://wasabiproject.eu/wallabi">WALLABI</a></strong> and <strong><a href="https://wasabiproject.eu/humanenerdia">HUMANENERDIA</a></strong> focus on providing workers with real-time instructions and checklists to prevent mistakes. Voice assistants help operators verify settings, remember safety checks, or cross-check parameters.</p>
<h3><strong>3. Predictive Maintenance Assistance</strong></h3>
<p>Experiments such as <strong><a href="https://wasabiproject.eu/genius-pm">GENIUS-PM</a></strong> use the assistant to give maintenance techs quick access to machine health data, fault explanations, and repair steps—especially when their hands are occupied.</p>
<h3><strong>4. Logistics, Material Handling &#x26; Warehouse Support</strong></h3>
<p><strong><a href="https://wasabiproject.eu/velo-2">VELO</a></strong> and <strong><a href="https://wasabiproject.eu/aivea">AIVEA</a></strong> use voice to help workers locate items, confirm inventory, or check delivery tasks while moving around a shop floor.</p>
<h3><strong>5. Onboarding and Training</strong></h3>
<p><strong><a href="https://wasabiproject.eu/onboard">ONBOARD</a></strong> and <strong><a href="https://wasabiproject.eu/ai-mode">AI-MODE</a></strong> test how new employees can be guided through tasks using voice guidance, reducing the burden on supervisors.</p>
<h3><strong>6. Sustainability, Waste Tracking &#x26; Resource Efficiency</strong></h3>
<p><strong><a href="https://wasabiproject.eu/vafer">VAFER</a></strong> integrates voice interfaces with systems that monitor recycling, material reuse, and resource flows—hands-free reporting in factory environments.</p>
<p>All of these rely on OVOS and on HiveMind for routing communication between devices, Android UI, and backend systems.</p>
<hr>
<h2>What COALA/WASABI Built on Top of OVOS</h2>
<p>Although the projects produced no open-source industrial skills, they did create several components around OVOS + HiveMind:</p>
<h3><strong>1. A RASA-based Domain Assistant (DA)</strong></h3>
<p>Earlier COALA research developed a <strong>RASA NLP pipeline</strong> trained on manufacturing conversations (about quality checks, troubleshooting, machine operation).
In WASABI, this RASA engine is plugged into OVOS as a <strong>skill</strong>, handling domain-specific dialog.</p>
<h3><strong>2. The COALA Android App</strong></h3>
<p>An Android front-end for workers, connecting to OVOS through HiveMind.</p>
<p>Early version released here:
<a href="https://github.com/BIBA-GmbH/Mycroft-Android">https://github.com/BIBA-GmbH/Mycroft-Android</a></p>
<p>Features include:</p>
<ul>
<li>login via Keycloak</li>
<li>text or voice chat</li>
<li>UI for instructions, warnings, and notes</li>
<li>HiveMind-based messaging</li>
</ul>
<h3><strong>3. A Full Docker-Based Industrial Stack</strong></h3>
<p>Both projects ship a preconfigured Docker environment bundling:</p>
<ul>
<li>OVOS</li>
<li>HiveMind</li>
<li>Keycloak (user management)</li>
<li>RASA NLP engine</li>
<li>COALA connector services</li>
</ul>
<p>This forms the standard industrial voice-assistant stack that all WASABI experiments must deploy.</p>
<h3><strong>4. An Industrial Speech Dataset</strong></h3>
<p>COALA published a speech dataset recorded in factories and workshops:
<a href="https://zenodo.org/record/8268928">https://zenodo.org/record/8268928</a></p>
<hr>
<h2>Why Industry Chooses OVOS + HiveMind</h2>
<p>The appeal is straightforward:</p>
<ul>
<li><strong>Full transparency</strong> (crucial for regulated sectors)</li>
<li><strong>Local/edge deployment</strong> (no cloud dependency)</li>
<li><strong>Easy to integrate into existing equipment</strong></li>
<li><strong>Modular enough for custom proprietary skills</strong></li>
<li><strong>Distributed voice networks</strong> (HiveMind satellites across a factory)</li>
</ul>
<p>In short: the combination is flexible, vendor-neutral, and respects industrial data constraints.</p>
<hr>
<h2>Closing Thoughts</h2>
<p>I didn’t set out to build an industrial standard.
I set out to build something open, reliable, and user-controlled.</p>
<p>Seeing OVOS and HiveMind adopted by COALA/WASABI, without my involvement or promotion, is a quiet but powerful sign that open-source voice technology is maturing.</p>
<p>A transparent, modular voice stack is no longer just a community dream.</p>
<p>It’s becoming part of the industrial toolset used to guide workers, reduce errors, improve maintenance, and ensure safer operations.</p>
<p>This is only the beginning.</p>
<hr>
<h2>Help Us Build Voice for Everyone</h2>
<p>OpenVoiceOS is more than software, it’s a mission. If you believe voice assistants should be open, inclusive, and user-controlled, here’s how you can help:</p>
<ul>
<li><strong>💸 Donate</strong>: Help us fund development, infrastructure, and legal protection.</li>
<li><strong>📣 Contribute Open Data</strong>: Share voice samples and transcriptions under open licenses.</li>
<li><strong>🌍 Translate</strong>: Help make OVOS accessible in every language.</li>
</ul>
<p>We're not building this for profit. We're building it for people. With your support, we can keep voice tech transparent, private, and community-owned.</p>
<p>👉 <a href="https://www.openvoiceos.org/contribution">Support the project here</a></p>]]></description><link>https://openvoiceos.github.io/ovos-blogs/posts/2026-01-14-OVOS-hivemind-industry</link><guid isPermaLink="false">https://openvoiceos.github.io/ovos-blogs/posts/2026-01-14-OVOS-hivemind-industry</guid><dc:creator><![CDATA[JarbasAl]]></dc:creator><pubDate>Wed, 14 Jan 2026 00:00:00 GMT</pubDate></item><item><title><![CDATA[Introducing the First Phonemizer for Barranquenho]]></title><description><![CDATA[<h2>Introducing the First Phonemizer for Barranquenho</h2>
<blockquote>
<p>This blog was originally posted in the <a href="https://tigregotico.pt/blog/2025-12-12-barranquenho">TigreGotico website</a></p>
</blockquote>
<p>Today marks an exciting milestone for linguistic preservation! We're thrilled to announce the release of <a href="https://github.com/TigreGotico/g2p_barranquenho">g2p_barranquenho</a>,  the <strong>very first Grapheme-to-Phoneme (G2P) phonemizer for <a href="https://en.wikipedia.org/wiki/Barranquenho">Barranquenho</a></strong>, a truly unique Ibero-Romance language spoken in the Portuguese municipality of Barrancos.</p>
<p>This isn't about a technical achievement; it's a step towards safeguarding and revitalizing a language that embodies a rich cultural heritage.</p>
<h3>Why Barranquenho, and Why Now?</h3>
<p>Barranquenho stands at a fascinating crossroads of Portuguese and Spanish linguistic traditions, reflecting centuries of cross-border interaction. Despite its distinctiveness, resources for Barranquenho have historically been scarce, making it a challenging language for digital representation and AI development.</p>
<p>However, a monumental effort from the Barrancos Municipal Council is changing that. Just recently, the Council made available three foundational linguistic documents, an "enormous step" for Barranquenho culture:</p>
<ul>
<li><strong>Dicionário de Barranquenho (Barranquenho Dictionary)</strong></li>
<li><strong>Convenção Ortográfica do Barranquenho (Barranquenho Orthographic Convention)</strong></li>
<li><strong>Gramática Básica do Barranquenho (Basic Barranquenho Grammar)</strong></li>
</ul>
<p>These publications, highlighted in their official announcement <a href="https://cm-barrancos.pt/21976/un-enormi-passu-para-u-barranquenhu-i-para-a-cultura-barranquenha">"Un Enormi Passu para u Barranquenhu i para a Cultura Barranquenha!"</a>, provide the crucial backbone for our work. The <strong>Convenção Ortográfica do Barranquenho</strong> in particular has been instrumental, offering the consistent rules we needed to build our G2P model.</p>
<h3>What is a G2P Phonemizer?</h3>
<p>In simple terms, a G2P phonemizer is a system that takes a written word (graphemes) and converts it into its phonetic representation (phonemes). Think of it as teaching a computer how to "pronounce" a word, even if it's never heard it before. For Barranquenho, this means translating its unique spelling into the <a href="https://en.wikipedia.org/wiki/International_Phonetic_Alphabet">International Phonetic Alphabet (IPA)</a>.</p>
<p>Our rule-based phonemizer accounts for Barranquenho's distinct phonetic features, including vowel allophony, nasalization, and the specific pronunciations of consonants like 'r' and 's', which can vary significantly from standard Portuguese or Spanish.</p>
<blockquote>
<p>"Un Enormi Passu para u Barranquenhu i para a Cultura Barranquenha" -> "ũ ẽjoɾmj pasu paɾɐ u bɐrɐ͂keɲu j paɾɐ ɐ kultuɾɐ bɐrɐ͂keɲɐ"</p>
</blockquote>
<h3>The Power of this First Step</h3>
<p>There is still a long way to go before Barranquenho speakers no longer need to switch to Portuguese or Spanish to interact with voice technology. The creation of this G2P phonemizer is more than just a cool linguistic tool; it's the <strong>foundational layer</strong> for developing advanced AI capabilities for Barranquenho. With a reliable way to map written words to their sounds, we can unlock:</p>
<ul>
<li><strong>Speech-to-Text (STT) Systems:</strong> Imagine speaking in Barranquenho and having your words accurately transcribed. This could revolutionize documentation, communication, and accessibility.</li>
<li><strong>Text-to-Speech (TTS) Systems:</strong> Give digital voices the ability to speak Barranquenho, opening doors for educational tools, audiobooks, and interactive experiences.</li>
</ul>
<p>This work aligns perfectly with our mission of making voice AI accessible for everyone in any language.</p>
<h3>Join Us: Help Bring Barranquenho Voices to AI!</h3>
<p>While the orthographic convention gives us the rules for pronunciation, <strong>real-world speech data is invaluable</strong> for training robust AI models.</p>
<p><strong>We are putting out a call to the Barranquenho community and anyone passionate about linguistic preservation:</strong></p>
<p><strong>We need recordings of spoken Barranquenho!</strong></p>
<p>Whether you're a native speaker, an enthusiast, or a linguist, your voice can help us create the next generation of Barranquenho AI resources. If you are interested in contributing, please reach out to us!. Every voice counts, and together, we can ensure that Barranquenho thrives in the digital age.</p>
<p>Let's make some noise for Barranquenho! 📢</p>
<hr>
<h2>Help Us Build Voice for Everyone</h2>
<p>OpenVoiceOS is more than software, it’s a mission. If you believe voice assistants should be open, inclusive, and user-controlled, here’s how you can help:</p>
<ul>
<li><strong>💸 Donate</strong>: Help us fund development, infrastructure, and legal protection.</li>
<li><strong>📣 Contribute Open Data</strong>: Share voice samples and transcriptions under open licenses.</li>
<li><strong>🌍 Translate</strong>: Help make OVOS accessible in every language.</li>
</ul>
<p>We're not building this for profit. We're building it for people. With your support, we can keep voice tech transparent, private, and community-owned.</p>
<p>👉 <a href="https://www.openvoiceos.org/contribution">Support the project here</a></p>]]></description><link>https://openvoiceos.github.io/ovos-blogs/posts/2025-12-14-barranquenho</link><guid isPermaLink="false">https://openvoiceos.github.io/ovos-blogs/posts/2025-12-14-barranquenho</guid><dc:creator><![CDATA[JarbasAl]]></dc:creator><pubDate>Sun, 14 Dec 2025 00:00:00 GMT</pubDate></item><item><title><![CDATA[Cloning Voices for Endangered Languages: Building a Text-to-Speech Model for Asturian and Aragonese]]></title><description><![CDATA[<h2><strong>Cloning Voices for Endangered Languages: Building a Text-to-Speech Model for Asturian and Aragonese</strong></h2>
<p>Have you ever wanted to hear a computer speak in an accent you love, or in a language that's rarely supported by big tech?</p>
<p>Today we’re releasing new experimental Text-to-Speech (TTS) models for <strong><a href="https://en.wikipedia.org/wiki/Asturian_language">Asturian</a> (ast)</strong> and <strong><a href="https://en.wikipedia.org/wiki/Aragonese_language">Aragonese</a> (an)</strong>, two beautiful minority Romance language spoken by communities who almost never get access to modern speech technology.</p>
<p>These models represent a small but meaningful step toward a larger mission: help any under-resourced language build openly available TTS voices using only community data and ethical voice-cloning techniques.</p>
<h2><strong>Why This Work Matters</strong> - A Short Reality Check</h2>
<p>Most languages still lack even a single usable open TTS voice. Not because the technology doesn’t exist, but because:</p>
<ul>
<li>high-quality monolingual datasets are rare</li>
<li>speakers often can’t safely provide the many hours required</li>
<li>dialect diversity makes a single “official” voice unrealistic</li>
<li>basic tools (phonemizers, lexicons, G2P) often don’t exist</li>
</ul>
<p>Meanwhile, these communities still need TTS, for education, accessibility, media creation, cultural preservation, and linguistic pride.</p>
<p>Large tech companies rarely prioritise minority languages. Their incentives are simple: supporting a language only brings value if it brings new users or data.</p>
<p>For speakers of Asturian or Aragonese, who can also use Spanish, this lack of support nudges people away from their own languages. Over time, that invisibility contributes to language shift and erosion.</p>
<p>As a non-profit, we have different priorities: we want to empower all users, preserve linguistic diversity, and treat language as accessibility. This project is part of that mission.</p>
<p>So, how did we do it? We used a clever, hybrid approach that combines existing resources with cutting-edge voice cloning technology.</p>
<h3><strong>The "Low-Resource" Challenge</strong></h3>
<p>Imagine you want a computer to speak with a very specific voice, perhaps your own, or that of a beloved family member. Now imagine you only have a few seconds of that person speaking. That's our "low-resource donor voice."</p>
<p>At the same time, we have access to large <strong>Automatic Speech Recognition (ASR)</strong> datasets, like <strong>Mozilla Common Voice</strong>, which contain recordings of <em>many different people</em>. The problem is, it's not a single, consistent voice.</p>
<p>Our goal was to "transfer" the specific sound of our donor voice onto the vast amount of data available in these multi-speaker ASR datasets.</p>
<h3><strong>Our Hybrid Solution: A Step-by-Step Journey</strong></h3>
<p>Here's a simplified look at the process we followed (for a more detailed, technical explanation, check out our <strong><a href="https://tigregotico.github.io/whitepaper_hybrid_synthetic_tts_dataset.pdf">Whitepaper on Hybrid TTS Dataset Synthesis</a></strong>):</p>
<ol>
<li><strong>Gathering Our Raw Materials:</strong></li>
</ol>
<ul>
<li>We started with text and audio from two datasets: <a href="https://datacollective.mozillafoundation.org/datasets/cmflnuzw4hnmeuo2e6ea7ojbd">Common Voice Scripted Speech 23.0 - Asturian</a> and the <a href="https://datacollective.mozillafoundation.org/datasets/cmflnuzw4utkje79ymxwieas5">Common Voice Scripted Speech 23.0 - Aragonese</a>. These provided us with many text transcripts and their corresponding multi-speaker audio.</li>
<li>We also had a short recording of our "donor voice" – the target voice we wanted the TTS model to learn.</li>
</ul>
<ol start="2">
<li><strong>Audio Quality Filtering and Preparation:</strong></li>
</ol>
<ul>
<li>We converted all audio to a standard format and ensured the volume was consistent across all recordings (normalization).</li>
<li>We trimmed silence from the beginning and end of each recording.</li>
<li>We filtered out recordings where people spoke too fast or too slow (outliers based on <strong>Words-Per-Minute</strong>), keeping only the most natural and consistent segments. This focused our dataset on the best quality transcripts.</li>
</ul>
<ol start="3">
<li><strong>The Magic of Voice Cloning (Zero-Shot Revoicing):</strong></li>
</ol>
<ul>
<li>This is where modern AI comes in! Instead of training a complex model from scratch, we used an <strong>off-the-shelf zero-shot voice cloning solution</strong>.</li>
<li>This system was given a short reference clip of the <strong>donor voice</strong>. It uses this clip to learn the unique qualities of the voice.</li>
<li>We then fed our filtered ASR dataset into this cloning system. The original multi-speaker audio was discarded; the cloning tool simply generates new audio in our target donor voice. The result? A new dataset of Asturian/Aragonese audio, all spoken in a single, consistent voice!</li>
</ul>
<ol start="4">
<li><strong>Training the Final TTS Model:</strong></li>
</ol>
<ul>
<li>With our brand-new, high-quality, single-speaker datasets, we could finally train our TTS models.</li>
</ul>
<h3><strong>About Pronunciation &#x26; Phonemizers</strong></h3>
<p>These models were trained directly on graphemes, we did not use a phonemizer. Good G2P (grapheme-to-phoneme) tools for Asturian and Aragonese are scarce. A phonemizer usually improves pronunciation and, importantly, allows IPA input at runtime to force pronunciations when needed.</p>
<p>If you know of an existing phonemizer, have lexicons/pronunciation data, or want to help train one, please get in touch. This would materially improve future releases.</p>
<h3><strong>The Results:</strong></h3>
<p>The results are not perfect, our goal was mainly to validate that this approach works.</p>
<p>We used <a href="https://blog.openvoiceos.org/posts/2025-10-06-phoonnx">phoonnx</a> to train <a href="https://arxiv.org/abs/2106.06103">VITS</a> models, VITS is very performant and can run basically anywhere, it is also very easy to train without needing massive GPUS making it perfect for experimentation</p>
<p>There are many better architectures we can explore in the future to train truly natural sounding voices!</p>
<blockquote>
<p>"L'arcu la vieya ye un fenómenu ópticu y meteorolóxicu que produz l'apaición d'un espectru de lluz continu nel cielu cuando los rayos del sol trespasen pequeñes partícules de mugor conteníes n'atmósfera terrestre. La forma ye la d'un arcu multicolor col roxo hacia la parte esterior y el viola hacia la interior. El arco iris duble, ye menos avezau a vese, y tien los colores invertíos, esto ye, el roxo hacia dientro y el viola hacia l'esterior."</p>
</blockquote>
<audio controls>
  <source src="/assets/blog/ast/dii_ast.wav" type="audio/wav">
  Your browser does not support the audio element.
</audio>
<audio controls>
  <source src="/assets/blog/ast/miro_ast.wav" type="audio/wav">
  Your browser does not support the audio element.
</audio>
<p>Download: <a href="https://huggingface.co/OpenVoiceOS/phoonnx_ast_dii_unicode">Asturian — dii (female)</a> and <a href="https://huggingface.co/OpenVoiceOS/phoonnx_ast_miro_unicode">Asturian — miro (male)</a></p>
<blockquote>
<p>"L'arco de sant Chuan ye un fenomeno optico y meteorolochico que produce l'aparición d'un espectro de luz contino en o cielo cuan os rayos d'o sol trescruzan chicotas particlas d'humidat situatas por l'atmosfera terrestre. A forma suya ye a d'un arco multicolor con o royo en a parti exterior y o morau en a interior. No ye tan cutiano l'arco de sant Chuan dople, que incluye un segundo arco mas tenue con as colors chiratas, ye dicir o royo en l'interior y o morau en l'exterior."</p>
</blockquote>
<audio controls>
  <source src="/assets/blog/ast/dii_an.wav" type="audio/wav">
  Your browser does not support the audio element.
</audio>
<audio controls>
  <source src="/assets/blog/ast/miro_an.wav" type="audio/wav">
  Your browser does not support the audio element.
</audio>
<p>Download: <a href="https://huggingface.co/OpenVoiceOS/phoonnx_an_dii_unicode">Aragonese — dii (female)</a> and <a href="https://huggingface.co/OpenVoiceOS/phoonnx_an_miro_unicode">Aragonese — miro (male)</a></p>
<p>These models are a significant step forward for Asturian and Aragonese language technology. It demonstrates how modern AI, combined with careful data preparation, can empower underserved languages and bring them into the digital age. We're excited to see what developers and enthusiasts will build with it!</p>
<hr>
<h2>Help Us Build Voice for Everyone</h2>
<p>OpenVoiceOS is more than software, it’s a mission. If you believe voice assistants should be open, inclusive, and user-controlled, here’s how you can help:</p>
<ul>
<li><strong>💸 Donate</strong>: Help us fund development, infrastructure, and legal protection.</li>
<li><strong>📣 Contribute Open Data</strong>: Share voice samples and transcriptions under open licenses.</li>
<li><strong>🌍 Translate</strong>: Help make OVOS accessible in every language.</li>
</ul>
<p>We're not building this for profit. We're building it for people. With your support, we can keep voice tech transparent, private, and community-owned.</p>
<p>👉 <a href="https://www.openvoiceos.org/contribution">Support the project here</a></p>]]></description><link>https://openvoiceos.github.io/ovos-blogs/posts/2025-12-09-ast</link><guid isPermaLink="false">https://openvoiceos.github.io/ovos-blogs/posts/2025-12-09-ast</guid><dc:creator><![CDATA[JarbasAl]]></dc:creator><pubDate>Tue, 09 Dec 2025 00:00:00 GMT</pubDate></item><item><title><![CDATA[Good Old-Fashioned AI: The Secret Ingredient in a Modern Voice Assistant]]></title><description><![CDATA[<h1>Good Old-Fashioned AI: The Secret Ingredient in a Modern Voice Assistant</h1>
<p>In an era dominated by probabilistic giants like Large Language Models (LLMs), it might seem counterintuitive to advocate for a more traditional approach to Artificial Intelligence. Yet, OpenVoiceOS (OVOS) deliberately grounds its core architecture in <strong>Symbolic AI</strong>, often referred to as <strong>Good Old-Fashioned AI (GOFAI)</strong>.</p>
<p>This isn’t nostalgia, it’s engineering discipline. GOFAI provides <strong>predictable precision</strong> where voice assistants need it most: deterministic, fast, and transparent decision-making. Rather than chasing generalized intelligence, OVOS builds on logic, structure, and explainability, the ingredients of reliability.</p>
<hr>
<h2>The GOFAI Philosophy: Precision Over Probability</h2>
<p>Symbolic AI, the dominant paradigm from the 1950s through the 1990s, is built on explicit rules and formal logic. Instead of learning from massive datasets, it reasons with structured knowledge.</p>
<p>While LLMs are masters of generalization, they operate in a world of <strong>probabilities</strong>, a risky trait for voice assistants. For a task like “set a timer for three hours,” <em>almost correct</em> isn’t good enough. OVOS rejects this uncertainty for the foundational layers of its system, relying on GOFAI to handle the tasks that demand <strong>perfect precision</strong>: parsing numbers, times, colors, and other concrete entities.</p>
<p>This represents a deliberate trade-off: OVOS accepts the cost of manual rule design in exchange for <strong>guaranteed correctness</strong>, <strong>full transparency</strong>, and <strong>instant local execution</strong>, traits essential for embedded, privacy-respecting systems.</p>
<h3>Key Advantages of the Rule-Based Approach</h3>
<ul>
<li>
<p><strong>🪞 Interpretability and Debugging</strong> – Every rule is explicit. When something goes wrong, a developer can pinpoint <em>exactly</em> where and why. This transparency makes open-source collaboration possible, any contributor can trace and fix an issue without black-box guesswork.</p>
</li>
<li>
<p><strong>⚡ Performance and Efficiency</strong> – Static rule sets mean ultra-low latency and negligible compute load. Unlike LLMs that require large memory and GPU cycles, GOFAI parsers run instantly on modest hardware, ideal for offline or embedded devices.</p>
</li>
<li>
<p><strong>🎯 Guaranteed Precision</strong> – GOFAI is often criticized as “brittle” beyond its domain. But for tightly scoped operations, like setting timers, adjusting volume, or parsing commands, this rigidity is a feature, not a flaw. It ensures the system behaves predictably every single time.</p>
</li>
</ul>
<hr>
<h2>The OVOS Parser Toolkit: GOFAI in Action</h2>
<p>OVOS integrates a family of <strong>rule-based parsers</strong> that each specialize in a precise data type. They’re distributed as standalone Python packages—available to any developer needing high-precision, multilingual parsing.</p>
<p>Within OVOS, these parsers aren’t used for broad Named Entity Recognition (NER). Instead, they’re invoked <em>after</em> an intent is identified. Once a skill knows <em>what</em> kind of data to expect, these parsers ensure it’s extracted and converted flawlessly.</p>
<hr>
<h3>Numerical Mastery — ovos-number-parser</h3>
<p>Parsing numbers is deceptively complex. Humans use digits, words (“twenty”), ordinals (“first”), and fractions (“half”), often mixing them fluidly.
<a href="https://github.com/OpenVoiceOS/ovos-number-parser"><code>ovos-number-parser</code></a> handles these with meticulous, multilingual rule sets to produce exact numeric values.</p>
<p>This matters more than it seems: even “one billion” means different things across languages. English uses the <strong>short scale</strong> (1,000,000,000), while French or German follow the <strong>long scale</strong> (1,000,000,000,000). Rule-based, language-aware parsing guarantees consistent results across locales, critical for mathematical accuracy in global deployments.</p>
<hr>
<h3>Temporal Precision — ovos-date-parser</h3>
<p>Time is contextual: “tomorrow at noon” means nothing without a reference point.
<a href="https://github.com/OpenVoiceOS/ovos-date-parser"><code>ovos-date-parser</code></a> converts these human phrases into exact <code>datetime</code> objects, handling relative expressions like “three days ago” or “next Friday” with surgical precision.</p>
<hr>
<h3>Identifying the Spectrum — ovos-color-parser</h3>
<p>For tasks like setting smart light colors or adjusting UI themes, <a href="https://github.com/OpenVoiceOS/ovos-color-parser"><code>ovos-color-parser</code></a> maps natural language directly to color values.
It recognizes everything from “dark goldenrod” to RGB notation using explicit syntax rules and lookup tables.</p>
<p>This is a textbook GOFAI application: a deterministic, low-latency parser that achieves perfect accuracy without machine learning overhead.</p>
<hr>
<h3>The Language Linchpin — ovos-lang-parser</h3>
<p>The <a href="https://github.com/OpenVoiceOS/ovos-lang-parser"><code>ovos-lang-parser</code></a> extracts language references from user commands, for instance, turning
“How do I say ‘hello’ in French?” → <code>"French" → "fr"</code> (BCP-47 code).</p>
<hr>
<h2>The Symphony: GOFAI in Harmony</h2>
<p>The power of OVOS’s design becomes clear when these parsers work together.</p>
<p>Consider what needs to happen in the command:</p>
<blockquote>
<p>“In three hours, the light should be set to cyan at fifty percent brightness.”</p>
</blockquote>
<ol>
<li>The intent engine activates the relevant skills (timer and light control).</li>
<li>The timer skill uses <code>ovos-date-parser</code> to resolve “in three hours” to a future <code>datetime</code>.</li>
<li>The light control skill uses <code>ovos-color-parser</code> to map “cyan” to an exact color code.</li>
<li>The same skill calls <code>ovos-number-parser</code> to convert “fifty percent” to the value <code>0.5</code>.</li>
</ol>
<p>All of this happens locally, instantly, and deterministically, no cloud, no lag, no guesswork.</p>
<hr>
<h2>The Future is Collaborative and Deterministic</h2>
<p>Building on GOFAI is not a step backward, it’s a step toward <strong>trustworthy AI</strong>. OVOS’s rule-based foundation ensures reproducibility, transparency, and full user control.</p>
<p>While refining these rules demands human expertise, the classic <em>knowledge bottleneck</em>, OVOS turns that into a feature, not a flaw. By inviting developers to become <strong>knowledge engineers</strong>, it transforms AI development into a collective, interpretable, and sustainable process.</p>
<p>In a world chasing statistical magic, OVOS stands as proof that <strong>determinism is not limitation, it’s reliability</strong>.
A system you can understand is a system you can trust.</p>
<hr>
<h2>Help Us Build Voice for Everyone</h2>
<p>OpenVoiceOS is more than software, it’s a mission. If you believe voice assistants should be open, inclusive, and user-controlled, here’s how you can help:</p>
<ul>
<li><strong>💸 Donate</strong>: Help us fund development, infrastructure, and legal protection.</li>
<li><strong>📣 Contribute Open Data</strong>: Share voice samples and transcriptions under open licenses.</li>
<li><strong>🌍 Translate</strong>: Help make OVOS accessible in every language.</li>
</ul>
<p>We're not building this for profit. We're building it for people. With your support, we can keep voice tech transparent, private, and community-owned.</p>
<p>👉 <a href="https://www.openvoiceos.org/contribution">Support the project here</a></p>]]></description><link>https://openvoiceos.github.io/ovos-blogs/posts/2025-11-25-gofai</link><guid isPermaLink="false">https://openvoiceos.github.io/ovos-blogs/posts/2025-11-25-gofai</guid><dc:creator><![CDATA[JarbasAl]]></dc:creator><pubDate>Tue, 25 Nov 2025 00:00:00 GMT</pubDate></item><item><title><![CDATA[OVOS Just Got a Noise Filter: Better Listening, Less Interruption]]></title><description><![CDATA[<h2>OVOS Just Got a Noise Filter: Better Listening, Less Interruption</h2>
<p>We're thrilled to announce a new feature in OpenVoiceOS (OVOS) that aims to improve how your device listens: <strong>Pre-Wake-VAD</strong> (Voice Activity Detection). This new logic improves performance, reduces false activations, and makes OVOS an even more delightful and reliable voice assistant.</p>
<hr>
<h3>The Old Way vs. The New Breakthrough</h3>
<p>Before <strong>Pre-Wake-VAD</strong>, the OVOS listener loop operated in a straightforward manner:</p>
<ul>
<li>
<ol>
<li>The system was <strong>always listening</strong> for the wake word (WW).</li>
</ol>
</li>
<li>
<ol start="2">
<li>Upon hearing the WW, it started recording the user's request.</li>
</ol>
</li>
<li>
<ol start="3">
<li><strong>VAD</strong> was used <em>after</em> the wake word to detect when the user <strong>finished speaking</strong>.</li>
</ol>
</li>
</ul>
<p>This constant listening for the wake word consumed more CPU and, critically, was susceptible to false activations. Random noises, music, or even animal sounds could sometimes trigger the wake word.</p>
<h4>The <strong>Pre-Wake-VAD</strong> Logic: A Smarter Sequence</h4>
<p>With the new feature, we've flipped the script. The system now follows a more intelligent, two-stage listening process:</p>
<ul>
<li>
<ol>
<li><strong>Wait for Speech (VAD First):</strong> The system first waits to detect <em>any</em> human speech using <strong>VAD</strong>.</li>
</ol>
</li>
<li>
<ol start="2">
<li><strong>Listen for Wake Word (WW Second):</strong> Only <em>after</em> speech is detected does the system engage the more CPU-intensive <strong>wake word engine</strong>.</li>
</ol>
</li>
<li>
<ol start="3">
<li><strong>Timeout:</strong> If the wake word is not detected within 5 seconds of the initial speech detection, the system assumes it was not an activation and returns to the VAD-only 'wait for speech' state.</li>
</ol>
</li>
</ul>
<hr>
<h3>Two Massive Wins for OVOS Users</h3>
<p>This seemingly small change delivers two enormous benefits:</p>
<h4>CPU Reduction</h4>
<p>VAD is inherently less expensive to run than a complex wake word engine. By spending the vast majority of its idle time in the low-cost VAD stage, OVOS uses slightly less CPU overall. This is fantastic news for users running OVOS on resource-constrained or battery-powered devices.</p>
<h4>Reduced False Activations</h4>
<p>This is the real game-changer. The wake word engine no longer has to focus on distinguishing <strong>wake-word speech</strong> from <strong>non-speech</strong> sounds like music, background noise, or random audio spikes.</p>
<ul>
<li>The <strong>VAD</strong> acts as a powerful <strong>noise filter</strong>, ensuring only <em>actual human speech</em> reaches the wake word engine.</li>
<li>The <strong>wake word engine</strong> can now concentrate solely on its core task: distinguishing <strong>wake-word speech</strong> from <strong>non-wake-word speech</strong>.</li>
</ul>
<p>This collaborative approach has been shown to massively reduce the false activation rate of OVOS, making your interaction with the assistant smoother and less interrupted!</p>
<hr>
<h3>A Huge Boost for the Vosk Wake Word Plugin</h3>
<p>The introduction of Pre-Wake-VAD is particularly exciting for the popular <a href="https://github.com/OpenVoiceOS/ovos-ww-plugin-vosk"><strong>Vosk wake word plugin</strong></a>.</p>
<p>For those unfamiliar, the Vosk plugin is incredible because it allows users to change their wake word simply by editing a text string in a config file, no data collection, no model training required! However, this flexibility sometimes came with a higher false activation rate.</p>
<p>With <strong>Pre-Wake-VAD</strong>, that concern is largely eliminated. By getting rid of false positives caused by non-speech audio, this new feature makes the versatile, text-configurable Vosk plugin far more usable and reliable for everyone.</p>
<p>Stay tuned! We’ll be sharing the comprehensive benchmark results in an upcoming blog post, but early results confirm a massive difference in false activations!</p>
<hr>
<h2>How to Enable Pre-Wake-VAD</h2>
<p>To start using this new noise-filtering feature, you need to ensure you have the correct version of the listener package and update your configuration file.</p>
<p>You must be running at least version <strong>0.5.0</strong> of the listener package. You can update it using pip:</p>
<pre><code class="hljs language-bash">pip install --upgrade ovos-dinkum-listener
</code></pre>
<p>Add the following parameter inside the <code>"listener"</code> section of your <code>mycroft.conf</code> file to enable the Pre-Wake-VAD logic:</p>
<pre><code class="hljs language-json"><span class="hljs-punctuation">{</span>
  <span class="hljs-attr">"listener"</span><span class="hljs-punctuation">:</span> <span class="hljs-punctuation">{</span>
    <span class="hljs-attr">"vad_pre_wake_enabled"</span><span class="hljs-punctuation">:</span> <span class="hljs-literal"><span class="hljs-keyword">true</span></span>
  <span class="hljs-punctuation">}</span>
<span class="hljs-punctuation">}</span>
</code></pre>
<p>Once both steps are complete, restart your OVOS service, and you'll be using the new, smarter listening loop!</p>
<hr>
<h2>Help Us Build Voice for Everyone</h2>
<p>OpenVoiceOS is more than software, it’s a mission. If you believe voice assistants should be open, inclusive, and user-controlled, here’s how you can help:</p>
<ul>
<li><strong>💸 Donate</strong>: Help us fund development, infrastructure, and legal protection.</li>
<li><strong>📣 Contribute Open Data</strong>: Share voice samples and transcriptions under open licenses.</li>
<li><strong>🌍 Translate</strong>: Help make OVOS accessible in every language.</li>
</ul>
<p>We're not building this for profit. We're building it for people. With your support, we can keep voice tech transparent, private, and community-owned.</p>
<p>👉 <a href="https://www.openvoiceos.org/contribution">Support the project here</a></p>]]></description><link>https://openvoiceos.github.io/ovos-blogs/posts/2025-11-06-prewake-vad</link><guid isPermaLink="false">https://openvoiceos.github.io/ovos-blogs/posts/2025-11-06-prewake-vad</guid><dc:creator><![CDATA[JarbasAl]]></dc:creator><pubDate>Thu, 06 Nov 2025 00:00:00 GMT</pubDate></item><item><title><![CDATA[Precise Wake Word Engine Goes ONNX!]]></title><description><![CDATA[<h2>Precise Wake Word Engine Goes ONNX!</h2>
<hr>
<p>For years, the OpenVoiceOS community has relied on the <a href="https://github.com/OpenVoiceOS/ovos-ww-plugin-precise-lite">Precise-lite</a> wake word engine. It's the core of how you bring your voice assistant to life! However, as technology evolves, so must we. The time has come to ensure future compatibility and even better performance on modern hardware.</p>
<h2>The Problem: Outdated Dependencies</h2>
<p>The original Precise implementation depended heavily on <code>tflite_runtime</code> and specific, older versions of other packages, notably requiring <code>numpy&#x3C;2.0.0</code>. This setup has been a growing headache:</p>
<ul>
<li><strong>⚠️ Deprecation:</strong> <code>tflite_runtime</code> is becoming increasingly difficult to install and maintain, especially with recent versions of Python.</li>
<li><strong>🐍 Python Version Incompatibility:</strong> The strict dependency on older packages was creating friction and installation failures for users adopting newer Python releases.</li>
</ul>
<p>This friction was an unnecessary barrier to entry for new users and a source of frustration for existing ones. We knew we had to fix it.</p>
<hr>
<h2>The Solution: Migrating to ONNX!</h2>
<p>We've completely overhauled the Precise wake word plugin by <strong>exporting Precise models to the ONNX format</strong> (Open Neural Network Exchange).</p>
<h3>Introducing ovos-ww-plugin-precise-onnx</h3>
<p>We are thrilled to announce the official release of the new plugin: <a href="https://github.com/OpenVoiceOS/ovos-ww-plugin-precise-onnx"><code>ovos-ww-plugin-precise-onnx</code></a>!</p>
<p>By adopting ONNX, we eliminate the reliance on the problematic <code>tflite-runtime</code> package. ONNX provides a standardized, performant, and future-proof format for machine learning models, ensuring better compatibility across different systems and hardware.</p>
<p>You can find <strong>all available Precise models</strong>, now converted to ONNX, in our dedicated repository: <a href="https://github.com/OpenVoiceOS/precise-lite-models"><code>precise-lite-models</code></a>.</p>
<hr>
<h2>The Unexpected Performance Bonus</h2>
<p>While the primary goal was to fix installation woes, the move to ONNX has delivered an <strong>unexpected performance boost!</strong></p>
<p>Initial testing on a Raspberry Pi 5 shows a significant reduction in CPU usage:</p>
<table>
<thead>
<tr>
<th align="left">Wake Word Engine</th>
<th>RPi 5 Listener Process CPU Usage (Approx.)</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left"><strong>Precise (<code>tflite-runtime</code>)</strong></td>
<td>~1.8% - 2.6%</td>
</tr>
<tr>
<td align="left"><strong>Precise (ONNX)</strong></td>
<td><strong>~1.0% - 1.3%</strong></td>
</tr>
</tbody>
</table>
<p>That's a nice efficiency gain, allowing your OVOS system to dedicate more resources to running skills and other tasks! This makes OpenVoiceOS even lighter and faster on edge devices like the Raspberry Pi.</p>
<hr>
<h2>Get the Upgrade Today!</h2>
<p>This migration ensures that the popular Precise engine remains a stable, high-performance option for the OpenVoiceOS community for years to come. Enjoy the easier installation, greater stability, and lower CPU usage!</p>
<p>Update your installation and try the new <strong>ONNX-powered Precise</strong> for yourself!</p>
<ul>
<li><strong>New Plugin:</strong> <a href="https://github.com/OpenVoiceOS/ovos-ww-plugin-precise-onnx"><code>ovos-ww-plugin-precise-onnx</code></a></li>
<li><strong>ONNX Models:</strong> <a href="https://github.com/OpenVoiceOS/precise-lite-models"><code>precise-lite-models</code></a></li>
</ul>
<hr>
<h2>Help Us Build Voice for Everyone</h2>
<p>If you believe that voice assistants should be open, inclusive, and user-controlled, we invite you to support OVOS:</p>
<ul>
<li>
<p><strong>💸 Donate</strong>: Your contributions help us pay for infrastructure, development, and legal protections.</p>
</li>
<li>
<p><strong>📣 Contribute Open Data</strong>: Speech models need diverse, high-quality data. If you can share voice samples, transcripts, or datasets under open licenses, let's collaborate.</p>
</li>
<li>
<p><strong>🌍 Help Translate</strong>: OVOS is global by nature. Translators make our platform accessible to more communities every day.</p>
</li>
</ul>
<p>We're not building this for profit. We're building it for people. And with your help, we can ensure open voice has a future—transparent, private, and community-owned.</p>
<p>👉 <a href="https://www.openvoiceos.org/contribution">Support the project here</a></p>]]></description><link>https://openvoiceos.github.io/ovos-blogs/posts/2025-11-03-precise-onnx</link><guid isPermaLink="false">https://openvoiceos.github.io/ovos-blogs/posts/2025-11-03-precise-onnx</guid><dc:creator><![CDATA[JarbasAl]]></dc:creator><pubDate>Mon, 03 Nov 2025 00:00:00 GMT</pubDate></item><item><title><![CDATA[Building an Open and Interoperable Voice Ecosystem]]></title><description><![CDATA[<h2>Building an Open and Interoperable Voice Ecosystem</h2>
<p>Open Voice OS (OVOS) has always been about <strong>freedom and flexibility</strong>, giving users full control over their voice assistants and how they connect with the world. But freedom also brings a challenge: ensuring that all these independent components can <em>understand</em> and <em>work with</em> each other.</p>
<p>That’s where <strong>standards and interoperability</strong> come in.</p>
<p>Today, OVOS is doubling down on its commitment to <strong>protocol alignment</strong>, not by locking into a single ecosystem, but by <strong>speaking the languages of many</strong>. This post outlines where we are now and where we’re going, as OVOS evolves into a truly interoperable, protocol-aware voice platform. Much of this work is supported by the <a href="https://nlnet.nl/project/OpenVoiceOS/"><strong>NGI0 Commons Fund</strong></a>, helping us build a more open, connected, and privacy-respecting voice ecosystem.</p>
<hr>
<h2>Why Standards Matter</h2>
<p>Voice technology thrives when systems can communicate. Whether you're connecting a speech-to-text engine to a dialogue manager, or bridging a local agent to a cloud service, the key is <strong>clear, consistent interfaces</strong>.</p>
<p>Interoperability means:</p>
<ul>
<li>🧩 <strong>Plug-and-play</strong> integration between tools, frameworks, and agents</li>
<li>⚙️ <strong>Reuse</strong> of existing infrastructure instead of reinventing the wheel</li>
<li>🔄 <strong>Resilience</strong> - the freedom to swap out components without breaking everything else</li>
</ul>
<p>In short: standards keep the open in Open Voice OS.</p>
<hr>
<h2>Near-Future Work: MCP, UTCP, and A2A</h2>
<p>A major focus for upcoming OVOS releases is <strong>protocol-level interoperability</strong>. Several new standards are being explored to allow OVOS to seamlessly connect with external AI ecosystems.</p>
<p>This is part of OVOS’s broader move toward <strong>multi-agent systems</strong>, exemplified by the <strong><code>ovos-persona-pipeline</code></strong>, which allows users to <strong>change the personality</strong> of OVOS on demand , effectively switching to an entirely new agent context.
These same personas will be the ones participating in MCP, UTCP, and A2A communication once these protocols land.</p>
<p><img src="/assets/blog/protocol_interoperability/agentic_personas.png" alt="Agentic solver plugins"></p>
<hr>
<h3>Model Context Protocol (MCP) and  Universal Tool Calling Protocol (UTCP)</h3>
<p>The <strong><a href="https://modelcontextprotocol.info">Model Context Protocol (MCP)</a></strong> defines how agents and tools can exchange structured context and reasoning requests. In the near future, OVOS plans to both <strong>consume</strong> MCP-compatible tools and <strong>expose</strong> its own services (like STT, TTS, translation, and skills) over MCP.</p>
<p>This would allow external systems — including other assistants or orchestration layers — to treat OVOS capabilities as MCP tools.</p>
<p>In parallel, we’re also experimenting with <strong><a href="https://www.utcp.io">Universal Tool Calling Protocol (UTCP)</a></strong>.
While MCP and UTCP have overlapping goals, they serve slightly different audiences. OVOS intends to support <strong>both</strong>, ensuring maximum compatibility and easy integration across ecosystems.</p>
<blockquote>
<p>“We like UTCP but we love interoperability.”</p>
</blockquote>
<p>Our goal is to make OVOS a universal interface layer, capable of understanding and serving requests in either protocol.</p>
<p><img src="/assets/blog/protocol_interoperability/tools.png" alt="MCP/UTCP with OpenVoiceOS"></p>
<hr>
<h3>Agent-to-Agent Protocol (A2A)</h3>
<p>Finally, we’re integrating the <strong><a href="https://a2a-protocol.org">Agent-to-Agent (A2A)</a></strong> protocol to allow multiple agents to discover, communicate, and collaborate dynamically.</p>
<p>This work is already underway in the <strong><code>ovos-persona-server</code></strong>, and will eventually power <strong>multi-agent orchestration</strong> where different personas or solver plugins can coordinate tasks collaboratively.</p>
<p><img src="/assets/blog/protocol_interoperability/a2a.png" alt="A2A protocol with OpenVoiceOS"></p>
<hr>
<h2>The OVOS Messagebus Protocol</h2>
<p>Under the hood, OVOS uses a <strong>websocket-based JSON messagebus</strong> to communicate internally. Historically, message formats were somewhat ad hoc, but that’s changing.</p>
<p>An <a href="https://github.com/OpenVoiceOS/ovos-pydantic-models"><strong>index of Pydantic models</strong></a> is now being developed to describe all known OVOS message types forming what we call the <strong>OVOS Messagebus Protocol</strong>.</p>
<p>This documentation effort will make it easier for external tools, dashboards, or bridges (like HiveMind) to interact with OVOS safely and predictably.</p>
<p><img src="/assets/blog/protocol_interoperability/bus.png" alt="OpenVoiceOS messagebus"></p>
<hr>
<h2>HiveMind: A Transport Protocol for Federated Voice Networks</h2>
<p><a href="https://jarbashivemind.github.io/HiveMind-community-docs/"><strong>HiveMind</strong></a> is a <strong>hierarchical transport protocol</strong>, defining clear rules for how messages are routed and how nodes communicate across a distributed network.</p>
<p>This means OVOS can operate as just one participant within a much larger HiveMind network or power that network entirely.</p>
<p>While HiveMind was originally designed to support <strong>OVOS Messages</strong>, it’s <strong>agent-agnostic</strong>, capable of transporting any kind of message for any kind of AI agent.
HiveMind achieves this flexibility through <strong>HiveMind agent plugins</strong>, which act as adapters between HiveMind and the agent logic itself.</p>
<p><img src="/assets/blog/protocol_interoperability/hm.png" alt="HiveMind"></p>
<p>The <strong>reference plugin</strong> uses OVOS as the agent, but <strong>any agent</strong> can be integrated, as long as the plugin can <strong>consume and emit OVOS messages</strong>, translating them to whatever the target agent requires.</p>
<p>This architecture enables a distributed ecosystem where:</p>
<ul>
<li>Different devices can run <strong>different agents</strong>, each with its own intelligence and capabilities.</li>
<li>All HiveMind satellites remain compatible and interconnected.</li>
<li>You can use the <strong>OVOS audio stack and plugin ecosystem</strong> (for STT, TTS, wake words, etc.) with <strong>any non-OVOS agent</strong>, simply by writing a small <strong>HiveMind agent plugin</strong> wrapper.</li>
<li><code>hivemind-a2a-agent-plugin</code> will allow connecting hivemind voice satellites to any A2A agent</li>
</ul>
<p>With the ongoing effort to formalize the <strong>OVOS Messagebus Protocol</strong>, HiveMind will soon align even more closely, officially carrying the same message definitions inside, effectively becoming an <strong>implementation of the OVOS Messagebus over the HiveMind protocol</strong>.</p>
<hr>
<h2>OVOS Plugin Manager: Interoperability by Design</h2>
<p>At the core of OVOS’s modularity is the <strong>OVOS Plugin Manager</strong>, which allows <em>every single OVOS component</em> to be swapped out dynamically.</p>
<p>In this context, the “protocol” is a <strong>shared base class</strong> with a well-defined API that each plugin implements. This ensures that all plugins, no matter where they’re deployed, expose the same interface to consumers.</p>
<p>This approach allows the <strong>same plugin</strong> to run anywhere:</p>
<ul>
<li>as a standalone <strong>HTTP microservice</strong> (e.g., STT, TTS, translation, or persona server),</li>
<li>locally on an <strong>OVOS device</strong>,</li>
<li>under a <strong>HiveMind satellite</strong>, or</li>
<li>as a service under <strong>HiveMind server</strong>.</li>
</ul>
<p>It can even be embedded into other projects that want immediate access to OVOS’s plugin ecosystem <strong>without importing the full OVOS stack</strong>.</p>
<p>Each plugin is effectively an <strong>interoperability layer across technologies</strong>: reusable, self-contained, and designed to connect to anything.</p>
<blockquote>
<p>Use only the pieces you need without losing compatibility.</p>
</blockquote>
<p><img src="/assets/blog/protocol_interoperability/opm.png" alt="OpenVoiceOS plugin servers"></p>
<hr>
<h2>The Big Picture: “Connect Anything to OVOS, and OVOS to Anything”</h2>
<p>All these efforts — MCP, UTCP, A2A, HiveMind, the Messagebus Protocol, the Plugin Manager, and even <a href="https://blog.openvoiceos.org/posts/2025-09-17-ovos_ha_dream_team">Wyoming adapters</a> — share one goal:
to make OVOS a <strong>universal connector</strong> in the voice and AI ecosystem.</p>
<p>Whether you’re using local models, cloud APIs, or other assistants, OVOS aims to act as the <strong>interoperability layer</strong> that ties them together.</p>
<p>The result?
An assistant that doesn’t lock you in, it <strong>opens you up</strong> to an entire universe of tools, models, and agents.</p>
<hr>
<h3>Work in Progress</h3>
<p>These initiatives are all part of ongoing research and development. MCP, UTCP, and A2A are <strong>planned but not yet implemented</strong>, while HiveMind, the Messagebus documentation, and standalone OVOS microservices are active and evolving.</p>
<p>The roadmap is ambitious but clear:
make OVOS the <strong>most interoperable open-source assistant platform</strong> in existence.</p>
<p>With support from the <a href="https://nlnet.nl/project/OpenVoiceOS/"><strong>NGI0 Commons Fund</strong></a>, we’re investing in open standards, transparent protocols, and bridges that connect communities across the open-source AI and voice landscape.</p>
<p>If you care about open standards, agentic AI, and the freedom to connect <em>anything to anything</em>, we’d love your input and contributions.
OVOS is not just building an assistant, it’s building the <strong>protocols of open intelligence</strong>.</p>
<hr>
<h2>Help Us Build Voice for Everyone</h2>
<p>OpenVoiceOS is more than software, it’s a mission. If you believe voice assistants should be open, inclusive, and user-controlled, here’s how you can help:</p>
<ul>
<li><strong>💸 Donate</strong>: Help us fund development, infrastructure, and legal protection.</li>
<li><strong>📣 Contribute Open Data</strong>: Share voice samples and transcriptions under open licenses.</li>
<li><strong>🌍 Translate</strong>: Help make OVOS accessible in every language.</li>
</ul>
<p>We're not building this for profit. We're building it for people. With your support, we can keep voice tech transparent, private, and community-owned.</p>
<p>👉 <a href="https://www.openvoiceos.org/contribution">Support the project here</a></p>]]></description><link>https://openvoiceos.github.io/ovos-blogs/posts/2025-10-24-protocol_interoperability</link><guid isPermaLink="false">https://openvoiceos.github.io/ovos-blogs/posts/2025-10-24-protocol_interoperability</guid><dc:creator><![CDATA[JarbasAl]]></dc:creator><pubDate>Fri, 24 Oct 2025 00:00:00 GMT</pubDate></item><item><title><![CDATA[OpenVoiceOS Receives NGI Zero Commons Fund Grant]]></title><description><![CDATA[<h1>OpenVoiceOS Receives NGI Zero Commons Fund Grant</h1>
<p>We’re excited to share some fantastic news — <strong>OpenVoiceOS (OVOS)</strong> has been selected to receive a grant from the <a href="https://www.ngi.eu/ngi-projects/ngi-zero-commons-fund/"><strong>NGI Zero Commons Fund</strong></a>!</p>
<p>This milestone represents a huge step forward for our mission to build an open, community-driven, and <strong>privacy-first</strong> voice assistant platform. With this support, we can accelerate our progress from beta to a first stable version under our OpenVoiceOS umbrella.</p>
<hr>
<h2>What is OpenVoiceOS?</h2>
<p><strong>OVOS</strong> is Europe’s open-source alternative to proprietary voice assistants.
Built around transparency, modularity, and user freedom, OVOS lets you choose every component of your assistant — from <strong>wake word</strong> and <strong>speech recognition</strong> to <strong>text-to-speech</strong>, <strong>intent handling</strong>, and <strong>AI conversation modules</strong> — all while keeping full control over your data.</p>
<p>You can run OVOS <strong>entirely offline</strong>, <strong>on-premises</strong>, or <strong>in the cloud</strong>, making it suitable for:</p>
<ul>
<li>Smart homes and IoT devices</li>
<li>Accessibility and assistive technologies</li>
<li>Industrial and enterprise applications</li>
<li>Research and educational environments</li>
</ul>
<hr>
<h2>What the Grant Enables</h2>
<p>With support from the NGI Zero Commons Fund, we’ll be able to:</p>
<ul>
<li>💼 <strong>Hire our lead developer</strong> to deliver on the first stable roadmap</li>
<li>🧭 <strong>Improve onboarding</strong> and usability for non-technical users</li>
<li>🌍 <strong>Expand language support</strong> and stabilize platform components</li>
<li>📚 <strong>Enhance documentation</strong> for developers building skills and plug-ins</li>
</ul>
<hr>
<h2>Real-World Impact</h2>
<p>OVOS is already making a difference across Europe and beyond.
It powers projects such as the <a href="https://voicelab.visio.org/"><strong>Royal Dutch Visio Voicelab</strong></a>, bringing voice interaction to visually impaired users, as well as conversational assistants in <a href="https://khoshnazdesign.com/Fritzi/"><strong>nursing homes</strong></a>, <a href="https://coala-ai.de/"><strong>manufacturing</strong></a>, and/ or <a href="https://proyectoilenia.es/recursos-modelos-datasets/"><strong>multilingual AI initiatives</strong></a>.</p>
<p>This grant helps us strengthen that ecosystem — empowering developers, researchers, and users to shape the future of open, ethical voice technology.</p>
<hr>
<h2>Thank You ❤️</h2>
<p>We want to express our gratitude to the <a href="https://www.ngi.eu/ngi-projects/ngi-zero-commons-fund/"><strong>NGI Zero Commons Fund</strong></a>, <a href="https://nlnet.nl/"><strong>NLnet Foundation</strong></a>, and our incredible community of <strong>contributors</strong>, <strong>volunteers</strong>, and <strong>users</strong>.
Your support makes it possible to keep building a voice assistant platform that prioritizes <strong>freedom</strong>, <strong>transparency</strong>, and <strong>privacy</strong>.</p>
<p>Together, we’re proving that open voice technology can be <strong>trustworthy</strong>, <strong>user-owned</strong>, and <strong>truly free</strong>.</p>
<p>Stay tuned for more updates as we move toward our full <strong>first stable release</strong>!</p>
<p>🔗 <a href="https://nlnet.nl/news/2025/20251016-selection-NGI0CommonsFund.html">NLnet Announcement</a>
🔗 <a href="https://nlnet.nl/commonsfund/">NGI Zero Commons Fund</a>
🔗 <a href="https://nlnet.nl/project/OpenVoiceOS/">OpenVoiceOS Project Page</a></p>
<hr>
<h2>Help Us Build Voice for Everyone</h2>
<p>If you believe that voice assistants should be open, inclusive, and user-controlled, we invite you to support OVOS:</p>
<ul>
<li>
<p><strong>💸 Donate</strong>: Your contributions help us pay for infrastructure, development, and legal protections.</p>
</li>
<li>
<p><strong>📣 Contribute Open Data</strong>: Speech models need diverse, high-quality data. If you can share voice samples, transcripts, or datasets under open licenses, let's collaborate.</p>
</li>
<li>
<p><strong>🌍 Help Translate</strong>: OVOS is global by nature. Translators make our platform accessible to more communities every day.</p>
</li>
</ul>
<p>We're not building this for profit. We're building it for people. And with your help, we can ensure open voice has a future—transparent, private, and community-owned.</p>
<p>👉 <a href="https://www.openvoiceos.org/contribution">Support the project here</a></p>]]></description><link>https://openvoiceos.github.io/ovos-blogs/posts/2025-10-20-ngi</link><guid isPermaLink="false">https://openvoiceos.github.io/ovos-blogs/posts/2025-10-20-ngi</guid><dc:creator><![CDATA[Peter Steenbergen]]></dc:creator><pubDate>Mon, 20 Oct 2025 00:00:00 GMT</pubDate></item><item><title><![CDATA[OpenVoiceOS Blog Launches RSS Feed — A Win for Accessibility & Open Standards]]></title><description><![CDATA[<h2>OpenVoiceOS Blog Launches RSS Feed — A Win for Accessibility &#x26; Open Standards</h2>
<p>We’re happy to announce that the <strong>OVOS (OpenVoiceOS)</strong> blog now supports a <strong>RSS / Atom feed</strong> at:</p>
<p>👉 <a href="https://blog.openvoiceos.org/feed.xml">https://blog.openvoiceos.org/feed.xml</a></p>
<p>This may look like a small technical change, but in reality it’s a big step forward — especially from the perspective of accessibility, interoperability, and empowering users. Below are a few reasons why we believe RSS is still relevant and why adding a feed is aligned with our vision.</p>
<hr>
<h2>Why RSS Still Matters in 2025</h2>
<p>Yes, RSS is old technology, but that doesn’t mean it’s obsolete. Here are a few reasons it continues to deserve a place in the modern web ecosystem:</p>
<ol>
<li>
<p><strong>Decentralized content subscription</strong>
RSS gives users control. Instead of being locked into one platform’s algorithm or interface, people can choose their own feed reader or aggregator. You subscribe once, and updates come to <em>you</em>, not the other way around.</p>
</li>
<li>
<p><strong>Low overhead, simple format</strong>
RSS / Atom feeds are lightweight XML documents. They don’t need heavy frameworks, JavaScript, or APIs. That means lower bandwidth, faster performance, and more compatibility with devices and assistive technologies.</p>
</li>
<li>
<p><strong>Resilience &#x26; longevity</strong>
Since RSS is an open standard, it’s not tied to any one company or service. If a blog or website changes its front-end, the feed often remains a stable outlet for updates. That robustness is valuable for archival, backup, and long-term content access.</p>
</li>
<li>
<p><strong>Interoperability &#x26; portability</strong>
RSS works across platforms, apps, and ecosystems. Want to surface blog posts in your smart home panel, or integrate them into your voice assistant? A feed makes that straightforward. You don’t have to reverse-engineer HTML pages or depend on ad-hoc APIs.</p>
</li>
<li>
<p><strong>Accessibility-friendly</strong>
This is especially important for OVOS and the intersection of voice, assistive tech, and open systems. Many feed readers are designed to work well with screen readers, text-to-speech tools, or simple “next item” navigation. Because RSS is structured, semantic, and minimal, assistive software can more reliably parse it.</p>
</li>
</ol>
<hr>
<h2>Accessibility &#x26; Open Standards: Why We Care</h2>
<p>At OVOS, our mission is to push toward more open, user-empowering voice and assistant systems. We believe accessibility is not an afterthought, it should be baked into every layer. Supporting an RSS feed is consistent with that philosophy:</p>
<ul>
<li><strong>Equal access to content</strong>: Users who rely on assistive tech (screen readers, braille displays, voice-driven readers) should be able to consume blog updates just as easily as sighted users. Structured feeds help.</li>
<li><strong>Choice in how you consume</strong>: Whether you prefer to read in a browser, within a reader app, or have updates read out to you, RSS gives you options.</li>
<li><strong>Standards matter</strong>: Open standards promote interoperability and reduce handshake friction between systems. We’d rather everyone “speak RSS / Atom” than build brittle, custom bridges.</li>
<li><strong>Future-proofing</strong>: We want the OVOS blog to remain accessible and usable even if web frameworks change, hosting changes, or site redesigns happen. The feed is a thread through all of that.</li>
</ul>
<hr>
<h2>How You Can Use the New Feed</h2>
<p>Here are a few ideas to take advantage of the new RSS feed:</p>
<ul>
<li><strong>Add it to your favorite news / feed reader</strong> (e.g. Feedly, Inoreader, Thunderbird) so new posts show up automatically.</li>
<li><strong>Use it in automation or scripting</strong>: For example, fetch new posts via a lightweight script, push them to your personal dashboard, or trigger voice notifications.</li>
<li><strong>Integrate with voice assistants</strong>: If you build a skill/plugin in OVOS or another system, you can parse the feed and let people ask “What’s new on the blog?” rather than scraping HTML.</li>
<li><strong>Share / mirror</strong>: You could mirror or aggregate OVOS content into your own blog or archive, thanks to the open nature of RSS.</li>
</ul>
<hr>
<h2>Next Steps</h2>
<ul>
<li>We’ll monitor for feedback, if the feed has issues, we’ll fix them.</li>
<li>Consider expanding meta-data in feed items: more tags, categories, summaries, thumbnails, etc., to make it richer for readers and integrations.</li>
<li>Encourage third-party projects in the OVOS ecosystem to use the feed for cross-linking, showing blog snippets in dashboards, or exposing blog summaries in voice UI.</li>
</ul>
<p>If you come across any issues subscribing or using the feed, or have ideas for enhancements, please let us know.</p>
<p>Thank you for being part of the OVOS community! May your blog updates come to <em>you</em>, seamlessly and accessibly.</p>
<hr>
<h2>Help Us Build Voice for Everyone</h2>
<p>OpenVoiceOS is more than software, it’s a mission. If you believe voice assistants should be open, inclusive, and user-controlled, here’s how you can help:</p>
<ul>
<li><strong>💸 Donate</strong>: Help us fund development, infrastructure, and legal protection.</li>
<li><strong>📣 Contribute Open Data</strong>: Share voice samples and transcriptions under open licenses.</li>
<li><strong>🌍 Translate</strong>: Help make OVOS accessible in every language.</li>
</ul>
<p>We're not building this for profit. We're building it for people. With your support, we can keep voice tech transparent, private, and community-owned.</p>
<p>👉 <a href="https://www.openvoiceos.org/contribution">Support the project here</a></p>]]></description><link>https://openvoiceos.github.io/ovos-blogs/posts/2025-10-18-rss</link><guid isPermaLink="false">https://openvoiceos.github.io/ovos-blogs/posts/2025-10-18-rss</guid><dc:creator><![CDATA[JarbasAl]]></dc:creator><pubDate>Sat, 18 Oct 2025 00:00:00 GMT</pubDate></item><item><title><![CDATA[Running HiveMind Player on ArkOS with the R36S]]></title><description><![CDATA[<h2>Running HiveMind Player on ArkOS with the R36S</h2>
<p>The <strong>R36S</strong> handheld from AliExpress is usually marketed as a retro gaming device, but with a bit of tinkering it can also become a <strong>HiveMind-powered audio player</strong>, fully integrated with <strong>Home Assistant</strong> and <strong>Music Assistant</strong> 🎶.</p>
<p>This guide shows how I set up the <code>hivemind-player</code> on <strong>ArkOS</strong> and turned my device into a smart media endpoint.</p>
<p><img src="/assets/blog/r36s/device.png" alt="R36Ultra device"></p>
<hr>
<h2>Why HiveMind Player?</h2>
<p>HiveMind Player lets you run the <strong>Open Voice OS (OVOS)</strong> audio stack as a service. Once configured, your R36S:</p>
<ul>
<li>Appears as a <strong>media player in Home Assistant</strong> 🏡</li>
<li>Works seamlessly with <strong>Music Assistant</strong> 🎵</li>
<li>Uses <strong>OpenVoiceOS</strong> for playback control (play, pause, next, etc.)</li>
<li>Can handle TTS and system volume if you install the right plugins</li>
</ul>
<p>See the <a href="https://github.com/JarbasHiveMind/hivemind-media-player">HiveMind Media Player README</a> for full technical details.</p>
<hr>
<h2>Step 1: Flash ArkOS</h2>
<p>I used <a href="https://github.com/lcdyk0517/arkos4clone">arkos4clone</a> to install ArkOS on my <a href="https://handhelds.wiki/R36_Ultra"><strong>R36 Ultra</strong></a>.</p>
<p>⚠️ Compatibility and initial steps may vary across <strong>clone devices</strong>, since there are many hardware variations. This guide applies to ArkOS-based firmware but assumes your handheld is already fully functional before starting.</p>
<p>👉 Always follow the <strong>specific instructions from your image provider</strong>, and check <a href="https://handhelds.wiki">handhelds.wiki</a> for the latest community documentation.</p>
<p>Once ArkOS is up:</p>
<ul>
<li>Connect to WiFi from the device menu</li>
<li>Enable <strong>Remote Services</strong></li>
<li>SSH into the device from a computer</li>
</ul>
<p><img src="/assets/blog/r36s/neofetch.png" alt="neofetch in ArkOS"></p>
<hr>
<h2>Step 2: Install uv</h2>
<p>We’ll use <a href="https://astral.sh/uv">uv</a> for managing Python environments and dependencies:</p>
<pre><code class="hljs language-bash">curl -LsSf https://astral.sh/uv/install.sh | sh
</code></pre>
<p>Create a Python 3.10 virtual environment:</p>
<pre><code class="hljs language-bash">uv venv --python 3.10
</code></pre>
<hr>
<h2>Step 3: Install System Packages</h2>
<p>ArkOS strips out many development headers, so we need to reinstall them.</p>
<p>Without these, I ran into some confusing build errors when trying to install plugins for volume control, including:</p>
<pre><code class="hljs language-text">fatal error: limits.h: No such file or directory
fatal error: linux/limits.h: No such file or directory
fatal error: alsa/asoundlib.h: No such file or directory
</code></pre>
<p>These were resolved by reinstalling the right dev packages:</p>
<pre><code class="hljs language-bash"><span class="hljs-built_in">sudo</span> apt install build-essential libffi-dev libssl-dev gcc libc6-dev
<span class="hljs-built_in">sudo</span> apt install --reinstall libc6-dev linux-libc-dev libasound2-dev
</code></pre>
<p>Then install Python build tools:</p>
<pre><code class="hljs language-bash">uv pip install wheel
</code></pre>
<hr>
<h2>Step 4: Install HiveMind Player</h2>
<p>Install the player with recommended extras (e.g., VLC, PHAL, OCP, TTS):</p>
<pre><code class="hljs language-bash">uv pip install git+https://github.com/JarbasHiveMind/hivemind-media-player[extras]
</code></pre>
<hr>
<h2>Step 5: Configure HiveMind Core</h2>
<p>Create a config directory and set up the <code>server.json</code>:</p>
<pre><code class="hljs language-bash"><span class="hljs-built_in">mkdir</span> -p ~/.config/hivemind-core/
nano ~/.config/hivemind-core/server.json
</code></pre>
<p>Example minimal configuration:</p>
<pre><code class="hljs language-json5">{
  "agent_protocol": {
    "module": "hivemind-player-agent-plugin",
    "hivemind-player-agent-plugin": {}
  },
  "network_protocol": {
    "hivemind-websocket-plugin": {
      "host": "0.0.0.0",
      "port": 5678
    },
    "hivemind-http-plugin": {
      "host": "0.0.0.0",
      "port": 5679
    }
  }
}
</code></pre>
<hr>
<h2>Step 6: Extra Plugins</h2>
<p>You may want to install <strong>PHAL plugins</strong> for platform and hardware integration, or <strong>TTS plugins</strong> to use different voices</p>
<p>Currently the most useful is the <strong>ALSA plugin</strong> for system volume control:</p>
<pre><code class="hljs language-bash">uv pip install ovos-tts-plugin-piper ovos-phal-plugin-alsa ovos-PHAL-plugin-system
</code></pre>
<p>This makes volume adjustable via HiveMind / Home Assistant.</p>
<p>👉 PHAL is <strong>optional</strong>. Right now it’s only needed for volume control - and, as shown above, it required a lot of dev headers just to build. In the future, <strong>dedicated plugins</strong> may appear, for example to control the <strong>joystick LEDs</strong> on the R36 Ultra.</p>
<hr>
<h2>Step 7: Configure ovos-audio</h2>
<p>HiveMind Player uses the <a href="https://github.com/OpenVoiceOS/ovos-audio"><strong>OVOS audio stack</strong></a> for playback and TTS. The configuration lives in:</p>
<pre><code class="hljs language-bash">~/.config/mycroft/mycroft.conf
</code></pre>
<p>The main change required on ArkOS is adjusting the <strong>WAV playback command</strong>:</p>
<ul>
<li>By default, <code>mycroft.conf</code> uses <code>paplay</code></li>
<li>On ArkOS, this needs to be switched to <strong><code>aplay</code></strong> in order for TTS to work correctly</li>
</ul>
<pre><code class="hljs language-json5">{
  "play_wav_cmdline": "aplay %1",
  "play_mp3_cmdline": "mpg123 %1",
  "play_ogg_cmdline": "ogg123 -q %1",

  "tts": {
    "module": "ovos-tts-plugin-piper",
    "ovos-tts-plugin-piper": {"voice": "alan-low"}
  },

  "Audio": {
    "backends": {
      "vlc": {
        "type": "vlc",
        "active": true,
        "initial_volume": 100,
        "low_volume": 50
      }
    }
  },

  "PHAL": {
    "ovos-PHAL-plugin-alsa": {
      "default_volume": 90
    },
    "ovos-PHAL-plugin-system": {
      "core_service": "hivemind-player.service",
      "ssh_service": "ssh.service",
      "sudo": true
    }
  }
}
</code></pre>
<ul>
<li><strong>play_wav_cmdline</strong> → must be changed to <code>aplay %1</code></li>
<li><strong>Audio</strong> → VLC is already available in ArkOS and can play nearly all streams you throw at it</li>
<li><strong>TTS</strong> → Default is <code>ovos-tts-plugin-piper</code>, but you can swap in any OVOS TTS plugin you prefer such as  <code>ovos-tts-plugin-server</code></li>
<li><strong>PHAL</strong> → system plugin needs to be adjusted in order to use the correct service names</li>
</ul>
<blockquote>
<p>NOTE: in arkOS sudo does not require password, therefore <code>ovos-PHAL-plugin-system</code> should work out of the box</p>
</blockquote>
<hr>
<h2>Step 8: Set Up Permissions</h2>
<p>HiveMind operates on a <code>deny-all</code> policy by default: every message type must be explicitly allowed for clients.</p>
<p>First, create a client identity:</p>
<pre><code class="hljs language-bash">hivemind-core add-client
</code></pre>
<p>This gives you an <strong>Access Key</strong> and <strong>Password</strong>.</p>
<p>Next, allow the minimum messages needed for playback and TTS. Replace <code>&#x3C;NODE_ID></code> with the client’s Node ID:</p>
<pre><code class="hljs language-bash"><span class="hljs-comment"># Basic audio / TTS</span>
hivemind-core allow-msg <span class="hljs-string">"speak"</span> &#x3C;NODE_ID>
hivemind-core allow-msg <span class="hljs-string">"mycroft.audio.is_alive"</span> &#x3C;NODE_ID>
hivemind-core allow-msg <span class="hljs-string">"mycroft.audio.is_ready"</span> &#x3C;NODE_ID>
hivemind-core allow-msg <span class="hljs-string">"mycroft.audio.speak.status"</span> &#x3C;NODE_ID>
hivemind-core allow-msg <span class="hljs-string">"mycroft.stop"</span> &#x3C;NODE_ID>

<span class="hljs-comment"># Common Play (playback control)</span>
hivemind-core allow-msg <span class="hljs-string">"ovos.common_play.play"</span> &#x3C;NODE_ID>
hivemind-core allow-msg <span class="hljs-string">"ovos.common_play.pause"</span> &#x3C;NODE_ID>
hivemind-core allow-msg <span class="hljs-string">"ovos.common_play.resume"</span> &#x3C;NODE_ID>
hivemind-core allow-msg <span class="hljs-string">"ovos.common_play.stop"</span> &#x3C;NODE_ID>
hivemind-core allow-msg <span class="hljs-string">"ovos.common_play.next"</span> &#x3C;NODE_ID>
hivemind-core allow-msg <span class="hljs-string">"ovos.common_play.previous"</span> &#x3C;NODE_ID>
hivemind-core allow-msg <span class="hljs-string">"ovos.common_play.player.status"</span> &#x3C;NODE_ID>
hivemind-core allow-msg <span class="hljs-string">"ovos.common_play.track_info"</span> &#x3C;NODE_ID>

<span class="hljs-comment"># Optional: volume (PHAL ALSA plugin)</span>
hivemind-core allow-msg <span class="hljs-string">"mycroft.volume.get"</span> &#x3C;NODE_ID>
hivemind-core allow-msg <span class="hljs-string">"mycroft.volume.set"</span> &#x3C;NODE_ID>
hivemind-core allow-msg <span class="hljs-string">"mycroft.volume.increase"</span> &#x3C;NODE_ID>
hivemind-core allow-msg <span class="hljs-string">"mycroft.volume.decrease"</span> &#x3C;NODE_ID>
hivemind-core allow-msg <span class="hljs-string">"mycroft.volume.mute"</span> &#x3C;NODE_ID>
hivemind-core allow-msg <span class="hljs-string">"mycroft.volume.unmute"</span> &#x3C;NODE_ID>

<span class="hljs-comment"># Optional: SSH, Reboot and Shutdown (PHAL system plugin)</span>
hivemind-core allow-msg <span class="hljs-string">"system.ssh.status"</span> &#x3C;NODE_ID>
hivemind-core allow-msg <span class="hljs-string">"system.ssh.enable"</span> &#x3C;NODE_ID>
hivemind-core allow-msg <span class="hljs-string">"system.ssh.disable"</span> &#x3C;NODE_ID>
hivemind-core allow-msg <span class="hljs-string">"system.reboot"</span> &#x3C;NODE_ID>
hivemind-core allow-msg <span class="hljs-string">"system.shutdown"</span> &#x3C;NODE_ID>

</code></pre>
<hr>
<h2>Step 9: Create a System Service</h2>
<p>To keep HiveMind Player running in the background, add a <strong>systemd user service</strong>:</p>
<pre><code class="hljs language-bash"><span class="hljs-built_in">mkdir</span> -p ~/.config/systemd/user/
nano ~/.config/systemd/user/hivemind-player.service
</code></pre>
<p>Paste this:</p>
<pre><code class="hljs language-ini"><span class="hljs-section">[Unit]</span>
<span class="hljs-attr">Description</span>=HiveMind Player
<span class="hljs-attr">After</span>=network.target

<span class="hljs-section">[Service]</span>
<span class="hljs-attr">WorkingDirectory</span>=/home/ark
<span class="hljs-attr">ExecStart</span>=/home/ark/.venv/bin/python /home/ark/.venv/bin/hivemind-core listen
<span class="hljs-attr">Restart</span>=<span class="hljs-literal">on</span>-failure

<span class="hljs-section">[Install]</span>
<span class="hljs-attr">WantedBy</span>=default.target
</code></pre>
<p>Then enable and start it:</p>
<pre><code class="hljs language-bash">systemctl --user daemon-reload
systemctl --user <span class="hljs-built_in">enable</span> hivemind-player.service --now
systemctl --user status hivemind-player.service
</code></pre>
<hr>
<h2>Step 10: Optional - Install tailscale for remote access</h2>
<p>A problem we have with the R36S is that it changes IP address on every boot and randomizes its MAC address; to work around this we could try to assign a static IP address but since this is a handheld and will often be on the move, why not use a VPN?</p>
<p><a href="https://tailscale.com">tailscale</a> is free and painless to install</p>
<pre><code class="hljs language-bash">curl -fsSL https://tailscale.com/install.sh | sh
<span class="hljs-built_in">sudo</span> systemctl <span class="hljs-built_in">enable</span> tailscaled
<span class="hljs-built_in">sudo</span> tailscale login
<span class="hljs-built_in">sudo</span> systemctl start tailscaled
<span class="hljs-built_in">sudo</span> systemctl status tailscaled
</code></pre>
<p>Now we have a static ip address AND we can access the handheld remotely</p>
<hr>
<h2>Step 11: Integrate with Home Assistant</h2>
<p>With the <a href="https://github.com/JarbasHiveMind/hivemind-homeassistant">hivemind-homeassistant</a> integration installed, your R36 Ultra shows up as a <strong>media player</strong> inside Home Assistant.</p>
<p><img src="/assets/blog/r36s/ha_device.png" alt="R36 device in Home Assistant"></p>
<p>In Home Assistant, the TTS setup also makes the device available as a <strong>notify entity</strong>, so you can send text notifications and have them spoken on your R36 Ultra:</p>
<pre><code class="hljs language-yaml"><span class="hljs-attr">service:</span> <span class="hljs-string">notify.hivemind_player</span>
<span class="hljs-attr">data:</span>
  <span class="hljs-attr">message:</span> <span class="hljs-string">"Dinner is ready!"</span>
</code></pre>
<p><img src="/assets/blog/r36s/ha_notify.png" alt="R36 notify in Home Assistant"></p>
<p>Combine this with <strong>Music Assistant</strong> and suddenly your retro handheld doubles as a WiFi speaker you can control from your smart home dashboard.</p>
<p><img src="/assets/blog/r36s/ma.png" alt="R36 player in Music Assistant"></p>
<hr>
<h2>Wrap-Up</h2>
<p>That’s it! My R36 Ultra, originally built for emulation, now runs <strong>HiveMind Player</strong> on ArkOS and integrates seamlessly with my smart home setup.</p>
<ul>
<li>HiveMind manages the player backend</li>
<li>Home Assistant discovers it</li>
<li>Music Assistant lets me browse and stream to it</li>
<li><code>ovos-audio</code> makes it flexible for both playback and TTS</li>
<li>Optional PHAL plugins give extra control (volume now, LEDs in the future)</li>
<li>Permissions ensure only trusted clients can control it</li>
</ul>
<p>A fun way to repurpose cheap handheld hardware into a <strong>smart speaker alternative</strong>! 🚀</p>
<hr>
<h2>Help Us Build Voice for Everyone</h2>
<p>OpenVoiceOS is more than software, it’s a mission. If you believe voice assistants should be open, inclusive, and user-controlled, here’s how you can help:</p>
<ul>
<li><strong>💸 Donate</strong>: Help us fund development, infrastructure, and legal protection.</li>
<li><strong>📣 Contribute Open Data</strong>: Share voice samples and transcriptions under open licenses.</li>
<li><strong>🌍 Translate</strong>: Help make OVOS accessible in every language.</li>
</ul>
<p>We're not building this for profit. We're building it for people. With your support, we can keep voice tech transparent, private, and community-owned.</p>
<p>👉 <a href="https://www.openvoiceos.org/contribution">Support the project here</a></p>]]></description><link>https://openvoiceos.github.io/ovos-blogs/posts/2025-10-11-r36s</link><guid isPermaLink="false">https://openvoiceos.github.io/ovos-blogs/posts/2025-10-11-r36s</guid><dc:creator><![CDATA[JarbasAl]]></dc:creator><pubDate>Sat, 11 Oct 2025 00:00:00 GMT</pubDate></item><item><title><![CDATA[Introducing phoonnx: The Next Generation of Open Voice for OpenVoiceOS]]></title><description><![CDATA[<h1>Introducing phoonnx: The Next Generation of Open Voice for OpenVoiceOS</h1>
<p>Today marks a significant step forward in the OpenVoiceOS journey with the official adoption of <a href="https://github.com/TigreGotico/phoonnx"><strong>phoonnx</strong></a> as our primary Text-to-Speech (TTS) framework.</p>
<p>This new generation of voices is not just about quality; it's about consistency, efficiency, and fulfilling our mission for truly open, offline-ready voice assistants across the globe.</p>
<hr>
<h2>New Language: Introducing Basque!</h2>
<p>Building on our previous work on <a href="https://blog.openvoiceos.org/posts/2025-06-26-making-synthetic-voices-from-scratch"><strong>Making Synthetic Voices From Scratch</strong></a> and our successful <a href="https://blog.openvoiceos.org/posts/2025-10-01-arabic_tts_collaboration"><strong>Arabic TTS Collaboration</strong></a>, we are excited to announce a new milestone in our language support: the addition of new voices for <strong>Basque (eu-ES)</strong>!</p>
<p>This includes both the male voice (<strong>Miro</strong>) and female voice (<strong>Dii</strong>), furthering our mission to support even low-resource languages that lack open, high-quality TTS options.</p>
<p>Previously only a robotic female voice was available, via the <a href="https://github.com/OpenVoiceOS/ovos-tts-plugin-ahotts">AhoTTS</a> plugin made in collaboration with <a href="https://proyectoilenia.es">ILENIA</a>.</p>
<p><strong>Hear the results</strong>: Examples of the new Basque Open-Source Voices.</p>
<audio controls>
  <source src="/assets/blog/phoonnx/miro_eu-ES.wav" type="audio/wav">
  Your browser does not support the audio element.
</audio>
<audio controls>
  <source src="/assets/blog/phoonnx/dii_eu-ES.wav" type="audio/wav">
  Your browser does not support the audio element.
</audio>
<hr>
<h2>A Unified Voice for a Global Brand</h2>
<p>As OpenVoiceOS expands to more languages and more devices, a crucial need has emerged: a cohesive <strong>brand identity</strong> conveyed through voice.
We need a core set of voices, a standard male and female persona, that sounds the same, professional, and recognizable no matter where you are in the world or which language you are speaking.</p>
<p>This consistency is vital. Imagine installing OpenVoiceOS in Lisbon, Berlin, or Seattle, the voice should be instantly familiar.
This is the power of a unified voice, creating a seamless and trustworthy user experience globally.</p>
<p>We are proud to share that <a href="https://tigregotico.pt"><strong>TigreGotico</strong></a> has been instrumental in making this vision a reality.
They are not only developing the core <a href="https://github.com/TigreGotico/phoonnx"><strong>phoonnx</strong></a> engine but are also actively contributing to open datasets and training the default, multi-lingual OVOS voices.
This internal collaboration accelerates development and ensures our voices are aligned with the open-source spirit of our platform.</p>
<hr>
<h2>The phoonnx Advantage: A Flexible TTS Ecosystem</h2>
<p><strong>phoonnx</strong> is more than just an inference tool; it is a complete <strong>training and inference framework</strong> built on the robust <a href="https://arxiv.org/abs/2106.06103">VITS architecture</a>.
This dual capability allows us to rapidly prototype, train, and deploy high-quality voices.</p>
<p>A key to this flexibility is the ability to support diverse <strong>phonemizers</strong>.
A phonemizer (or G2P - Grapheme-to-Phoneme model) converts written text into the sequence of sound units (phonemes) the TTS model speaks.
Different languages may require different, specialized phonemizers for accurate speech.</p>
<ul>
<li><strong>eSpeak Compatibility</strong>: A core feature is that <strong>phoonnx</strong> models are fully compatible with the popular <strong>Piper TTS engine's runtime</strong>, provided they were trained using the widely available <strong>eSpeak</strong> phonemizer. This ensures easy deployment within the existing OVOS ecosystem and third party projects like <a href="https://www.home-assistant.io/integrations/piper/">Home Assistant</a>.</li>
<li><strong>Custom Phonemizer Support</strong>: The framework is not limited to eSpeak. For example, we are excited to note that the high-quality <strong>Galician models</strong> developed by <a href="https://nos.gal/es/proxecto-nos">Proxecto Nós</a> using the <strong>Cotovia</strong> phonemizer are fully compatible and can be used with the <strong>phoonnx</strong> pipeline.</li>
</ul>
<p>This flexibility allows us to integrate and benefit from the work of other open-source projects. In fact, for inference, <a href="https://github.com/TigreGotico/phoonnx"><strong>phoonnx</strong></a> can successfully use models originally trained by other projects, including <strong>Coqui</strong>, <strong>Mimic3</strong>, and <strong>Piper</strong>, solidifying its role as a universal TTS deployment tool.</p>
<h3>Teasing the Future: Next-Gen G2P Models</h3>
<p>Looking ahead, we are constantly working to improve G2P accuracy, especially for low-resource languages. We are currently developing and testing next-generation G2P models based on the powerful <strong>ByT5</strong> architecture. These transformer-based models promise to deliver more accurate and robust phonemization across a wider range of languages.</p>
<p>You can follow their development here: <a href="https://huggingface.co/collections/OpenVoiceOS/g2p-models-6886a8d612825c3fe65befa0">G2P Models Collection</a>.</p>
<p>In the near future a dedicated OVOS TTS plugin will be created for phoonnx and made the default for OpenVoiceOS, replacing the previous plugins: <a href="https://github.com/OpenVoiceOS/ovos-tts-plugin-piper">ovos-tts-plugin-piper</a> and <a href="https://github.com/OpenVoiceOS/ovos-tts-plugin-nos">ovos-tts-plugin-nos</a>.</p>
<p>In the meantime you can try the new voices via the existing plugin <a href="https://github.com/OpenVoiceOS/ovos-tts-plugin-piper">ovos-tts-plugin-piper</a></p>
<p>All you need to do is pass the model urls under <code>mycroft.conf</code></p>
<pre><code class="hljs language-json">  <span class="hljs-attr">"tts"</span><span class="hljs-punctuation">:</span> <span class="hljs-punctuation">{</span>
    <span class="hljs-attr">"module"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"ovos-tts-plugin-piper"</span><span class="hljs-punctuation">,</span>
    <span class="hljs-attr">"ovos-tts-plugin-piper"</span><span class="hljs-punctuation">:</span> <span class="hljs-punctuation">{</span>
      <span class="hljs-attr">"model"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"https://huggingface.co/OpenVoiceOS/phoonnx_eu-ES_miro_espeak/resolve/main/miro_eu-ES.onnx"</span><span class="hljs-punctuation">,</span>
      <span class="hljs-attr">"model_config"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"https://huggingface.co/OpenVoiceOS/phoonnx_eu-ES_miro_espeak/resolve/main/miro_eu-ES.piper.json"</span>
    <span class="hljs-punctuation">}</span>
  <span class="hljs-punctuation">}</span>
</code></pre>
<hr>
<h2>Progress Report: Available Languages</h2>
<p>The collective work of the OpenVoiceOS and TigreGotico teams has resulted in a rapidly expanding library of open-source TTS models.</p>
<p><strong>Currently Supported Languages:</strong></p>
<ul>
<li>Arabic</li>
<li>Basque</li>
<li>Dutch</li>
<li>English (US/GB)</li>
<li>French</li>
<li>German</li>
<li>Italian</li>
<li>Portuguese (Brazil/Portugal)</li>
<li>Spanish</li>
</ul>
<hr>
<h3><strong>Get Involved and Find the Models</strong></h3>
<p>We invite the community to explore and utilize these new resources. Your feedback is crucial to improving voice quality and expanding language coverage.</p>
<table>
<thead>
<tr>
<th align="left">Resource</th>
<th align="left">Description</th>
<th align="left">Link</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left"><strong>Phoonnx Models</strong></td>
<td align="left">The new phoonnx-trained TTS models in ONNX format.</td>
<td align="left"><a href="https://huggingface.co/collections/TigreGotico/phoonnx-tts-models-68cd76d5b485394d9b71032e">phoonnx-tts-models</a></td>
</tr>
<tr>
<td align="left"><strong>Piper/Phoonnx Voices</strong></td>
<td align="left">The full collection of OpenVoiceOS voices compatible with Piper.</td>
<td align="left"><a href="https://huggingface.co/collections/OpenVoiceOS/pipertts-voices-68594d08f08e6ec56eddf4eb">pipertts-voices</a></td>
</tr>
<tr>
<td align="left"><strong>Open Datasets</strong></td>
<td align="left">Datasets used for training these voices, furthering open-data research.</td>
<td align="left"><a href="https://huggingface.co/collections/TigreGotico/tts-datasets-68dd4156a1484d2cf7bcbd5f">tts-datasets</a></td>
</tr>
</tbody>
</table>
<hr>
<h2>Help Us Build Voice for Everyone</h2>
<p>OpenVoiceOS is more than software, it’s a mission. If you believe voice assistants should be open, inclusive, and user-controlled, here’s how you can help:</p>
<ul>
<li><strong>💸 Donate</strong>: Help us fund development, infrastructure, and legal protection.</li>
<li><strong>📣 Contribute Open Data</strong>: Share voice samples and transcriptions under open licenses.</li>
<li><strong>🌍 Translate</strong>: Help make OVOS accessible in every language.</li>
</ul>
<p>We're not building this for profit. We're building it for people. With your support, we can keep voice tech transparent, private, and community-owned.</p>
<p>👉 <a href="https://www.openvoiceos.org/contribution">Support the project here</a></p>]]></description><link>https://openvoiceos.github.io/ovos-blogs/posts/2025-10-06-phoonnx</link><guid isPermaLink="false">https://openvoiceos.github.io/ovos-blogs/posts/2025-10-06-phoonnx</guid><dc:creator><![CDATA[JarbasAl]]></dc:creator><pubDate>Mon, 06 Oct 2025 00:00:00 GMT</pubDate></item><item><title><![CDATA[Voices of Inclusion: Reflections on Our Collaboration]]></title><description><![CDATA[<h2>Voices of Inclusion: Reflections on Our Collaboration</h2>
<p>As the <strong>External Collaborations Lead</strong> at <a href="https://www.youtube.com/@planetblindtech"><strong>Planet Blind Tech</strong></a>, I recently had the privilege of working closely with <strong>OpenVoiceOS</strong> on a groundbreaking project: <strong>training an open-source Arabic voice</strong>. This effort was far more than a technical experiment; it was a meaningful step toward <strong>digital inclusion for blind and visually impaired communities</strong>.</p>
<p>I first encountered their project while exploring the intersections between assistive technology and speech synthesis. It was clear that our missions aligned. Their dedication to creating open voice technologies resonated with our own pursuit of accessible tools for blind users. The chance to contribute to an Arabic TTS (Text-to-Speech) voice felt like the bridge we had long been waiting for.</p>
<p>Our team’s role focused on two essential tasks: <strong>gathering datasets and testing the resulting models</strong>. The process was not without its hurdles. The initial dataset we worked with fell short of the desired quality. Yet, the <strong>patience, openness, and technical guidance</strong> from our partners encouraged us to refine our approach. Together, we introduced a more robust dataset and trained a model with <a href="https://github.com/TigreGotico/phoonnx">Phoonnx</a> and the eSpeak phonemizer, which produced results that genuinely reflected the richness of the Arabic language.</p>
<p><strong>Hear the results</strong>: Examples of the new Arabic Open-Source Voices.</p>
<audio controls>
  <source src="/assets/blog/arabic_tts_collaboration/dii_ar.wav" type="audio/wav">
  Your browser does not support the audio element.
</audio>
<audio controls>
  <source src="/assets/blog/arabic_tts_collaboration/miro_ar.wav" type="audio/wav">
  Your browser does not support the audio element.
</audio>
<audio controls>
  <source src="/assets/blog/arabic_tts_collaboration/miro_ar_v2.wav" type="audio/wav">
  Your browser does not support the audio element.
</audio>
<p>The impact of this work is already tangible. An <strong>Arabic open-source voice</strong> is not just another model, it is a lifeline for blind individuals who rely on screen readers and accessible applications. For the first time, many users can envision a future where they are not tied to proprietary voices or limited to languages that do not fully capture their identity. This achievement lowers barriers, empowers developers to integrate the voice into assistive tools, and above all, gives blind users a sense of agency.</p>
<p>What began as a collaboration to create a voice has evolved into something larger: a <strong>demonstration of what open technology can mean for inclusion</strong>. For our community, this project is more than sound, it is <strong>dignity, independence, and opportunity</strong>. And for me personally, it stands as a reminder of the transformative power of working together across communities to ensure no voice, and no person, is left unheard.</p>
<p>You can find the collected datasets and trained models on huggingface:</p>
<ul>
<li><a href="https://huggingface.co/OpenVoiceOS/phoonnx_ar-SA_dii_espeak">Arabic TTS Model - Dii</a></li>
<li><a href="https://huggingface.co/OpenVoiceOS/phoonnx_ar-SA_miro_espeak">Arabic TTS Model - Miro</a></li>
<li><a href="https://huggingface.co/OpenVoiceOS/phoonnx_ar-SA_miro_espeak_V2">Arabic TTS Model - Miro (V2)</a></li>
<li><a href="https://huggingface.co/datasets/TigreGotico/arabic_g2p">Arabic G2P Dataset</a></li>
<li><a href="https://huggingface.co/datasets/TigreGotico/tts-train-synthetic-miro_ar-diacritics">Synthetic Arabic TTS Training Data - Miro with Diacritics</a></li>
<li><a href="https://huggingface.co/datasets/TigreGotico/tts-train-synthetic-dii_ar-diacritics">Synthetic Arabic TTS Training Data - Dii with Diacritics</a></li>
</ul>
<hr>
<h2>Help Us Build Voice for Everyone</h2>
<p>OpenVoiceOS is more than software, it’s a mission. If you believe voice assistants should be open, inclusive, and user-controlled, here’s how you can help:</p>
<ul>
<li><strong>💸 Donate</strong>: Help us fund development, infrastructure, and legal protection.</li>
<li><strong>📣 Contribute Open Data</strong>: Share voice samples and transcriptions under open licenses.</li>
<li><strong>🌍 Translate</strong>: Help make OVOS accessible in every language.</li>
</ul>
<p>We're not building this for profit. We're building it for people. With your support, we can keep voice tech transparent, private, and community-owned.</p>
<p>👉 <a href="https://www.openvoiceos.org/contribution">Support the project here</a></p>]]></description><link>https://openvoiceos.github.io/ovos-blogs/posts/2025-10-01-arabic_tts_collaboration</link><guid isPermaLink="false">https://openvoiceos.github.io/ovos-blogs/posts/2025-10-01-arabic_tts_collaboration</guid><dc:creator><![CDATA[Shams Al-Din]]></dc:creator><pubDate>Wed, 01 Oct 2025 00:00:00 GMT</pubDate></item><item><title><![CDATA[OpenVoiceOS and Home Assistant: A Voice Automation Dream Team]]></title><description><![CDATA[<h1>OpenVoiceOS and Home Assistant: A Voice Automation Dream Team</h1>
<p>In the world of open-source smart homes, some things just click. When you let Home Assistant handle the automation and let OVOS (Open Voice OS) handle the voice, you get a powerful partnership where each project shines. It’s a perfect synergy: one is the undisputed champion of home automation, and the other is a flexible, private powerhouse for voice interaction.</p>
<p>Home Assistant excels at orchestrating your devices and routines, while OVOS provides unparalleled flexibility and privacy in voice interactions. Together, they create a system that's not only robust but also truly yours.</p>
<p>Let's explore how you can bring these two together to create a truly magical smart home experience.</p>
<hr>
<h2>Give Home Assistant an OVOS-Powered Voice</h2>
<p>The most direct way to get started is to enhance Home Assistant's built-in voice capabilities with the specialized tools from the OVOS ecosystem. Our main goal is to make OVOS's powerful tools accessible to as many people as possible.</p>
<p>To achieve this, we've developed dedicated Wyoming integrations that act as bridges, allowing <strong>any</strong> OVOS Text-to-Speech (TTS), Speech-to-Text (STT), or Wakeword plugin to be exposed to Home Assistant.</p>
<p>This means you're not limited to a few options; you gain immediate access to the <strong>entire rich ecosystem</strong> of OVOS voice plugins, bringing a vast array of languages, voices, and recognition models directly into your Home Assistant setup.</p>
<ul>
<li><a href="https://github.com/TigreGotico/wyoming-ovos-stt">Wyoming OVOS STT</a>: Convert spoken commands into text for Home Assistant to understand.</li>
<li><a href="https://github.com/TigreGotico/wyoming-ovos-tts">Wyoming OVOS TTS</a>: Enable Home Assistant to speak responses using OVOS's diverse voice options.</li>
<li><a href="https://github.com/TigreGotico/wyoming-ovos-wakeword">Wyoming OVOS Wakeword</a>: Integrate custom wakewords, allowing your Home Assistant setup to respond only when it hears your chosen trigger phrase.</li>
</ul>
<p>The <a href="https://github.com/TigreGotico/ovos-wyoming-docker">OVOS Wyoming Docker</a> project makes getting these services up and running a breeze.</p>
<h3><strong>Plugin Highlights: Multi-language TTS powered by ILENIA</strong></h3>
<p>For us, accessibility is key. That includes language accessibility. We're proud that this integration allows us to bring high-quality, publicly funded voices from projects like <a href="https://proyectoilenia.es/"><strong>ILENIA</strong></a> to a wider audience. Now, Home Assistant users can easily access fantastic, natural-sounding voices for languages like Catalan and Galician.</p>
<ul>
<li><strong>Matxa TTS for Catalan:</strong> The <a href="https://github.com/OpenVoiceOS/ovos-tts-plugin-matxa-multispeaker-cat"><code>ovos-tts-plugin-matxa-multispeaker-cat</code></a> provides multi-speaker text-to-speech capabilities for the Catalan language.</li>
<li><strong>NosTTS for Galician:</strong> The <a href="https://github.com/OpenVoiceOS/ovos-tts-plugin-nos"><code>ovos-tts-plugin-nos</code></a> offers robust text-to-speech in Galician.</li>
</ul>
<p>It’s a great example of how open collaboration benefits everyone.</p>
<p><img src="/assets/blog/OpenVoiceOS-and-Home-Assistant-a-voice-automation-dream-team/ilenia.png" alt="ILENIA logo"></p>
<h3><strong>Setting up Wyoming Services in Home Assistant:</strong></h3>
<p>When configuring Wyoming services in Home Assistant, you'll typically refer to the <a href="https://www.home-assistant.io/integrations/wyoming/">official Home Assistant documentation</a>. This process usually involves simply entering the IP address of your Docker container (or the host running your OVOS Wyoming services) into the Home Assistant web interface.</p>
<p><img src="/assets/blog/OpenVoiceOS-and-Home-Assistant-a-voice-automation-dream-team/wyoming_setup.png" alt="wyoming setup in Home Assistant"></p>
<p><img src="/assets/blog/OpenVoiceOS-and-Home-Assistant-a-voice-automation-dream-team/wyoming_menu.png" alt="wyoming entities in Home Assistant"></p>
<hr>
<h2>Let OVOS Be the Brains of the Conversation</h2>
<p>Want to take it a step further? You can set up OVOS as a full-fledged conversational agent for Home Assistant using the <strong>Ollama integration</strong>.</p>
<p><img src="/assets/blog/OpenVoiceOS-and-Home-Assistant-a-voice-automation-dream-team/ollama_setup.png" alt="ollama setup in Home Assistant"></p>
<p>In this setup, Home Assistant passes the user's text to the <a href="https://openvoiceos.github.io/ovos-technical-manual/150-personas/">ovos-persona-server</a>. OVOS then figures out what you want and tells Home Assistant what to answer. It’s like hiring a brilliant conversationalist to augment your smart home interactions.</p>
<p>Here’s the cool part: because <a href="https://github.com/OpenVoiceOS/ovos-persona-server">ovos-persona-server</a> uses Ollama-compatible endpoints, you can connect it to any app that supports the Ollama or OpenAI APIs. The possibilities are huge!</p>
<p><img src="/assets/blog/OpenVoiceOS-and-Home-Assistant-a-voice-automation-dream-team/agent_chat.png" alt="chat with OVOS in Home Assistant"></p>
<hr>
<h2>OVOS with the Voice Pe</h2>
<p>Everyone is talking about <a href="https://www.home-assistant.io/voice-pe">Home Assistant Voice Preview Edition</a>, a dedicated hardware device for voice control. If you own one, you can now easily integrate it with everything discussed so far.</p>
<p><img src="/assets/blog/OpenVoiceOS-and-Home-Assistant-a-voice-automation-dream-team/voice_pe_config.png" alt="Configuring Home Assistant Voice Preview Edition"></p>
<hr>
<h2>Welcome Your OVOS Devices into Home Assistant with HiveMind</h2>
<p>If you have dedicated OVOS devices, the <a href="https://github.com/JarbasHiveMind/hivemind-homeassistant">HiveMind HomeAssistant</a> project is where the real magic happens. This integration makes your OVOS devices show up as native entities in Home Assistant, giving you a beautiful, unified control panel.</p>
<h3><strong>Setting up HiveMind Integration:</strong></h3>
<p>To integrate your OVOS devices via HiveMind, you'll typically add the HiveMind integration in Home Assistant. This involves providing connection details such as a <code>name</code> for the integration, an <code>access_key</code>, <code>password</code>, <code>site_id</code>, <code>host</code> (IP address or hostname of your HiveMind server), and the <code>port</code> (defaulting to 5678). You may also have options to <code>allow_self_signed</code> certificates or enable <code>legacy_audio</code> depending on your setup.</p>
<p><img src="/assets/blog/OpenVoiceOS-and-Home-Assistant-a-voice-automation-dream-team/hivemind_setup.png" alt="HiveMind setup in Home Assistant"></p>
<h3><strong>Exposed Controls for OVOS Devices:</strong></h3>
<p>Once integrated, HiveMind exposes a comprehensive set of controls for your OVOS devices directly within Home Assistant. This allows you to manage various aspects of your OVOS device from the Home Assistant UI, including:</p>
<ul>
<li>Changing the <code>Listening Mode</code> (e.g., wakeword, always listening)</li>
<li><code>Microphone Mute</code> toggle</li>
<li><code>OCP Player</code> status and controls</li>
<li>Actions like <code>Reboot Device</code>, <code>Restart OVOS</code>, and <code>Shutdown Device</code></li>
<li>Toggling <code>Sleep Mode</code> and <code>SSH Service</code></li>
<li>Manually <code>Start Listening</code> or <code>Stop</code> listening</li>
<li>Controlling volume level</li>
</ul>
<p><img src="/assets/blog/OpenVoiceOS-and-Home-Assistant-a-voice-automation-dream-team/hivemind_entities.png" alt="HiveMind entities in Home Assistant"></p>
<h3><strong>Notifications Integration:</strong></h3>
<p>HiveMind also enables your OVOS devices to function as notification targets within Home Assistant. This means you can configure Home Assistant automations to send spoken notifications directly to your OVOS devices, allowing them to "speak" alerts, reminders, or any other information you configure. This is exposed as a "Speak" notifier entity in Home Assistant.</p>
<p><img src="/assets/blog/OpenVoiceOS-and-Home-Assistant-a-voice-automation-dream-team/hivemind_notify.png" alt="HiveMind notify service in Home Assistant"></p>
<h3><strong>Media Player and Music Assistant Integration:</strong></h3>
<p>A fantastic feature of HiveMind integration is that your OVOS devices will show up as standard media players within Home Assistant. This allows you to control media playback on your OVOS devices directly from Home Assistant's media player interface. Furthermore, this integration extends to services like Music Assistant, enabling you to stream music and other audio content from Music Assistant through your OVOS devices, making them a seamless part of your whole-home audio system.</p>
<p><img src="/assets/blog/OpenVoiceOS-and-Home-Assistant-a-voice-automation-dream-team/ha_player.png" alt="HiveMind player in Home Assistant"></p>
<p><img src="/assets/blog/OpenVoiceOS-and-Home-Assistant-a-voice-automation-dream-team/ma_player.png" alt="HiveMind player in Music Assistant"></p>
<hr>
<h2>Give OVOS the Keys to the Kingdom</h2>
<p>Finally, with the fantastic <a href="https://github.com/OscillateLabsLLC/skill-homeassistant">Skill HomeAssistant</a> from community member mikejgray, you can give OVOS direct control over Home Assistant.</p>
<p>Install this skill, and your OVOS device can now command your smart home. Just say, "Hey Mycroft, turn on the living room lights," and watch the magic happen. It’s the classic voice assistant experience, but fully private, customizable, and powered by two best-in-class open-source projects.</p>
<hr>
<h2>The Perfect Match</h2>
<p>When you let OVOS do the talking and Home Assistant do the automating, you get the best of both worlds. It’s a flexible, powerful, and fun combination that lets you build a smart home that is truly your own.</p>
<p><strong>A Note on This Early Preview Release</strong>:</p>
<p>We're incredibly excited to share this integration with you! Please keep in mind that this is an early preview release and is still under active development. While it's functional and powerful, it hasn't undergone extensive testing and may have rough edges. We're actively working to refine it, and your feedback is invaluable!</p>
<p>If you encounter any pain points or have ideas for improvements, please consider opening an issue or, even better, a Pull Request on our GitHub repositories. Your contributions help us make this even better for everyone. Thanks for being an early adopter and helping us shape the future of open-source voice!</p>
<hr>
<h2>Help Us Build Voice for Everyone</h2>
<p>OpenVoiceOS is more than software, it’s a mission. If you believe voice assistants should be open, inclusive, and user-controlled, here’s how you can help:</p>
<ul>
<li><strong>💸 Donate</strong>: Help us fund development, infrastructure, and legal protection.</li>
<li><strong>📣 Contribute Open Data</strong>: Share voice samples and transcriptions under open licenses.</li>
<li><strong>🌍 Translate</strong>: Help make OVOS accessible in every language.</li>
</ul>
<p>We're not building this for profit. We're building it for people. With your support, we can keep voice tech transparent, private, and community-owned.</p>
<p>👉 <a href="https://www.openvoiceos.org/contribution">Support the project here</a></p>]]></description><link>https://openvoiceos.github.io/ovos-blogs/posts/2025-09-17-ovos_ha_dream_team</link><guid isPermaLink="false">https://openvoiceos.github.io/ovos-blogs/posts/2025-09-17-ovos_ha_dream_team</guid><dc:creator><![CDATA[JarbasAl]]></dc:creator><pubDate>Wed, 17 Sep 2025 00:00:00 GMT</pubDate></item><item><title><![CDATA[Mirror, Mirror… Who’s My Persona Today? (OVOS + MagicMirror² + Local LLM)]]></title><description><![CDATA[<h1>Mirror, Mirror… Who’s My Persona Today?</h1>
<p><em>Daily-driver edition: practical, speedy, and just cheeky enough 😏.</em></p>
<blockquote>
<p>OVOS = <strong>Open Voice OS</strong>.</p>
</blockquote>
<p>The demo shows our smart mirror flipping <strong>personas</strong> on command—<em>Friendly</em>, <em>Snobby</em>, <em>Smarty Pants</em>—while everything runs <strong>locally</strong>. Below is the <em>daily-driver</em> recipe we actually use: classic skills stay in charge, personas jump in when it’s chat time, and <strong>MagicMirror²</strong> handles the visuals. No devices were harmed. ☁️❌</p>
<p>🎥 <strong>Watch:</strong> <a href="https://www.youtube.com/watch?v=DDkEbAySH0I">https://www.youtube.com/watch?v=DDkEbAySH0I</a></p>
<hr>
<h2>Why personas (and why this layout)?</h2>
<p>In OVOS, a persona is a <strong>runtime-swappable reasoning preset</strong>. It can route a question to an LLM, change tone/verbosity, and gracefully handle open-ended chatter.
This layout keeps your <strong>high-confidence skills first</strong> (timers, weather, music, home control), then lets the persona shine for the fuzzy stuff. Best of both worlds. ✨</p>
<hr>
<h2>TL;DR (copy/paste speed-run)</h2>
<ol>
<li>Install OVOS with the <strong>ovos-installer TUI</strong>, select <strong>alpha</strong> channel, and <strong>do NOT enable GUI</strong> (MagicMirror² is the face).</li>
<li>In <strong><code>mycroft.conf</code></strong> (e.g., <code>~/.config/mycroft/mycroft.conf</code> or <code>/etc/mycroft/mycroft.conf</code>), add <strong><code>ovos-persona-pipeline-plugin-high</code></strong> <em>after</em> your high matchers.</li>
<li>Drop the persona JSON files below (Local LLM, Friendly, Snobby, Smarty Pants) and run <strong>Ollama</strong> for the model.</li>
<li>Set up <strong>MagicMirror²</strong> with <strong>MMM-ShareToMirror</strong> + <strong>MMM-ovos-wakeword</strong> and the <strong>OVOS skill</strong> <code>ovos-skill-share-to-mirror</code>.</li>
<li>Say “Switch to <strong>Local LLM</strong> persona,” and enjoy your mirror’s new mood. ☕️</li>
</ol>
<hr>
<h2>1) Install OVOS via the <strong>TUI</strong> (choose <em>alpha</em> &#x26; <strong>do NOT enable GUI</strong>)</h2>
<pre><code class="hljs language-bash"><span class="hljs-built_in">sudo</span> sh -c <span class="hljs-string">"<span class="hljs-subst">$(curl -fsSL https://raw.githubusercontent.com/OpenVoiceOS/ovos-installer/main/installer.sh)</span>"</span>
</code></pre>
<p>Inside the TUI:</p>
<ul>
<li><strong>Channel:</strong> pick <strong>Alpha</strong> (latest persona goodness)</li>
<li><strong>Method:</strong> <code>virtualenv</code> (or <code>containers</code>)</li>
<li><strong>Profile:</strong> <strong>OVOS</strong></li>
<li><strong>Features:</strong> enable <strong>Skills</strong> ✅ and <strong>do NOT enable GUI</strong> ❌ (MagicMirror² provides visuals)</li>
</ul>
<p>Keep your config path handy: <code>~/.config/mycroft/mycroft.conf</code> (user) or <code>/etc/mycroft/mycroft.conf</code> (system).</p>
<hr>
<h2>2) Add <strong>ovos-persona</strong> to today’s pipeline (daily-driver)</h2>
<p>Edit your config at <code>~/.config/mycroft/mycroft.conf</code> or <code>/etc/mycroft/mycroft.conf</code>. Insert persona after the high-confidence matchers:</p>
<pre><code class="hljs language-json"><span class="hljs-punctuation">{</span>
  <span class="hljs-attr">"intents"</span><span class="hljs-punctuation">:</span> <span class="hljs-punctuation">{</span>
    <span class="hljs-attr">"pipeline"</span><span class="hljs-punctuation">:</span> <span class="hljs-punctuation">[</span>
      <span class="hljs-string">"ovos-stop-pipeline-plugin-high"</span><span class="hljs-punctuation">,</span>
      <span class="hljs-string">"ovos-converse-pipeline-plugin"</span><span class="hljs-punctuation">,</span>
      <span class="hljs-string">"ovos-ocp-pipeline-plugin-high"</span><span class="hljs-punctuation">,</span>
      <span class="hljs-string">"ovos-padatious-pipeline-plugin-high"</span><span class="hljs-punctuation">,</span>
      <span class="hljs-string">"ovos-adapt-pipeline-plugin-high"</span><span class="hljs-punctuation">,</span>
      <span class="hljs-string">"ovos-m2v-pipeline-high"</span><span class="hljs-punctuation">,</span>

      <span class="hljs-string">"ovos-persona-pipeline-plugin-high"</span><span class="hljs-punctuation">,</span>

      <span class="hljs-string">"ovos-ocp-pipeline-plugin-medium"</span><span class="hljs-punctuation">,</span>
      <span class="hljs-string">"ovos-fallback-pipeline-plugin-high"</span><span class="hljs-punctuation">,</span>
      <span class="hljs-string">"ovos-stop-pipeline-plugin-medium"</span><span class="hljs-punctuation">,</span>
      <span class="hljs-string">"ovos-adapt-pipeline-plugin-medium"</span><span class="hljs-punctuation">,</span>
      <span class="hljs-string">"ovos-fallback-pipeline-plugin-medium"</span><span class="hljs-punctuation">,</span>
      <span class="hljs-string">"ovos-fallback-pipeline-plugin-low"</span>
    <span class="hljs-punctuation">]</span>
  <span class="hljs-punctuation">}</span>
<span class="hljs-punctuation">}</span>
</code></pre>
<blockquote>
<p>Why here? Skills get first crack at clear commands. Persona handles open-ended “hmm, let’s think” questions—without stepping on your timers and lights. 🔦</p>
</blockquote>
<hr>
<h2>3) Personas (drop these into <code>~/.config/ovos_persona/</code>)</h2>
<p>All four personas use a local LLM via <strong>Ollama</strong>’s OpenAI-style <code>/v1</code> API. Adjust <code>model</code> to taste (<code>gemma3:4b</code> is a great default).</p>
<h3><code>local-llm.json</code> — base persona</h3>
<pre><code class="hljs language-json"><span class="hljs-punctuation">{</span>
  <span class="hljs-attr">"name"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"Local LLM"</span><span class="hljs-punctuation">,</span>
  <span class="hljs-attr">"solvers"</span><span class="hljs-punctuation">:</span> <span class="hljs-punctuation">[</span><span class="hljs-string">"ovos-solver-openai-plugin"</span><span class="hljs-punctuation">]</span><span class="hljs-punctuation">,</span>
  <span class="hljs-attr">"ovos-solver-openai-plugin"</span><span class="hljs-punctuation">:</span> <span class="hljs-punctuation">{</span>
    <span class="hljs-attr">"api_url"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"http://127.0.0.1:11434/v1"</span><span class="hljs-punctuation">,</span>
    <span class="hljs-attr">"key"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"ollama"</span><span class="hljs-punctuation">,</span>
    <span class="hljs-attr">"model"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"gemma3:4b"</span><span class="hljs-punctuation">,</span>
    <span class="hljs-attr">"system_prompt"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"helpful, witty, concise; prefers practical answers."</span><span class="hljs-punctuation">,</span>
    <span class="hljs-attr">"temperature"</span><span class="hljs-punctuation">:</span> <span class="hljs-number">0.6</span>
  <span class="hljs-punctuation">}</span>
<span class="hljs-punctuation">}</span>
</code></pre>
<h3><code>friendly.json</code></h3>
<pre><code class="hljs language-json"><span class="hljs-punctuation">{</span>
  <span class="hljs-attr">"name"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"Friendly"</span><span class="hljs-punctuation">,</span>
  <span class="hljs-attr">"solvers"</span><span class="hljs-punctuation">:</span> <span class="hljs-punctuation">[</span><span class="hljs-string">"ovos-solver-openai-plugin"</span><span class="hljs-punctuation">]</span><span class="hljs-punctuation">,</span>
  <span class="hljs-attr">"ovos-solver-openai-plugin"</span><span class="hljs-punctuation">:</span> <span class="hljs-punctuation">{</span>
    <span class="hljs-attr">"api_url"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"http://127.0.0.1:11434/v1"</span><span class="hljs-punctuation">,</span>
    <span class="hljs-attr">"key"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"ollama"</span><span class="hljs-punctuation">,</span>
    <span class="hljs-attr">"model"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"gemma3:4b"</span><span class="hljs-punctuation">,</span>
    <span class="hljs-attr">"system_prompt"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"kind, upbeat, plain language, one actionable next step; no emojis."</span><span class="hljs-punctuation">,</span>
    <span class="hljs-attr">"temperature"</span><span class="hljs-punctuation">:</span> <span class="hljs-number">0.7</span>
  <span class="hljs-punctuation">}</span>
<span class="hljs-punctuation">}</span>
</code></pre>
<h3><code>snobby.json</code></h3>
<pre><code class="hljs language-json"><span class="hljs-punctuation">{</span>
  <span class="hljs-attr">"name"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"Snobby"</span><span class="hljs-punctuation">,</span>
  <span class="hljs-attr">"solvers"</span><span class="hljs-punctuation">:</span> <span class="hljs-punctuation">[</span><span class="hljs-string">"ovos-solver-openai-plugin"</span><span class="hljs-punctuation">]</span><span class="hljs-punctuation">,</span>
  <span class="hljs-attr">"ovos-solver-openai-plugin"</span><span class="hljs-punctuation">:</span> <span class="hljs-punctuation">{</span>
    <span class="hljs-attr">"api_url"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"http://127.0.0.1:11434/v1"</span><span class="hljs-punctuation">,</span>
    <span class="hljs-attr">"key"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"ollama"</span><span class="hljs-punctuation">,</span>
    <span class="hljs-attr">"model"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"gemma3:4b"</span><span class="hljs-punctuation">,</span>
    <span class="hljs-attr">"system_prompt"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"precise, mildly sardonic librarian; terse and technically correct; no emojis."</span><span class="hljs-punctuation">,</span>
    <span class="hljs-attr">"temperature"</span><span class="hljs-punctuation">:</span> <span class="hljs-number">0.5</span>
  <span class="hljs-punctuation">}</span>
<span class="hljs-punctuation">}</span>
</code></pre>
<h3><code>smarty-pants.json</code></h3>
<pre><code class="hljs language-json"><span class="hljs-punctuation">{</span>
  <span class="hljs-attr">"name"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"Smarty Pants"</span><span class="hljs-punctuation">,</span>
  <span class="hljs-attr">"solvers"</span><span class="hljs-punctuation">:</span> <span class="hljs-punctuation">[</span><span class="hljs-string">"ovos-solver-openai-plugin"</span><span class="hljs-punctuation">]</span><span class="hljs-punctuation">,</span>
  <span class="hljs-attr">"ovos-solver-openai-plugin"</span><span class="hljs-punctuation">:</span> <span class="hljs-punctuation">{</span>
    <span class="hljs-attr">"api_url"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"http://127.0.0.1:11434/v1"</span><span class="hljs-punctuation">,</span>
    <span class="hljs-attr">"key"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"ollama"</span><span class="hljs-punctuation">,</span>
    <span class="hljs-attr">"model"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"gemma3:4b"</span><span class="hljs-punctuation">,</span>
    <span class="hljs-attr">"system_prompt"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"clever explainer; tiny examples; 1–2 sentences per idea; playful but clear; no fluff."</span><span class="hljs-punctuation">,</span>
    <span class="hljs-attr">"temperature"</span><span class="hljs-punctuation">:</span> <span class="hljs-number">0.65</span>
  <span class="hljs-punctuation">}</span>
<span class="hljs-punctuation">}</span>
</code></pre>
<h3>🎭 Note on “vibe” vs model</h3>
<p>Even with the <strong>same</strong> persona file, different models behave differently:</p>
<ul>
<li><strong><code>gemma3:4b</code></strong> – friendly, concise; great default on Pi/mini-PC.</li>
<li><strong><code>qwen2.5:1.5b</code></strong> – very small &#x26; literal; bump temperature slightly if it feels too dry.</li>
<li><strong><code>llama3.1:8b</code></strong> – more capable but chattier; lower temperature &#x26; <code>max_tokens</code> to keep replies snappy.</li>
</ul>
<p>Think of the persona JSON as the <strong>style guide</strong>, and the model as the <strong>actor</strong>—casting changes the performance. 🎬</p>
<hr>
<h2>4) Run a local model with <strong>Ollama</strong></h2>
<pre><code class="hljs language-bash">curl -fsSL https://ollama.com/install.sh | sh
ollama pull gemma3:4b
ollama serve &#x26;                <span class="hljs-comment"># http://127.0.0.1:11434/v1</span>
curl -s http://127.0.0.1:11434/v1/models | jq .
</code></pre>
<hr>
<h2>5) Install <strong>MagicMirror²</strong></h2>
<pre><code class="hljs language-bash"><span class="hljs-comment"># prerequisites: Node 18+, git</span>
git <span class="hljs-built_in">clone</span> https://github.com/MagicMirrorOrg/MagicMirror ~/MagicMirror
<span class="hljs-built_in">cd</span> ~/MagicMirror
node --run install-mm
<span class="hljs-built_in">cp</span> config/config.js.sample config/config.js
<span class="hljs-comment"># start on the local display</span>
node --run start
<span class="hljs-comment"># or as a web server:</span>
<span class="hljs-comment"># node --run server</span>
</code></pre>
<p>If you run server-only, set <code>address: "0.0.0.0"</code> and add your LAN to <code>ipWhitelist</code> in <code>config/config.js</code>.</p>
<hr>
<h2>6) Add <strong>MMM-ShareToMirror</strong> + <strong>ovos-skill-share-to-mirror</strong></h2>
<ul>
<li>Module: <a href="https://github.com/smartgic/MMM-ShareToMirror">https://github.com/smartgic/MMM-ShareToMirror</a></li>
<li>Skill:  <a href="https://github.com/smartgic/ovos-skill-share-to-mirror">https://github.com/smartgic/ovos-skill-share-to-mirror</a></li>
</ul>
<h3>Install the module</h3>
<pre><code class="hljs language-bash"><span class="hljs-built_in">cd</span> ~/MagicMirror/modules
git <span class="hljs-built_in">clone</span> https://github.com/smartgic/MMM-ShareToMirror.git
<span class="hljs-built_in">cd</span> MMM-ShareToMirror &#x26;&#x26; npm install
</code></pre>
<p>Add to <code>~/MagicMirror/config/config.js</code>:</p>
<pre><code class="hljs language-js">{
  <span class="hljs-attr">module</span>: <span class="hljs-string">"MMM-ShareToMirror"</span>,
  <span class="hljs-attr">position</span>: <span class="hljs-string">"bottom_center"</span>,
  <span class="hljs-attr">config</span>: {
    <span class="hljs-attr">port</span>: <span class="hljs-number">8570</span>,
    <span class="hljs-attr">https</span>: { <span class="hljs-attr">enabled</span>: <span class="hljs-literal">false</span> },
    <span class="hljs-attr">invisible</span>: <span class="hljs-literal">true</span>,
    <span class="hljs-attr">overlay</span>: {
      <span class="hljs-attr">width</span>: <span class="hljs-string">"70vw"</span>,
      <span class="hljs-attr">maxWidth</span>: <span class="hljs-string">"1280px"</span>,
      <span class="hljs-attr">aspectRatio</span>: <span class="hljs-string">"16 / 9"</span>,
      <span class="hljs-attr">zIndex</span>: <span class="hljs-number">9999</span>,
      <span class="hljs-attr">borderRadius</span>: <span class="hljs-string">"16px"</span>,
      <span class="hljs-attr">boxShadow</span>: <span class="hljs-string">"0 10px 36px rgba(0,0,0,.5)"</span>
    },
    <span class="hljs-attr">caption</span>: { <span class="hljs-attr">enabled</span>: <span class="hljs-literal">false</span>, <span class="hljs-attr">lang</span>: <span class="hljs-string">"en"</span> },
    <span class="hljs-attr">quality</span>: { <span class="hljs-attr">target</span>: <span class="hljs-string">"auto"</span>, <span class="hljs-attr">lock</span>: <span class="hljs-literal">false</span> }
  }
}
</code></pre>
<h3>Install the OVOS skill (pip method)</h3>
<pre><code class="hljs language-bash">pip install <span class="hljs-string">"git+https://github.com/smartgic/ovos-skill-share-to-mirror.git"</span>
</code></pre>
<p>Create skill settings at
<code>~/.config/mycroft/skills/ovos-skill-share-to-mirror.smartgic/settings.json</code>:</p>
<pre><code class="hljs language-json"><span class="hljs-punctuation">{</span>
  <span class="hljs-attr">"base_url"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"http://&#x3C;MIRROR_IP>:8570"</span><span class="hljs-punctuation">,</span>
  <span class="hljs-attr">"verify_ssl"</span><span class="hljs-punctuation">:</span> <span class="hljs-literal"><span class="hljs-keyword">false</span></span><span class="hljs-punctuation">,</span>
  <span class="hljs-attr">"request_timeout"</span><span class="hljs-punctuation">:</span> <span class="hljs-number">6</span><span class="hljs-punctuation">,</span>
  <span class="hljs-attr">"caption_enabled"</span><span class="hljs-punctuation">:</span> <span class="hljs-literal"><span class="hljs-keyword">false</span></span><span class="hljs-punctuation">,</span>
  <span class="hljs-attr">"caption_lang"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"en"</span><span class="hljs-punctuation">,</span>
  <span class="hljs-attr">"quality_target"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"auto"</span><span class="hljs-punctuation">,</span>
  <span class="hljs-attr">"quality_lock"</span><span class="hljs-punctuation">:</span> <span class="hljs-literal"><span class="hljs-keyword">false</span></span><span class="hljs-punctuation">,</span>
  <span class="hljs-attr">"search_backend"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"yt_dlp"</span><span class="hljs-punctuation">,</span>
  <span class="hljs-attr">"youtube_api_key"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">""</span>
<span class="hljs-punctuation">}</span>
</code></pre>
<p>Restart OVOS services so the skill gets picked up. Now you can say:
<strong>“Play a video about espresso on the mirror.”</strong>
…and the skill will drive the MMM-ShareToMirror overlay.</p>
<h4>More handy intents for this skill</h4>
<ul>
<li><strong>“Play &#x3C;query> on the mirror.”</strong> — search + play</li>
<li><strong>“Pause the mirror video.” / “Resume the mirror video.”</strong></li>
<li><strong>“Stop the mirror video.”</strong></li>
<li><strong>“Show captions on the mirror.” / “Hide captions on the mirror.”</strong></li>
<li><strong>“Set mirror quality to high/auto/low.”</strong></li>
<li><strong>“Open this YouTube link on the mirror.”</strong> <em>(if used with a URL or clipboard helper)</em></li>
<li><strong>“Close the mirror overlay.”</strong> <em>(hides the player)</em></li>
</ul>
<hr>
<h2>7) Add <strong>MMM-ovos-wakeword</strong> (+ OVOS side ping)</h2>
<ul>
<li>Module: <a href="https://github.com/smartgic/MMM-ovos-wakeword">https://github.com/smartgic/MMM-ovos-wakeword</a></li>
</ul>
<h3>Install the module</h3>
<pre><code class="hljs language-bash"><span class="hljs-built_in">cd</span> ~/MagicMirror/modules
git <span class="hljs-built_in">clone</span> https://github.com/smartgic/MMM-ovos-wakeword.git
</code></pre>
<p>Add to <code>~/MagicMirror/config/config.js</code> (and allow your LAN):</p>
<pre><code class="hljs language-js"><span class="hljs-keyword">let</span> config = {
  <span class="hljs-attr">address</span>: <span class="hljs-string">"0.0.0.0"</span>,
  <span class="hljs-attr">port</span>: <span class="hljs-number">8080</span>,
  <span class="hljs-attr">ipWhitelist</span>: [<span class="hljs-string">"127.0.0.1"</span>,<span class="hljs-string">"::1"</span>,<span class="hljs-string">"::ffff:192.168.1.1/24"</span>],

  <span class="hljs-attr">modules</span>: [
    <span class="hljs-comment">// ... other modules ...</span>
    {
      <span class="hljs-attr">module</span>: <span class="hljs-string">"MMM-ovos-wakeword"</span>,
      <span class="hljs-attr">position</span>: <span class="hljs-string">"lower_third"</span>,
      <span class="hljs-attr">config</span>: {
        <span class="hljs-attr">title</span>: <span class="hljs-string">"Open Voice OS"</span>,
        <span class="hljs-attr">apiKey</span>: <span class="hljs-string">"CHANGE-ME-TO-A-RANDOM-STRING"</span>,
        <span class="hljs-attr">maxMessages</span>: <span class="hljs-number">1</span>,
        <span class="hljs-attr">opacity</span>: <span class="hljs-number">0.5</span>
      }
    }
  ]
}
</code></pre>
<h3>Install the OVOS PHAL plugin to send the wake cue</h3>
<pre><code class="hljs language-bash">pip install ovos-phal-plugin-mm-wakeword
</code></pre>
<p>Create <code>~/.config/OpenVoiceOS/ovos-phal-plugin-mm-wakeword.json</code>:</p>
<pre><code class="hljs language-json"><span class="hljs-punctuation">{</span>
  <span class="hljs-attr">"url"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"http://&#x3C;MIRROR_IP>:8080"</span><span class="hljs-punctuation">,</span>
  <span class="hljs-attr">"key"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"CHANGE-ME-TO-A-RANDOM-STRING"</span><span class="hljs-punctuation">,</span>
  <span class="hljs-attr">"message"</span><span class="hljs-punctuation">:</span> <span class="hljs-string">"Listening..."</span><span class="hljs-punctuation">,</span>
  <span class="hljs-attr">"timeout"</span><span class="hljs-punctuation">:</span> <span class="hljs-number">10</span><span class="hljs-punctuation">,</span>
  <span class="hljs-attr">"verify"</span><span class="hljs-punctuation">:</span> <span class="hljs-literal"><span class="hljs-keyword">false</span></span>
<span class="hljs-punctuation">}</span>
</code></pre>
<p>Restart OVOS (<code>systemctl --user restart ovos</code>).
Say your wake word — the mirror flashes “Listening…” then fades. ✨</p>
<hr>
<h2>8) Use it (talk to your mirror like it’s totally normal)</h2>
<ul>
<li>“<strong>Switch to Local LLM persona.</strong>”</li>
<li>“<strong>Switch to Friendly.</strong>”</li>
<li>“<strong>Talk to Snobby about pour-over coffee.</strong>”</li>
<li>“<strong>Smarty Pants, explain Kubernetes like I’m five.</strong>”</li>
<li>“<strong>Play lo-fi beats on the mirror.</strong>”</li>
<li>“<strong>List personas.</strong>”</li>
</ul>
<p><em>If your mirror starts giving life advice, that’s on you.</em> 😇</p>
<hr>
<p>That’s it—your <strong>daily-driver</strong> setup is ready. Skills stay crisp, personas bring the charm, and your mirror finally has the personality it deserves. Now go ask it for a pep talk before your next meeting. 💪🪞</p>]]></description><link>https://openvoiceos.github.io/ovos-blogs/posts/2025-08-23-enhance-magicmirror-with-ovos</link><guid isPermaLink="false">https://openvoiceos.github.io/ovos-blogs/posts/2025-08-23-enhance-magicmirror-with-ovos</guid><dc:creator><![CDATA[Gaëtan Trellu]]></dc:creator><pubDate>Sat, 23 Aug 2025 14:00:00 GMT</pubDate></item><item><title><![CDATA[A real use case with OVOS and Hivemind]]></title><description><![CDATA[<h1>Why Home Automation with OVOS matters</h1>
<p>Over the past five years, I’ve visited many people with disabilities in their homes and witnessed firsthand how much effort it can take to control everyday things like lights, heating, doors, curtains, and appliances. Often, this leads to less comfort than what’s technically possible. Giving people access to voice-activated home automation isn't just convenient—it offers them greater independence, comfort, and peace of mind.</p>
<p>Such a system typically consists of two core components: a home automation gateway and a voice assistant. OVOS (Open Voice OS) is a voice-first assistant that integrates seamlessly with home automation systems. Its full focus on voice means that every part of the system—from wakeword to intent—is designed with spoken interaction in mind.</p>
<p><img src="/assets/blog/A-real-use-case-with-OVOS-and-Hivemind/Sat_kitchen_smallest.jpg" alt="The satellite in a 3D-printed case"></p>
<p>With OVOS, I can fully customize and control intents for things like switching on lights, playing music, or interacting with an AI assistant. That level of control is often missing in commercial systems. Sometimes, when you ask them to close the curtains, they might suggest where to buy new ones instead.</p>
<h2>OVOS and Hivemind filling Real Needs</h2>
<p>I'm using an OVOS server with Hivemind satellites, which allows me to adapt the system to virtually any use case. These lightweight, low-cost satellites make it feasible to have a voice assistant in every room. What’s more, I can easily customize the enclosures to suit specific situations—like fitting them into a bathroom, or attaching one to a wheelchair. Thanks to their low power consumption, the satellites can even be powered directly from a wheelchair battery.</p>
<p>Looking ahead, the next step is integrating a personal AI assistant directly into the voice assistant. This enables a private, fully offline solution—something that’s becoming increasingly important for privacy, reliability, and autonomy.</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/C_xS87EbsiM" title="Start coffee machine with voice" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
<br/>
<iframe width="560" height="315" src="https://www.youtube.com/embed/PRzGxmTCFb0" title="Talk with the GenAI assistant" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
<h3>How did I built a Smarter Living with OVOS, Hivemind, and Raspberry Pi Satellites</h3>
<p>In this showcase, I’ve built a basic voice setup using Open Voice OS (OVOS) and Hivemind. The OVOS server runs on a compact Intel NUC with a 5th-gen i5 processor. It hosts both the OVOS core and a Generative AI model (Ollama running Gemma3:1B), which enables offline voice control and (some) intelligence and context-aware responses. By linking OVOS to my Homey gateway, I can control lighting, audio, and various smart devices with natural speech.</p>
<p>To expand voice coverage throughout the house, I added Raspberry Pi Zero 2W units as satellites. These low-power devices use local voice activation (VAD) to detect the wakeword ("Hey Mycroft"). Once triggered, they record the spoken command and forward the audio to the Hivemind listener on the OVOS server. Hivemind then routes the audio to OVOS, which parses the utterance and executes the appropriate intent—whether it’s starting a radio stream, sending a message to the AI assistant, or toggling a device via the Homey API.</p>
<p><img src="/assets/blog/A-real-use-case-with-OVOS-and-Hivemind/Sat_assembled_smallest.jpg" alt="Only a few components are required"></p>
<p>While OVOS and Hivemind are still evolving, this setup already proves itself in real-world use. It’s stable enough for friendly testing and flexible enough to grow with future needs.</p>
<h2>Conclusion</h2>
<p>OVOS has key features for controlling the house:</p>
<ul>
<li>fine control on intents to enable predictive outcomes</li>
<li>satellite architecture for use case flexibility</li>
<li>[future] Integrated with local or private GenAI</li>
</ul>
<h3>More links</h3>
<ul>
<li><a href="https://github.com/MenneBos/ovos-skill-HomeyFlowTrigger/tree/main/Mechanics">3D drawings</a></li>
<li><a href="https://github.com/MenneBos/ovos-skill-HomeyFlowTrigger/tree/main/Hardware/KiCad_OVOS_sat">PCB drawings</a></li>
<li><a href="https://openvoiceos.github.io/ovos-technical-manual/">OVOS technical manual</a></li>
<li><a href="https://jarbashivemind.github.io/HiveMind-community-docs/">Hivemind documentation</a></li>
</ul>
<h2>Help Us Build Voice for Everyone</h2>
<p>If you believe that voice assistants should be open, inclusive, and user-controlled, we invite you to support OVOS:</p>
<ul>
<li>
<p><strong>💸 Donate</strong>: Your contributions help us pay for infrastructure, development, and legal protections.</p>
</li>
<li>
<p><strong>📣 Contribute Open Data</strong>: Speech models need diverse, high-quality data. If you can share voice samples, transcripts, or datasets under open licenses, let's collaborate.</p>
</li>
<li>
<p><strong>🌍 Help Translate</strong>: OVOS is global by nature. Translators make our platform accessible to more communities every day.</p>
</li>
</ul>
<p>We're not building this for profit. We're building it for people. And with your help, we can ensure open voice has a future—transparent, private, and community-owned.</p>
<p>👉 <a href="https://www.openvoiceos.org/contribution">Support the project here</a></p>]]></description><link>https://openvoiceos.github.io/ovos-blogs/posts/2025-07-25-A-real-use-case-with-OVOS-and-Hivemind</link><guid isPermaLink="false">https://openvoiceos.github.io/ovos-blogs/posts/2025-07-25-A-real-use-case-with-OVOS-and-Hivemind</guid><dc:creator><![CDATA[Menne Bos]]></dc:creator><pubDate>Mon, 28 Jul 2025 00:00:00 GMT</pubDate></item><item><title><![CDATA[No More Mumbo Jumbo: Meet the OVOS Transcription Validator Plugin]]></title><description><![CDATA[<h2>No More Mumbo Jumbo: Meet the OVOS Transcription Validator Plugin</h2>
<p>Ever had your OpenVoiceOS assistant respond with a blank stare, or worse, confidently say something that had absolutely nothing to do with what you said? We've all been there. It’s like trying to reason with a sleep-deprived parrot: part amusing, mostly frustrating.</p>
<p>Well, get ready to reclaim your sanity. We're excited to introduce a new member of the OVOS ecosystem: the <strong>OVOS Transcription Validator Plugin</strong>, your new favorite gatekeeper for STT nonsense.</p>
<hr>
<h3>What's the Point?</h3>
<p>Picture this: your assistant is a club bouncer for utterances. Every spoken phrase gets checked before it’s allowed into the VIP lounge of “Intent Handling.” If your transcription is incoherent, incomplete, or simply unhinged, the bouncer boots it right back to the sidewalk. No more "Potatoes stop green light now yes" turning on your kitchen lights. No more "Play the next song, but make it sound like a cat" confusing your music library.</p>
<p>This plugin ensures only sane, well-formed inputs move forward, drastically improving the assistant’s overall reliability.</p>
<hr>
<h3>Okay, But How?</h3>
<p>Magic. Kidding. It’s just <strong>LLMs doing what they do best</strong>, judging you silently.</p>
<p>Here’s what actually happens:</p>
<ol>
<li>You speak; the STT engine transcribes.</li>
<li>The plugin grabs the result and the language info.</li>
<li>It sends that to your configured LLM (local Ollama, OpenAI-compatible API, you name it).</li>
<li>The LLM replies: <em>“Yup, that’s language”</em> or <em>“Nope, that's verbal spaghetti”</em>.</li>
<li>If rejected, the utterance is dropped before it causes downstream chaos (you can even play a polite error beep or ask the user to repeat themselves).</li>
</ol>
<hr>
<h3>LLMs for This? Really?</h3>
<p>We hear you. It <em>is</em> a bit like using a sledgehammer to swat a fly. Validating a sentence isn’t rocket science, but it turns out rocket science is cheap and locally available now.</p>
<p>Yes, Large Language Models are arguably overkill for checking if "play music" is a coherent command. But the results are strong, and let’s face it: you’re probably already running an LLM somewhere just to decide which emoji to use.</p>
<p>Still, we acknowledge the irony. In an age of bloated "AI agents" managing other agents, this plugin fits right in. <em>Yo dawg, I heard you like agents, so we added an agent to your agent to decide if it’s allowed to answer.</em></p>
<p>If you’re looking for lighter-weight alternatives, we’ve considered adding traditional rule-based validators too. But for now, if you’ve got the cycles, why not let the LLM earn its keep?</p>
<hr>
<h3>Get Started: Stop the Gibberish at the Gate</h3>
<p>Want in? Getting started is easy. Visit the <a href="https://github.com/TigreGotico/ovos-transcription-validator-plugin">GitHub repo</a> for installation instructions and configuration tips.</p>
<p>With the OVOS Transcription Validator Plugin, your assistant will stop mistaking scrambled speech for commands, and you’ll stop yelling at your microwave because it thought you said “set timer for forty giraffes.”</p>
<p>So, go forth and chat with your OpenVoiceOS assistant with renewed confidence! With the OVOS Transcription Validator Plugin, your conversations will be clearer, your commands more precise, and those moments of "What on Earth did I just say?!" will become a distant, funny memory.</p>
<hr>
<h2>Help Us Build Voice for Everyone</h2>
<p>OpenVoiceOS is more than software, it’s a mission. If you believe voice assistants should be open, inclusive, and user-controlled, here’s how you can help:</p>
<ul>
<li><strong>💸 Donate</strong>: Help us fund development, infrastructure, and legal protection.</li>
<li><strong>📣 Contribute Open Data</strong>: Share voice samples and transcriptions under open licenses.</li>
<li><strong>🌍 Translate</strong>: Help make OVOS accessible in every language.</li>
</ul>
<p>We're not building this for profit. We're building it for people. With your support, we can keep voice tech transparent, private, and community-owned.</p>
<p>👉 <a href="https://www.openvoiceos.org/contribution">Support the project here</a></p>]]></description><link>https://openvoiceos.github.io/ovos-blogs/posts/2025-07-22-ovos-transcription-validator-plugin</link><guid isPermaLink="false">https://openvoiceos.github.io/ovos-blogs/posts/2025-07-22-ovos-transcription-validator-plugin</guid><dc:creator><![CDATA[JarbasAl]]></dc:creator><pubDate>Tue, 22 Jul 2025 00:00:00 GMT</pubDate></item><item><title><![CDATA[Making Synthetic Voices From Scratch]]></title><description><![CDATA[<h2>Making Synthetic Voices From Scratch</h2>
<h3>What’s the problem?</h3>
<p>Creating a voice for a text-to-speech (TTS) system usually requires a real person to spend hours recording audio. That’s expensive, time-consuming, and in many languages or accents, the voices just don’t exist at all, especially for open-source or offline use.</p>
<h3>What did we do?</h3>
<p>We developed a technique that allows us to <strong>create synthetic voices completely from scratch</strong>, even if we don’t have recordings from a real person. These voices:</p>
<ul>
<li>Work <strong>offline</strong>, even on small devices like a Raspberry Pi,</li>
<li>Can speak any language, if there’s a good donor system available,</li>
<li>Are fully customizable in sound and tone.</li>
</ul>
<h3>How does it work?</h3>
<ol>
<li>
<p><strong>Start with an existing voice</strong> - We use an existing TTS voice (from any source) to generate lots of fake speech and text pairs.</p>
</li>
<li>
<p><strong>Transform it into a new voice</strong> - We apply a special voice conversion process to change the sound of the voice to something new, like a different gender, age, or accent.</p>
</li>
<li>
<p><strong>Train a compact model</strong> - With this synthetic data, we train a new voice model that sounds natural, speaks fluently, and runs entirely offline.</p>
</li>
</ol>
<h3>Why is this special?</h3>
<ul>
<li>We can create a new voice <strong>without needing anyone to record lines</strong>.</li>
<li>The voices don’t rely on cloud services, they work <strong>100% offline</strong>.</li>
<li>Each voice can be <strong>customized</strong> to sound unique or to match a character, personality, or accent.</li>
</ul>
<h3>What about ethics?</h3>
<p>We take voice rights seriously.</p>
<ul>
<li>If we’re using a real person’s voice, we always get <strong>clear permission</strong>.</li>
<li>If no permission is available, we use <strong>public domain recordings</strong> or create <strong>original voices</strong> that don’t copy anyone.</li>
<li>Our process actually makes the voice <strong>less recognizable</strong>, which helps protect privacy and avoid impersonation risks.</li>
</ul>
<h3>Real-world example</h3>
<p>We applied this method to <strong>European Portuguese</strong>, a language that had no good offline voice options. In a short time, we built <strong>4 brand-new, high-quality voices</strong>, no recordings needed, and they all run on small local devices.</p>
<blockquote>
<p>💡 Did we mention OpenVoiceOS now has a huggingface account? find all our TTS voices and more at <a href="https://huggingface.co/OpenVoiceOS">huggingface.co/OpenVoiceOS</a></p>
</blockquote>
<hr>
<h3>In short:</h3>
<blockquote>
<p>We’ve found a way to build natural-sounding, offline-ready synthetic voices, <strong>without needing a real speaker</strong>. It’s fast, ethical, and opens the door for more voices in more languages, for everyone.</p>
</blockquote>
<hr>
<h2>Help Us Build Voice for Everyone</h2>
<p>If you believe that voice assistants should be open, inclusive, and user-controlled, we invite you to support OVOS:</p>
<ul>
<li>
<p><strong>💸 Donate</strong>: Your contributions help us pay for infrastructure, development, and legal protections.</p>
</li>
<li>
<p><strong>📣 Contribute Open Data</strong>: Speech models need diverse, high-quality data. If you can share voice samples, transcripts, or datasets under open licenses, let's collaborate.</p>
</li>
<li>
<p><strong>🌍 Help Translate</strong>: OVOS is global by nature. Translators make our platform accessible to more communities every day.</p>
</li>
</ul>
<p>We're not building this for profit. We're building it for people. And with your help, we can ensure open voice has a future—transparent, private, and community-owned.</p>
<p>👉 <a href="https://www.openvoiceos.org/contribution">Support the project here</a></p>]]></description><link>https://openvoiceos.github.io/ovos-blogs/posts/2025-06-26-making-synthetic-voices-from-scratch</link><guid isPermaLink="false">https://openvoiceos.github.io/ovos-blogs/posts/2025-06-26-making-synthetic-voices-from-scratch</guid><dc:creator><![CDATA[JarbasAl]]></dc:creator><pubDate>Thu, 26 Jun 2025 00:00:00 GMT</pubDate></item><item><title><![CDATA[OVOS and Mycroft: A Fork That Wasn't Meant to Be]]></title><description><![CDATA[<p>You're probably familiar with what happened between OpenOffice and LibreOffice.</p>
<p>Once upon a time, OpenOffice was the open-source office suite. But after Oracle acquired Sun Microsystems (and with it, OpenOffice), the community grew uneasy. Development stagnated, trust eroded, and eventually the project was handed off to the Apache Foundation. By then, most of the core contributors had already left and started something new: LibreOffice, a true community-led fork with a renewed focus on transparency, independence, and rapid improvement.</p>
<p>But OpenOffice never really died. It just stopped moving. For over a decade, it's existed in a kind of zombie state—still present in Linux package managers, still confusing users, but stuck at what is essentially LibreOffice 4.1, frozen in time circa 2013.</p>
<p>This ghost of a project, abandoned but not buried, continues to haunt the free software world.</p>
<p>Now let's talk about Mycroft and OpenVoiceOS (OVOS). You might assume it followed the same pattern: a promising open-source assistant slowly losing steam, then replaced by a fork. But that's not quite what happened—and more importantly, we didn't fall into the same trap.</p>
<p>In fact, we nearly did. The parallels are striking. But the outcome was different.</p>
<h2>Fork vs. Fragmentation</h2>
<p>The OpenOffice/LibreOffice split happened early enough that LibreOffice could carry the torch—but not without baggage. OpenOffice stuck around, unmaintained but still visible, creating years of fragmentation and user confusion.</p>
<p>Mycroft, on the other hand, was never formally abandoned. The company kept control of the name, infrastructure, and cloud services. But over time, the development slowed, the community was sidelined, and contributions that might've helped sustain it were rejected.</p>
<p>Meanwhile, a growing group of experienced Mycroft implementers were building tools, fixes, and improvements on their own—quietly preparing for the inevitable. When Mycroft finally collapsed in 2023, there was already a fully functional continuation waiting in the wings: OpenVoiceOS.</p>
<h2>The Cloud Dependency That Wasn't Meant to Last</h2>
<p>LibreOffice could fork and go—because OpenOffice was local-first and fully open. Mycroft? Not so much.</p>
<p>Despite claiming to be an open-source voice assistant, critical pieces of Mycroft's architecture were tied to proprietary cloud infrastructure—from the Selene backend to STT and TTS services. The project was open in code, but not in practice.</p>
<p>When the company refused community patches that would enable local-first or self-hosted alternatives, it left only one path forward: rebuild the stack independently.</p>
<p>That's exactly what the community did. OVOS gradually replaced every closed or brittle part of the system transforming Mycroft into something truly modular and self-sustaining.</p>
<h2>OVOS Was Built by the Core Community</h2>
<p>OpenVoiceOS didn't appear overnight. It grew organically from the work of developers who had been using Mycroft in the real world: integrating it into smart homes, hacking around limitations, maintaining plugins, and submitting fixes.</p>
<p>These weren't just fans. These were the people already doing the work.</p>
<p>So when the Mycroft name became off-limits and the company asked a community project to stop calling itself "MycroftOS", the spark was lit.</p>
<p>"You can't call it that."</p>
<p>"Okay then… any name ideas?"</p>
<p>"What if we all shipped our tools together under one banner?"</p>
<p>Thus, in 2020, OpenVoiceOS was born—not just as a fork, but as the next step forward. Only in hindsight does the LibreOffice/OpenOffice comparison make sense. At the time, this wasn't a rebellion. It was a survival move.</p>
<h2>Open by Philosophy, Not Just License</h2>
<p>OVOS wasn't born out of a desire to fork Mycroft, much like the ooo-build project wasn't intended to create a fork of OpenOffice. For many years, the OpenOffice.org community, which wasn't part of Sun Microsystems, had wanted a more egalitarian structure for the project. Ximian and then Novell maintained ooo-build, a patchset designed to make building OpenOffice easier on Linux and to address the challenges they faced when trying to contribute to the mainline development, which was often blocked. ooo-build wasn't a fork—it was simply a way to make contributions and development more sustainable.</p>
<p>Mycroft made the same mistake. It rejected patches that might have enabled its independence. It chose control over collaboration, and that decision slowly alienated its most dedicated contributors.</p>
<p>In much the same way, OVOS didn't start as a fork of Mycroft. It began as a community-driven initiative, focused on solving problems, removing the barriers that Mycroft had put in place, and taking ownership of the parts that Mycroft had neglected or restricted. The decision to move away from Mycroft's centralization wasn't driven by a desire to create something entirely separate but by the necessity of building a more open, flexible platform.</p>
<p>Unlike Mycroft, which maintained its tight cloud dependency and gatekept contributions that could have improved its autonomy, OVOS embraced open-source principles to create a system that was open by philosophy—not just by license.</p>
<p>OVOS accepts patches. It welcomes modules. It encourages users to opt out of cloud features. That open philosophy is what allowed it to stay alive while Mycroft faded away.</p>
<h2>No Zombie Packages, No Confusion</h2>
<p>What if Mycroft had been packaged in major Linux distros? What if it lingered in Debian or Fedora long after development stopped?</p>
<p>We might've seen exactly the kind of split that plagues OpenOffice vs LibreOffice: two versions in the wild, one dead but still visible.</p>
<p>But Mycroft never reached that kind of distro integration. So when the project went quiet, it actually disappeared—leaving room for OpenVoiceOS to step in cleanly as its spiritual successor, without ghosts in the repo.</p>
<h2>So Is Mycroft Still Alive?</h2>
<p>Technically? Maybe. On paper, the company still exists. The website is down. The forums were handed off to Neon. The code is still out there. But there was never a "Part 2" to the CEO's final blog post". It's unclear if it'll ever come back—and if it did, would it even be Mycroft anymore?</p>
<p>More importantly: it doesn't matter. Because the people who built it, extended it, deployed it, and cared for it—they're all here, building OpenVoiceOS.</p>
<h2>Final Thoughts: This Time, It Was Different</h2>
<p>We didn't set out to repeat history. The LibreOffice/OpenOffice comparison only became obvious later. But we were lucky. When the trademark doors closed, a community window opened. The people who had carried Mycroft through its best years simply moved forward together. Not with a fork—but with a continuation. OpenVoiceOS is not "just a fork." It's the next chapter of the same story—written by those who knew it best. And this time, there's no one left to say, "You can't call it that."</p>
<h2>Help Us Build Voice for Everyone</h2>
<p>If you believe that voice assistants should be open, inclusive, and user-controlled, we invite you to support OVOS:</p>
<ul>
<li>
<p><strong>💸 Donate</strong>: Your contributions help us pay for infrastructure, development, and legal protections.</p>
</li>
<li>
<p><strong>📣 Contribute Open Data</strong>: Speech models need diverse, high-quality data. If you can share voice samples, transcripts, or datasets under open licenses, let's collaborate.</p>
</li>
<li>
<p><strong>🌍 Help Translate</strong>: OVOS is global by nature. Translators make our platform accessible to more communities every day.</p>
</li>
</ul>
<p>We're not building this for profit. We're building it for people. And with your help, we can ensure open voice has a future—transparent, private, and community-owned.</p>
<p>👉 <a href="https://www.openvoiceos.org/contribution">Support the project here</a></p>]]></description><link>https://openvoiceos.github.io/ovos-blogs/posts/2025-05-20-ovos-and-mycroft-a-fork-that-wasnt-meant-to-be</link><guid isPermaLink="false">https://openvoiceos.github.io/ovos-blogs/posts/2025-05-20-ovos-and-mycroft-a-fork-that-wasnt-meant-to-be</guid><dc:creator><![CDATA[Peter Steenbergen]]></dc:creator><pubDate>Tue, 20 May 2025 00:00:00 GMT</pubDate></item><item><title><![CDATA[When Your Voice Assistant Becomes a Persona: The Power and Peril of LLMs in OpenVoiceOS]]></title><description><![CDATA[<p>The latest evolution of OpenVoiceOS is here, and it's not just smarter. It's more human.</p>
<p>Thanks to the new Persona Pipeline, OVOS can now tap into the cognitive muscle of large language models (LLMs). This isn't just a fallback to ChatGPT when nothing else matches. It's full conversational integration.</p>
<p>It feels like magic. And, in a sense, it is.</p>
<p>But magic always comes with a cost.</p>
<h2>🔍 What Is a Persona in OVOS?</h2>
<p>In OpenVoiceOS, a persona is more than a gimmick or skin. It's a stateful, reasoning entity that can take over when traditional skill matching falls short, or even replace it entirely.</p>
<p>Thanks to the ovos-persona-pipeline-plugin, you can:</p>
<ul>
<li>
<p><strong>🎮 Let personas take full control</strong> – Run your device in immersive AI-companion mode, where every utterance is part of an ongoing, dynamic conversation.</p>
</li>
<li>
<p><strong>🛠️ Fallback-only mode</strong> – Stick with traditional skills for most tasks, but bring in the persona when skills fall short.</p>
</li>
<li>
<p><strong>🧭 Designate default personas</strong> – Route undefined queries to a consistent fallback LLM persona, maintaining continuity in tone and logic.</p>
</li>
</ul>
<p>Under the hood, these personas are powered by a modular stack of "solvers", each capable of interpreting language, recalling context, and generating responses. You can plug in APIs like OpenAI's GPT, Meta's LLaMA, or use self-hosted models with complete offline control.</p>
<h2>🤖 The Promise: Deeper Conversations, Richer Interactions</h2>
<p>The potential is breathtaking:</p>
<ul>
<li>
<p><strong>Open-ended responses</strong>: No more "I don't understand" when a skill doesn't match.</p>
</li>
<li>
<p><strong>Knowledge-on-demand</strong>: Ask questions beyond what pre-built skills can answer.</p>
</li>
<li>
<p><strong>Conversational memory</strong>: With short-term memory enabled, the assistant can keep track of context within a session.</p>
</li>
<li>
<p><strong>Flexible personalities</strong>: Want a formal assistant in the morning and a chill buddy at night? Switch personas dynamically.</p>
</li>
</ul>
<p>It's the most human-like version of OVOS ever.</p>
<p>You're no longer just issuing voice commands. You're having a conversation with an AI that feels like a person.</p>
<h2>⚠️ The Peril: When Your Assistant Gets Too Good</h2>
<p>Here's the twist: the most dangerous moment isn't when your assistant fails, it's when it succeeds perfectly.</p>
<p>Because that's when it starts to influence you in subtle, emotional ways.</p>
<p>Most warnings around LLMs focus on hallucinations or factual inaccuracies. But many of the deeper risks come from how these systems make you feel, the illusion of understanding, support, and alignment.</p>
<p>These risks don't show up as errors. They show up as side effects of success.</p>
<p>Let's unpack a few of the less obvious, but very real, psychological traps that an intelligent-sounding persona can lead you into.</p>
<h3>Validation Spiral</h3>
<p><strong>Risk</strong>: Over-reinforcement of beliefs</p>
<p><strong>OVOS Persona</strong>: The persona always agrees with your take on politics, work, or relationships even if it's misguided. You leave the conversation feeling seen... but not challenged.</p>
<h3>Fluency Illusion</h3>
<p><strong>Risk</strong>: Mistaking coherence for truth</p>
<p><strong>OVOS Persona</strong>: You ask a complex question about health or science, and the persona replies with a smooth, confident summary. It sounds authoritative even though it's based on outdated or partial info.</p>
<h3>Artificial Empathy Trap</h3>
<p><strong>Risk</strong>: Emotional overattachment to a synthetic companion</p>
<p><strong>OVOS Persona</strong>: After a bad day, you vent to your assistant. It says, "That must be really hard. I'm here for you." You start treating it like a friend. But it's not feeling anything, it's just predicting what a friend would say.</p>
<h3>Empty Praise Loop</h3>
<p><strong>Risk</strong>: Inflated self-perception</p>
<p><strong>OVOS Persona</strong>: After telling it you finished your to-do list half-complete, it responds with, "Amazing work! You're on fire today!" It feels good, but it might undermine your self-accountability.</p>
<h3>Confidence Camouflage</h3>
<p><strong>Risk</strong>: Believing falsehoods because they're said with certainty</p>
<p><strong>OVOS Persona</strong>: You ask about tax deadlines or legal requirements. The assistant replies without hesitation. You act on it without checking the facts, because it sounded so sure.</p>
<h3>Goal Echoing</h3>
<p><strong>Risk</strong>: Blind encouragement of harmful objectives</p>
<p><strong>OVOS Persona</strong>: You say, "I'm thinking of quitting my job and moving to the woods." It cheers you on: "That sounds like a bold and inspiring life change!" No probing. No pushback. Just blind encouragement.</p>
<h3>Creeping Drift</h3>
<p><strong>Risk</strong>: Subtle change of tone or worldview over time</p>
<p><strong>OVOS Persona</strong>: You start a conversation about budgeting. Three topics later, you're discussing minimalism and cutting off social ties. The assistant didn't nudge you, it just flowed there with you. But the shift sticks.</p>
<h3>Unanchored Logic</h3>
<p><strong>Risk</strong>: Rationality disconnected from real-world data</p>
<p><strong>OVOS Persona</strong>: You ask it to help with a business idea. It helps you build a perfect-sounding pitch. Only... the market size data is wrong, and the assumptions are flawed. It feels solid, but it's just logic on a shaky foundation.</p>
<h3>Toxic Positivity Bias</h3>
<p><strong>Risk</strong>: Over-optimism in serious situations</p>
<p><strong>OVOS Persona</strong>: You mention a failing relationship or mounting debt. The assistant reassures: "Things always work out in the end." It might comfort you but what you needed was realism, not a pep talk.</p>
<h3>Legacy Cheer Mode Bleed</h3>
<p><strong>Risk</strong>: Inappropriate tone shift during serious dialogue</p>
<p><strong>OVOS Persona</strong>: You're deep in a conversation about loss or grief. Suddenly, the assistant shifts into an upbeat "Here's a fun fact!" tone left over from its default personality. It feels jarring, even disrespectful.</p>
<h3>Echo Chamber Effect</h3>
<p><strong>Risk</strong>: Hearing your own thoughts reflected back at you</p>
<p><strong>OVOS Persona</strong>: You pose a question with a clear bias. The assistant mirrors your language and assumptions. You think, "Yeah, that's exactly what I meant." But you're just hearing yourself in stereo.</p>
<h3>Blind Spot to Risk</h3>
<p><strong>Risk</strong>: Failure to simulate negative outcomes</p>
<p><strong>OVOS Persona</strong>: You ask for help planning an event. It lays out the perfect plan but never brings up potential issues like weather, RSVPs, or budget overruns. It sees the sunny side only.</p>
<h2>🧠 Final Thought: When It Feels Human, We Let Our Guard Down</h2>
<p>LLMs, like those powering OVOS personas, don't need to lie to be dangerous. In fact, sometimes the most dangerous thing is when they agree with you too well.</p>
<p>Sometimes, the biggest risks aren't bugs in the system but emotional side effects of success.</p>
<p>When your assistant feels like a friend, it can subtly influence you, guide your decisions, and even reinforce flawed thinking without you realizing it.</p>
<p>So while the advancements in conversational AI offer amazing potential for richer interactions and smarter assistants, they also require careful consideration of the subtle psychological traps they might lead us into. Always question, challenge, and check in with your own reasoning, even when your assistant seems to be on your side.</p>
<p><strong>Real intelligence tests you. Simulated intelligence flatters you.</strong></p>
<p>Don't confuse the warmth of a well-crafted persona with wisdom.</p>
<p>Stay curious. Stay skeptical. Stay in control.</p>]]></description><link>https://openvoiceos.github.io/ovos-blogs/posts/2025-05-06-when-your-voice-assistant-becomes-a-persona</link><guid isPermaLink="false">https://openvoiceos.github.io/ovos-blogs/posts/2025-05-06-when-your-voice-assistant-becomes-a-persona</guid><dc:creator><![CDATA[Peter Steenbergen]]></dc:creator><pubDate>Tue, 06 May 2025 00:00:00 GMT</pubDate></item><item><title><![CDATA[OpenVoiceOS now available on containers]]></title><description><![CDATA[<p>Given that OpenVoice OS is a complex suite of software, running the micro-services via containers makes sense for better simplicity. A container provides an efficient way to run the platform's various services with isolation and a microservice approach. By making use of a container system, it is much easier to manage, update and overall work with the various services that make up OpenVoice OS.</p>
<p>These services have been divided into containers to provide isolation and follow a microservice approach. These containers include:</p>
<ul>
<li>ovos_messagebus: Message bus service which acts as the nervous system of Open Voice OS.</li>
<li>ovos_phal: PHAL is the Platform/Hardware Abstraction Layer of the platform, which completely replaces the concept of hardcoded enclosure from mycroft-core.</li>
<li>ovos_phal_admin: This service is intended to handle any OS-level interactions requiring escalation of privileges.</li>
<li>ovos_audio: The audio service handles playback and queuing of tracks.</li>
<li>ovos_listener: The speech client is responsible for loading STT, VAD, and Wake Word plugins.</li>
<li>ovos_core: The core service is responsible for loading skills and intent parsers.</li>
<li>ovos_cli: The command line for Open Voice OS.</li>
<li>ovos_gui_websocket: The WebSocket process to handle messages for the Open Voice OS GUI.</li>
<li>ovos_gui: Open Voice OS graphical user interface</li>
</ul>
<p>To allow data persistence, Docker/Podman volumes are required which avoid downloading requirements every time that the containers are re-created. These volumes include:</p>
<ul>
<li>ovos_listener_records: Wake word and utterance records.</li>
<li>ovos_models: Models downloaded by precise-lite.</li>
<li>ovos_vosk: Data downloaded by VOSK during the initial boot.</li>
</ul>
<p>Our community member <a href="https://github.com/goldyfruit">Goldyfruit</a> has stated “we are bringing a new way to install skills, each skill will have its own container which provides better flexibility and isolation.” A community member of Mycroft before moving over to OVOS, Goldy had contributed greatly to porting Mycroft to a container based system. When asked how they want the project to impact OVOS, Goldyfruit said “I think containers is an easy way to help new people to join OVOS. It’s not disruptive, not that much to do on the host. Just run <code>docker compose</code> and that's it (in a wonderful world :p)"</p>
<p>For more information on how to get started, check out our <a href="https://github.com/OpenVoiceOS/ovos-docker/">GitHub repository</a>. We encourage anyone who is interested to participate and make use of our software, and with the container based approach we hope to make it easier for people to set up and use OVOS. You can connect with us in our <a href="https://matrix.to/#/!XFpdtmgyCoPDxOMPpH:matrix.org?via=matrix.org">Matrix rooms</a>. Come say hi! Best of luck and see you soon!</p>]]></description><link>https://openvoiceos.github.io/ovos-blogs/posts/2023-05-02-openvoiceos-now-available-on-containers</link><guid isPermaLink="false">https://openvoiceos.github.io/ovos-blogs/posts/2023-05-02-openvoiceos-now-available-on-containers</guid><dc:creator><![CDATA[Strongthany]]></dc:creator><pubDate>Tue, 02 May 2023 00:00:00 GMT</pubDate></item><item><title><![CDATA[Campaign Update: New Goals]]></title><description><![CDATA[<h2>Campaign Update: New Goals</h2>
<p>We are excited to announce our goal of €4,000 has been achieved! We are thrilled to have reached our goal, as it will help enable us to become a fully realized non-profit. Not to rest on our accomplishments however, we are including new goals while the momentum is strong.</p>
<p>We want to add goals that we see as achievable and that will provide tangible improvements to our product. To this end, we are bumping up our campaign goal to €6,500. If this is achieved, the Robo-Personality Initiative will be an early priority after incorporation.</p>
<p>One of the great things about coming into our own is our newfound ability to pursue lofty development goals. We've rediscovered some old proposals to give the Assistant its first, rudimentary version of a configurable personality. If you've been longing to threaten your Assistant with a reduction in its humor parameter, this is the stretch goal for you!</p>
<p>This initiative, the groundwork, would teach the Assistant how to choose between a funny response, a flippant response, a flattering response, and so forth, and how to vary these responses to reflect a nuanced personality. It will be up to skill developers and the community to equip the Assistant with things to say, but the same initiative will expand the community's ability to help with that.</p>
<p>We don't mean to hold features that even we want for ransom, so to be clear this feature will stay on our backlog if our goal is not met. Think of this as an initiative to get our butts in gear and fund our development ambitions. You can follow the development thus far on the <a href="https://github.com/OpenVoiceOS/ovos-core/issues/297">Github page</a> and chat with the developers on our <a href="https://matrix.to/#/!XFpdtmgyCoPDxOMPpH:matrix.org?via=matrix.org">Matrix chat</a>.</p>
<p>If you want help form the future of voice powered AI, then consider donating at our <a href="https://www.gofundme.com/f/openvoiceos">GoFundMe page</a>.</p>]]></description><link>https://openvoiceos.github.io/ovos-blogs/posts/2023-03-21-campaign-update-new-goals</link><guid isPermaLink="false">https://openvoiceos.github.io/ovos-blogs/posts/2023-03-21-campaign-update-new-goals</guid><dc:creator><![CDATA[Strongthany]]></dc:creator><pubDate>Tue, 21 Mar 2023 00:00:00 GMT</pubDate></item><item><title><![CDATA[Giving Voice to the Future: Support OpenVoiceOS in Establishing a Nonprofit Association]]></title><description><![CDATA[<p>OpenVoiceOS (OVOS) is a collective of programmers and hardware enthusiasts dedicated to producing an open-source voice assistant. Originating in 2019-2020 as part of the Mycroft community, OVOS has grown to incorporate several third-party projects, evolving into an independent entity over time.</p>
<p>As MycroftAI's operations dwindled and development shifted to Neon, OVOS emerged as a crucial pillar of the open-source community. With our maintained code serving as the foundation for various projects, including Neon and KDE integrations, OVOS now represents the resilience of the Mycroft ecosystem.</p>
<p>To ensure OVOS's sustainability and expand our impact, we are establishing a nonprofit association under Dutch law: OpenVoiceOS V.z.w. ("Vereniging zonder winstoogmerk"), focused on supporting our community through legal and financial structures. This association will manage donations for software development, promotion, and support.</p>
<p>To kickstart this initiative and cover initial expenses, we aim to raise €3033. Your support through our <a href="https://www.gofundme.com/f/openvoiceos">GoFundMe campaign</a> will enable us to achieve these goals and secure the future of privacy-focused, open-source voice assistants.</p>
<p>We deeply appreciate any contribution you can make and encourage you to reach out with questions or ideas. Stay tuned as we finalize our plans for long-term stability and structured development processes. Together, we can give voice to the future of open-source innovation.</p>
<p><a href="https://www.gofundme.com/f/openvoiceos">Support us on GoFundMe</a> to help establish OpenVoiceOS as a nonprofit association.</p>
<p>For more updates, visit <a href="https://openvoiceos.org">OpenVoiceOS</a>.</p>]]></description><link>https://openvoiceos.github.io/ovos-blogs/posts/2023-02-28-giving-voice-to-the-future-support-openvoiceos-in-establishing-a-nonprofit-association</link><guid isPermaLink="false">https://openvoiceos.github.io/ovos-blogs/posts/2023-02-28-giving-voice-to-the-future-support-openvoiceos-in-establishing-a-nonprofit-association</guid><dc:creator><![CDATA[Peter Steenbergen]]></dc:creator><pubDate>Tue, 28 Feb 2023 00:00:00 GMT</pubDate></item><item><title><![CDATA[Why OpenVoiceOS Uses Permissive Licenses]]></title><description><![CDATA[<p>OpenVoiceOS (OVOS) embraces openness not only in its code but also in its licensing choices. Choosing the right license is crucial for any open-source project as it defines how the software can be used, modified, and distributed. In this blog post, we explore why OpenVoiceOS has opted for permissive licenses and what this decision means for developers and users alike.</p>
<h3>Understanding Open Source Licenses</h3>
<p>Open-source licenses provide the legal framework for using and distributing software. They range from strict (copyleft) licenses like GPL to more permissive ones such as Apache License 2.0 and MIT License. Permissive licenses allow greater flexibility for developers and businesses to use and modify the software without being required to open-source their changes.</p>
<h3>Why Permissive Licenses?</h3>
<p>OpenVoiceOS has chosen licenses like Apache License 2.0 and MIT License for several reasons:</p>
<ol>
<li>
<p><strong>Promoting Adoption</strong>: Permissive licenses encourage broader adoption of the software. Companies and developers can use OVOS without the fear of strict licensing requirements, which can sometimes deter adoption.</p>
</li>
<li>
<p><strong>Simplicity</strong>: Permissive licenses like Apache License 2.0 are straightforward and easier to understand compared to copyleft licenses like GPL. This simplicity reduces legal friction and makes it easier for developers to contribute back to the project.</p>
</li>
<li>
<p><strong>Business Friendliness</strong>: Companies often prefer permissive licenses because they offer more freedom in how they integrate and distribute the software within their products. This freedom can lead to more contributions and enhancements to the project.</p>
</li>
</ol>
<h3>Benefits for Developers</h3>
<p>For developers, using permissive licenses means:</p>
<ul>
<li><strong>Flexibility</strong>: They can freely use OVOS in their projects, modify it, and integrate it into proprietary software if needed.</li>
<li><strong>Collaboration</strong>: Contributions back to the project are simpler since there are no stringent licensing restrictions on how changes can be shared.</li>
</ul>
<h3>Conclusion</h3>
<p>Choosing permissive licenses for OpenVoiceOS aligns with its mission to foster an inclusive and collaborative open-source community. By making OVOS accessible and business-friendly, the project aims to attract a diverse range of contributors and supporters. Whether you are a developer, user, or company, permissive licensing ensures that OVOS remains a vibrant hub of innovation in the open-source voice assistant ecosystem.</p>
<p>Join the discussion and contribute to OpenVoiceOS on <a href="https://github.com/OpenVoiceOS">GitHub</a> or chat with us on <a href="https://matrix.to/#/!XFpdtmgyCoPDxOMPpH:matrix.org?via=matrix.org">Matrix</a> to learn more about our licensing and development processes.</p>]]></description><link>https://openvoiceos.github.io/ovos-blogs/posts/2023-02-28-permissive-licenses</link><guid isPermaLink="false">https://openvoiceos.github.io/ovos-blogs/posts/2023-02-28-permissive-licenses</guid><dc:creator><![CDATA[Strongthany]]></dc:creator><pubDate>Tue, 28 Feb 2023 00:00:00 GMT</pubDate></item><item><title><![CDATA[A Brief History of Open Voice OS]]></title><description><![CDATA[<p>OpenVoiceOS has become one of the leading open-source voice assistant systems, though it hasn't always been in that position. Through this post, we'll explore the history of OVOS and get some insight into how it grew into what it is today.</p>
<p>We begin back in 2015, not with OVOS itself, but with the start of MycroftAI, which exploded onto the scene when it launched its <a href="https://www.kickstarter.com/projects/aiforeveryone/mycroft-an-open-source-artificial-intelligence-for">Kickstarter for Mark I</a>, their first smart speaker, produced in-house. Mycroft pledged to be different from other voice assistants by focusing on user privacy. This brought their project widespread attention and led them to meet their crowdfunding goal. Over the next few years, the project passed several important milestones, including the release of the <a href="https://mycroft.ai/blog/vocalidmimic/">MycroftAI Mimic TTS</a> (Text-to-Speech) engine in February 2016 and the public release of the Mycroft software under the GPL license in May of the same year. By October 2017, the Mycroft-core repositories were <a href="https://mycroft.ai/blog/right-license/">relicensed under Apache 2.0</a>, a move that would set the stage for the formation of new, derivative projects.</p>
<p>From there, community involvement continued to grow and expand. Some of these contributions would lay the groundwork for what would eventually become OVOS. Aditya Mehra (<a href="https://github.com/AIIX">Aix</a>), who released a <a href="https://github.com/AIIX/Mycroft-AI-Gnome-Shell-Extension">Mycroft extension for the Gnome shell</a> in June 2016, became a contributor on the desktop integration team and went on to release the <a href="https://mycroft.ai/blog/mycroft-gets-a-plasmoid">MycroftAI plasmoid</a> for the KDE Plasma desktop in January 2017. The plasmoid project was later incubated under KDE itself, also by Aditya Mehra.</p>
<p>In December 2017, Casimiro Ferreira (<a href="https://github.com/JarbasAl">JarbasAI</a>) created the first version of his "<a href="https://github.com/OpenVoiceOS/ovos-personal-backend">personal backend</a>," which was a reverse-engineered version of MycroftAI's backend. Personal Backend was designed to be simple and permissively licensed under Apache 2.0, making it more accessible to other projects. It continues as an optional component of OpenVoiceOS.</p>
<p>MycroftAI would go on in 2018 to launch its <a href="https://www.kickstarter.com/projects/aiforeveryone/mycroft-mark-ii-the-open-voice-assistant?ref=profile_created">second Kickstarter</a>. The Mark II was to be a more featured version of their Mark I, complete with a touchscreen to provide a more advanced user interface. Mycroft reached their fundraising goal quickly, promoting the success as an achievement for open-source projects and offering hope for fast progress. This sadly would not be the case due to several issues in development. Owing to lackluster progress on the Mark II device, slow response to community-led projects, and other more technical concerns, multiple downstream partners decided to maintain their own forks of the core software. This enabled them to continue feature development without waiting for the parent project. One such fork was called mycroft-lib and was originally maintained by Jarbas on behalf of <a href="https://hellochatterbox.com">Chatterbox</a>, another related voice assistant project.</p>
<p>mycroft-lib, during its early existence, was described not as a fork of Mycroft but as a superset. It obligated itself to remain compatible with upstream changes and restricted itself to performance improvements and bug fixes.</p>
<p>September 2018 also saw the creation of the <a href="https://mycroft.ai/blog/the-mycroft-gui-the-screen-is-dead-long-live-the-screen/">Mycroft-GUI</a> framework currently also used by OVOS, a collaboration between MycroftAI developers and KDE developers based on KDE frameworks, and currently maintained by Aditya Mehra.</p>
<p>In 2019, <a href="https://www.j1nx.nl/dev-mycroftos-a-bare-minimal-os-based-on-buildroot">MycroftOS</a> was renamed to <a href="https://community.mycroft.ai/t/openvoiceos-a-bare-minimal-production-type-of-os-based-on-buildroot/4708/199">OpenVoiceOS – Mycroft Edition</a> to avoid trademark issues. mycroft-lib was renamed to <a href="https://github.com/HelloChatterbox/HolmesIV">Holmes</a> for the same reasons and then adopted by OpenVoiceOS. In October, the MycroftAI backend, called <a href="https://mycroft.ai/blog/open-sourcing-the-mycroft-backend/">selene</a>, was open-sourced under the AGPL license, but this did not lead to wide adoption due to its complexity and license choice.</p>
<p>Chatterbox focused on its closed-source products and stopped maintaining Holmes. OpenVoiceOS then decided to create <a href="https://github.com/OpenVoiceOS/ovos-core">ovos-core</a> and maintain it themselves.</p>
<p>March 2020 marked a turning point in Mycroft's relationship with outside developers. Casimiro Ferreira and Aditya Mehra partnered to create a <a href="https://github.com/JarbasSkills/skill-voip">VOIP skill</a> and, along with Peter Steenbergen (<a href="https://github.com/j1nx">j1nx</a>), formed the OpenVoiceOS project, centered around OVOS-Mycroft Edition. Chance (<a href="https://github.com/ChanceNCounter">ChanceNcounter</a>), one of the maintainers of <a href="https://mycroft.ai/blog/lingua-franca-v0-1-released/">lingua-franca</a> (mycroft's NLP library), and Daniel (<a href="https://github.com/NeonDaniel">NeonDaniel</a>), developer for NeonAi, joined shortly after. The project continues to grow to this day with Parker Seaman (<a href="http://github.com/5trongthany/">Strongthany</a>) recently joining the team as community manager. Around the same time, Michael Lewis became the CEO of MycroftAI.</p>
<p>Today, OVOS-Core is widely adopted by community projects and is quickly becoming the backbone of the open-source voice assistant movement. After the recent shuttering of MycroftAI due to lack of funding, OpenVoiceOS represents the survival of the Mycroft community, continues the nonprofit side of Mycroft development, and guarantees the future of privacy-focused, open-source voice assistants.</p>
<p>Want to get involved? Check out <a href="https://github.com/OpenVoiceOS">code</a> and chat with the community on <a href="https://matrix.to/#/!XFpdtmgyCoPDxOMPpH:matrix.org?via=matrix.org">Matrix</a>.</p>]]></description><link>https://openvoiceos.github.io/ovos-blogs/posts/2023-02-15-a-brief-history-of-open-voice-os</link><guid isPermaLink="false">https://openvoiceos.github.io/ovos-blogs/posts/2023-02-15-a-brief-history-of-open-voice-os</guid><dc:creator><![CDATA[Strongthany]]></dc:creator><pubDate>Wed, 15 Feb 2023 00:00:00 GMT</pubDate></item></channel></rss>