🌐 Try in your browser
No installation — open it on your phone, snap a photo of the textbook page, and the app reads it aloud as a dictation. Works on iOS, Android and desktop. Uses Microsoft Edge neural voices online.
Download for Windows
Full — bundles Tesseract OCR with Russian, English and Thai data, Piper TTS and two Russian neural voices. Works offline (Thai voices use the online Edge engine).
Lite — smaller package; Tesseract must be installed system-wide, and Piper voices download on first use.
Signed with a code-signing certificate.
About
Dictation is a free desktop application for Windows built around a single use-case: a parent or teacher photographs a page from a textbook, and the app reads it aloud as a school dictation. The text is broken into sentences, then into shorter parts, and finally word-by-word — with configurable pauses at every level — so a child can write at their own pace.
It works in Russian, English and Thai. Thai is written without spaces between words, so the app segments it into real words (via pythainlp) and dictates them one by one — and automatically stretches the pauses, since Thai handwriting takes longer.
The Full edition bundles Tesseract OCR (Russian, English, Thai) and Piper neural text-to-speech with two Russian voices, so Russian and English work with no internet. Thai speech uses Microsoft's online Edge voices.
Features
- Add pages by file, paste from clipboard, drag-and-drop or live webcam capture
- Multi-page support — thumbnail strip at the bottom, per-page crop area
- Rubber-band crop selection — focus OCR on the relevant text block
- OCR via bundled Tesseract 5 — Russian, English and Thai, with automatic script detection
- Thai word segmentation (pythainlp) — spaceless Thai is split into real words for dictation
- Spell-check with red underline and right-click fixes (Russian + English, Hunspell)
- Image preprocessing (deskew, threshold) for cleaner OCR on photos
- Recognized text appears in an editable panel — fix typos before dictating
- Cascade dictation: whole sentence → parts → individual words (short sentences repeated whole)
- Adjustable pause slider up to 10 seconds — auto-extended for Thai handwriting
- Streaming synthesis — playback starts on the first fragment, no waiting
- Change voice or speech rate live, without interrupting playback
- Three TTS engines: Edge TTS (Microsoft neural voices incl. Thai, online), Piper (offline neural), Windows SAPI
- Trilingual UI — Russian, English and Thai, switchable on the fly
How it works
Take a photo of the textbook page with your phone, paste it into the app (or open it from a file, or use the built-in webcam capture). Drag a rectangle over the text you want to dictate — that crop is remembered per page, so a single dictation can span several pages of a textbook.
Press Recognize and the bundled Tesseract OCR converts the image to text in the right-hand panel. You can edit anything that came out wrong before starting the dictation.
Press Start (or F5) and the cascade begins: for every sentence, the app speaks the full sentence, then pauses; speaks each comma-delimited part, with a shorter pause; then speaks each word individually, with the shortest pause. Use the slider to make the pauses longer for younger children, shorter for older ones — the speech rate itself never changes, only the silence between fragments.
System Requirements
| Requirement | Full edition | Lite edition |
|---|---|---|
| Operating system | Windows 10 or Windows 11 (64-bit) | |
| Tesseract OCR (Russian, English, Thai) | Bundled | Must be installed system-wide |
| Piper TTS + Russian voices | Bundled (irina + ruslan) | Downloads on first use (~60 MB / voice) |
| Thai text-to-speech | Online Edge voices (Premwadee / Niwat) — needs an internet connection | |
| Internet connection | Only for Edge TTS (incl. all Thai speech) | For Edge TTS, Thai speech, and first voice download |
| Microphone / webcam | Webcam optional — for capturing textbook pages without a phone | |