How To Enhance Your Dictation In Real Time Using Large Language Models On Windows and MacOS

You can improve your productivity by using dictation software that supports real-time text enhancement using large-language models.

For example, you can save lots of time if your dictation software can correct the grammar and spelling mistakes of your dictation in real-time.

You can use SpeechPulse offline speech recognition application on Windows and macOS for real-time text enhancements, including grammar and spelling correction and email formatting.

On Windows, we recommend you run the English (standard) language model on a GPU for faster processing. You need at least 8 GB of VRAM to run both the Multi (large) speech model and English (standard) language model on your NVIDIA GPU. If your PC doesn’t meet these requirements, you can use one of the speech APIs for the speech model and one of the language APIs for the language model.

  1. Download and install SpeechPulse
  2. Download one of the speech models. Larger speech models have better accuracy, but smaller models run faster. You can also use an OpenAI-compatible Whisper Speech API.

    Option 1: Download a speech model.

    Option 2: Add a speech API.

  3. Download a large language model or use an OpenAI-compatible language API as the large language model.

    Option 1: Download a language model.

    Option 2: Add a language API.

  4. Select your speech model (or API) and large language model (or API).
  5. Add a new AI template that gives instructions to the large language model or use an existing template.
  6. Select your AI template.
  7. Press the Start button.
  8. Place the cursor into the text edit area and dictate in your natural voice. SpeechPulse will transcribe your speech in real-time and apply the large language model to enhance or transform your dictation according to the AI template.