Releases: jfbosch/Mutation
Releases · jfbosch/Mutation
Mutation_2024-06-22
There are various providers that now host Whisper under APIs that is compatible with that of OpenAI.
For example: groq.com.
This change allows the base domain and the model ID to be configured so that these other providers can be used instead of only OpenAI.
See the readme for how to configure e.g. groq.com.
Mutation_2023-09-08
- Added more advanced formatting commands to the dictation that runs locally based on rules and regular expressions, and no longer uses the LLM for formatting.
- Added a review feature that reviews the formatted transcript according to the provided review prompt; eg. review for logical consistency.
- Added the ability for the LLM to apply the review suggestions directly to the formatted transcript.
- Added the ability to insert the formatted dictation directly into a 3rd party application. I.e. set the focus on the 3rd party input text box, and use the dictation hotkey to start and stop dictation.
- Renamed some of the keys in the Mutation.json. Running the latest version of Mutation should add any missing settings with default values.
Mutation_2023-08-20
- Mainly, this release adds a speech-to-text prompt, which is used to pass to the whisper speech-to-text model to improve results. The prompt text box is just above the speech to text results text box. The prompt will automatically be saved on exit and restored on load. Any value in the prompt text box will be passed to Whisper along with the audio.
- Here is more about how it can be used:
- You can use a prompt to improve the quality of the transcripts generated by the Whisper API. The model will try to match the style of the prompt, so it will be more likely to use capitalization and punctuation if the prompt does too. This only provides limited control over the generated audio. Here are some examples of how prompting can help in different scenarios:
- Prompts can be very helpful for correcting specific words or acronyms that the model often misrecognizes in the audio. For example, the following prompt improves the transcription of the words DALL·E and GPT-3, which were previously written as ""DALI"" and ""GDP 3"": The prompt is: “OpenAI makes technology like DALL·E, GPT-3, and ChatGPT with the hope of one day building an AGI system that benefits all of humanity""
- Sometimes the model might skip punctuation in the transcript. You can avoid this by using a simple prompt that includes punctuation, such as: ""Hello, welcome to my lecture.""
- The model may also leave out common filler words in the audio. If you want to keep the filler words in your transcript, you can use a prompt that contains them: ""Umm, let me think like, hmm... Okay, here's what I'm, like, thinking.""
- Here is more about how it can be used:
- A UI button was also added to start and stop recording for those who do not want to use the hotkey.
- Minor other changes, like a bit of UI cleanup and tab index ordering.
Mutation_2023-08-16
First release of Mutation. Just run Mutation.exe and it will create and open up Mutation.json for editing. Populate with values and save and restart Mutation. Read the project readme for more instructions.