Speech-to-Speech task prompt #32

ehosseiniasl · 2024-07-23T23:21:15Z

https://github.com/OpenMOSS/AnyGPT/blame/6404dbafccc10943be6bf6e24a4b99b3a6545501/anygpt/src/m_utils/prompter.py#L45

Hello,
Is this line correct? Is this for speech-to-speech conversation?
In that case, isn't this the correct prompt:

Speech-Response-Speech': '{speech} Please interpret the user\'s voice commands, provide text responses, and generate corresponding voice replies

The text was updated successfully, but these errors were encountered:

JunZhan2000 · 2024-07-24T11:17:31Z

Hello, part of the prompt in this file was used for debugging. I suggest you refer to this place https://github.com/OpenMOSS/AnyGPT/blame/6404dbafccc10943be6bf6e24a4b99b3a6545501/anygpt/src/m_utils/prompter.py#L113

So actually for voice commands and voice replies, we use the prompt of 'Speech-Instruction'

ehosseiniasl · 2024-07-24T16:07:51Z

thanks. Did you have direct speech response generation (without text response generation) for base or chat model?
which speech response tasks are included in instruction tuning?

ehosseiniasl · 2024-07-24T16:11:48Z

using Speech-Instruction on chat model, response is as bellow. to_modality=speech
Could you please explain what is the first line? : <-Res-> Gmarin misway"- How beautiful you look today!
does the model first generates text reply, then speech, even if output modality is speech only?

response:
 :  <-Res-> Gmarin misway"- How beautiful you look today!
  [AnyGPT] "Guhmyayayay!" - How beautiful you look today!  <sosp> <🗣️691> <🗣️691> <🗣️60> <🗣️868> <🗣️868> <🗣️906> <🗣️316> <🗣️1015> <🗣️965> <🗣️512> <🗣️512> <🗣️223> <🗣️223> <🗣️689> <🗣️35> <🗣️35> <🗣️35> <🗣️962> <🗣️57> <🗣️943> <🗣️699> <🗣️1> <🗣️118> <🗣️118> <🗣️118>

ehosseiniasl · 2024-07-24T16:51:16Z

does the prompt include user speech transcription? the sentence after <-Res-> is the transcription of speech instruction I provided

JunZhan2000 · 2024-07-30T12:55:26Z

does the prompt include user speech transcription? the sentence after <-Res-> is the transcription of speech instruction I provided

Hello, we provide some training data samples and related descriptions, please refer to https://github.com/OpenMOSS/AnyGPT?tab=readme-ov-file#pretraining-and-sft

JunZhan2000 · 2024-07-30T12:56:33Z

In the voice dialogue mode, the user provides voice commands, the model recognizes the text commands, generates text replies, and finally generates the voice of the reply.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speech-to-Speech task prompt #32

Speech-to-Speech task prompt #32

ehosseiniasl commented Jul 23, 2024

JunZhan2000 commented Jul 24, 2024

ehosseiniasl commented Jul 24, 2024

ehosseiniasl commented Jul 24, 2024 •

edited

Loading

ehosseiniasl commented Jul 24, 2024

JunZhan2000 commented Jul 30, 2024

JunZhan2000 commented Jul 30, 2024

Speech-to-Speech task prompt #32

Speech-to-Speech task prompt #32

Comments

ehosseiniasl commented Jul 23, 2024

JunZhan2000 commented Jul 24, 2024

ehosseiniasl commented Jul 24, 2024

ehosseiniasl commented Jul 24, 2024 • edited Loading

ehosseiniasl commented Jul 24, 2024

JunZhan2000 commented Jul 30, 2024

JunZhan2000 commented Jul 30, 2024

ehosseiniasl commented Jul 24, 2024 •

edited

Loading