Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need help with decoding the incoming audio #191

Open
aygupt1822 opened this issue Jul 19, 2022 · 2 comments
Open

Need help with decoding the incoming audio #191

aygupt1822 opened this issue Jul 19, 2022 · 2 comments

Comments

@aygupt1822
Copy link

aygupt1822 commented Jul 19, 2022

I am trying to make a Zello application Python client using API documentation from the Zello Github repo.

I am having problems with the decoding the raw data.

The documentation of the Zello library sucks as it clearly doesn't explain anything related to the decoding part. I also looked at some other similar issues raised by other users but the answers were not satisfactory at all.

I sincerely need help with this.

@AndyW999
Copy link

AndyW999 commented Nov 1, 2022

Do you mean the JSON or the Opus codec?

Both are well documented and used everywhere - try Google?

@tlstpierre
Copy link

I was able to get this to work in Go without any issues. This is binary data, so you may have to do a bit of bit-banging to make it work:

  1. Look at the first byte of the binary message. If it == 0x01, then this is an audio stream.
  2. Take bytes 1-4 and decode them to a Uint32 type with BigEndian encoding. Not sure how to do this with Python, but presumably you can take a sub-set of the byte array and there is a function that will take the four bytes and give you a Uint32 back. This is your stream ID (save this to a variable somewhere).
  3. Take bytes 5-8 and decode these the same way - this is your packet ID. This number will increment for every packet.
  4. The rest of the binary data from bytes 9 to the end is your Opus encoded audio. Send that to your Opus decoder to get the audio frame.

Here's a few other tips that might help:

  • In order to initialize your Opus decoder, you need to get the format information from the stream start message. In my application, I use this to create a data structure to receive the audio, open up the sound card stream, etc.
  • You will need to create some sort of jitter buffer structure to hold the incoming audio frames before you play them out, to ensure that the output doesn't get starved if there is a delay in a packet arriving. It isn't a big concern with TCP based audio, but you may want to look at the packet ID numbers to make sure they increment without a gap, and go in order.
  • Here's an example of the signal flow: incoming packet > decode header (stream ID and packet ID) > Opus decoder > slice (array?) of Int16 audio samples > buffer > output.

The documentation is quite good, once you understand the binary encoding part of it. Do you have any more specific detail about where you are stuck?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants