Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trying to reassemble audio packets into ogg file. #155

Open
ZenFeedbacker opened this issue Aug 26, 2021 · 2 comments
Open

Trying to reassemble audio packets into ogg file. #155

ZenFeedbacker opened this issue Aug 26, 2021 · 2 comments

Comments

@ZenFeedbacker
Copy link

So I'm building a Java application that can connect to the Zello and receive messages, but I've been struggling with audio streams, more specifically with reassembling the audio data packets Ι receive into an ogg/opus file.

From what I've gathered from the API specification I should strip the first 9 bytes of each audio packet I get

{type(8) = 0x01, stream_id(32), packet_id(32), data[]}

and what I'm left with is a opus data packet, including its ToC byte, as described here.

Then I build an Ogg Stream where each Opus packet get its own Ogg Page (plus two pages at the beginning, one for the ID header and one for the Comment header,/ as described here.

I'm more or less following the Javascript code provided here.

My decoder class looks like that:

import lombok.extern.slf4j.Slf4j;
import org.apache.commons.io.FileUtils;
import org.apache.commons.lang3.ArrayUtils;
import org.gagravarr.ogg.CRCUtils;

import java.io.File;
import java.io.IOException;
import java.math.BigInteger;
import java.nio.ByteBuffer;
import java.nio.charset.StandardCharsets;
import java.util.ArrayDeque;
import java.util.Deque;
import java.util.List;

@Slf4j
public class Decoder {

    private static final String OPUS_ID_HEADER = "OpusHead";
    private static final String OPUS_COMMENT_HEADER = "OpusTags";
    private static final String OGG_PAGE_HEADER = "OggS";

    private final Deque<byte[]> packets = new ArrayDeque<>();
    private int pageIndex = 0;
    private final int bitstreamSerialNumber = Utils.randomStreamSerialNumber();

    public Decoder(List<byte[]> audio) {

        packets.addAll(audio);
    }

    public void writeToFile(String path) {

        try {
            FileUtils.writeByteArrayToFile(new File(path), getOGG());
        } catch (IOException e) {
            log.warn("IOException while writing ogg file to {}: {}", path, e.getMessage());
        }
    }

    private byte[] getOGG() {

        oggData = ArrayUtils.addAll(getPage(getIDHeader(), 2), getPage(getCommentHeader(), 0));

        while (!this.packets.isEmpty()) {
            oggData = ArrayUtils.addAll(oggData, getPage(packets.remove(), packets.isEmpty() ? 4 : 0));
        }

        return oggData;
    }

    private byte[] getPage(byte[] segmentData, int headerType) {

        var bb = ByteBuffer.allocate(28 + segmentData.length);

        // Page Header
        bb.put(OGG_PAGE_HEADER.getBytes(StandardCharsets.UTF_8));
        // Version
        bb.put((byte) 0);
        // Header type
        bb.put((byte) headerType);
        // Granule position
        bb.putLong(new BigInteger("ffffffff", 16).longValue());
        // Bitstream serial number
        bb.put(Utils.intToLittleEndianByteArray(bitstreamSerialNumber, 4));
        // Page sequence number
        bb.put(Utils.intToLittleEndianByteArray(pageIndex++, 4));
        // CRC checksum temporary
        bb.putInt(0);
        // Page segments
        bb.put((byte) 1);
        // Segments table
        bb.put((byte) segmentData.length);

        int checksum = CRCUtils.getCRC(bb.array());

        // Segment data
        bb.put(segmentData);

        if (segmentData.length > 0) {
            checksum = CRCUtils.getCRC(segmentData, checksum);
        }

        // CRC checksum
        byte[] page = bb.array();

        Utils.copyArrayToArray(Utils.intToLittleEndianByteArray(checksum, 4), page, 22);

        return page;
    }

    private byte[] getIDHeader() {

        var bb = ByteBuffer.allocate(19);

        // ID package header
        bb.put(OPUS_ID_HEADER.getBytes(StandardCharsets.UTF_8));
        // Version
        bb.put((byte) 1);
        // Channel count
        bb.put((byte) 1);
        // Pre-skip
        bb.putShort((short) 0);
        // Sample rate
        bb.put(Utils.intToLittleEndianByteArray(16000, 4));
        // Output gain
        bb.putShort((short) 0);
        // Channel map
        bb.put((byte) 0);

        return bb.array();
    }

    private byte[] getCommentHeader() {

        var bb = ByteBuffer.allocate(20);

        //Comment package header
        bb.put(OPUS_COMMENT_HEADER.getBytes(StandardCharsets.UTF_8));
        // Vendor string length
        bb.put(Utils.intToLittleEndianByteArray(4, 4));
        // Vendor string
        bb.put("abcd".getBytes(StandardCharsets.UTF_8));
        // User comment List length
        bb.putInt(0);

        return bb.array();
    }
}

And here are the utility functions used:

import lombok.AccessLevel;
import lombok.NoArgsConstructor;

import java.nio.ByteBuffer;
import java.nio.ByteOrder;
import java.util.concurrent.ThreadLocalRandom;

@NoArgsConstructor(access = AccessLevel.PRIVATE)
public class Utils {

    public static void copyArrayToArray(byte[] from, byte [] to, int pos){
        System.arraycopy(from, 0, to,  pos, from.length);
    }

    public static byte[] intToLittleEndianByteArray(int num, int len){

        var byteBuffer = ByteBuffer.allocate(len);
        byteBuffer.order(ByteOrder.LITTLE_ENDIAN);
        byteBuffer.putInt(num);
        return byteBuffer.array();
    }

    public static int randomStreamSerialNumber(){

        return ThreadLocalRandom.current().nextInt() & Integer.MAX_VALUE;
    }
}

The CRC is calculated with the use of this class.

Unfortunately when I try to play the file with ffplay I only hear sound for only a few fragments of a second and then silence. I also get the following output when running validation commands on the resulting file

>>> opusinfo 58511287-fc76-434d-b223-459a651e2db9.opus 

Processing file "58511287-fc76-434d-b223-459a651e2db9.opus"...

WARNING: Hole in data (28 bytes) found at approximate offset 2544 bytes. Corrupted Ogg.
WARNING: Hole in data (19 bytes) found at approximate offset 2544 bytes. Corrupted Ogg.
WARNING: Hole in data (28 bytes) found at approximate offset 2544 bytes. Corrupted Ogg.
WARNING: Hole in data (20 bytes) found at approximate offset 2544 bytes. Corrupted Ogg.
WARNING: Hole in data (99 bytes) found at approximate offset 2544 bytes. Corrupted Ogg.
WARNING: Hole in data (310 bytes) found at approximate offset 2544 bytes. Corrupted Ogg.
WARNING: Hole in data (20 bytes) found at approximate offset 2544 bytes. Corrupted Ogg.
WARNING: Hole in data (317 bytes) found at approximate offset 2544 bytes. Corrupted Ogg.
WARNING: Hole in data (297 bytes) found at approximate offset 2544 bytes. Corrupted Ogg.
WARNING: Hole in data (186 bytes) found at approximate offset 2544 bytes. Corrupted Ogg.
WARNING: Hole in data (25 bytes) found at approximate offset 2544 bytes. Corrupted Ogg.
WARNING: Hole in data (30 bytes) found at approximate offset 2544 bytes. Corrupted Ogg.
WARNING: Hole in data (50 bytes) found at approximate offset 2544 bytes. Corrupted Ogg.
WARNING: Hole in data (160 bytes) found at approximate offset 2544 bytes. Corrupted Ogg.
WARNING: Hole in data (112 bytes) found at approximate offset 2544 bytes. Corrupted Ogg.
WARNING: Hole in data (283 bytes) found at approximate offset 2544 bytes. Corrupted Ogg.
WARNING: Hole in data (211 bytes) found at approximate offset 2544 bytes. Corrupted Ogg.
WARNING: Hole in data (70 bytes) found at approximate offset 2544 bytes. Corrupted Ogg.
WARNING: Hole in data (40 bytes) found at approximate offset 2544 bytes. Corrupted Ogg.
WARNING: Hole in data (5 bytes) found at approximate offset 2544 bytes. Corrupted Ogg.
WARNING: Hole in data (234 bytes) found at approximate offset 2544 bytes. Corrupted Ogg.
ERROR: No Ogg data found in file "58511287-fc76-434d-b223-459a651e2db9.opus".
Input probably not Ogg.

(note that this file is exactly 2544 so there is some issue with how opusinfo calculates the EOF I suppose).

>>>ffplay 58511287-fc76-434d-b223-459a651e2db9.opus
 
ffplay version 4.2.4-1ubuntu0.1 Copyright (c) 2003-2020 the FFmpeg developers
Input #0, ogg, from '58511287-fc76-434d-b223-459a651e2db9.opus':0   
  Duration: N/A, start: -0.120000, bitrate: N/A
    Stream #0:0: Audio: opus, 48000 Hz, mono, fltp
[opus @ 0x7f0d040051c0] Error parsing the packet header.
    Last message repeated 3 times
   2.32 M-A:  0.000 fd=   0 aq=    0KB vq=    0KB sq=    0B f=0/0   
>>> oggz-validate 58511287-fc76-434d-b223-459a651e2db9.opus

58511287-fc76-434d-b223-459a651e2db9.opus: Error:
File contains no Ogg packets

The output file is this one 58511287-fc76-434d-b223-459a651e2db9.opus

I've been stuck quite some time with this issue and I'm suspecting I missing something obvious, either in the way the checksum is calculated or in the way i'm constructing the Ogg Pages. Any help would be greatly appreciated.

@ihorserba
Copy link
Collaborator

It looks like the way of constructing the ogg pages is not correct.
You have to calculate and fill the page_segments and segment_table fields (see Ogg page format)
More information is also here: encapsulation process
Also I'm not sure the granule position value of 0xffffffff is fine.

I found the Java library VorbisJava and the opened issue there with the related PR.

It seems to me it does exactly what you need - just modify the pre-encoded binary with the real opus stream taken from the Zello incoming message.

@ihorserba
Copy link
Collaborator

Here is a simple example: 4ed91cc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants