| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The AT3-in-WAV writer produces headers that Sony's psp_at3tool rejects for
files longer than around forty seconds. The tool prints "input file is
illegal file or over 2G Byte" and refuses to decode. ffmpeg accepts the
file but decodes it without any encoder-delay compensation, leaving a
variable lag of several hundred samples relative to the source. The two
observations have a common root cause: the header we write is missing
fields that downstream decoders rely on.
This patch addresses three concrete issues in src/at3.cpp.
First, the writer emits no fact chunk. The fact chunk is optional in the
general RIFF specification but is how WAVEFORMATEX based codecs announce
the total number of decoded samples per channel. psp_at3tool uses the
sample count together with samples-per-frame to decide how much PCM to
produce and where to stop. Without a fact chunk the tool falls back to a
short default and either truncates output or, for longer streams, rejects
the file outright. ffmpeg uses the same field to skip encoder priming
samples. Sony's own AT3 files carry this chunk with a fixed eight byte
payload containing total_samples and samples_per_frame. We now write the
same structure.
Second, the bytes_per_frame field in the ATRAC3 extradata was hardcoded
to 0x10 with an XXX comment. The correct value for standard ATRAC3 is
0x1000, that is 4096, which corresponds to the PCM bytes represented by
one frame (1024 samples per channel times two channels times two bytes
per sample). Sony's encoder writes 4096 at this offset and both ffmpeg
and psp_at3tool validate against that number. The previous value of
sixteen bytes per frame is nonsensical and was part of why psp_at3tool
misestimated the playback length.
Third, the RIFF chunk_size field was being written as the full file size.
By the RIFF specification this field should hold the size of everything
that follows the field itself, that is file_size minus eight. Writing the
full size is tolerated by ffmpeg but violates the specification and makes
the file look larger than it is to strict parsers.
Because the PCM engine can flush additional frames after the initially
estimated numFrames count (due to look-ahead tail during encoding), the
three length fields chunk_size, total_samples, and subchunk2_size were
stale by one to three frames relative to the actual data on disk. To
keep them consistent, TAt3 now counts frames as WriteFrame is called and
seeks back to overwrite the three length fields in the destructor, so
the final file describes its real contents.
The patch is purely a container metadata fix. The encoded AT3 payload is
byte-identical to before. After this change, output from atracdenc for
long test tracks (90 and 186 seconds, 132 kbps LP2) is accepted and fully
decoded by psp_at3tool in a single pass, and ffmpeg decodes with a
constant small codec latency instead of the previous variable drift.
This made it possible to run a proper triple comparison against Sony's
reference encoder, which previously looked catastrophic (gap around
-22 dB SNR) purely due to the alignment problem but sits at roughly
-0.5 to -1.4 dB SNR once the container headers are correct.
Signed-off-by: hilman2 <[email protected]>
|