Knowledge Base > Recording > Hybrid MP4 Format

Hybrid MP4 Format

Overview

Hybrid MP4 is the name of a new output format introduced in OBS 30.2 that aims to provide reliable and compatible output without the drawbacks of traditional or fragmented MP4. It remains recoverable even if writing the file is aborted, e.g. due to system crashes or power outages, while still maintaining wide compatibility.

This works through the file being written as a fragmented file, but then being finalised like a regular MP4 in a process we call "soft-remux", which hides the fragmentation within, so that MP4-compatible software will treat is like a regular file.

Features

Codec Support
  • Video: H.264, HEVC, AV1
  • Audio: AAC, Opus, FLAC, ALAC, PCM (16/24-bit Integer and 32-bit Float)

ProRes will be supported in a future update by adding "Hybrid MOV" as an additional option.

Chapter Markers

Chapters can be inserted into the file, for example, to make it easier to find highlights or the start of certain sections during the editing process. These markers are supported by most video players and editing suites such as DaVinci Resolve.

Markers can be added via the following means:

  • Hotkey ("Add chapter marker")
  • WebSockets CreateRecordChapter request
  • OBS Frontend API obs_frontend_recording_add_chapter(const char *name)

Chapters added via hotkey will have the format "Unnamed <N>" where N is a simple counter. When added via the frontend API or WebSockets a custom name can be specified, if left empty the same scheme will be used to generate a chapter name.

Note: Chapters are written during the finalisation process, i.e. they are not recoverable in the event of a crash. This may change in the future.

Creation Time and Date

Unlike the existing MP4 output the creation date will be set to the output starting time, allowing users to more easily organise and catalogue their files.

Additional Metadata

The encoder settings in JSON format can be written as metadata to the file, see the next section for how to enable this feature.

Muxer Options

Option Description Default

use_negative_cts

Use negative composition timestamps instead of edit list to deal with b-frame delay. May not be compatible with older software

On (1)

write_encoder_info

Write encoder configuration JSON as metadata attached to each track

Off (0)

use_metadata_tags

Write metadata as generic key-value pairs instead of using QuickTime metadata keys

Off (0)

skip_soft_remux

Disable finalisation ("soft-remux"), for debugging purposes only

Off (0)

Options can be set via the "Custom Muxer Options" field in the format key=value and separated by spaces, e.g. use_negative_cts=1 write_encoder_info=1

Technical Details

The QuickTime File Format (QTFF/MOV), later adapted by ISO/IEC into the "Base Media File Format" (ISO-BMFF), and extended to become the "MPEG-4 File Format" (MP4), is an object-based media container originally created by Apple.

While generally well supported, the format has been extended and updated over the years, and not all features are implemented in every software. For example, fragmentation is a more recent feature that is often not fully supported by some players and video editing suites.

Anatomy of an MPEG-4 file

Files are made up of "boxes" - or "atoms" in Apple/QuickTime terminology - that contain metadata and the actual media data. Each box has a header containing its size and four letter type, e.g. moov, that tell the demuxer what they contain. The box types are standardised, though custom ones are permitted and should simply be skipped by a demuxer that does not support them.

When a file is written the audio and video data will live in a data box (mdat), which does not have a defined structure. To be able to read and decode a file it is necessary to parse the movie box (moov).

Generally speaking, a standard MP4 file needs only three boxes:

- `ftyp` Header
- `moov` Movie box
- `mdat` Audio/Video Data

Since the contents of the moov box depend on the data in mdat it is often written at the end, rather than the beginning. But this also means that if a file is incomplete, e.g. due to a crash or medium error, it generally cannot be read, as the mdat data by itself does not have any inherent meaning to the reading application.

How fragmentation works

In order to allow for streaming of MP4 files while they are being written, some additions were made to the format.

he primary difference is that the moov atom may be "empty", i.e. it contains no information about media samples or their location in the file, and only contains the file metadata and codec information necessary to initialise a decoder. Each fragment of the file then starts with a moof box, which provides the information about media samples the moov would usually provide, but limited to the current fragment. By combining the information from the moov and moof boxes each fragment can be decoded independently from the others, making the file resilient to data loss.

A fragmented MP4 file will thusly look like so:

- `ftyp` Header
- `moov` Movie Box (without sample information)
- `moof` Movie Fragment Box
- `mdat` Fragment Audio/Video data
- `moof` Movie Fragment Box
- `mdat` Fragment Audio/Video Data
- ...

While this format is great for streaming and resilience, it has significant downsides for casual users. For one, playback requires reading all moof boxes to get an accurate duration, which can take a long time on HDDs or network drivers, and some software doesn't support this. While fragmented files can be turned into regular ones by remuxing it, this also requires double the space and can take a while depending on how fast the medium is.

What Hybrid MP4 does

When writing a file the hybrid approach still writes it as though it is a regular fragmented MP4, but will finalise it similar to a regular MP4. For that purpose a full moov box is written at the end before the header of the file is partially overwritten to make the demuxer skip the original incomplete moov and moof boxes.

The initial hybrid MP4 file structure before overwriting the header is as follows:

- `ftyp` Header
- `free` Placeholder
- `moov` Movie Box (without sample information)
- `moof` Movie Fragment Box
- `moof` Movie Fragment Box
- `mdat` Fragment Audio/Video data
- `moof` Movie Fragment Box
- `mdat` Fragment Audio/Video Data
- ...
- `moov` Movie Box (*with* sample information)

Now in order to make this file appear as though it is a regular MP4 file all we need to do is overwrite the free box so that it becomes an mdat box that spans the entire file up to our final moov box, thus making the file appear as such:

- `ftyp` Header
- `mdat` Audio/Video Data
- `moov` Movie box

We have essentially just hidden the fragmented structure of the file by turning the entire thing into one giant data box that the muxer will not try to parse.