What are the main differences between current genomic formats and MPEG-G in terms of scope?

Posted28/06/2019

Byclaudio

With respect to existing genomic data formats, MPEG-G provides new solutions for the representation and compression of genome sequencing information by:

Specifying an abstract representation of the sequencing information rather than specifying an implementation of the compression of a specific data format or file format
Avoiding ambiguous, unspecified or ill-specified data representations that prevent interoperable information exchange. Being designed at a time point in which technologies and use cases are rather mature, incremental ad-hoc additions of new features are forbidden.
Clearly and normatively separating user-defined information with no clear semantics from the normative genomic data representation. User-defined information is supported within a standardized syntax and semantics. This allows a fully interoperable and automatic exchange of information between different data producers and flexibility in satisfying specific user needs.
Allowing multiplexing of relevant meta-data information with the data since data and meta-data are partitioned at different conceptual levels.
Providing the specification of a transport format, a file format, compression technology and standard application program interfaces, following a model and development process which has proven successful in the last 30 years in the domain of digital media.