MPEG-G ISO/IEC 23092

Genomic Information Representation

The MPEG-G standard is currently the largest coordinated and international effort addressing the problems and limitations of current technologies and products towards a truly efficient and economical handling of genomic information.

Publication of MPEG-G International Standard for Application Program Interfaces (APIs) and Metadata technologies (Part 3)

The extensive usage of high-throughput deoxyribonucleic acid (DNA) sequencing technologies opens up new perspectives in the treatment of several diseases and enables the implementation of a new approach to healthcare known as “precision medicine”. DNA...

Publication of MPEG-G International Standard for Coding of Genomic Information (Part 2)

The extensive usage of high-throughput deoxyribonucleic acid (DNA) sequencing technologies opens up new perspectives in the treatment of several diseases and enables the implementation of a new approach to healthcare known as “precision medicine”. DNA...

Publication of MPEG-G International Standard for Transport and Storage of Genomic Information (Part 1)

The extensive usage of high-throughput deoxyribonucleic acid (DNA) sequencing technologies opens up new perspectives in the treatment of several diseases and enables the implementation of a new approach to healthcare known as “precision medicine”. DNA...

Details about MPEG-G Part 1 – ISO/IEC 23092/1

Part 1: Transport and Storage of Genomic InformationMPEG-G specifies a digital container format for transmission and storage of the genomic data compressed according to Part 2 of the standard. In MPEG jargon the container format used for the transport of...

The extensive usage of high-throughput deoxyribonucleic acid (DNA) sequencing technologies opens up new perspectives in the treatment of several diseases and enables “precision medicine”. As DNA sequencing technologies produce extremely large amounts of raw data, the ICT costs for the storage, transmission, and processing of DNA sequence data and related information, result to be very high due to the lack of universal standards preventing timely application of effective treatments.

MPEG-G

Genome Compression Standard

The MPEG-G standard jointly developed by MPEG and ISO Technical Committee for biotechnology standards (ISO TC 276/WG 5) is the first international standard to address and solve the problem of efficient and cost-effective handling of genomic data by providing, not only new compression and transport technologies, but also a family of standard specifications associating relevant information in the form of metadata and a rich set of Application Programming Interfaces (APIs) for building a full ecosystem of interoperable applications and services capable of efficiently processing sequencing data.

MPEG-G utilizes the latest technology to compress and transport sequencing data for complex use cases including:

  • Selective access to compressed data
  • Data streaming
  • Compressed file concatenation
  • Genomic studies aggregation
  • Enforcement of privacy rules
  • Selective encryption of sequencing data and metadata
  • Annotation and linkage of genomic segments
  • Interoperability with main existing technologies and legacy formats
  • Incremental update of sequencing data and metadata

source: mpeg.chiariglione.org

Do you want to know more? Have a look at our Frequently Asked Questions!

In order to better know how MPEG standardization and MPEG-G work, please read our FAQ section

The MPEG-G standard is composed by five parts:

Part 1: Transport and Storage of Genomic Information

This part of the standard deals with data formats for both Transport and Storage of Genomic Information, with reference conversion process and informative annexes. The main topics covered by this part are genomic data streaming and file format.

Read more

Part 2: Coding of Genomic Information

This part provides specifications for the normative representation of genomic sequence reads identifiers, genomic sequence reads (both unaligned reads and aligned reads), reference sequences and quality values. This is the part where compression is specified in terms of normative bitstream syntax and decoding behaviour.

Read more

Part 3: Metadata and APIs

This part of the standard specifies information metadata, SAM interoperability, protection metadata and programming interfaces to access genomic information. The main goals are to enable (controlled) access to MPEG-G data from external applications and to add metadata to compressed genomic information.

Read more

Part 4: Reference Software

To support and guide potential implementers of MPEG-G, the standard includes a normative Reference Software. The Reference Software is normative in the sense that any conforming implementation of the decoder, taking the same conformant compressed bitstreams, using the same normative output data structures, will output the same data.

Read more

Part 5: Conformance

Conformance testing is fundamental in providing means to validate the correct implementation of the MPEG-G technology in different devices and applications and the interoperability among all systems. This part of the standard specifies a normative procedure to assess conformity to the standard on an exhaustive dataset of compressed data.

Read more

MPEG-G Database

A test set of reference genomic data has been defined to perform tests during the process of standardization, definition of conformance test procedures and other experiments. More information on the MPEG-G database is available here.

Next MPEG-G AhG meeting

The next MPEG-G AhG meeting will be held online on 2-3 July before the 131st MPEG meeting which will be online as well (Geneva physical meeting cancelled). More meeting information will soon be available on the meeting web site.

How to participate?

If you are interested in MPEG-G and the related activities you are welcome to join the open mailing list and contribute to the discussions