Traditionally regrouping all the world’s MPEG experts in a single location, the 130th MPEG meeting has been held online due to the COVID-19 situation, from April 20th to 24th. Nonetheless, lots of important topics have been discussed and many progress and improvements have been made in the specific ISO 23092 MPEG-G subject.

The Workshop provides at first, as usual, an overview of MPEG-G the new ISO standard on the compression and optimized access to genomic information, its impact on the relevant industry, on the various related standardization initiatives, use cases, sequencing technology evolution and perspectives for standardization in other –omics fields.

In line with the traditional MPEG practice of continuous improvement of the quality and performance of its standards, at its 130th meeting, MPEG promoted to FDIS a new edition of Part 1 and 2 and to FDIS Part 4 “Reference Software” and Part 5 “Conformance”. Such components of the MPEG-G standard series provide important supports to those willing to implement the standard or interested to verify the correctness and interoperability of their own implementations.

Compared to the first edition, the second editions of ISO/IEC 23092-1 and ISO/IEC 23092-2, haves been improved by taking into accounts comments received from users.

The ISO/IEC 23092-4 (MPEG-G Reference Software) standard provides a normative implementation of the standard. In conjunction with the ISO/IEC 23092-5 (MPEG-G Conformance) standard, it provides a comprehensive specification and validation support for the development of conforming decoder implementations. Interoperability of applications relying on normative decoding processes is facilitated by a reference normative decoding process and a rich set of tests and corresponding golden references.

A workshop on applications of genomic information processing was hosted by Tencent in Shenzhen on 13th October 2018, the day after the closing of the 124th MPEG meeting in Macau.

The Workshop has provided an overview of MPEG-G the new ISO standard on the compression and optimized access to genomic information, its impact on the relevant industry, on the various related standardization initiatives, use cases, sequencing technology evolution and perspectives for standardization in other –omics fields.

Specifically the workshop addressed:

  • An overview of the ISO genomic compression standard and its new features and performance
  • The challenges for the generation and management of very large volumes of genome sequencing data
  • The status and future perspectives of sequencing technology and genomic data generation
  • The vision of genomic information storage and processing on the cloud
  • The vision of further standardization objectives in the –omics fields

Final Program

Start End What Who
12:30 13:00 Registration  
13:00 13:10 Welcome & workshop goals Leonardo Chiariglione (MPEG Convener)
13:10 13:40 “An overview of the MPEG-G standard for the compression and processing of genomic sequencing data” Marco Mattavelli (EPFL, Switzerland)
13:40 14:10 “An overview of standardization progress in genomics data” Yong Zhang (ISO/TC 276/ WG2 & WG 5 Convenor)
14:10 14:40 “GSA: Genome Sequence Archive, in China” Yanqing Wang (BIG Data Center, BIG, CAS)
14:40 14:45 Short presentation of demos Alvaro G. Hernandez (UIUC DNA Services, USA)
14:50 15:20 Demo session and Coffee Break
15:20 15:50 “State-of-the-art and future of NGS, a standard perspective”  Ming NI (BGI-Shenzhen and MGI)
15:50 16:20 “Constructing an open ecosystem for bioinformatics and genomic big data” Chen Shifu (Haplox)
16:20 16:50 “Practice and Challenges of 20,000 human WGS data analysis on BGI Online” Kang FANG (BGI-Online, BGI)
16:50 17:20 Panel discussion,  Q&A and concluding remarks All speakers

Demonstrations of genome sequencing data processing prototypes and products

Co-located with the workshop GenomSys showed demos, prototypes and products related to genome sequencing data processing analytics, compression and storage.

Whole Genome Sequencing Data Analysis
MPEG-G Genomic Information Representation

Workshop on Genomic Sequencing Data Compression

GA4GH – MPEG, Basel 3rd October 2018

Call for Contributions

The amount of genome sequencing data generated day by day is either comparable or larger than other big data problems. Shrinking costs of data alone do not provide affordable solutions to the ambition of making genomic medicine common practice. However, storage costs are not the only factor to consider because genomic data, once generated, have to be made available to the scientific community for frequent and repeated accesses.

Current sequencing technologies compensate the errors generated by intrinsic noisy processes by generating redundant data and associated metadata (i.e. quality values). Thus compression approaches are effective solutions to reduce and mitigate the costs and the technological limitations related to the handling of extremely large volumes of data.

The heterogeneity of genome sequencing data and the diversity of the available compression solutions pose several challenges to the quest for an ideal technology able to deliver, at the same time, high compression ratios, high coding and decoding speed, efficient selective access to data and guaranteed interoperability among applications while respecting a variety of data protection and privacy requirements.

The goal of this workshop is to collect technical contributions on emerging and new compression technologies with particular attention to:

  • DNA sequencing data compression
  • Selective access and processing in the compressed domain
  • Emerging standard frameworks for the specification, representation and compression of genomic sequencing data
  • Interoperability of genomic sequencing data formats, applications standard frameworks and APIs
  • Use cases and processing applications requiring genomic data/metadata compression and protection·

Interested authors are invited to submit an abstract of no more than 600 words (excluding pictures and graphics which are welcome) describing their technical work by 31st August 2018.

The submission should indicate the preferred form of the contribution:

  • Oral presentation
  • Poster
  • Demonstration

Submissions of abstracts must be sent by email to:

GA4GH assembly program

Registration

Genomic Information Representation Metadata

A workshop on applications of genomic information processing has been held on 18th April 2018 co-located with the 122nd MPEG meeting in San Diego.

The Workshop explored the opportunities for improved genome sequencing data processing services enabled by the availability in late 2018 of an ISO standard called MPEG-G on the compression of genomic information and its impact on the relevant industry.

Specifically the workshop addressed:

  • The perspectives and potential of genomic information usage in medicine and public health
  • The vision of interdisciplinary approaches to the analysis of genome sequencing data
  • The challenges for the generation and management of very large volumes of genome sequencing data
  • The status, progresses of sequencing technology and associate data generation features
  • The reasons for supporting seamless availability and exchange of genome sequencing data for improving scientific progress yielded by wider data volume analysis
  • A status report on the development of the ISO genomic compression standard and an overview of its new features and performance

Venue:
San Diego Marriott La Jolla, 4240 La Jolla Village Drive
San Diego, CA 92037, United States
(see also the 122nd MPEG meeting for more details)

Organizing Committee:
Joern Ostermann (TNT-LUH), Claudio Alberti (GenomSys), Rongshan Yu (Aginome Scientific), Tom Paridaens (imec and UGent)

Program

Start End What Who
12:30 13:00 Registration  
13:00 13:15 Welcome & workshop goals  
13:15 13:40 “Genome and medical information portability, retrieval and analysis” Amalio Telenti (Scripps Research Institute, USA)
13:40 14:05 “From womb to tomb sequencing: on the advantages on bringing multidisciplinary R&D to develop standards and analytics” Ioannis Xenarios, (SIB Switzerland)
14:05 14:30  “Future of Genomics and Big Data” Dawn Barry (Luna DNA, USA)
14:30 14:55 “Generation and Management of Large Sequence Files: Perspectives from the DNA Sequencing Core” Alvaro G. Hernandez (UIUC DNA Services, USA)
14:55 15:10 Presentation of demonstrations GenomSys, Aginome Scientific
15:10 15:40 Demo session and Coffee Break  
15:40 16:05 “The role of compression in the genomics data life cycle” Come Raczy (Illumina Inc., USA)
16:05 16:30 Genomics at Rady’s Children Hospital San Diego Ray Veeraraghavan (Rady’s Children Hospital San Diego, USA)
16:30 16:55 “An overview of the MPEG-G standard for the compression and processing of genomic sequencing data” Marco Mattavelli (EPFL, Switzerland)

Demonstrations of genome sequencing data processing prototypes and products

Co-located with the workshop GenomSys and Aginome Scientific showed demos, prototypes and products related to genome sequencing data processing analytics, compression and storage.

Workshop on Processing of Genomic Information: From Standards to Deployment

 13:30 – 19:00, 19th July 2017 – Torino (Italy)

Sala dei Mappamondi, Academy of Sciences, Via Accademia delle Scienze, 6, Torino

Genomic Information Representation
New Systems and Solutions For Efficient Genomic Information Processing

Organized by
Italian Institute for Genomic Medicine (IIGM)

and

International Organisation for Standardisation
Technical Committee 276 “Biotechnology”
and MPEG (ISO/IEC JTC 1/SC 29/WG 11)

A workshop on applications of genomic information processing has been held on 19th July 2017 in Torino, Italy, co-located with the 119th MPEG meeting.

The workshop intended to explore the opportunities for improved health services offered by the availability in late 2018 of an ISO standard on compression of genomic information and its impact on the relevant industry.

Specifically the workshop addressed:

  • the potential of genomic information usage in public health
  • the IT challenges for the widespread usage of genomic data in public health
  • public policy on big data economy and genomics
  • a status report on the development of the ISO genomic compression standard
  • how the investors community can help progress the standard to deployment
  • how the appropriate usage of genomic data drives efficient drug development
  • how the genomic data processing industries can benefit from the standard
  • how health services can be affected by the massive application of genome processing

Organizing Committee:

Ruggero De Maria (IIGM), Diego Gibellino (TIM), Claudio Alberti (EPFL), Tom Paridaens (imec and UGent)

Program and list of speakers:

Start End Speaker What
13:30 14:00 Registration  
14:00 14:15 Ruggero De Maria (IIGM) Welcome & workshop goals
14:15 14:40 Chiara Marinacci. Epidemiology Unit, Regione Piemonte, Italy Genomics, a major component of future health policy
14:40 15:05 Mikel Hernaez, Carl R. Woese Institute for Genomic Biology, University of Illinois Urbana-Champaign, USA Challenges in precision medicine: from sequencing to big data processing
15:05 15:30 Kristina Kjerstad, DG Connect, European Commission EC policy on data economy and its relation with genomics
15:30 15:55 Coffee break  
15:55 16:20 Marco Mattavelli, École Polytechnique Fédérale de Lausanne, Switzerland MPEG-G the emerging ISO standard for genomic data
16:20 16:45 Alberto de Min, Preon Capital, Switzerland Business opportunities in standardization
16:45 17:10 Wolfgang Hackl, Novartis, Switzerland Genomic Profiling of Cancer to inform Drug Development
17:10 17:25 Coffee Break  
17:25 17:50 Leonor Frias, MADEOFGENES, Spain Challenges in the personal genomics era
17:50 18:15 Pier Giuseppe Pelicci, Istituto Europeo di Oncologia, Italy Genomics in the clinical practice – Today and tomorrow
18:15 19:00 Francesco Profumo, President, Compagnia di San Paolo, Italy Discussion & recommendations

With the support of

Transport and Storage of Genomic Information

With the patronage of

Efficient Genomic Data Storage
MPEG Genome Compression
NGS Data Compression
Emerging Standard For Genomic Data Compression

At its 115th meeting, MPEG issued a Call for Proposals (CfP) for Genomic Information Compression and Storage in conjunction with the working group for standardisation of data processing and integration of the ISO Technical Committee for biotechnology standards (ISO/TC 276/WG5). The call sought submissions of technologies that can provide efficient compression of genomic data and metadata for storage and processing applications. During the 116th MPEG meeting, responses to this CfP have been collected and evaluated by a joint ad-hoc group of both working groups, comprising twelve distinct technologies submitted. An initial assessment of the performance of the best s elevenolutions for the different categories reported compression factors ranging from 8 to 58 for the different classes of data.

The submitted twelve technologies show consistent improvements versus the results assessed as an answer to the Call for Evidence in February 2016. Further improvements of the technologies under consideration are expected with the first phase of core experiments that has been defined at the 116th MPEG meeting. The open core experiments process planned in the next 12 months will address multiple, independent, directly comparable rigorous experiments performed by independent entities to determine the specific merit of each technology and their mutual integration into a single solution for standardisation. The core experiment process will consider submitted technologies as well as new solutions in the scope of each specific core experiment. The final inclusion of submitted technologies into the standard will be based on the experimental comparison of performance, as well as on the validation of requirements and inclusion of essential metadata describing the context of the sequence data, and will be reached by consensus within and across both committees.