Importing HDF5 files

Introduction to Importing Data in Python

Hugo Bowne-Anderson

Data Scientist at DataCamp

HDF5 files

  • Hierarchical Data Format version 5
  • Standard for storing large quantities of numerical data
  • Datasets can be hundreds of gigabytes or terabytes
  • HDF5 can scale to exabytes
Introduction to Importing Data in Python

Importing HDF5 files

import h5py
filename = 'H-H1_LOSC_4_V1-815411200-4096.hdf5'
data = h5py.File(filename, 'r') # 'r' is to read
print(type(data))
<class 'h5py._hl.files.File'>
Introduction to Importing Data in Python

The structure of HDF5 files

for key in data.keys():
    print(key)
meta
quality
strain
print(type(data['meta']))
<class 'h5py._hl.group.Group'>

ch_2_3.014.png

Introduction to Importing Data in Python

The structure of HDF5 files

for key in data['meta'].keys():
    print(key)
Description
DescriptionURL
Detector
Duration
GPSstart
Observatory
Type
UTCstart
print(np.array(data['meta']['Description']), np.array(data['meta']['Detector']))
b'Strain data time series from LIGO' b'H1'
Introduction to Importing Data in Python

The HDF Project

  • Actively maintained by the HDF Group

ch_2_3.019.png

  • Based in Champaign, Illinois
Introduction to Importing Data in Python

Let's practice!

Introduction to Importing Data in Python

Preparing Video For Download...