Using bapsflib.lapd

The bapsflib.lapd is a one-stop-shop for everything specifically related to handling data collected on the LaPD. The package provides:

  1. HDF5 file access via bapsflib.lapd.File

  2. LaPD machine specs and parameters in bapsflib.lapd.constants

  3. LaPD specific tools (e.g. port number to LaPD \(z\) conversion in

Accessing HDF5 Files

Opening a File

Opening a HDF5 file is done using the bapsflib.lapd.File class. File subclasses h5py.File, so group and dataset manipulation is handled by the inherited methods; whereas, the new methods (see Table 1) are focused on mapping the data structure and providing a high-level access to the experimental data recorded by the LaPD DAQ system.

File is a wrapper on h5py.File and, thus, HDF5 file manipulation is handled by the inherited methods of h5py.File. File adds methods and attributes specifically for manipulating data and metadata written to the file from the Large Plasma Device (LaPD) DAQ system, see Table 1.

To open a LaPD generated HDF5 file do

>>> from bapsflib import lapd
>>> f = lapd.File('test.hdf5')
>>> f
<HDF5 file "test.hdf5" (mode r)>
>>> # f is still an instance of h5py.File
>>> isinstance(f, h5py.File)

which opens the file as ‘read-only’ by default. File restricts opening modes to ‘read-only’ (mode='r') and ‘read/write’ (mode='r+'), but maintains keyword pass-through to h5py.File.

After Opening a File

Upon opening a file, File calls on the LaPDMap class (a subclass of HDFMap) to construct a mapping of the HDF5 file’s internal data structure. This mapping provides the necessary translation for the high-level data reading methods, read_data(), read_controls(), and read_msi(). If an element of the HDF5 file is un-mappable – a mapping module does not exist or the mapping fails – the data can still be reached using the not-so-lower inherited methods of h5py.File. An instance of the mapping object is bound to File as file_map

>>> from bapsflib import lapd
>>> from bapsflib._hdf import HDFMap
>>> f = lapd.File('test.hdf5')
>>> f.file_map
<LaPDMap of HDF5 file 'test.hdf5'>
>>> # is still an instance of HDFMap
>>> isinstance(f.file_map, HDFMap)

For details on how the mapping works and how the mapping objects are structured see HDF5 File Mapping (HDFMap). For details on using the file_map see File Mapping for details.

The opened file object (f) provides a set of high-level methods and attributes for th user to interface with, see Table 1.

Table 1 Bound methods and attributes for open HDF5 file object File




dictionary of control device mappings (quick access to f.file_map.controls)


dictionary of digitizer [device] mappings (quick access to f.file_map.digitizers)


instance of the LaPD HDF5 file mapping (instance of LaPDMap)
(see File Mapping for details)


dictionary of meta-info about the HDF5 file and the experimental run


dictionary of MSI diagnostic [device] mappings (quick access to f.file_map.msi)


instance of LaPDOverview which that allows for printing and saving of the file mapping results


high-level method for reading control device data contained in the HDF5 file (instance of HDFReadControls)
(see For Control Devices for details)


high-level method for reading digitizer data and mating control device data at the time of read (instance of HDFReadData)
(see For a Digitizer for details)


high-level method for reading MSI diagnostic date (instance of HDFReadMSI)
(see For a MSI Diagnostic for details)


printout the LaPD experimental run description (print(['run description'].splitlines()))

File Mapping

The main purpose of the file_map object is to (1) identify the control devices, digitizers, and MSI diagnostic in the HDF5 file and (2) provide the necessary translation info to allow for easy reading of data via read_controls(), read_data(), and read_msi(). For the most part, there is not reason to directly access the file_map object since its results can easily be printed or saved using the overview attribute, see File Overview: Getting, Printing, and Saving for details. However, the mapping objects do contain useful details that are desirable in certain circumstances and can be modified for a special type of control device to augment the resulting numpy array when data is read.

The file map object file_map is an instance of LaPDMap, which subclasses HDFMap (details on HDFMap can be found at HDF5 File Mapping (HDFMap)). The LaPDMap provides a useful set of bound methods, see Table 2.

Table 2 Bound methods and attributes on f.file_map.




dictionary of control device mapping objects


dictionary of digitizer mapping objects


dictionary of experimental info collected from various group attributes in the HDF5 file


retrieve the mapping object for a specified device


True if it was determined that the HDF5 file was generated by the LaPD


version string of the LaPD DAQ Controller software used to generate the HDF5 file


mapping object for the digitizer that is considered the “main digitizer”


dictionary of MSI diagnostic mapping objects


dictionary of experimental run info collected from various group attributes in the HDF5 file


list of all subgroup and dataset paths in the HDF5 root group, control device group, digitizer group, and MSI group that were unable to be mapped

File Info: Metadata You Want

Every time a HDF5 file is opened a dictionary of metadata about the file and the experiment is bound to the file object as info.

>>> f = lapd.File('test.hdf5')
{'absolute file path': '/foo/bar/test.hdf5',
 'exp description': 'this is an experiment description',
  'run status': 'Started'}

Table Table 3 lists and describes all the items that can be found in the info dictionary.

Table 3 Items in the dictionary. (info)


Description & Equivalence

'absolute file path'

absolute path to the HDF5 file

'exp description'

description of experiment
f['Raw data + config].attrs['Experiment description']

'exp name'

name of the experiment in which the run of the HDF5 file resides
f['Raw data + config].attrs['Experiment Name']

'exp set description'

description of experiment set
f['Raw data + config].attrs['Experiment set description']

'exp set name'

name of experiment set the 'exp name' resides
f['Raw data + config].attrs['Experiment set name']


name of HDF5 file


name of Investigator/PI of the experiment
f['Raw data + config].attrs['Investigator']

'lapd version'

LaPD DAQ software version that wrote the HDF5 file

'run date'

date of experimental run
f['Raw data + config].attrs['Status date']

'run description'

description of experimental run
f['Raw data + config].attrs['Description']

'run name'

name of experimental data run
f['Raw data + config].attrs['Data run']

'run status'

status of experimental run (started, completed, etc.)
f['Raw data + config].attrs['Status']

File Overview: Getting, Printing, and Saving

The hdfOverview class provides a set of tools (see Table 4) to report the results of the HDF5 file mapping by HDFMap. An instance of hdfOverview is bound to File as the overview attribute and will report the current status of the mapping object.

>>> f = lapd.File('test.hdf5')
>>> f.overview

Thus, if any changes are made to the mapping object (file_map), which could happen for certain control devices [*], then those changes will be reflected in the overview report.

The overview report is divided into three blocks:

  1. General File and Experimental Info
    • This block contains information on the file (name, path, etc.), the experiment (exp. name, investigator, etc.), and the experimental run setup (run name, description, etc.).

    • Example:

      test.hdf5 Overview
      Generated by bapsflib
      Generated date: 4/19/2018 3:35:43 PM
      Filename:     test.hdf5
      Abs. Path:    /foo/bar/test.hdf5
      LaPD version: 1.2
      Investigator: Everson
      Run Date:     8/14/2017 9:49:53 PM
      Exp. and Run Structure:
        (set)  LaB6_Cathode
        (exp)  +-- HighBetaSAWAug17
        (run)  |   +-- test
      Run Description:
          some description of the experimental run
      Exp. Description:
          some description of the experiment as a whole
  2. Discovery Report
    • This block gives a brief report on what devices the HDFMap class discovered in the the file.

    • There are no details about each discovered device, just what was discovered.

    • Example:

      Discovery Report
      MSI/                                                   found
      +-- diagnostics (5)
      |   +-- Discharge
      |   +-- Gas pressure
      |   +-- Heater
      |   +-- Interferometer array
      |   +-- Magnetic field
      Raw data + config/                                     found
      +-- Data run sequence                                  not mapped
      +-- digitizers (1)
      |   +-- SIS crate (main)
      +-- control devices (1)
      |   +-- Waveform
      Unknowns (2)                                           aka unmapped
      +-- /Raw data + config/Data run sequence
      +-- /Raw data + config/N5700_PS
  3. Detailed Report
    • This block reports details on the mapping results for each discovered device (MSI diagnostics, control devices, and digitizers).

    • Basically reports the constructed configs dictionary of each devices mapping object.

    • Example:

      Detailed Reports
      Digitizer Report
      SIS crate (main)
      +-- adc's:  ['SIS 3302', 'SIS 3305']
      +-- Configurations Detected (1)                               (1 active, 0 inactive)
      |   +-- sis0-10ch                                             active
      |   |   +-- adc's (active):  ['SIS 3302']
      |   |   +-- path: /Raw data + config/SIS crate/sis0-10ch
      |   |   +-- SIS 3302 adc connections
      |   |   |   |   +-- (brd, [ch, ...])               bit  clock rate   nshotnum  nt        shot ave.  sample ave.
      |   |   |   |   +-- (1, [3, 4, 5, 6, 7, 8])        16   100.0 MHz    6160      12288     None       8
      |   |   |   |   +-- (2, [1, 2, 3, 4])              16   100.0 MHz    6160      12288     None       8
      Control Device Report
      +-- path:     /Raw data + config/Waveform
      +-- contype:  waveform
      +-- Configurations Detected (1)
      |   +-- waveform_50to150kHz_df10kHz_nf11
      |   |   +-- {...}
      MSI Diagnostic Report
      +-- path:  /MSI/Discharge
      +-- configs
      |   +--- {...}
      Gas pressure
      +-- path:  /MSI/Gas pressure
      +-- configs
      |   +-- {...}
      +-- path:  /MSI/Heater
      +-- configs
      |   +-- {...}
      Interferometer array
      +-- path:  /MSI/Interferometer array
      +-- configs
      |   +-- {...}
      Magnetic field
      +-- path:  /MSI/Magnetic field
      +-- configs
      |   +-- {...}

The methods provided by hdfOverview (see Table 4) allow for printing and saving of the complete overview, as well as, printing the individual blocks or sections of the blocks.

Table 4 “Methods provided by hdfOverview for reporting a HDF5 file overview”


Description and Call


Print to screen the entire overview.

>>> f.overview.print()


Save the report to a file given by filename.


If filename=True, then a text file is created with the same name as the HDF5 file in the same location.



Print the general info block.

>>> f.overview.report_general()


Print the discovery report block.

>>> f.overview.report_discovery()


Print the detail report block.

>>> f.overview.report_details()


Print the detail report block for all control devices.

>>> f.overview.report_controls()

Print the detail report block for a specific control device (e.g. Waveform).

>>> f.overview.report_controls(name='Waveform')


Print the detail report block for all digitizers.

>>> f.overview.report_digitizers()

Print the detail report block for a specific digitizer (e.g. SIS 3301).

>>> f.overview.report_digtitizers(name='SIS 3301')


Print the detail report block for all MSI diagnostics.

>>> f.overview.report_msi()

Print the detail report block for a specific MSI diagnostic (e.g. Discharge).

>>> f.overview.report_msi(name='Discharge')

Reading Data from a HDF5 File

Three classes HDFReadData, HDFReadControls, and HDFReadMSI are given to read data for digitizers, control devices, and MSI diagnostics, respectively. Each of these read classes are bound to File, see Table 5, and will return a structured numpy array with the requested data.

Table 5 Read classes/methods for extracting data from a HDF5 file

Read Class

Bound Method on File

What it does



Designed to extract digitizer data from a HDF5 file with the option of mating control device data at the time of extraction. (see reading For a Digitizer)



Designed to extract control device data. (see reading For Control Devices)



Designed to extract MSI diagnostic data. (see reading For a MSI Diagnostic)

For a Digitizer

Digitizer data is read using the read_data() method on File. The method also has the option of mating control device data at the time of declaration (see section Adding Control Device Data) [1].

At a minimum the read_data() method only needs a board number and channel number to extract data [2], but there are several additional keyword options:

Table 6 Optional keywords for read_data()






row index of the HDF5 dataset (see Extracting a sub-set)



global HDF5 file shot number (see Extracting a sub-set)



name of the digitizer for which board and channel belong to



name of the digitizer’s analog-digital-converter (adc) for which board and channel belong to



name of the digitizer configuration



Set True to return the digitizer data in bit values. By default the digitizer data is converted to voltage.



list of control devices whose data will be matched and added to the requested digitizer data



Ensures that the returned data array only contains shot numbers that are inclusive in shotnum, the digitizer dataset, and all control device datasets.



set True to suppress command line printout of soft-warnings

These keywords are explained in more detail in the following subsections.

If the test.hdf5 file has only one digitizer with one active adc and one configuration, then the entire dataset collected from the signal attached to board = 1 and channel = 0 can be extracted as follows:

>>> from bapsflib import lapd
>>> f = lapd.File('test.hdf5')
>>> board, channel = 1, 0
>>> data = f.read_data(board, channel)

where data is an instance of HDFReadData. The HDFReadData class acts as a wrapper on numpy.recarray. Thus, data behaves just like a numpy.recarray object, but will have additional methods and attributes that describe the data’s origin and parameters (e.g. info, dt, dv, etc.).

By default, data is a structured numpy array with the following dtype:

>>> data.dtype
dtype([('shotnum', '<u4'),
       ('signal', '<f4', (12288,)),
       ('xyz', '<f4', (3,))])

where 'shotnum' contains the HDF5 shot number, 'signal' contains the signal recorded by the digitizer, and 'xyz' is a 3-element array containing the probe position. In this example, the digitized signal is automatically converted into voltage before being added to the array and 12288 is the size of the signal’s time-array. To keep the digitizer 'signal in bit values, then set keep_bits=True at execution of read_data(). The field 'xyz' is initialized with numpy.nan values, but will be populated if a control device of contype = 'motion' is added (see Adding Control Device Data).

For details on handling and manipulating data see handle_data.


Since bapsflib.lapd leverages the h5py package, the data in test.hdf5 resides on disk until one of the read methods, read_data(), read_msi(), or read_controls() is called. In calling on of these methods, the requested data is brought into memory as a numpy.ndarray and a numpy.view onto that ndarray is returned to the user.

Extracting a sub-set

There are three keywords for sub-setting a dataset: index, shotnum, and intersection_set. index and shotnum are indexing keywords, whereas, intersection_set controls sub-setting behavior between the indexing keywords and the dataset(s).

index refers to the row index of the requested dataset and shotnum refers to the global HDF5 shot number. Either indexing keyword can be used, but index overrides shotnum. index and shotnum can be of type() int, list(int), or slice(). Sub-setting with index looks like:

>>> # read dataset row 10
>>> data = f.read_data(board, channel, index=9)
>>> data['shotnum']
array([10], dtype=uint32)

>>> # read dataset rows 10, 20, and 30
>>> data = f.read_data(board, channel, index=[9, 19, 29])

>>> # read dataset rows 10 to 19
>>> data = f.read_data(board, channel, index=slice(9, 19))

>>> # read every third row in the dataset from row 10 to 19
>>> data = f.read_data(board, channel, index=slice(9, 19, 3))
>>> data['shotnum']
array([10, 13, 16, 19], dtype=uint32)

Sub-setting with shotnum looks like:

>>> # read dataset shot number 10
>>> data = f.read_data(board, channel, shotnum=10)
>>> data['shotnum']
array([10], dtype=uint32)

>>> # read dataset shot numbers 10, 20, and 30
>>> data = f.read_data(board, channel, shotnum=[10, 20, 30])

>>> # read dataset shot numbers 10 to 19
>>> data = f.read_data(board, channel, shotnum=slice(10, 20))

>>> # read every 5th dataset shot number from 10 to 19
>>> data = f.read_data(board, channel, index=slice(10, 20, 5))
>>> data['shotnum']
array([10, 15], dtype=uint32)

intersection_set modifies what shot numbers are returned by read_data(). By default intersection_set=True and forces the returned data to only correspond to shot numbers that exist in the digitizer dataset, any specified control device datasets, and those shot numbers represented by index or shotnum. Setting to False will return all shot numbers >=1 associated with index or shotnum and array entries that are not associated with a dataset will be filled with a “NaN” value (np.nan for floats, -99999 for integers, and '' for strings).

Specifying digitizer, adc, and config_name

It is possible for a LaPD generated HDF5 file to contain multiple digitizers, each of which can have multiple analog-digital-converters (adc) and multiple configuration settings. For such a case, read_data() has the keywords digitizer, adc, and config_name to direct the data extraction accordingly.

If digitizer is not specified, then it is assumed that the desired digitizer is the one defined in main_digitizer. Suppose the test.hdf5 has two digitizers, 'SIS 3301' and 'SIS crate'. In this case 'SIS 3301' would be assumed as the main_digitizer. To extract data from 'SIS crate' one would use the digitizer keyword as follows:

>>> data = f.read_data(board, channel, digitizer='SIS crate')
'SIS crate'

Digitizer 'SIS crate' can have multiple active adc’s, 'SIS 3302' and 'SIS 3305'. By default, if only one adc is active then that adc is assumed; however, if multiple adc’s are active, then the adc with the slower clock rate is assumed. 'SIS 3302' has the slower clock rate in this case. To extract data from 'SIS 3305' one would use the adc keyword as follows:

>>> data = f.read_data(board, channel, digitizer='SIS crate',
>>>                    adc='SIS 3305')
'SIS 3305'

A digitizer can have multiple configurations, but typically only one configuration is ever active for the HDF5 file. In the case that multiple configurations are active, there is no overlying hierarchy for assuming one configuration over another. Suppose digitizer 'SIS crate' has two configurations, 'config_01' and 'config_02'. In this case, one of the configurations has to be specified at the time of extraction. To extract data from 'SIS crate' under the the configuration 'config_02' one would use the 'config_name' keyword as follows:

>>> f.file_map.digitizers['SIS crate'].active_configs
['config_01', 'config_02']
>>> data = f.read_data(board, channel, digitizer='SIS crate',
>>>                    config_name='config_02')
>>>['configuration name']
Adding Control Device Data

Adding control device data to a digitizer dataset is done with the keyword add_controls. Specifying add_controls will trigger a call to the HDFReadControls class and extract the desired control device data. HDFReadData then compares and mates that control device data with the digitizer data according to the global HDF5 shot number.

add_controls must be a list of strings and/or 2-element tuples specifying the desired control device data to be added to the digitizer data. If a control device only controls one configuration, then it is sufficient to only name that device. For example, if a '6K Compumotor' is only controlling one probe, then the data extraction call would look like:

>>> list(f.file_map.controls['6K Compumotor'].configs)
>>> data = f.read_data(board, channel,
>>>                    add_controls=['6K Compumotor'])
>>>['added controls']
[('6K Compumotor', 3)]

In the case the '6K Compumotor' has multiple configurations (controlling multiple probes), the add_controls call must also provide the configuration name to direct the extraction. This is done with a 2-element tuple entry for add_controls, where the first element is the control device name and the second element is the configuration name. For the '6K Compumotor' the configuration name is the receptacle number of the probe drive [3]. Suppose the '6K Compumotor' is utilizing three probe drives with the receptacles 2, 3, and 4. To mate control device data from receptacle 3, the call would look something like:

>>> list(f.file_map.controls['6K Compumotor'].configs)
[2, 3, 4]
>>> control  = [('6K Compumotor', 3)]
>>> data = f.read_data(board, channel, add_controls=control)
>>>['added controls']
[('6K Compumotor', 3)]

Multiple control device datasets can be added at once, but only one control device for each control type ('motion', 'power', and 'waveform') can be added. Adding '6K Compumotor' data from receptacle 3 and 'Waveform' data would look like:

>>> list(f.file_map.controls['Waveform'].configs)
>>> f.file_map.controls['Waveform'].contype
>>> f.file_map.controls['6K Compumotor'].contype
>>> data = f.read_data(board, channel,
>>>                    add_controls=[('6K Compumotor', 3),
>>>                                  'Waveform'])
>>>['added controls']
[('6K Compumotor', 3), ('Waveform', 'config01')]

Since '6K Compumotor' is a 'motion' control type it fills out the 'xyz' field in the returned numpy structured array; whereas, 'Waveform' will add field names to the numpy structured array according to the fields specified in its mapping constructor. See For Control Devices for details on these added fields.

For Control Devices


To be written

For a MSI Diagnostic

MSI diagnostic data is read using the read_msi() method on File. Only the MSI diagnostic name needs to be supplied to read the associated data:

>>> from bapsflib import lapd
>>> # open file
>>> f = lapd.File('test.hdf5')
>>> # list mapped MSI diagnostics
>>> f.list_msi
 'Gas pressure',
 'Interferometer array',
 'Magnetic field']
>>> # read 'Discharge' data
>>> mdata = f.read_msi('Discharge')

The returned data mdata is a structured numpy array where its field structure and population is determined by the MSI diagnostic mapping object. Every mdata will have the fields 'shotnum' and 'meta'. 'shotnum' represents the HDF5 shot number. 'meta' is a structured array with fields representing quantities (metadata) that are both diagnostic and shot number specific, but are not considered “primary” data arrays. Any other field in mdata is considered to be a “primary” data array. Continuing with the above example:

>>> # display mdata dytpe
>>> mdata.dtype
dtype([('shotnum', '<i4'),
       ('voltage', '<f4', (2048,)),
       ('current', '<f4', (2048,)),
       ('meta', [('timestamp', '<f8'),
                 ('data valid', 'i1'),
                 ('pulse length', '<f4'),
                 ('peak current', '<f4'),
                 ('bank voltage', '<f4')])])
>>> # display shot numbers
>>> mdata['shotnum']
array([    0, 19251], dtype=int32)

Here, the fields 'voltage' and 'current' correspond to “primary” data arrays. To display display the first three samples of the 'voltage' array for shot number 19251 do:

>>> mdata['voltage'][1][0:3:]
array([-44.631958, -44.708252, -44.631958], dtype=float32)

The metadata field 'meta' has five quantities in it, 'timestamp', 'data valid', 'pulse length', 'peak current', and 'peak voltage'. Now, these metadata fields will vary depending on the requested MSI diagnostic. To view the 'peak voltage' for shot number 0 do:

>>> mdata['meta']['peak voltage'][0]

The data array mdata is also constructed with a info attribute that contains metadata that is diagnostic specific but not shot number specific.

{'current conversion factor': [0.0],
 'diagnostic name': 'Discharge',
 'diagnostic path': '/MSI/Discharge',
 'dt': [4.88e-05],
 'hdf file': 'test.hdf5',
 't0': [-0.0249856],
 'voltage conversion factor': [0.0]}

Every info attribute will have the keys 'hdf file', 'diagnostic name', and 'diagnostic path'. The rest of the keys will be MSI diagnostic dependent. For example, for the 'Magnetic field' diagnostic would have the key 'z' that corresponds to the axial locations of the magnetic field array.

>>> # get magnetic field data
>>> mdata = f.read_msi('Magnetic field')
>>> mdata.dtype
dtype([('shotnum', '<i4'),
       ('magnet ps current', '<f4', (10,)),
       ('magnetic field', '<f4', (1024,)),
       ('meta', [('timestamp', '<f8'),
                 ('data valid', 'i1'),
                 ('peak magnetic field', '<f4')])])
{'diagnostic name': 'Magnetic field',
 'diagnostic path': '/MSI/Magnetic field',
 'hdf file': 'test.hdf5',
 'z': array([-300.     , -297.727  , -295.45395, ..., 2020.754  ,
             2023.027  , 2025.3    ], dtype=float32)}

LaPD Constants

LaPD Tools