Welcome to bapsflib’s documentation!

About bapsflib

The bapsflib package is intend to be a toolkit for reading, manipulating, and analyzing data collected at the Basic Plasma Science Facility (BaPSF). The current development focus is on providing a high-level, structured interface between the user and the HDF5 files generated by the Large Plasma Device (LaPD). The bapsflib.lapd module provides all the high-level methods while maintaining the not-so-lower “lower level” functionality of the h5py package.

As the package develops additional data visualization and plasma analysis tools will be incorporated.

Installation

Installing from pip

The bapsflib package is registered with PyPI and can be installed with pip via

pip install bapsflib

For the most recent development version, bapsflib can be installed from GitHub.

Installing Directly from GitHub

To install directly from GitHub, you need to have git installed on your computer. If you do not have git installed, then see Installing from a GitHub Clone or Download.

To install directly from the master branch invoke the following command

pip install git+https://github.com/BaPSF/bapsflib.git#egg=bapsflib

If an alternate branch BranchName is desired, then invoke

pip install git+https://github.com/BaPSF/bapsflib.git@BranchName#egg=bapsflib

Installing from a GitHub Clone or Download

A copy of the bapsflib package can be obtained by cloning or downloading from the GitHub repository.

Cloning the repository requires an installation of git on your computer. To clone the master branch, first, on your computer, navigate to the directory you want the clone and do

git clone https://github.com/BaPSF/bapsflib.git

To download a copy, go to the repository, select the branch to be downloaded, click the green button labeled Clone or download, select Download ZIP, save the zip file to the desired directory, and unpack.

After getting a copy of the bapsflib package (via clone or download), navigate to the main package directory, where the package setup.py file is located, and execute

pip install .

or

python setup.py install

Getting Started

The bapsflib package has four key sub-packages:

  • bapsflib._hdf

    This package contains the generic HDF5 utilities for mapping and accessing any HDF5 file generated at the Basic Plasma Science Facility (BaPSF) at UCLA. Typically there is no reason to directly access classes in this package, since these classes are typically sub-classed to provided specific access to HDF5 files generate by each plasma device at BaPSF. For example, any data collected on the Large Plasma Device (LaPD) will be handled by the bapsflib.lapd package. For now, one can access data collected on the Small Plasma Device (SmPD) and Enormous Toroidal Plasma Device (ETPD) by utilizing bapsflib._hdf.File.

  • bapsflib.lapd

    This package contains functionality for accessing HDF5 files generated by the LaPD (bapsflib.lapd.File), LaPD parameters (bapsflib.lapd.constants), and LaPD specific tools (bapsflib.lapd.tools). Look to Using bapsflib.lapd for details about the package.

  • bapsflib.plasma

    Warning

    package currently in development

    This package plasma constants and functions.

  • bapsflib.utils

    This package is for developers and contributors. It contains utilities used for constructing the bapsflib package.

In the future, packages for the Small Plasma Device (SmPD) bapsflib.smpd and the Enormous Toroidal Device (ETPD) bapsflib.etpd will be added, as well as, some fundamental analysis and plasma diagnostic packages.

BaPSF Background

What is HDF5?

HDF5 is a technology developed by the HDF Group that is designed to manage large and complex collections of data, allowing for advanced relationships between data and user metadata to be structured through grouping and linking mechanisms. For HDF5 support visit HDF Group’s HDF5 support site.

BaPSF HDF5 Files

Every plasma device at BaPSF is governed by a DAQ Controller. This DAQ Controller is tasked with operating and monitoring the plasma device, controlling the order of operations for an experimental run, recording data for the run, and generating the HDF5 file.

Types of Recorded Data

Data collected by BaPSF is classified into three types:

  1. MSI diagnostic [device] data

    MSI data is machine state recordings for the plasma device the experiment was run on. For example, MSI data for the LaPD would include diagnostics like partial gas pressure, discharge traces, magnetic field, etc.

  2. Digitizer [device] data

    This is “primary data” recorded by the DAQ digitizers. “Primary data” is any signal recorded from a plasma probe.

  3. Control device data

    Data recorded from a control device. A control device is a piece of equipment that controls some state property of the experiments. For example, a probe drive records position state info for a probe location and a waveform generator can record driving frequencies of an antenna.

Internal HDF5 Structure

HDF5 internal data structure

Example of a LaPD HDF5 internal data-tree.

The internal data structure (data-tree) of a HDF5 file looks very similar to a system file structure (see Fig. 1), where groups are akin to directories and datasets are akin to files. Not depicted in Fig. 1, each group and dataset can have an arbitrary number of key-value pair attributes constituting the component’s metadata.

In the above example, MSI diagnostic data is contain in the MSI group and the Raw data + config group houses both Digitizer data and Control device data. In addition to the three typical Types of Recorded Data, the Raw data + config group contains the Data run sequence for the experimental run. The Data run sequence is the order of operations performed by the DAQ Controller to execute the experimental run.


Using bapsflib.lapd

The bapsflib.lapd is a one-stop-shop for everything specifically related to handling data collected on the LaPD. The package provides:

  1. HDF5 file access via bapsflib.lapd.File

  2. LaPD machine specs and parameters in bapsflib.lapd.constants

  3. LaPD specific tools (e.g. port number to LaPD \(z\) conversion bapsflib.lapd.tools.portnum_to_z()) in bapsflib.lapd.tools.

Accessing HDF5 Files

Opening a File

Opening a HDF5 file is done using the bapsflib.lapd.File class. File subclasses h5py.File, so group and dataset manipulation is handled by the inherited methods; whereas, the new methods (see Table 1) are focused on mapping the data structure and providing a high-level access to the experimental data recorded by the LaPD DAQ system.

File is a wrapper on h5py.File and, thus, HDF5 file manipulation is handled by the inherited methods of h5py.File. File adds methods and attributes specifically for manipulating data and metadata written to the file from the Large Plasma Device (LaPD) DAQ system, see Table 1.

To open a LaPD generated HDF5 file do

>>> from bapsflib import lapd
>>> f = lapd.File('test.hdf5')
>>> f
<HDF5 file "test.hdf5" (mode r)>
>>>
>>> # f is still an instance of h5py.File
>>> isinstance(f, h5py.File)
True

which opens the file as ‘read-only’ by default. File restricts opening modes to ‘read-only’ (mode='r') and ‘read/write’ (mode='r+'), but maintains keyword pass-through to h5py.File.

After Opening a File

Upon opening a file, File calls on the LaPDMap class (a subclass of HDFMap) to construct a mapping of the HDF5 file’s internal data structure. This mapping provides the necessary translation for the high-level data reading methods, read_data(), read_controls(), and read_msi(). If an element of the HDF5 file is un-mappable – a mapping module does not exist or the mapping fails – the data can still be reached using the not-so-lower inherited methods of h5py.File. An instance of the mapping object is bound to File as file_map

>>> from bapsflib import lapd
>>> from bapsflib._hdf import HDFMap
>>> f = lapd.File('test.hdf5')
>>> f.file_map
<LaPDMap of HDF5 file 'test.hdf5'>
>>>
>>> # is still an instance of HDFMap
>>> isinstance(f.file_map, HDFMap)
True

For details on how the mapping works and how the mapping objects are structured see HDF5 File Mapping (HDFMap). For details on using the file_map see File Mapping for details.

The opened file object (f) provides a set of high-level methods and attributes for th user to interface with, see Table 1.

Bound methods and attributes for open HDF5 file object File

method/attribute

Description

controls

dictionary of control device mappings (quick access to f.file_map.controls)

digitizers

dictionary of digitizer [device] mappings (quick access to f.file_map.digitizers)

file_map

instance of the LaPD HDF5 file mapping (instance of LaPDMap)
(see File Mapping for details)

info

dictionary of meta-info about the HDF5 file and the experimental run

msi

dictionary of MSI diagnostic [device] mappings (quick access to f.file_map.msi)

overview

instance of LaPDOverview which that allows for printing and saving of the file mapping results

read_controls()

high-level method for reading control device data contained in the HDF5 file (instance of HDFReadControls)
(see For Control Devices for details)

read_data()

high-level method for reading digitizer data and mating control device data at the time of read (instance of HDFReadData)
(see For a Digitizer for details)

read_msi()

high-level method for reading MSI diagnostic date (instance of HDFReadMSI)
(see For a MSI Diagnostic for details)

run_description()

printout the LaPD experimental run description (print(f.info['run description'].splitlines()))

File Mapping

The main purpose of the file_map object is to (1) identify the control devices, digitizers, and MSI diagnostic in the HDF5 file and (2) provide the necessary translation info to allow for easy reading of data via read_controls(), read_data(), and read_msi(). For the most part, there is not reason to directly access the file_map object since its results can easily be printed or saved using the overview attribute, see File Overview: Getting, Printing, and Saving for details. However, the mapping objects do contain useful details that are desirable in certain circumstances and can be modified for a special type of control device to augment the resulting numpy array when data is read.

The file map object file_map is an instance of LaPDMap, which subclasses HDFMap (details on HDFMap can be found at HDF5 File Mapping (HDFMap)). The LaPDMap provides a useful set of bound methods, see Table 2.

Bound methods and attributes on f.file_map.

method/attribute

Description

controls

dictionary of control device mapping objects

digitizers

dictionary of digitizer mapping objects

exp_info

dictionary of experimental info collected from various group attributes in the HDF5 file

get()

retrieve the mapping object for a specified device

is_lapd

True if it was determined that the HDF5 file was generated by the LaPD

lapd_version

version string of the LaPD DAQ Controller software used to generate the HDF5 file

main_digitizer

mapping object for the digitizer that is considered the “main digitizer”

msi

dictionary of MSI diagnostic mapping objects

run_info

dictionary of experimental run info collected from various group attributes in the HDF5 file

unknowns

list of all subgroup and dataset paths in the HDF5 root group, control device group, digitizer group, and MSI group that were unable to be mapped

File Info: Metadata You Want

Every time a HDF5 file is opened a dictionary of metadata about the file and the experiment is bound to the file object as info.

>>> f = lapd.File('test.hdf5')
>>> f.info
{'absolute file path': '/foo/bar/test.hdf5',
 'exp description': 'this is an experiment description',
  ...
  'run status': 'Started'}

Table Table 3 lists and describes all the items that can be found in the info dictionary.

Items in the f.info dictionary. (info)

key

Description & Equivalence

'absolute file path'

absolute path to the HDF5 file
os.path.abspath(f.filename)

'exp description'

description of experiment
f['Raw data + config].attrs['Experiment description']

'exp name'

name of the experiment in which the run of the HDF5 file resides
f['Raw data + config].attrs['Experiment Name']

'exp set description'

description of experiment set
f['Raw data + config].attrs['Experiment set description']

'exp set name'

name of experiment set the 'exp name' resides
f['Raw data + config].attrs['Experiment set name']

'filename'

name of HDF5 file
os.path.basename(f.filename)

'investigator'

name of Investigator/PI of the experiment
f['Raw data + config].attrs['Investigator']

'lapd version'

LaPD DAQ software version that wrote the HDF5 file
f.file_map.hdf_version

'run date'

date of experimental run
f['Raw data + config].attrs['Status date']

'run description'

description of experimental run
f['Raw data + config].attrs['Description']

'run name'

name of experimental data run
f['Raw data + config].attrs['Data run']

'run status'

status of experimental run (started, completed, etc.)
f['Raw data + config].attrs['Status']

File Overview: Getting, Printing, and Saving

The hdfOverview class provides a set of tools (see Table 4) to report the results of the HDF5 file mapping by HDFMap. An instance of hdfOverview is bound to File as the overview attribute and will report the current status of the mapping object.

>>> f = lapd.File('test.hdf5')
>>> f.overview
<bapsflib.lapd._hdf.hdfoverview.hdfOverview>

Thus, if any changes are made to the mapping object (file_map), which could happen for certain control devices [*], then those changes will be reflected in the overview report.

The overview report is divided into three blocks:

  1. General File and Experimental Info
    • This block contains information on the file (name, path, etc.), the experiment (exp. name, investigator, etc.), and the experimental run setup (run name, description, etc.).

    • Example:

      ========================================================================
      test.hdf5 Overview
      Generated by bapsflib
      Generated date: 4/19/2018 3:35:43 PM
      ========================================================================
      
      
      Filename:     test.hdf5
      Abs. Path:    /foo/bar/test.hdf5
      LaPD version: 1.2
      Investigator: Everson
      Run Date:     8/14/2017 9:49:53 PM
      
      Exp. and Run Structure:
        (set)  LaB6_Cathode
        (exp)  +-- HighBetaSAWAug17
        (run)  |   +-- test
      
      Run Description:
          some description of the experimental run
      
      Exp. Description:
          some description of the experiment as a whole
      
  2. Discovery Report
    • This block gives a brief report on what devices the HDFMap class discovered in the the file.

    • There are no details about each discovered device, just what was discovered.

    • Example:

      Discovery Report
      ----------------
      
      MSI/                                                   found
      +-- diagnostics (5)
      |   +-- Discharge
      |   +-- Gas pressure
      |   +-- Heater
      |   +-- Interferometer array
      |   +-- Magnetic field
      Raw data + config/                                     found
      +-- Data run sequence                                  not mapped
      +-- digitizers (1)
      |   +-- SIS crate (main)
      +-- control devices (1)
      |   +-- Waveform
      Unknowns (2)                                           aka unmapped
      +-- /Raw data + config/Data run sequence
      +-- /Raw data + config/N5700_PS
      
  3. Detailed Report
    • This block reports details on the mapping results for each discovered device (MSI diagnostics, control devices, and digitizers).

    • Basically reports the constructed configs dictionary of each devices mapping object.

    • Example:

      Detailed Reports
      -----------------
      
      
      Digitizer Report
      ^^^^^^^^^^^^^^^^
      
      SIS crate (main)
      +-- adc's:  ['SIS 3302', 'SIS 3305']
      +-- Configurations Detected (1)                               (1 active, 0 inactive)
      |   +-- sis0-10ch                                             active
      |   |   +-- adc's (active):  ['SIS 3302']
      |   |   +-- path: /Raw data + config/SIS crate/sis0-10ch
      |   |   +-- SIS 3302 adc connections
      |   |   |   |   +-- (brd, [ch, ...])               bit  clock rate   nshotnum  nt        shot ave.  sample ave.
      |   |   |   |   +-- (1, [3, 4, 5, 6, 7, 8])        16   100.0 MHz    6160      12288     None       8
      |   |   |   |   +-- (2, [1, 2, 3, 4])              16   100.0 MHz    6160      12288     None       8
      
      Control Device Report
      ^^^^^^^^^^^^^^^^^^^^^
      
      Waveform
      +-- path:     /Raw data + config/Waveform
      +-- contype:  waveform
      +-- Configurations Detected (1)
      |   +-- waveform_50to150kHz_df10kHz_nf11
      |   |   +-- {...}
      
      MSI Diagnostic Report
      ^^^^^^^^^^^^^^^^^^^^^
      
      Discharge
      +-- path:  /MSI/Discharge
      +-- configs
      |   +--- {...}
      Gas pressure
      +-- path:  /MSI/Gas pressure
      +-- configs
      |   +-- {...}
      Heater
      +-- path:  /MSI/Heater
      +-- configs
      |   +-- {...}
      Interferometer array
      +-- path:  /MSI/Interferometer array
      +-- configs
      |   +-- {...}
      Magnetic field
      +-- path:  /MSI/Magnetic field
      +-- configs
      |   +-- {...}
      

The methods provided by hdfOverview (see Table 4) allow for printing and saving of the complete overview, as well as, printing the individual blocks or sections of the blocks.

“Methods provided by hdfOverview for reporting a HDF5 file overview”

Method

Description and Call

print()

Print to screen the entire overview.

>>> f.overview.print()

save()

Save the report to a file given by filename.

>>> f.overview.save(filename)

If filename=True, then a text file is created with the same name as the HDF5 file in the same location.

>>> f.overview.save(True)

report_general()

Print the general info block.

>>> f.overview.report_general()

report_discovery()

Print the discovery report block.

>>> f.overview.report_discovery()

report_details()

Print the detail report block.

>>> f.overview.report_details()

report_controls()

Print the detail report block for all control devices.

>>> f.overview.report_controls()

Print the detail report block for a specific control device (e.g. Waveform).

>>> f.overview.report_controls(name='Waveform')

report_digitizers()

Print the detail report block for all digitizers.

>>> f.overview.report_digitizers()

Print the detail report block for a specific digitizer (e.g. SIS 3301).

>>> f.overview.report_digtitizers(name='SIS 3301')

report_msi()

Print the detail report block for all MSI diagnostics.

>>> f.overview.report_msi()

Print the detail report block for a specific MSI diagnostic (e.g. Discharge).

>>> f.overview.report_msi(name='Discharge')

Reading Data from a HDF5 File

Three classes HDFReadData, HDFReadControls, and HDFReadMSI are given to read data for digitizers, control devices, and MSI diagnostics, respectively. Each of these read classes are bound to File, see Table 5, and will return a structured numpy array with the requested data.

Read classes/methods for extracting data from a HDF5 file

Read Class

Bound Method on File

What it does

HDFReadData

read_data()

Designed to extract digitizer data from a HDF5 file with the option of mating control device data at the time of extraction. (see reading For a Digitizer)

HDFReadControls

read_controls()

Designed to extract control device data. (see reading For Control Devices)

HDFReadMSI

read_msi()

Designed to extract MSI diagnostic data. (see reading For a MSI Diagnostic)

For a Digitizer

Digitizer data is read using the read_data() method on File. The method also has the option of mating control device data at the time of declaration (see section Adding Control Device Data) [1].

At a minimum the read_data() method only needs a board number and channel number to extract data [2], but there are several additional keyword options:

Optional keywords for read_data()

Keyword

Default

Description

index

slice(None)

row index of the HDF5 dataset (see Extracting a sub-set)

shotnum

slice(None)

global HDF5 file shot number (see Extracting a sub-set)

digitizer

None

name of the digitizer for which board and channel belong to

adc

None

name of the digitizer’s analog-digital-converter (adc) for which board and channel belong to

config_name

None

name of the digitizer configuration

keep_bits

False

Set True to return the digitizer data in bit values. By default the digitizer data is converted to voltage.

add_controls

None

list of control devices whose data will be matched and added to the requested digitizer data

intersection_set

True

Ensures that the returned data array only contains shot numbers that are inclusive in shotnum, the digitizer dataset, and all control device datasets.

silent

False

set True to suppress command line printout of soft-warnings

These keywords are explained in more detail in the following subsections.

If the test.hdf5 file has only one digitizer with one active adc and one configuration, then the entire dataset collected from the signal attached to board = 1 and channel = 0 can be extracted as follows:

>>> from bapsflib import lapd
>>> f = lapd.File('test.hdf5')
>>> board, channel = 1, 0
>>> data = f.read_data(board, channel)

where data is an instance of HDFReadData. The HDFReadData class acts as a wrapper on numpy.recarray. Thus, data behaves just like a numpy.recarray object, but will have additional methods and attributes that describe the data’s origin and parameters (e.g. info, dt, dv, etc.).

By default, data is a structured numpy array with the following dtype:

>>> data.dtype
dtype([('shotnum', '<u4'),
       ('signal', '<f4', (12288,)),
       ('xyz', '<f4', (3,))])

where 'shotnum' contains the HDF5 shot number, 'signal' contains the signal recorded by the digitizer, and 'xyz' is a 3-element array containing the probe position. In this example, the digitized signal is automatically converted into voltage before being added to the array and 12288 is the size of the signal’s time-array. To keep the digitizer 'signal in bit values, then set keep_bits=True at execution of read_data(). The field 'xyz' is initialized with numpy.nan values, but will be populated if a control device of contype = 'motion' is added (see Adding Control Device Data).


For details on handling and manipulating data see handle_data.

Note

Since bapsflib.lapd leverages the h5py package, the data in test.hdf5 resides on disk until one of the read methods, read_data(), read_msi(), or read_controls() is called. In calling on of these methods, the requested data is brought into memory as a numpy.ndarray and a numpy.view onto that ndarray is returned to the user.


Extracting a sub-set

There are three keywords for sub-setting a dataset: index, shotnum, and intersection_set. index and shotnum are indexing keywords, whereas, intersection_set controls sub-setting behavior between the indexing keywords and the dataset(s).

index refers to the row index of the requested dataset and shotnum refers to the global HDF5 shot number. Either indexing keyword can be used, but index overrides shotnum. index and shotnum can be of type() int, list(int), or slice(). Sub-setting with index looks like:

>>> # read dataset row 10
>>> data = f.read_data(board, channel, index=9)
>>> data['shotnum']
array([10], dtype=uint32)

>>> # read dataset rows 10, 20, and 30
>>> data = f.read_data(board, channel, index=[9, 19, 29])

>>> # read dataset rows 10 to 19
>>> data = f.read_data(board, channel, index=slice(9, 19))

>>> # read every third row in the dataset from row 10 to 19
>>> data = f.read_data(board, channel, index=slice(9, 19, 3))
>>> data['shotnum']
array([10, 13, 16, 19], dtype=uint32)

Sub-setting with shotnum looks like:

>>> # read dataset shot number 10
>>> data = f.read_data(board, channel, shotnum=10)
>>> data['shotnum']
array([10], dtype=uint32)

>>> # read dataset shot numbers 10, 20, and 30
>>> data = f.read_data(board, channel, shotnum=[10, 20, 30])

>>> # read dataset shot numbers 10 to 19
>>> data = f.read_data(board, channel, shotnum=slice(10, 20))

>>> # read every 5th dataset shot number from 10 to 19
>>> data = f.read_data(board, channel, index=slice(10, 20, 5))
>>> data['shotnum']
array([10, 15], dtype=uint32)

intersection_set modifies what shot numbers are returned by read_data(). By default intersection_set=True and forces the returned data to only correspond to shot numbers that exist in the digitizer dataset, any specified control device datasets, and those shot numbers represented by index or shotnum. Setting to False will return all shot numbers >=1 associated with index or shotnum and array entries that are not associated with a dataset will be filled with a “NaN” value (np.nan for floats, -99999 for integers, and '' for strings).

Specifying digitizer, adc, and config_name

It is possible for a LaPD generated HDF5 file to contain multiple digitizers, each of which can have multiple analog-digital-converters (adc) and multiple configuration settings. For such a case, read_data() has the keywords digitizer, adc, and config_name to direct the data extraction accordingly.

If digitizer is not specified, then it is assumed that the desired digitizer is the one defined in main_digitizer. Suppose the test.hdf5 has two digitizers, 'SIS 3301' and 'SIS crate'. In this case 'SIS 3301' would be assumed as the main_digitizer. To extract data from 'SIS crate' one would use the digitizer keyword as follows:

>>> data = f.read_data(board, channel, digitizer='SIS crate')
>>> data.info['digitizer']
'SIS crate'

Digitizer 'SIS crate' can have multiple active adc’s, 'SIS 3302' and 'SIS 3305'. By default, if only one adc is active then that adc is assumed; however, if multiple adc’s are active, then the adc with the slower clock rate is assumed. 'SIS 3302' has the slower clock rate in this case. To extract data from 'SIS 3305' one would use the adc keyword as follows:

>>> data = f.read_data(board, channel, digitizer='SIS crate',
>>>                    adc='SIS 3305')
>>> data.info['adc']
'SIS 3305'

A digitizer can have multiple configurations, but typically only one configuration is ever active for the HDF5 file. In the case that multiple configurations are active, there is no overlying hierarchy for assuming one configuration over another. Suppose digitizer 'SIS crate' has two configurations, 'config_01' and 'config_02'. In this case, one of the configurations has to be specified at the time of extraction. To extract data from 'SIS crate' under the the configuration 'config_02' one would use the 'config_name' keyword as follows:

>>> f.file_map.digitizers['SIS crate'].active_configs
['config_01', 'config_02']
>>> data = f.read_data(board, channel, digitizer='SIS crate',
>>>                    config_name='config_02')
>>> data.info['configuration name']
'config_02'
Adding Control Device Data

Adding control device data to a digitizer dataset is done with the keyword add_controls. Specifying add_controls will trigger a call to the HDFReadControls class and extract the desired control device data. HDFReadData then compares and mates that control device data with the digitizer data according to the global HDF5 shot number.

add_controls must be a list of strings and/or 2-element tuples specifying the desired control device data to be added to the digitizer data. If a control device only controls one configuration, then it is sufficient to only name that device. For example, if a '6K Compumotor' is only controlling one probe, then the data extraction call would look like:

>>> list(f.file_map.controls['6K Compumotor'].configs)
[3]
>>> data = f.read_data(board, channel,
>>>                    add_controls=['6K Compumotor'])
>>> data.info['added controls']
[('6K Compumotor', 3)]

In the case the '6K Compumotor' has multiple configurations (controlling multiple probes), the add_controls call must also provide the configuration name to direct the extraction. This is done with a 2-element tuple entry for add_controls, where the first element is the control device name and the second element is the configuration name. For the '6K Compumotor' the configuration name is the receptacle number of the probe drive [3]. Suppose the '6K Compumotor' is utilizing three probe drives with the receptacles 2, 3, and 4. To mate control device data from receptacle 3, the call would look something like:

>>> list(f.file_map.controls['6K Compumotor'].configs)
[2, 3, 4]
>>> control  = [('6K Compumotor', 3)]
>>> data = f.read_data(board, channel, add_controls=control)
>>> data.info['added controls']
[('6K Compumotor', 3)]

Multiple control device datasets can be added at once, but only one control device for each control type ('motion', 'power', and 'waveform') can be added. Adding '6K Compumotor' data from receptacle 3 and 'Waveform' data would look like:

>>> list(f.file_map.controls['Waveform'].configs)
['config01']
>>> f.file_map.controls['Waveform'].contype
'waveform'
>>> f.file_map.controls['6K Compumotor'].contype
'motion'
>>> data = f.read_data(board, channel,
>>>                    add_controls=[('6K Compumotor', 3),
>>>                                  'Waveform'])
>>> data.info['added controls']
[('6K Compumotor', 3), ('Waveform', 'config01')]

Since '6K Compumotor' is a 'motion' control type it fills out the 'xyz' field in the returned numpy structured array; whereas, 'Waveform' will add field names to the numpy structured array according to the fields specified in its mapping constructor. See For Control Devices for details on these added fields.

For Control Devices

Note

To be written

For a MSI Diagnostic

MSI diagnostic data is read using the read_msi() method on File. Only the MSI diagnostic name needs to be supplied to read the associated data:

>>> from bapsflib import lapd
>>>
>>> # open file
>>> f = lapd.File('test.hdf5')
>>>
>>> # list mapped MSI diagnostics
>>> f.list_msi
['Discharge',
 'Gas pressure',
 'Heater',
 'Interferometer array',
 'Magnetic field']
>>>
>>> # read 'Discharge' data
>>> mdata = f.read_msi('Discharge')

The returned data mdata is a structured numpy array where its field structure and population is determined by the MSI diagnostic mapping object. Every mdata will have the fields 'shotnum' and 'meta'. 'shotnum' represents the HDF5 shot number. 'meta' is a structured array with fields representing quantities (metadata) that are both diagnostic and shot number specific, but are not considered “primary” data arrays. Any other field in mdata is considered to be a “primary” data array. Continuing with the above example:

>>> # display mdata dytpe
>>> mdata.dtype
dtype([('shotnum', '<i4'),
       ('voltage', '<f4', (2048,)),
       ('current', '<f4', (2048,)),
       ('meta', [('timestamp', '<f8'),
                 ('data valid', 'i1'),
                 ('pulse length', '<f4'),
                 ('peak current', '<f4'),
                 ('bank voltage', '<f4')])])
>>>
>>> # display shot numbers
>>> mdata['shotnum']
array([    0, 19251], dtype=int32)

Here, the fields 'voltage' and 'current' correspond to “primary” data arrays. To display display the first three samples of the 'voltage' array for shot number 19251 do:

>>> mdata['voltage'][1][0:3:]
array([-44.631958, -44.708252, -44.631958], dtype=float32)

The metadata field 'meta' has five quantities in it, 'timestamp', 'data valid', 'pulse length', 'peak current', and 'peak voltage'. Now, these metadata fields will vary depending on the requested MSI diagnostic. To view the 'peak voltage' for shot number 0 do:

>>> mdata['meta']['peak voltage'][0]
6127.1323

The data array mdata is also constructed with a info attribute that contains metadata that is diagnostic specific but not shot number specific.

>>> mdata.info
{'current conversion factor': [0.0],
 'diagnostic name': 'Discharge',
 'diagnostic path': '/MSI/Discharge',
 'dt': [4.88e-05],
 'hdf file': 'test.hdf5',
 't0': [-0.0249856],
 'voltage conversion factor': [0.0]}

Every info attribute will have the keys 'hdf file', 'diagnostic name', and 'diagnostic path'. The rest of the keys will be MSI diagnostic dependent. For example, mdata.info for the 'Magnetic field' diagnostic would have the key 'z' that corresponds to the axial locations of the magnetic field array.

>>> # get magnetic field data
>>> mdata = f.read_msi('Magnetic field')
>>> mdata.dtype
dtype([('shotnum', '<i4'),
       ('magnet ps current', '<f4', (10,)),
       ('magnetic field', '<f4', (1024,)),
       ('meta', [('timestamp', '<f8'),
                 ('data valid', 'i1'),
                 ('peak magnetic field', '<f4')])])
>>> mdata.info
{'diagnostic name': 'Magnetic field',
 'diagnostic path': '/MSI/Magnetic field',
 'hdf file': 'test.hdf5',
 'z': array([-300.     , -297.727  , -295.45395, ..., 2020.754  ,
             2023.027  , 2025.3    ], dtype=float32)}

LaPD Constants

LaPD Tools

HDF5 File Mapping (HDFMap)

HDFMap constructs the mapping for a given HDF5 file. When a HDF5 file is opened with File, HDFMap is automatically called to construct the map and an instance of the mapping object is bound to the file object as file_map. Thus, the file mappings for test.hdf5 can be accessed like:

>>> f = lapd.File('test.hdf5')
>>> f.file_map
<bapsflib._hdf.maps.core.HDFMap>

Architecture

HDFMap takes a modular approach to mapping a HDF5 file. It contains a dictionary of known modules with known layouts. If one or more of these layouts are discovered inside the HDF5 file, then the associated mappings are added to the mapping object. There are five module categories:

msi diagnostic

This is any sub-group of the '/MSI/' group that represents a diagnostic device. A diagnostic device is a probe or sensor that records machine state data for every experimental run.

digitizer

This is any group inside the '/Raw data + config/' group that is associated with a digitizer. A digitizer is a device that records “primary” data; that is, data recorded for a plasma probe.

control device

This is any group inside the '/Raw data + config/' group that is associated with a device that controls a plasma probe. The recorded data is state data for the plasma probe; for example, probe position, bias, driving frequency, etc.

data run sequence

This is the /Raw data + config/Data run sequence/' group which records the run sequence (operation sequence) of the LaPD DAQ controller.

unknown

This is any group or dataset in '/', '/MSI/', or '/Raw data + config/' groups that is not known by HDFMap or is unsuccessfully mapped.

Basic Usage

Basic Structure of file_map Mapping Object

Retrieving Active Digitizers

A list of all detected digitizers can be obtain by doing

>>> list(f.file_map.digitizers)

The file mappings for all the active digitizers are stored in the dictionary f.file_map.digitizers such that

>>> list(f.file_map.digitizers.keys())
Out: list of strings of all active digitizer names
>>> f.file_map.digitizer[digi_name]
Out: digitizer mapping object

Retrieving Active Digitizer Configuration

Retrieving Active Analog-Digital Converts (adc’s) for a Digitizer Configuration

Retrieving adc Connections and Digitization Settings

Adding Modules to the Mapping Architecture

Adding a Digitizer Mapping Module

Adding a Control Device Mapping Module

Adding a MSI Diagnostic Mapping Module

Change Log

This document lists the changes made for each release of bapsflib.

v2.0.0b3.dev14+g9353d77 (2024-05-06)

Backwards Incompatible Changes

  • Dropped support for h5py < 3.0, and set minimum h5py dependency to h5py >= 3.0. (#70)

Bug Fixes

Trivial/Internal Changes

Package Management

  • Extended GitHub Action workflow tests.yml to include testing on Python 3.9 and 3.10. (#70)

  • Set minimum dependencies astropy >= 4.3.1 and numpy >= 1.17. (#70)

  • Fixed towncrier dependency to towncrier==22.8.0. (#101)

  • Updated numpy dependency to >=1.20. (#101)

v1.0.2 (2022-07-26)

Backwards Incompatible Changes

  • Dropped support for Python 3.6. (#73)

Features

  • Updated “SIS Crate” digitizers mapper (HDFMapDigiSISCrate) so the analog-digital-converters can have enabled/active boards without enabled/active channels. (#61)

  • Updated HDFMapControl6K so the identification and mapping of probe configurations uses the combination of the receptacle number and probe name as an unique id, opposed to just the probe name. (#63)

  • Created helper function bapsflib.utils._bytes_to_str to convert byte literals to utf-8 strings. (#64)

  • Made indexing of datasets more robust in a few locations in anticipation of allowing h5py >= 3.0, and thus accounting for h5py’s change in indexing behavior. (#65)

Documentation Improvements

Trivial/Internal Changes

  • Refactored bapsflib module imports to be consistent with the style enforced by isort. (#66)

  • Refactored bapsflib using black==21.7b0 styling formatter, and converted many string concatenations and format strings into f-strings . (#67)

  • Added .git-blame-ignore-revs to package so git blame ignores the major refactoring commits from isort PR #66 and black PR #67. (#68)

  • Converted all remaining relative imports to absolute imports. (#83)

  • Added codespell as a package “extras” dependency. (#86)

Package Management

  • Added GitHub Action check_pr_changelog.yml to check for valid change log entries on a pull request. (#58)

  • Added GitHub Action tests.yml for testing of pushes to master, version tags, pull requests, and cron jobs every Sunday at 3:13 am PT. Tests are setup to run on the latest versions of ubuntu, MacOS, and Windows. Tests are setup to run on Python 3.6, 3.7, and 3.8. Tests also run on min versions of h5py (v2.8.0) and numpy (v1.14). (#58)

  • Added isort configuration to pyproject.toml. (#66)

  • Created GitHub Action linters.yml with the isort/isort-action to check that module imports are properly styled. (#66)

  • Added black==21.7b0 to the “extras” dependencies and add its configuration to pyproject.toml. (#67)

  • Added to GitHub Action linters.yml` the psf/black action to check that new code is formatted properly. (#67)

  • Deleted pep8speaks.yml and disconnect CI from repo since package is adopting black and associated GitHub Action. (#67)

  • Added a GitHub Dependabot to keep versions of GitHub Actions up to date. (#72)

  • Reworked testing GitHub Action workflow such that: two base tests are initially run; if the base tests pass, then the full matrix of tests are run; and if the full matrix of tests passes, then the tests on minimum versions are run. (#72)

  • Updated GitHub Actions for linters isort and black such that the associated package versions used are taken from bapsflib’s requirements files. (#72)

  • Set numpy dependency to >= 1.15. (#73)

  • Set package dependency coverage[toml] >= 4.5.1. The [toml] option allows for the coverage configuration to be defined in pyproject.toml, and v4.5.1 is the first release with this functionality. (#74)

  • Moved the coverage configuration from .coveragerc to pyproject.toml. (#74)

  • Removed setuptools and setuptools_scm from setup.cfg install dependencies since they are already listed as build dependencies in pyproject.toml. (#74)

  • Exposed requirements/build.txt into requirements/install.txt since setuptools and setuptools_scm are both build and install dependencies. (#74)

  • Added workflow python-publish.yml to GitHub Actions that builds and publishes a release to PyPI when using GitHub’s Releases functionality. (#84)

  • Added an import bapsflib test to the test.yml GitHub Action workflow. (#85)

  • Added a package build and install test to the test.yml GitHub Action workflow. (#85)

  • Added a codespell test to the linters.yml GitHub Action workflow. (#86)

v1.0.1 (2019-06-01)

v1.0.0 (2018-11-08)

  • Initial release

Glossary

adc

analog-digital converter

BaPSF

Basic Plasma Science Facility

LaPD

Large Plasma Device