Welcome to bapsflib’s documentation!
About bapsflib
The bapsflib
package is intend to be a toolkit for reading,
manipulating, and analyzing data collected at the Basic Plasma Science
Facility (BaPSF). The current development focus is on providing a
high-level, structured interface between the user and the HDF5 files
generated by the Large Plasma Device (LaPD). The bapsflib.lapd
module provides all the high-level methods while maintaining the
not-so-lower “lower level” functionality of the
h5py package.
As the package develops additional data visualization and plasma analysis tools will be incorporated.
Installation
Installing from pip
The bapsflib
package is registered with
PyPI and can be installed with
pip
via
pip install bapsflib
For the most recent development version, bapsflib
can be
installed from GitHub.
Installing Directly from GitHub
To install directly from GitHub, you need to have
git
installed on your computer. If you do not have git
installed,
then see Installing from a GitHub Clone or Download.
To install directly from the master
branch invoke the following
command
pip install git+https://github.com/BaPSF/bapsflib.git#egg=bapsflib
If an alternate branch BranchName
is desired, then invoke
pip install git+https://github.com/BaPSF/bapsflib.git@BranchName#egg=bapsflib
Installing from a GitHub Clone or Download
A copy of the bapsflib
package can be obtained by
cloning
or downloading from the GitHub repository.
Cloning the repository requires an installation of git
on your
computer. To clone the master
branch, first, on your computer,
navigate to the directory you want the clone and do
git clone https://github.com/BaPSF/bapsflib.git
To download a copy, go to the repository, select the branch to be downloaded, click the green button labeled Clone or download, select Download ZIP, save the zip file to the desired directory, and unpack.
After getting a copy of the bapsflib
package (via clone or
download), navigate to the main package directory, where the package
setup.py
file is located, and execute
pip install .
or
python setup.py install
Useful Installation Links
bapsflib repository: https://github.com/BaPSF/bapsflib
bapsflib on PyPI: https://pypi.org/project/bapsflib/
setuptools documentation: https://setuptools.readthedocs.io/en/latest/index.html
pip documentation: https://pip.pypa.io/en/stable/
git installation: https://git-scm.com/book/en/v2/Getting-Started-Installing-Git
cloning and downloading form GitHub: https://help.github.com/articles/cloning-a-repository/
Getting Started
The bapsflib
package has four key sub-packages:
-
This package contains the generic HDF5 utilities for mapping and accessing any HDF5 file generated at the Basic Plasma Science Facility (BaPSF) at UCLA. Typically there is no reason to directly access classes in this package, since these classes are typically sub-classed to provided specific access to HDF5 files generate by each plasma device at BaPSF. For example, any data collected on the Large Plasma Device (LaPD) will be handled by the
bapsflib.lapd
package. For now, one can access data collected on the Small Plasma Device (SmPD) and Enormous Toroidal Plasma Device (ETPD) by utilizingbapsflib._hdf.File
. -
This package contains functionality for accessing HDF5 files generated by the LaPD (
bapsflib.lapd.File
), LaPD parameters (bapsflib.lapd.constants
), and LaPD specific tools (bapsflib.lapd.tools
). Look to Using bapsflib.lapd for details about the package. -
Warning
package currently in development
This package plasma constants and functions.
-
This package is for developers and contributors. It contains utilities used for constructing the
bapsflib
package.
In the future, packages for the Small Plasma Device (SmPD)
bapsflib.smpd
and the Enormous Toroidal Device (ETPD)
bapsflib.etpd
will be added, as well as, some
fundamental analysis and plasma diagnostic packages.
BaPSF Background
What is HDF5?
HDF5 is a technology developed by the HDF Group that is designed to manage large and complex collections of data, allowing for advanced relationships between data and user metadata to be structured through grouping and linking mechanisms. For HDF5 support visit HDF Group’s HDF5 support site.
BaPSF HDF5 Files
Every plasma device at BaPSF is governed by a DAQ Controller. This DAQ Controller is tasked with operating and monitoring the plasma device, controlling the order of operations for an experimental run, recording data for the run, and generating the HDF5 file.
Types of Recorded Data
Data collected by BaPSF is classified into three types:
MSI diagnostic [device] data
MSI data is machine state recordings for the plasma device the experiment was run on. For example, MSI data for the LaPD would include diagnostics like partial gas pressure, discharge traces, magnetic field, etc.
Digitizer [device] data
This is “primary data” recorded by the DAQ digitizers. “Primary data” is any signal recorded from a plasma probe.
Control device data
Data recorded from a control device. A control device is a piece of equipment that controls some state property of the experiments. For example, a probe drive records position state info for a probe location and a waveform generator can record driving frequencies of an antenna.
Internal HDF5 Structure

Example of a LaPD HDF5 internal data-tree.
The internal data structure (data-tree) of a HDF5 file looks very similar to a system file structure (see Fig. 1), where groups are akin to directories and datasets are akin to files. Not depicted in Fig. 1, each group and dataset can have an arbitrary number of key-value pair attributes constituting the component’s metadata.
In the above example, MSI diagnostic data is contain in the MSI
group and the Raw data + config
group houses both
Digitizer data and Control device data. In addition to the
three typical Types of Recorded Data, the Raw data + config
group contains the Data run sequence for the experimental run.
The Data run sequence is the order of operations performed by the
DAQ Controller to execute the experimental run.
Using bapsflib.lapd
The bapsflib.lapd
is a one-stop-shop for everything specifically
related to handling data collected on the LaPD. The package provides:
HDF5 file access via
bapsflib.lapd.File
LaPD machine specs and parameters in
bapsflib.lapd.constants
LaPD specific tools (e.g. port number to LaPD \(z\) conversion
bapsflib.lapd.tools.portnum_to_z()
) inbapsflib.lapd.tools
.
Accessing HDF5 Files
Opening a File
Opening a HDF5 file is done using the
bapsflib.lapd.File
class. File
subclasses h5py.File
, so group and dataset manipulation
is handled by the inherited methods; whereas, the new methods (see
Table 1) are focused on mapping the data structure and
providing a high-level access to the experimental data recorded by the
LaPD DAQ system.
File
is a wrapper on
h5py.File
and, thus, HDF5 file manipulation is handled by the
inherited methods of h5py.File
.
File
adds methods and
attributes specifically for manipulating data and metadata written to
the file from the Large Plasma Device (LaPD) DAQ system, see
Table 1.
To open a LaPD generated HDF5 file do
>>> from bapsflib import lapd
>>> f = lapd.File('test.hdf5')
>>> f
<HDF5 file "test.hdf5" (mode r)>
>>>
>>> # f is still an instance of h5py.File
>>> isinstance(f, h5py.File)
True
which opens the file as ‘read-only’ by default.
File
restricts opening modes to ‘read-only’
(mode='r'
) and ‘read/write’ (mode='r+'
), but maintains
keyword pass-through to h5py.File
.
After Opening a File
Upon opening a file, File
calls on the
LaPDMap
class
(a subclass of HDFMap
) to construct
a mapping of the HDF5 file’s internal data structure. This mapping
provides the necessary translation for the high-level data reading
methods, read_data()
, read_controls()
, and
read_msi()
. If an element of the HDF5 file
is un-mappable – a mapping module does not exist or the mapping
fails – the data can still be reached using the not-so-lower
inherited methods of h5py.File
. An instance of the mapping
object is bound to File
as
file_map
>>> from bapsflib import lapd
>>> from bapsflib._hdf import HDFMap
>>> f = lapd.File('test.hdf5')
>>> f.file_map
<LaPDMap of HDF5 file 'test.hdf5'>
>>>
>>> # is still an instance of HDFMap
>>> isinstance(f.file_map, HDFMap)
True
For details on how the mapping works and how the mapping objects are
structured see HDF5 File Mapping (HDFMap). For details on using the
file_map
see File Mapping for details.
The opened file object (f
) provides a set of high-level methods and
attributes for th user to interface with, see Table 1.
method/attribute |
Description |
---|---|
dictionary of control device mappings (quick access to
|
|
dictionary of digitizer [device] mappings (quick access to
|
|
instance of the LaPD HDF5 file mapping (instance of
LaPDMap )(see File Mapping for details)
|
|
dictionary of meta-info about the HDF5 file and the experimental
run
(see File Info: Metadata You Want for details)
|
|
dictionary of MSI diagnostic [device] mappings (quick access to
|
|
instance of
LaPDOverview
which that allows for printing and saving of the file mapping
results(see File Overview: Getting, Printing, and Saving for details)
|
|
high-level method for reading control device data contained in the
HDF5 file (instance of
HDFReadControls )(see For Control Devices for details)
|
|
high-level method for reading digitizer data and mating control
device data at the time of read (instance of
HDFReadData )(see For a Digitizer for details)
|
|
high-level method for reading MSI diagnostic date (instance of
HDFReadMSI )(see For a MSI Diagnostic for details)
|
|
printout the LaPD experimental run description
( |
File Mapping
The main purpose of the file_map
object is to (1) identify
the control devices, digitizers, and MSI diagnostic in the HDF5 file and
(2) provide the necessary translation info to allow for easy reading of
data via read_controls()
, read_data()
, and
read_msi()
. For the most part, there is not reason to
directly access the file_map
object since its results can
easily be printed or saved using the overview
attribute,
see File Overview: Getting, Printing, and Saving for details. However, the mapping objects do
contain useful details that are desirable in certain circumstances and
can be modified for a special type of control device to augment the
resulting numpy array when data is read.
The file map object file_map
is an instance of
LaPDMap
, which subclasses
HDFMap
(details on
HDFMap
can be found at HDF5 File Mapping (HDFMap)).
The LaPDMap
provides a useful set of bound
methods, see Table 2.
method/attribute |
Description |
---|---|
dictionary of control device mapping objects |
|
dictionary of digitizer mapping objects |
|
dictionary of experimental info collected from various group attributes in the HDF5 file |
|
retrieve the mapping object for a specified device |
|
|
|
version string of the LaPD DAQ Controller software used to generate the HDF5 file |
|
mapping object for the digitizer that is considered the “main digitizer” |
|
dictionary of MSI diagnostic mapping objects |
|
dictionary of experimental run info collected from various group attributes in the HDF5 file |
|
list of all subgroup and dataset paths in the HDF5 root group, control device group, digitizer group, and MSI group that were unable to be mapped |
File Info: Metadata You Want
Every time a HDF5 file is opened a dictionary of metadata about the file
and the experiment is bound to the file object as
info
.
>>> f = lapd.File('test.hdf5')
>>> f.info
{'absolute file path': '/foo/bar/test.hdf5',
'exp description': 'this is an experiment description',
...
'run status': 'Started'}
Table Table 3 lists and describes all the items that can be found in the info dictionary.
key |
Description & Equivalence |
---|---|
|
absolute path to the HDF5 file
os.path.abspath(f.filename) |
|
description of experiment
f['Raw data + config].attrs['Experiment description'] |
|
name of the experiment in which the run of the HDF5 file resides
f['Raw data + config].attrs['Experiment Name'] |
|
description of experiment set
f['Raw data + config].attrs['Experiment set description'] |
|
name of experiment set the
'exp name' residesf['Raw data + config].attrs['Experiment set name'] |
|
name of HDF5 file
os.path.basename(f.filename) |
|
name of Investigator/PI of the experiment
f['Raw data + config].attrs['Investigator'] |
|
LaPD DAQ software version that wrote the HDF5 file
f.file_map.hdf_version |
|
date of experimental run
f['Raw data + config].attrs['Status date'] |
|
description of experimental run
f['Raw data + config].attrs['Description'] |
|
name of experimental data run
f['Raw data + config].attrs['Data run'] |
|
status of experimental run (started, completed, etc.)
f['Raw data + config].attrs['Status'] |
File Overview: Getting, Printing, and Saving
The hdfOverview
class provides a
set of tools (see Table 4) to report the results of
the HDF5 file mapping by HDFMap
.
An instance of hdfOverview
is
bound to File
as the
overview
attribute and will report
the current status of the mapping object.
>>> f = lapd.File('test.hdf5')
>>> f.overview
<bapsflib.lapd._hdf.hdfoverview.hdfOverview>
Thus, if any changes
are made to the mapping object
(file_map
), which could happen for
certain control devices [*], then those changes will be reflected in the
overview report.
The overview report is divided into three blocks:
- General File and Experimental Info
This block contains information on the file (name, path, etc.), the experiment (exp. name, investigator, etc.), and the experimental run setup (run name, description, etc.).
Example:
======================================================================== test.hdf5 Overview Generated by bapsflib Generated date: 4/19/2018 3:35:43 PM ======================================================================== Filename: test.hdf5 Abs. Path: /foo/bar/test.hdf5 LaPD version: 1.2 Investigator: Everson Run Date: 8/14/2017 9:49:53 PM Exp. and Run Structure: (set) LaB6_Cathode (exp) +-- HighBetaSAWAug17 (run) | +-- test Run Description: some description of the experimental run Exp. Description: some description of the experiment as a whole
- Discovery Report
This block gives a brief report on what devices the
HDFMap
class discovered in the the file.There are no details about each discovered device, just what was discovered.
Example:
Discovery Report ---------------- MSI/ found +-- diagnostics (5) | +-- Discharge | +-- Gas pressure | +-- Heater | +-- Interferometer array | +-- Magnetic field Raw data + config/ found +-- Data run sequence not mapped +-- digitizers (1) | +-- SIS crate (main) +-- control devices (1) | +-- Waveform Unknowns (2) aka unmapped +-- /Raw data + config/Data run sequence +-- /Raw data + config/N5700_PS
- Detailed Report
This block reports details on the mapping results for each discovered device (MSI diagnostics, control devices, and digitizers).
Basically reports the constructed
configs
dictionary of each devices mapping object.Example:
Detailed Reports ----------------- Digitizer Report ^^^^^^^^^^^^^^^^ SIS crate (main) +-- adc's: ['SIS 3302', 'SIS 3305'] +-- Configurations Detected (1) (1 active, 0 inactive) | +-- sis0-10ch active | | +-- adc's (active): ['SIS 3302'] | | +-- path: /Raw data + config/SIS crate/sis0-10ch | | +-- SIS 3302 adc connections | | | | +-- (brd, [ch, ...]) bit clock rate nshotnum nt shot ave. sample ave. | | | | +-- (1, [3, 4, 5, 6, 7, 8]) 16 100.0 MHz 6160 12288 None 8 | | | | +-- (2, [1, 2, 3, 4]) 16 100.0 MHz 6160 12288 None 8 Control Device Report ^^^^^^^^^^^^^^^^^^^^^ Waveform +-- path: /Raw data + config/Waveform +-- contype: waveform +-- Configurations Detected (1) | +-- waveform_50to150kHz_df10kHz_nf11 | | +-- {...} MSI Diagnostic Report ^^^^^^^^^^^^^^^^^^^^^ Discharge +-- path: /MSI/Discharge +-- configs | +--- {...} Gas pressure +-- path: /MSI/Gas pressure +-- configs | +-- {...} Heater +-- path: /MSI/Heater +-- configs | +-- {...} Interferometer array +-- path: /MSI/Interferometer array +-- configs | +-- {...} Magnetic field +-- path: /MSI/Magnetic field +-- configs | +-- {...}
The methods provided by
hdfOverview
(see
Table 4) allow for printing and saving of the
complete overview, as well as, printing the individual blocks or
sections of the blocks.
Method |
Description and Call |
---|---|
|
Print to screen the entire overview. >>> f.overview.print()
|
|
Save the report to a file given by >>> f.overview.save(filename)
If >>> f.overview.save(True)
|
|
Print the general info block. >>> f.overview.report_general()
|
|
Print the discovery report block. >>> f.overview.report_discovery()
|
|
Print the detail report block. >>> f.overview.report_details()
|
|
Print the detail report block for all control devices. >>> f.overview.report_controls()
Print the detail report block for a specific control device (e.g. Waveform). >>> f.overview.report_controls(name='Waveform')
|
|
Print the detail report block for all digitizers. >>> f.overview.report_digitizers()
Print the detail report block for a specific digitizer (e.g. SIS 3301). >>> f.overview.report_digtitizers(name='SIS 3301')
|
|
Print the detail report block for all MSI diagnostics. >>> f.overview.report_msi()
Print the detail report block for a specific MSI diagnostic (e.g. Discharge). >>> f.overview.report_msi(name='Discharge')
|
Reading Data from a HDF5 File
Three classes HDFReadData
,
HDFReadControls
, and
HDFReadMSI
are given to read
data for digitizers, control devices, and MSI diagnostics,
respectively. Each of these read classes are bound to
File
, see Table 5,
and will return a structured numpy
array with the requested data.
Read Class |
Bound Method on |
What it does |
---|---|---|
Designed to extract digitizer data from a HDF5 file with the option of mating control device data at the time of extraction. (see reading For a Digitizer) |
||
Designed to extract control device data. (see reading For Control Devices) |
||
Designed to extract MSI diagnostic data. (see reading For a MSI Diagnostic) |
For a Digitizer
Digitizer data is read using the
read_data()
method on
File
. The method also has the option
of mating control device data at the time of declaration (see section
Adding Control Device Data) [1].
At a minimum the read_data()
method
only needs a board number and channel number to extract data [2], but
there are several additional keyword options:
Keyword |
Default |
Description |
---|---|---|
|
|
row index of the HDF5 dataset (see Extracting a sub-set) |
|
|
global HDF5 file shot number (see Extracting a sub-set) |
|
|
name of the digitizer for which
board and channel
belong to |
|
|
name of the digitizer’s analog-digital-converter (adc) for which
board and channel belong to |
|
|
name of the digitizer configuration
|
|
|
Set |
|
|
list of control devices whose data will be matched and added to
the requested digitizer data
|
|
|
Ensures that the returned data array only contains shot numbers
that are inclusive in
shotnum , the digitizer dataset, and
all control device datasets.(see Extracting a sub-set)
|
|
|
set |
These keywords are explained in more detail in the following subsections.
If the test.hdf5
file has only one digitizer with one active
adc and one configuration, then the entire dataset collected from the
signal attached to board = 1
and channel = 0
can be
extracted as follows:
>>> from bapsflib import lapd
>>> f = lapd.File('test.hdf5')
>>> board, channel = 1, 0
>>> data = f.read_data(board, channel)
where data
is an instance of
HDFReadData
. The
HDFReadData
class acts as a
wrapper on numpy.recarray
. Thus, data
behaves just like
a numpy.recarray
object, but will have additional methods and
attributes that describe the data’s origin and parameters (e.g.
info
,
dt
,
dv
, etc.).
By default, data
is a structured numpy
array with the
following dtype
:
>>> data.dtype
dtype([('shotnum', '<u4'),
('signal', '<f4', (12288,)),
('xyz', '<f4', (3,))])
where 'shotnum'
contains the HDF5 shot number, 'signal'
contains the signal recorded by the digitizer, and 'xyz'
is a
3-element array containing the probe position. In this example,
the digitized signal is automatically converted into voltage before
being added to the array and 12288
is the size of the signal’s
time-array. To keep the digitizer 'signal
in bit values, then
set keep_bits=True
at execution of
read_data()
. The field 'xyz'
is initialized with numpy.nan
values, but will be populated if
a control device of contype = 'motion'
is added (see
Adding Control Device Data).
For details on handling and manipulating data
see
handle_data.
Note
Since bapsflib.lapd
leverages the h5py
package,
the data in test.hdf5
resides on disk until one of the read
methods, read_data()
,
read_msi()
, or
read_controls()
is called. In
calling on of these methods, the requested data is brought into
memory as a numpy.ndarray
and a numpy.view
onto
that ndarray
is returned to the user.
Extracting a sub-set
There are three keywords for sub-setting a dataset: index
,
shotnum
, and intersection_set
. index
and
shotnum
are indexing keywords, whereas, intersection_set
controls sub-setting behavior between the indexing keywords and the
dataset(s).
index
refers to the row index of the requested dataset and
shotnum
refers to the global HDF5 shot number. Either indexing
keyword can be used, but index
overrides shotnum
.
index
and shotnum
can be of type()
int
, list(int)
, or slice()
. Sub-setting with
index
looks like:
>>> # read dataset row 10
>>> data = f.read_data(board, channel, index=9)
>>> data['shotnum']
array([10], dtype=uint32)
>>> # read dataset rows 10, 20, and 30
>>> data = f.read_data(board, channel, index=[9, 19, 29])
>>> # read dataset rows 10 to 19
>>> data = f.read_data(board, channel, index=slice(9, 19))
>>> # read every third row in the dataset from row 10 to 19
>>> data = f.read_data(board, channel, index=slice(9, 19, 3))
>>> data['shotnum']
array([10, 13, 16, 19], dtype=uint32)
Sub-setting with shotnum
looks like:
>>> # read dataset shot number 10
>>> data = f.read_data(board, channel, shotnum=10)
>>> data['shotnum']
array([10], dtype=uint32)
>>> # read dataset shot numbers 10, 20, and 30
>>> data = f.read_data(board, channel, shotnum=[10, 20, 30])
>>> # read dataset shot numbers 10 to 19
>>> data = f.read_data(board, channel, shotnum=slice(10, 20))
>>> # read every 5th dataset shot number from 10 to 19
>>> data = f.read_data(board, channel, index=slice(10, 20, 5))
>>> data['shotnum']
array([10, 15], dtype=uint32)
intersection_set
modifies what shot numbers are returned by
read_data()
. By default
intersection_set=True
and forces the returned data to only
correspond to shot numbers that exist in the digitizer dataset, any
specified control device datasets, and those shot numbers represented by
index
or shotnum
. Setting to False
will return
all shot numbers >=1
associated with index
or
shotnum
and array entries that are not associated with a dataset
will be filled with a “NaN” value (np.nan
for floats,
-99999
for integers, and ''
for strings).
Specifying digitizer
, adc
, and config_name
It is possible for a LaPD generated HDF5 file to contain multiple
digitizers, each of which can have multiple analog-digital-converters
(adc) and multiple configuration settings. For such a case,
read_data()
has the keywords
digitizer
, adc
, and config_name
to direct the
data extraction accordingly.
If digitizer
is not specified, then it is assumed that the
desired digitizer is the one defined in
main_digitizer
. Suppose
the test.hdf5
has two digitizers, 'SIS 3301'
and
'SIS crate'
. In this case 'SIS 3301'
would be assumed
as the main_digitizer
. To
extract data from 'SIS crate'
one would use the
digitizer
keyword as follows:
>>> data = f.read_data(board, channel, digitizer='SIS crate')
>>> data.info['digitizer']
'SIS crate'
Digitizer 'SIS crate'
can have multiple active
adc’s, 'SIS 3302'
and 'SIS 3305'
. By default, if only
one adc is active then that adc is assumed; however, if multiple adc’s
are active, then the adc with the slower clock rate is assumed.
'SIS 3302'
has the slower clock rate in this case. To extract
data from 'SIS 3305'
one would use the adc
keyword as
follows:
>>> data = f.read_data(board, channel, digitizer='SIS crate',
>>> adc='SIS 3305')
>>> data.info['adc']
'SIS 3305'
A digitizer can have multiple configurations, but typically only one
configuration is ever active for the HDF5 file. In the case that
multiple configurations are active, there is no overlying hierarchy for
assuming one configuration over another. Suppose digitizer
'SIS crate'
has two configurations, 'config_01'
and
'config_02'
. In this case, one of the configurations has to be
specified at the time of extraction. To extract data from
'SIS crate'
under the the configuration 'config_02'
one
would use the 'config_name'
keyword as follows:
>>> f.file_map.digitizers['SIS crate'].active_configs
['config_01', 'config_02']
>>> data = f.read_data(board, channel, digitizer='SIS crate',
>>> config_name='config_02')
>>> data.info['configuration name']
'config_02'
Adding Control Device Data
Adding control device data to a digitizer dataset is done with the
keyword add_controls
. Specifying add_controls
will
trigger a call to the
HDFReadControls
class and
extract the desired control device data.
HDFReadData
then compares and
mates that control device data with the digitizer data according to the
global HDF5 shot number.
add_controls
must be a list of strings and/or 2-element tuples
specifying the desired control device data to be added to the digitizer
data. If a control device only controls one configuration, then it is
sufficient to only name that device. For example, if a
'6K Compumotor'
is only controlling one probe, then the data
extraction call would look like:
>>> list(f.file_map.controls['6K Compumotor'].configs)
[3]
>>> data = f.read_data(board, channel,
>>> add_controls=['6K Compumotor'])
>>> data.info['added controls']
[('6K Compumotor', 3)]
In the case the '6K Compumotor'
has multiple configurations
(controlling multiple probes), the add_controls
call must also
provide the configuration name to direct the extraction. This is done
with a 2-element tuple entry for add_controls
, where the first
element is the control device name and the second element is the
configuration name. For the '6K Compumotor'
the configuration
name is the receptacle number of the probe drive [3]. Suppose the
'6K Compumotor'
is utilizing three probe drives with the
receptacles 2, 3, and 4. To mate control device data from receptacle 3,
the call would look something like:
>>> list(f.file_map.controls['6K Compumotor'].configs)
[2, 3, 4]
>>> control = [('6K Compumotor', 3)]
>>> data = f.read_data(board, channel, add_controls=control)
>>> data.info['added controls']
[('6K Compumotor', 3)]
Multiple control device datasets can be added at once, but only
one control device for each control type ('motion'
,
'power'
, and 'waveform'
) can be added. Adding
'6K Compumotor'
data from receptacle 3 and 'Waveform'
data would look like:
>>> list(f.file_map.controls['Waveform'].configs)
['config01']
>>> f.file_map.controls['Waveform'].contype
'waveform'
>>> f.file_map.controls['6K Compumotor'].contype
'motion'
>>> data = f.read_data(board, channel,
>>> add_controls=[('6K Compumotor', 3),
>>> 'Waveform'])
>>> data.info['added controls']
[('6K Compumotor', 3), ('Waveform', 'config01')]
Since '6K Compumotor'
is a 'motion'
control type it
fills out the 'xyz'
field in the returned numpy structured
array; whereas, 'Waveform'
will add field names to the numpy
structured array according to the fields specified in its mapping
constructor. See For Control Devices for details on these added
fields.
Control device data can also be independently read using
read_controls()
.
(see For Control Devices for usage)
Review section digi_overview for how a digitizer is organized and configured.
Each control device has its own concept of what constitutes a
configuration. The configuration has be unique to a block of
recorded data. For the '6K Compumotor'
the receptacle
number is used as the configuration name, whereas, for the
'Waveform'
control the configuration name is the name of the
configuration group inside the 'Waveform
group. Since the
configurations are contain in the
f.file_map.contorols[con_name].configs
dictionary, the
configuration name need not be a string.
For Control Devices
Note
To be written
For a MSI Diagnostic
MSI diagnostic data is read using the
read_msi()
method on
File
. Only the MSI diagnostic name
needs to be supplied to read the associated data:
>>> from bapsflib import lapd
>>>
>>> # open file
>>> f = lapd.File('test.hdf5')
>>>
>>> # list mapped MSI diagnostics
>>> f.list_msi
['Discharge',
'Gas pressure',
'Heater',
'Interferometer array',
'Magnetic field']
>>>
>>> # read 'Discharge' data
>>> mdata = f.read_msi('Discharge')
The returned data mdata
is a structured numpy
array where
its field structure and population is determined by the MSI diagnostic
mapping object. Every mdata
will have the fields
'shotnum'
and 'meta'
. 'shotnum'
represents the
HDF5 shot number. 'meta'
is a structured array with fields
representing quantities (metadata) that are both diagnostic and shot
number specific, but are not considered “primary” data arrays. Any
other field in mdata
is considered to be a “primary” data array.
Continuing with the above example:
>>> # display mdata dytpe
>>> mdata.dtype
dtype([('shotnum', '<i4'),
('voltage', '<f4', (2048,)),
('current', '<f4', (2048,)),
('meta', [('timestamp', '<f8'),
('data valid', 'i1'),
('pulse length', '<f4'),
('peak current', '<f4'),
('bank voltage', '<f4')])])
>>>
>>> # display shot numbers
>>> mdata['shotnum']
array([ 0, 19251], dtype=int32)
Here, the fields 'voltage'
and 'current'
correspond to
“primary” data arrays. To display display the first three samples of
the 'voltage'
array for shot number 19251 do:
>>> mdata['voltage'][1][0:3:]
array([-44.631958, -44.708252, -44.631958], dtype=float32)
The metadata field 'meta'
has five quantities in it,
'timestamp'
, 'data valid'
, 'pulse length'
,
'peak current'
, and 'peak voltage'
. Now, these metadata
fields will vary depending on the requested MSI diagnostic. To view
the 'peak voltage'
for shot number 0 do:
>>> mdata['meta']['peak voltage'][0]
6127.1323
The data array mdata
is also constructed with a info
attribute that contains metadata that is diagnostic specific but not
shot number specific.
>>> mdata.info
{'current conversion factor': [0.0],
'diagnostic name': 'Discharge',
'diagnostic path': '/MSI/Discharge',
'dt': [4.88e-05],
'hdf file': 'test.hdf5',
't0': [-0.0249856],
'voltage conversion factor': [0.0]}
Every info
attribute will have the keys 'hdf file'
,
'diagnostic name'
, and 'diagnostic path'
. The rest of
the keys will be MSI diagnostic dependent. For example,
mdata.info
for the 'Magnetic field'
diagnostic would
have the key 'z'
that corresponds to the axial locations of the
magnetic field array.
>>> # get magnetic field data
>>> mdata = f.read_msi('Magnetic field')
>>> mdata.dtype
dtype([('shotnum', '<i4'),
('magnet ps current', '<f4', (10,)),
('magnetic field', '<f4', (1024,)),
('meta', [('timestamp', '<f8'),
('data valid', 'i1'),
('peak magnetic field', '<f4')])])
>>> mdata.info
{'diagnostic name': 'Magnetic field',
'diagnostic path': '/MSI/Magnetic field',
'hdf file': 'test.hdf5',
'z': array([-300. , -297.727 , -295.45395, ..., 2020.754 ,
2023.027 , 2025.3 ], dtype=float32)}
LaPD Constants
LaPD Tools
HDF5 File Mapping (HDFMap
)
HDFMap
constructs the mapping for
a given HDF5 file. When a HDF5 file is opened with
File
,
HDFMap
is automatically called to
construct the map and an instance of the mapping object is bound
to the file object as file_map
.
Thus, the file mappings for test.hdf5
can be accessed like:
>>> f = lapd.File('test.hdf5')
>>> f.file_map
<bapsflib._hdf.maps.core.HDFMap>
Architecture
HDFMap
takes a modular approach to
mapping a HDF5 file. It contains a dictionary of known modules with
known layouts. If one or more of these layouts are discovered inside
the HDF5 file, then the associated mappings are added to the mapping
object. There are five module categories:
- msi diagnostic
This is any sub-group of the
'/MSI/'
group that represents a diagnostic device. A diagnostic device is a probe or sensor that records machine state data for every experimental run.- digitizer
This is any group inside the
'/Raw data + config/'
group that is associated with a digitizer. A digitizer is a device that records “primary” data; that is, data recorded for a plasma probe.- control device
This is any group inside the
'/Raw data + config/'
group that is associated with a device that controls a plasma probe. The recorded data is state data for the plasma probe; for example, probe position, bias, driving frequency, etc.- data run sequence
This is the
/Raw data + config/Data run sequence/'
group which records the run sequence (operation sequence) of the LaPD DAQ controller.- unknown
This is any group or dataset in
'/'
,'/MSI/'
, or'/Raw data + config/'
groups that is not known byHDFMap
or is unsuccessfully mapped.
Basic Usage
Basic Structure of file_map
Mapping Object
Retrieving Active Digitizers
A list of all detected digitizers can be obtain by doing
>>> list(f.file_map.digitizers)
The file mappings for all the active digitizers are stored in the
dictionary f.file_map.digitizers
such that
>>> list(f.file_map.digitizers.keys())
Out: list of strings of all active digitizer names
>>> f.file_map.digitizer[digi_name]
Out: digitizer mapping object
Retrieving Active Digitizer Configuration
Retrieving Active Analog-Digital Converts (adc’s) for a Digitizer Configuration
Retrieving adc Connections and Digitization Settings
Adding Modules to the Mapping Architecture
Adding a Digitizer Mapping Module
Adding a Control Device Mapping Module
Adding a MSI Diagnostic Mapping Module
Change Log
This document lists the changes made for each release of bapsflib
.
v2.0.0b3.dev14+g9353d77 (2024-05-06)
Backwards Incompatible Changes
Dropped support for
h5py < 3.0
, and set minimumh5py
dependency toh5py >= 3.0
. (#70)
Bug Fixes
Updated
bapsflib.utils._bytes_to_str()
to handle byte strings that my have been encoded using Windows codespace 1252. (#100)
Trivial/Internal Changes
Replaced several instances of
numpy
deprecated functionality:numpy.bool
tobool
andnumpy.bytes0
tonumpy.bytes_
. (#101)
Package Management
v1.0.2 (2022-07-26)
Backwards Incompatible Changes
Dropped support for Python 3.6. (#73)
Features
Updated “SIS Crate” digitizers mapper (
HDFMapDigiSISCrate
) so the analog-digital-converters can have enabled/active boards without enabled/active channels. (#61)Updated
HDFMapControl6K
so the identification and mapping of probe configurations uses the combination of the receptacle number and probe name as an unique id, opposed to just the probe name. (#63)Created helper function
bapsflib.utils._bytes_to_str
to convert byte literals to utf-8 strings. (#64)Made indexing of datasets more robust in a few locations in anticipation of allowing
h5py >= 3.0
, and thus accounting forh5py
’s change in indexing behavior. (#65)
Documentation Improvements
Incorporated towncrier and sphinx-changelog for better change log tracking and continuous change log rending in the documentation. (#56)
Trivial/Internal Changes
Refactored
bapsflib
module imports to be consistent with the style enforced byisort
. (#66)Refactored
bapsflib
usingblack==21.7b0
styling formatter, and converted many string concatenations and format strings into f-strings . (#67)Added
.git-blame-ignore-revs
to package sogit blame
ignores the major refactoring commits fromisort
PR #66 andblack
PR #67. (#68)Converted all remaining relative imports to absolute imports. (#83)
Package Management
Added GitHub Action
check_pr_changelog.yml
to check for valid change log entries on a pull request. (#58)Added GitHub Action
tests.yml
for testing of pushes tomaster
, version tags, pull requests, and cron jobs every Sunday at 3:13 am PT. Tests are setup to run on the latest versions of ubuntu, MacOS, and Windows. Tests are setup to run on Python 3.6, 3.7, and 3.8. Tests also run on min versions ofh5py
(v2.8.0
) andnumpy
(v1.14
). (#58)Added
isort
configuration topyproject.toml
. (#66)Created GitHub Action
linters.yml
with the isort/isort-action to check that module imports are properly styled. (#66)Added
black==21.7b0
to the “extras” dependencies and add its configuration topyproject.toml
. (#67)Added to GitHub Action
linters.yml`
the psf/black action to check that new code is formatted properly. (#67)Deleted
pep8speaks.yml
and disconnect CI from repo since package is adoptingblack
and associated GitHub Action. (#67)Added a GitHub Dependabot to keep versions of GitHub Actions up to date. (#72)
Reworked testing GitHub Action workflow such that: two base tests are initially run; if the base tests pass, then the full matrix of tests are run; and if the full matrix of tests passes, then the tests on minimum versions are run. (#72)
Updated GitHub Actions for linters
isort
andblack
such that the associated package versions used are taken frombapsflib
’s requirements files. (#72)Set package dependency
coverage[toml] >= 4.5.1
. The[toml]
option allows for the coverage configuration to be defined inpyproject.toml
, andv4.5.1
is the first release with this functionality. (#74)Moved the coverage configuration from
.coveragerc
topyproject.toml
. (#74)Removed
setuptools
andsetuptools_scm
fromsetup.cfg
install dependencies since they are already listed as build dependencies inpyproject.toml
. (#74)Exposed
requirements/build.txt
intorequirements/install.txt
sincesetuptools
andsetuptools_scm
are both build and install dependencies. (#74)Added workflow
python-publish.yml
to GitHub Actions that builds and publishes a release to PyPI when using GitHub’s Releases functionality. (#84)Added an
import bapsflib
test to thetest.yml
GitHub Action workflow. (#85)Added a package build and install test to the
test.yml
GitHub Action workflow. (#85)Added a codespell test to the
linters.yml
GitHub Action workflow. (#86)
v1.0.1 (2019-06-01)
Added Control Device mapping modules for:
Update mapping module
sis3301
to handle variations seen in SmPD generated HDF5 files (PR 30)developed decorators
with_bf()
andwith_lapdf()
to better context manage file access to the HDF5 files (PR 23)this allows tests for HDF5 files to run on Windows systems
integrated CI AppVeyor
Allow function
condition_shotnum()
to handle single elementnumpy
arrays (PR 41)Refactor class
HDFReadControls
and modulehdfreadcontrols
to be plural, which better reflects the class behavior and is consistent with the bound methodread_controls()
onFile
. (PR 42)Setup continuation integration pep8speaks (PR 43)
v1.0.0 (2018-11-08)
Initial release