Binning with metadata generation, and storing into a NeXus file#

In this example, we show how to bin the same data used for example 3, but using the values for correction/calibration parameters generated in the example notebook 3, which are locally saved in the file sed_config.yaml. These data and the corresponding (machine and processing) metadata are then stored to a NeXus file following the NXmpes NeXus standard (https://fairmat-experimental.github.io/nexus-fairmat-proposal/9636feecb79bb32b828b1a9804269573256d7696/classes/contributed_definitions/NXmpes.html#nxmpes) using the ‘dataconverter’ of the pynxtools package (FAIRmat-NFDI/pynxtools).

[1]:
%load_ext autoreload
%autoreload 2

import sed
from sed.dataset import dataset

%matplotlib widget

Load Data#

[2]:
dataset.get("WSe2") # Put in Path to a storage of at least 20 GByte free space.
data_path = dataset.dir # This is the path to the data
scandir, _ = dataset.subdirs # scandir contains the data, _ contains the calibration files
INFO - Not downloading WSe2 data as it already exists at "/home/runner/work/sed/sed/docs/tutorial/datasets/WSe2".
Set 'use_existing' to False if you want to download to a new location.
INFO - Using existing data path for "WSe2": "/home/runner/work/sed/sed/docs/tutorial/datasets/WSe2"
INFO - WSe2 data is already present.
[3]:
metadata = {}
# manual Meta data. These should ideally come from an Electronic Lab Notebook.
#General
metadata['experiment_summary'] = 'WSe2 XUV NIR pump probe data.'
metadata['entry_title'] = 'Valence Band Dynamics - 800 nm linear s-polarized pump, 0.6 mJ/cm2 absorbed fluence'
metadata['experiment_title'] = 'Valence band dynamics of 2H-WSe2'

#User
# Fill general parameters of NXuser
# TODO: discuss how to deal with multiple users?
metadata['user0'] = {}
metadata['user0']['name'] = 'Julian Maklar'
metadata['user0']['role'] = 'Principal Investigator'
metadata['user0']['affiliation'] = 'Fritz Haber Institute of the Max Planck Society'
metadata['user0']['address'] = 'Faradayweg 4-6, 14195 Berlin'
metadata['user0']['email'] = 'maklar@fhi-berlin.mpg.de'

#NXinstrument
metadata['instrument'] = {}
metadata['instrument']['energy_resolution'] = 140.
metadata['instrument']['temporal_resolution'] = 35.
#analyzer
metadata['instrument']['analyzer']={}
metadata['instrument']['analyzer']['slow_axes'] = "delay" # the scanned axes
metadata['instrument']['analyzer']['spatial_resolution'] = 10.
metadata['instrument']['analyzer']['energy_resolution'] = 110.
metadata['instrument']['analyzer']['momentum_resolution'] = 0.08
metadata['instrument']['analyzer']['working_distance'] = 4.
metadata['instrument']['analyzer']['lens_mode'] = "6kV_kmodem4.0_30VTOF.sav"

#probe beam
metadata['instrument']['beam']={}
metadata['instrument']['beam']['probe']={}
metadata['instrument']['beam']['probe']['incident_energy'] = 21.7
metadata['instrument']['beam']['probe']['incident_energy_spread'] = 0.11
metadata['instrument']['beam']['probe']['pulse_duration'] = 20.
metadata['instrument']['beam']['probe']['frequency'] = 500.
metadata['instrument']['beam']['probe']['incident_polarization'] = [1, 1, 0, 0] # p pol Stokes vector
metadata['instrument']['beam']['probe']['extent'] = [80., 80.]
#pump beam
metadata['instrument']['beam']['pump']={}
metadata['instrument']['beam']['pump']['incident_energy'] = 1.55
metadata['instrument']['beam']['pump']['incident_energy_spread'] = 0.08
metadata['instrument']['beam']['pump']['pulse_duration'] = 35.
metadata['instrument']['beam']['pump']['frequency'] = 500.
metadata['instrument']['beam']['pump']['incident_polarization'] = [1, -1, 0, 0] # s pol Stokes vector
metadata['instrument']['beam']['pump']['incident_wavelength'] = 800.
metadata['instrument']['beam']['pump']['average_power'] = 300.
metadata['instrument']['beam']['pump']['pulse_energy'] = metadata['instrument']['beam']['pump']['average_power']/metadata['instrument']['beam']['pump']['frequency']#µJ
metadata['instrument']['beam']['pump']['extent'] = [230., 265.]
metadata['instrument']['beam']['pump']['fluence'] = 0.15

#sample
metadata['sample']={}
metadata['sample']['preparation_date'] = '2019-01-13T10:00:00+00:00'
metadata['sample']['preparation_description'] = 'Cleaved'
metadata['sample']['sample_history'] = 'Cleaved'
metadata['sample']['chemical_formula'] = 'WSe2'
metadata['sample']['description'] = 'Sample'
metadata['sample']['name'] = 'WSe2 Single Crystal'

metadata['file'] = {}
metadata['file']["trARPES:Carving:TEMP_RBV"] = 300.
metadata['file']["trARPES:XGS600:PressureAC:P_RD"] = 5.e-11
metadata['file']["KTOF:Lens:Extr:I"] = -0.12877
metadata['file']["KTOF:Lens:UDLD:V"] = 399.99905
metadata['file']["KTOF:Lens:Sample:V"] = 17.19976
metadata['file']["KTOF:Apertures:m1.RBV"] = 3.729931
metadata['file']["KTOF:Apertures:m2.RBV"] = -5.200078
metadata['file']["KTOF:Apertures:m3.RBV"] = -11.000425

# Sample motor positions
metadata['file']['trARPES:Carving:TRX.RBV'] = 7.1900000000000004
metadata['file']['trARPES:Carving:TRY.RBV'] = -6.1700200225439552
metadata['file']['trARPES:Carving:TRZ.RBV'] = 33.4501953125
metadata['file']['trARPES:Carving:THT.RBV'] = 423.30500940561586
metadata['file']['trARPES:Carving:PHI.RBV'] = 0.99931647456264949
metadata['file']['trARPES:Carving:OMG.RBV'] = 11.002500171914066
[4]:
# create sed processor using the config file, and collect the meta data from the files:
sp = sed.SedProcessor(folder=scandir, config="../src/sed/config/mpes_example_config.yaml", system_config={}, metadata=metadata, collect_metadata=True)
INFO - Configuration loaded from: [/home/runner/work/sed/sed/docs/src/sed/config/mpes_example_config.yaml]
INFO - Folder config loaded from: [/home/runner/work/sed/sed/docs/tutorial/sed_config.yaml]
INFO - Default config loaded from: [/opt/hostedtoolcache/Python/3.10.17/x64/lib/python3.10/site-packages/sed/config/default.yaml]
WARNING - Entry "KTOF:Lens:Sample:V" for channel "sampleBias" not found. Skipping the channel.
WARNING - No valid token provided for elabFTW. Fetching elabFTW metadata will be skipped.
INFO - Collecting data from the EPICS archive...
WARNING - Fetching elabFTW metadata only supported for loading from "runs"
[5]:
# Apply jittering to X, Y, t, ADC columns.
sp.add_jitter()
INFO - add_jitter: Added jitter to columns ['X', 'Y', 't', 'ADC'].
[6]:
# Calculate machine-coordinate data for pose adjustment
sp.bin_and_load_momentum_calibration(df_partitions=10, plane=33, width=10, apply=True)
[7]:
# Adjust pose alignment, using stored distortion correction
sp.pose_adjustment(xtrans=8, ytrans=7, angle=-4, apply=True, use_correction=True)
INFO - No landmarks defined, using momentum correction parameters generated on 05/11/2025, 22:12:04
INFO - Calculated thin spline correction based on the following landmarks:
pouter_ord: [[203.2  341.96]
 [299.16 345.32]
 [350.25 243.7 ]
 [304.38 149.88]
 [199.52 152.48]
 [154.28 242.27]]
pcent: (248.29, 248.62)
INFO - Applied translation with (xtrans=8.0, ytrans=7.0).
INFO - Applied rotation with angle=-4.0.
[8]:
# Apply stored momentum correction
sp.apply_momentum_correction()
INFO - Adding corrected X/Y columns to dataframe:
Calculating inverse deformation field, this might take a moment...
INFO - Dask DataFrame Structure:
                       X        Y        t      ADC       Xm       Ym
npartitions=100
                 float64  float64  float64  float64  float64  float64
                     ...      ...      ...      ...      ...      ...
...                  ...      ...      ...      ...      ...      ...
                     ...      ...      ...      ...      ...      ...
                     ...      ...      ...      ...      ...      ...
Dask Name: apply_dfield, 206 graph layers
[9]:
# Apply stored config momentum calibration
sp.apply_momentum_calibration()
INFO - Adding kx/ky columns to dataframe:
INFO - Using momentum calibration parameters generated on 05/11/2025, 22:12:10
INFO - Dask DataFrame Structure:
                       X        Y        t      ADC       Xm       Ym       kx       ky
npartitions=100
                 float64  float64  float64  float64  float64  float64  float64  float64
                     ...      ...      ...      ...      ...      ...      ...      ...
...                  ...      ...      ...      ...      ...      ...      ...      ...
                     ...      ...      ...      ...      ...      ...      ...      ...
                     ...      ...      ...      ...      ...      ...      ...      ...
Dask Name: assign, 216 graph layers
[10]:
# Apply stored config energy correction
sp.apply_energy_correction()
INFO - Applying energy correction to dataframe...
INFO - Using energy correction parameters generated on 05/11/2025, 22:12:11
INFO - Dask DataFrame Structure:
                       X        Y        t      ADC       Xm       Ym       kx       ky       tm
npartitions=100
                 float64  float64  float64  float64  float64  float64  float64  float64  float64
                     ...      ...      ...      ...      ...      ...      ...      ...      ...
...                  ...      ...      ...      ...      ...      ...      ...      ...      ...
                     ...      ...      ...      ...      ...      ...      ...      ...      ...
                     ...      ...      ...      ...      ...      ...      ...      ...      ...
Dask Name: assign, 230 graph layers
[11]:
# Apply stored config energy calibration
sp.append_energy_axis(bias_voltage=16.8)
INFO - Adding energy column to dataframe:
INFO - Using energy calibration parameters generated on 05/11/2025, 22:12:21
INFO - Dask DataFrame Structure:
                       X        Y        t      ADC       Xm       Ym       kx       ky       tm   energy
npartitions=100
                 float64  float64  float64  float64  float64  float64  float64  float64  float64  float64
                     ...      ...      ...      ...      ...      ...      ...      ...      ...      ...
...                  ...      ...      ...      ...      ...      ...      ...      ...      ...      ...
                     ...      ...      ...      ...      ...      ...      ...      ...      ...      ...
                     ...      ...      ...      ...      ...      ...      ...      ...      ...      ...
Dask Name: assign, 243 graph layers
[12]:
# Apply delay calibration
delay_range = (-500, 1500)
sp.calibrate_delay_axis(delay_range=delay_range, preview=True)
INFO - Adding delay column to dataframe:
INFO - Append delay axis using delay_range = [-500, 1500] and adc_range = [475.0, 6400.0]
INFO -              X            Y             t          ADC           Xm  \
0     0.434859     0.434859      0.434859     0.434859   -23.229539
1   365.449462  1002.449462  70101.449462  6317.449462   353.990854
2   761.180719   818.180719  75615.180719  6316.180719   792.188407
3   692.247275   971.247275  66455.247275  6317.247275   714.567888
4   671.208533   712.208533  73026.208533  6317.208533   697.121864
5   298.564015  1163.564015  68458.564015  6315.564015   280.209580
6   571.244851   665.244851  73903.244851  6316.244851   588.744257
7   821.745204   544.745204  72631.745204  6317.745204   846.423246
8   817.843415   415.843415  72421.843415  6316.843415   835.817221
9  1005.986836   666.986836  72801.986836  6316.986836  1037.794863

            Ym        kx        ky            tm     energy        delay
0    97.345250 -2.122381 -1.798954    -47.750550  -8.260098  -660.190765
1  1035.022370 -1.110531  0.716257  70084.443887   7.511015  1472.134840
2   838.824489  0.064883  0.189978  75614.295861   0.223190  1471.706572
3   984.121088 -0.143325  0.579720  66449.554893  15.953907  1472.066591
4   741.756354 -0.190122 -0.070396  73025.824668   3.068939  1472.053513
5  1187.146582 -1.308441  1.124312  68432.050694  10.829497  1471.498402
6   702.930977 -0.480833 -0.174540  73900.341539   2.016655  1471.728220
7   586.646064  0.210361 -0.486461  72627.587696   3.583610  1472.234668
8   466.950285  0.181912 -0.807531  72412.183319   3.871962  1471.930267
9   707.960804  0.723693 -0.161048  72794.505409   3.365034  1471.978679

Compute final data volume#

[13]:
axes = ['kx', 'ky', 'energy', 'delay']
bins = [100, 100, 200, 50]
ranges = [[-2, 2], [-2, 2], [-4, 2], [-600, 1600]]
res = sp.compute(bins=bins, axes=axes, ranges=ranges)
[14]:
# save to NXmpes NeXus (including standardized metadata)
sp.save(data_path + "/binned.nxs")
Using mpes reader to convert the given files:
• ../src/sed/config/NXmpes_config.json
The output file generated: /home/runner/work/sed/sed/docs/tutorial/datasets/WSe2/binned.nxs.
[15]:
# Visualization (requires JupyterLab)
from jupyterlab_h5web import H5Web
H5Web(data_path + "/binned.nxs")
[15]:
<jupyterlab_h5web.widget.H5Web object>
[ ]: