Binning with metadata generation, and storing into a NeXus file#
In this example, we show how to bin the same data used for example 3, but using the values for correction/calibration parameters generated in the example notebook 3, which are locally saved in the file sed_config.yaml. These data and the corresponding (machine and processing) metadata are then stored to a NeXus file following the NXmpes NeXus standard (https://fairmat-experimental.github.io/nexus-fairmat-proposal/9636feecb79bb32b828b1a9804269573256d7696/classes/contributed_definitions/NXmpes.html#nxmpes) using the ‘dataconverter’ of the pynxtools package (FAIRmat-NFDI/pynxtools).
[1]:
%load_ext autoreload
%autoreload 2
import sed
from sed.dataset import dataset
%matplotlib widget
Load Data#
[2]:
dataset.get("WSe2") # Put in Path to a storage of at least 20 GByte free space.
data_path = dataset.dir # This is the path to the data
scandir, _ = dataset.subdirs # scandir contains the data, _ contains the calibration files
INFO - Not downloading WSe2 data as it already exists at "/home/runner/work/sed/sed/docs/tutorial/datasets/WSe2".
Set 'use_existing' to False if you want to download to a new location.
INFO - Using existing data path for "WSe2": "/home/runner/work/sed/sed/docs/tutorial/datasets/WSe2"
INFO - WSe2 data is already present.
[3]:
metadata = {}
# manual Meta data. These should ideally come from an Electronic Lab Notebook.
#General
metadata['experiment_summary'] = 'WSe2 XUV NIR pump probe data.'
metadata['entry_title'] = 'Valence Band Dynamics - 800 nm linear s-polarized pump, 0.6 mJ/cm2 absorbed fluence'
metadata['experiment_title'] = 'Valence band dynamics of 2H-WSe2'
#User
# Fill general parameters of NXuser
# TODO: discuss how to deal with multiple users?
metadata['user0'] = {}
metadata['user0']['name'] = 'Julian Maklar'
metadata['user0']['role'] = 'Principal Investigator'
metadata['user0']['affiliation'] = 'Fritz Haber Institute of the Max Planck Society'
metadata['user0']['address'] = 'Faradayweg 4-6, 14195 Berlin'
metadata['user0']['email'] = 'maklar@fhi-berlin.mpg.de'
#NXinstrument
metadata['instrument'] = {}
metadata['instrument']['energy_resolution'] = 140.
metadata['instrument']['temporal_resolution'] = 35.
#analyzer
metadata['instrument']['analyzer']={}
metadata['instrument']['analyzer']['slow_axes'] = "delay" # the scanned axes
metadata['instrument']['analyzer']['spatial_resolution'] = 10.
metadata['instrument']['analyzer']['energy_resolution'] = 110.
metadata['instrument']['analyzer']['momentum_resolution'] = 0.08
metadata['instrument']['analyzer']['working_distance'] = 4.
metadata['instrument']['analyzer']['lens_mode'] = "6kV_kmodem4.0_30VTOF.sav"
#probe beam
metadata['instrument']['beam']={}
metadata['instrument']['beam']['probe']={}
metadata['instrument']['beam']['probe']['incident_energy'] = 21.7
metadata['instrument']['beam']['probe']['incident_energy_spread'] = 0.11
metadata['instrument']['beam']['probe']['pulse_duration'] = 20.
metadata['instrument']['beam']['probe']['frequency'] = 500.
metadata['instrument']['beam']['probe']['incident_polarization'] = [1, 1, 0, 0] # p pol Stokes vector
metadata['instrument']['beam']['probe']['extent'] = [80., 80.]
#pump beam
metadata['instrument']['beam']['pump']={}
metadata['instrument']['beam']['pump']['incident_energy'] = 1.55
metadata['instrument']['beam']['pump']['incident_energy_spread'] = 0.08
metadata['instrument']['beam']['pump']['pulse_duration'] = 35.
metadata['instrument']['beam']['pump']['frequency'] = 500.
metadata['instrument']['beam']['pump']['incident_polarization'] = [1, -1, 0, 0] # s pol Stokes vector
metadata['instrument']['beam']['pump']['incident_wavelength'] = 800.
metadata['instrument']['beam']['pump']['average_power'] = 300.
metadata['instrument']['beam']['pump']['pulse_energy'] = metadata['instrument']['beam']['pump']['average_power']/metadata['instrument']['beam']['pump']['frequency']#µJ
metadata['instrument']['beam']['pump']['extent'] = [230., 265.]
metadata['instrument']['beam']['pump']['fluence'] = 0.15
#sample
metadata['sample']={}
metadata['sample']['preparation_date'] = '2019-01-13T10:00:00+00:00'
metadata['sample']['preparation_description'] = 'Cleaved'
metadata['sample']['sample_history'] = 'Cleaved'
metadata['sample']['chemical_formula'] = 'WSe2'
metadata['sample']['description'] = 'Sample'
metadata['sample']['name'] = 'WSe2 Single Crystal'
metadata['file'] = {}
metadata['file']["trARPES:Carving:TEMP_RBV"] = 300.
metadata['file']["trARPES:XGS600:PressureAC:P_RD"] = 5.e-11
metadata['file']["KTOF:Lens:Extr:I"] = -0.12877
metadata['file']["KTOF:Lens:UDLD:V"] = 399.99905
metadata['file']["KTOF:Lens:Sample:V"] = 17.19976
metadata['file']["KTOF:Apertures:m1.RBV"] = 3.729931
metadata['file']["KTOF:Apertures:m2.RBV"] = -5.200078
metadata['file']["KTOF:Apertures:m3.RBV"] = -11.000425
# Sample motor positions
metadata['file']['trARPES:Carving:TRX.RBV'] = 7.1900000000000004
metadata['file']['trARPES:Carving:TRY.RBV'] = -6.1700200225439552
metadata['file']['trARPES:Carving:TRZ.RBV'] = 33.4501953125
metadata['file']['trARPES:Carving:THT.RBV'] = 423.30500940561586
metadata['file']['trARPES:Carving:PHI.RBV'] = 0.99931647456264949
metadata['file']['trARPES:Carving:OMG.RBV'] = 11.002500171914066
[4]:
# create sed processor using the config file, and collect the meta data from the files:
sp = sed.SedProcessor(folder=scandir, config="../src/sed/config/mpes_example_config.yaml", system_config={}, metadata=metadata, collect_metadata=True)
INFO - Configuration loaded from: [/home/runner/work/sed/sed/docs/src/sed/config/mpes_example_config.yaml]
INFO - Folder config loaded from: [/home/runner/work/sed/sed/docs/tutorial/sed_config.yaml]
INFO - Default config loaded from: [/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/site-packages/sed/config/default.yaml]
WARNING - Entry "KTOF:Lens:Sample:V" for channel "sampleBias" not found. Skipping the channel.
WARNING - No valid token provided for elabFTW. Fetching elabFTW metadata will be skipped.
INFO - Collecting data from the EPICS archive...
WARNING - Fetching elabFTW metadata only supported for loading from "runs"
[5]:
# Apply jittering to X, Y, t, ADC columns.
sp.add_jitter()
INFO - add_jitter: Added jitter to columns ['X', 'Y', 't', 'ADC'].
[6]:
# Calculate machine-coordinate data for pose adjustment
sp.bin_and_load_momentum_calibration(df_partitions=10, plane=33, width=10, apply=True)
[7]:
# Adjust pose alignment, using stored distortion correction
sp.pose_adjustment(xtrans=8, ytrans=7, angle=-4, apply=True, use_correction=True)
INFO - No landmarks defined, using momentum correction parameters generated on 05/16/2026, 13:38:42
INFO - Calculated thin spline correction based on the following landmarks:
pouter_ord: [[203.2 341.96]
[299.16 345.32]
[350.25 243.7 ]
[304.38 149.88]
[199.52 152.48]
[154.28 242.27]]
pcent: (248.29, 248.62)
INFO - Applied translation with (xtrans=8.0, ytrans=7.0).
INFO - Applied rotation with angle=-4.0.
[8]:
# Apply stored momentum correction
sp.apply_momentum_correction()
INFO - Adding corrected X/Y columns to dataframe:
Calculating inverse deformation field, this might take a moment...
INFO - Dask DataFrame Structure:
X Y t ADC Xm Ym
npartitions=100
float64 float64 float64 float64 float64 float64
... ... ... ... ... ...
... ... ... ... ... ... ...
... ... ... ... ... ...
... ... ... ... ... ...
Dask Name: apply_dfield, 206 graph layers
[9]:
# Apply stored config momentum calibration
sp.apply_momentum_calibration()
INFO - Adding kx/ky columns to dataframe:
INFO - Using momentum calibration parameters generated on 05/16/2026, 13:38:48
INFO - Dask DataFrame Structure:
X Y t ADC Xm Ym kx ky
npartitions=100
float64 float64 float64 float64 float64 float64 float64 float64
... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ...
Dask Name: assign, 216 graph layers
[10]:
# Apply stored config energy correction
sp.apply_energy_correction()
INFO - Applying energy correction to dataframe...
INFO - Using energy correction parameters generated on 05/16/2026, 13:38:49
INFO - Dask DataFrame Structure:
X Y t ADC Xm Ym kx ky tm
npartitions=100
float64 float64 float64 float64 float64 float64 float64 float64 float64
... ... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ...
Dask Name: assign, 230 graph layers
[11]:
# Apply stored config energy calibration
sp.append_energy_axis(bias_voltage=16.8)
INFO - Adding energy column to dataframe:
INFO - Using energy calibration parameters generated on 05/16/2026, 13:38:57
INFO - Dask DataFrame Structure:
X Y t ADC Xm Ym kx ky tm energy
npartitions=100
float64 float64 float64 float64 float64 float64 float64 float64 float64 float64
... ... ... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ... ...
Dask Name: assign, 243 graph layers
[12]:
# Apply delay calibration
delay_range = (-500, 1500)
sp.calibrate_delay_axis(delay_range=delay_range, preview=True)
INFO - Adding delay column to dataframe:
INFO - Append delay axis using delay_range = [-500, 1500] and adc_range = [475.0, 6400.0]
INFO - X Y t ADC Xm \
0 -0.312143 -0.312143 -0.312143 -0.312143 0.000000
1 364.512029 1001.512029 70100.512029 6316.512029 352.978148
2 760.735931 817.735931 75614.735931 6315.735931 791.727572
3 692.366991 971.366991 66455.366991 6317.366991 714.694489
4 670.694433 711.694433 73025.694433 6316.694433 696.567658
5 299.101801 1164.101801 68459.101801 6316.101801 280.787527
6 571.223295 665.223295 73903.223295 6316.223295 588.720920
7 821.669442 544.669442 72631.669442 6317.669442 846.343668
8 818.398008 416.398008 72422.398008 6317.398008 836.386818
9 1006.271105 667.271105 72802.271105 6317.271105 1038.096123
Ym kx ky tm energy delay
0 0.000000 -2.060071 -2.060071 -48.541229 -8.260280 -660.442917
1 1034.215505 -1.113248 0.714092 70083.494950 7.512723 1471.818407
2 838.411679 0.063646 0.188871 75613.861665 0.223605 1471.556432
3 984.227562 -0.142986 0.580005 66449.670262 15.953563 1472.107001
4 741.295551 -0.191609 -0.071632 73025.302539 3.069598 1471.879977
5 1187.613396 -1.306891 1.125564 68432.588340 10.828294 1471.679933
6 702.911924 -0.480895 -0.174591 73900.319064 2.016681 1471.720944
7 586.573676 0.210148 -0.486655 72627.510621 3.583712 1472.209094
8 467.483891 0.183440 -0.806100 72412.758111 3.871183 1472.117471
9 708.238552 0.724501 -0.160303 72794.779339 3.364679 1472.074635
Compute final data volume#
[13]:
axes = ['kx', 'ky', 'energy', 'delay']
bins = [100, 100, 200, 50]
ranges = [[-2, 2], [-2, 2], [-4, 2], [-600, 1600]]
res = sp.compute(bins=bins, axes=axes, ranges=ranges)
[14]:
# save to NXmpes NeXus (including standardized metadata)
sp.save(data_path + "/binned.nxs")
Using mpes reader to convert the given files:
• ../src/sed/config/NXmpes_config.json
The unit /ENTRY[entry]/CALIBRATION[energy_calibration]/fit_formula_inputs/PARAMETER[coefficients]/@units = has no documentation.
The output file generated: /home/runner/work/sed/sed/docs/tutorial/datasets/WSe2/binned.nxs.
[15]:
# Visualization (requires JupyterLab)
from jupyterlab_h5web import H5Web
H5Web(data_path + "/binned.nxs")
[15]:
<jupyterlab_h5web.widget.H5Web object>
[ ]: