Binning with metadata generation, and storing into a NeXus file#
In this example, we show how to bin the same data used for example 3, but using the values for correction/calibration parameters generated in the example notebook 3, which are locally saved in the file sed_config.yaml. These data and the corresponding (machine and processing) metadata are then stored to a NeXus file following the NXmpes NeXus standard (https://fairmat-experimental.github.io/nexus-fairmat-proposal/9636feecb79bb32b828b1a9804269573256d7696/classes/contributed_definitions/NXmpes.html#nxmpes) using the ‘dataconverter’ of the pynxtools package (FAIRmat-NFDI/pynxtools).
[1]:
%load_ext autoreload
%autoreload 2
import sed
from sed.dataset import dataset
%matplotlib widget
Load Data#
[2]:
dataset.get("WSe2") # Put in Path to a storage of at least 20 GByte free space.
data_path = dataset.dir # This is the path to the data
scandir, _ = dataset.subdirs # scandir contains the data, _ contains the calibration files
INFO - Not downloading WSe2 data as it already exists at "/home/runner/work/sed/sed/docs/tutorial/datasets/WSe2".
Set 'use_existing' to False if you want to download to a new location.
INFO - Using existing data path for "WSe2": "/home/runner/work/sed/sed/docs/tutorial/datasets/WSe2"
INFO - WSe2 data is already present.
[3]:
metadata = {}
# manual Meta data. These should ideally come from an Electronic Lab Notebook.
#General
metadata['experiment_summary'] = 'WSe2 XUV NIR pump probe data.'
metadata['entry_title'] = 'Valence Band Dynamics - 800 nm linear s-polarized pump, 0.6 mJ/cm2 absorbed fluence'
metadata['experiment_title'] = 'Valence band dynamics of 2H-WSe2'
#User
# Fill general parameters of NXuser
# TODO: discuss how to deal with multiple users?
metadata['user0'] = {}
metadata['user0']['name'] = 'Julian Maklar'
metadata['user0']['role'] = 'Principal Investigator'
metadata['user0']['affiliation'] = 'Fritz Haber Institute of the Max Planck Society'
metadata['user0']['address'] = 'Faradayweg 4-6, 14195 Berlin'
metadata['user0']['email'] = 'maklar@fhi-berlin.mpg.de'
#NXinstrument
metadata['instrument'] = {}
metadata['instrument']['energy_resolution'] = 140.
metadata['instrument']['temporal_resolution'] = 35.
#analyzer
metadata['instrument']['analyzer']={}
metadata['instrument']['analyzer']['slow_axes'] = "delay" # the scanned axes
metadata['instrument']['analyzer']['spatial_resolution'] = 10.
metadata['instrument']['analyzer']['energy_resolution'] = 110.
metadata['instrument']['analyzer']['momentum_resolution'] = 0.08
metadata['instrument']['analyzer']['working_distance'] = 4.
metadata['instrument']['analyzer']['lens_mode'] = "6kV_kmodem4.0_30VTOF.sav"
#probe beam
metadata['instrument']['beam']={}
metadata['instrument']['beam']['probe']={}
metadata['instrument']['beam']['probe']['incident_energy'] = 21.7
metadata['instrument']['beam']['probe']['incident_energy_spread'] = 0.11
metadata['instrument']['beam']['probe']['pulse_duration'] = 20.
metadata['instrument']['beam']['probe']['frequency'] = 500.
metadata['instrument']['beam']['probe']['incident_polarization'] = [1, 1, 0, 0] # p pol Stokes vector
metadata['instrument']['beam']['probe']['extent'] = [80., 80.]
#pump beam
metadata['instrument']['beam']['pump']={}
metadata['instrument']['beam']['pump']['incident_energy'] = 1.55
metadata['instrument']['beam']['pump']['incident_energy_spread'] = 0.08
metadata['instrument']['beam']['pump']['pulse_duration'] = 35.
metadata['instrument']['beam']['pump']['frequency'] = 500.
metadata['instrument']['beam']['pump']['incident_polarization'] = [1, -1, 0, 0] # s pol Stokes vector
metadata['instrument']['beam']['pump']['incident_wavelength'] = 800.
metadata['instrument']['beam']['pump']['average_power'] = 300.
metadata['instrument']['beam']['pump']['pulse_energy'] = metadata['instrument']['beam']['pump']['average_power']/metadata['instrument']['beam']['pump']['frequency']#µJ
metadata['instrument']['beam']['pump']['extent'] = [230., 265.]
metadata['instrument']['beam']['pump']['fluence'] = 0.15
#sample
metadata['sample']={}
metadata['sample']['preparation_date'] = '2019-01-13T10:00:00+00:00'
metadata['sample']['preparation_description'] = 'Cleaved'
metadata['sample']['sample_history'] = 'Cleaved'
metadata['sample']['chemical_formula'] = 'WSe2'
metadata['sample']['description'] = 'Sample'
metadata['sample']['name'] = 'WSe2 Single Crystal'
metadata['file'] = {}
metadata['file']["trARPES:Carving:TEMP_RBV"] = 300.
metadata['file']["trARPES:XGS600:PressureAC:P_RD"] = 5.e-11
metadata['file']["KTOF:Lens:Extr:I"] = -0.12877
metadata['file']["KTOF:Lens:UDLD:V"] = 399.99905
metadata['file']["KTOF:Lens:Sample:V"] = 17.19976
metadata['file']["KTOF:Apertures:m1.RBV"] = 3.729931
metadata['file']["KTOF:Apertures:m2.RBV"] = -5.200078
metadata['file']["KTOF:Apertures:m3.RBV"] = -11.000425
# Sample motor positions
metadata['file']['trARPES:Carving:TRX.RBV'] = 7.1900000000000004
metadata['file']['trARPES:Carving:TRY.RBV'] = -6.1700200225439552
metadata['file']['trARPES:Carving:TRZ.RBV'] = 33.4501953125
metadata['file']['trARPES:Carving:THT.RBV'] = 423.30500940561586
metadata['file']['trARPES:Carving:PHI.RBV'] = 0.99931647456264949
metadata['file']['trARPES:Carving:OMG.RBV'] = 11.002500171914066
[4]:
# create sed processor using the config file, and collect the meta data from the files:
sp = sed.SedProcessor(folder=scandir, config="../src/sed/config/mpes_example_config.yaml", system_config={}, metadata=metadata, collect_metadata=True)
INFO - Configuration loaded from: [/home/runner/work/sed/sed/docs/src/sed/config/mpes_example_config.yaml]
INFO - Folder config loaded from: [/home/runner/work/sed/sed/docs/tutorial/sed_config.yaml]
INFO - Default config loaded from: [/opt/hostedtoolcache/Python/3.10.17/x64/lib/python3.10/site-packages/sed/config/default.yaml]
WARNING - Entry "KTOF:Lens:Sample:V" for channel "sampleBias" not found. Skipping the channel.
WARNING - No valid token provided for elabFTW. Fetching elabFTW metadata will be skipped.
INFO - Collecting data from the EPICS archive...
WARNING - Fetching elabFTW metadata only supported for loading from "runs"
[5]:
# Apply jittering to X, Y, t, ADC columns.
sp.add_jitter()
INFO - add_jitter: Added jitter to columns ['X', 'Y', 't', 'ADC'].
[6]:
# Calculate machine-coordinate data for pose adjustment
sp.bin_and_load_momentum_calibration(df_partitions=10, plane=33, width=10, apply=True)
[7]:
# Adjust pose alignment, using stored distortion correction
sp.pose_adjustment(xtrans=8, ytrans=7, angle=-4, apply=True, use_correction=True)
INFO - No landmarks defined, using momentum correction parameters generated on 05/11/2025, 22:12:04
INFO - Calculated thin spline correction based on the following landmarks:
pouter_ord: [[203.2 341.96]
[299.16 345.32]
[350.25 243.7 ]
[304.38 149.88]
[199.52 152.48]
[154.28 242.27]]
pcent: (248.29, 248.62)
INFO - Applied translation with (xtrans=8.0, ytrans=7.0).
INFO - Applied rotation with angle=-4.0.
[8]:
# Apply stored momentum correction
sp.apply_momentum_correction()
INFO - Adding corrected X/Y columns to dataframe:
Calculating inverse deformation field, this might take a moment...
INFO - Dask DataFrame Structure:
X Y t ADC Xm Ym
npartitions=100
float64 float64 float64 float64 float64 float64
... ... ... ... ... ...
... ... ... ... ... ... ...
... ... ... ... ... ...
... ... ... ... ... ...
Dask Name: apply_dfield, 206 graph layers
[9]:
# Apply stored config momentum calibration
sp.apply_momentum_calibration()
INFO - Adding kx/ky columns to dataframe:
INFO - Using momentum calibration parameters generated on 05/11/2025, 22:12:10
INFO - Dask DataFrame Structure:
X Y t ADC Xm Ym kx ky
npartitions=100
float64 float64 float64 float64 float64 float64 float64 float64
... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ...
Dask Name: assign, 216 graph layers
[10]:
# Apply stored config energy correction
sp.apply_energy_correction()
INFO - Applying energy correction to dataframe...
INFO - Using energy correction parameters generated on 05/11/2025, 22:12:11
INFO - Dask DataFrame Structure:
X Y t ADC Xm Ym kx ky tm
npartitions=100
float64 float64 float64 float64 float64 float64 float64 float64 float64
... ... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ...
Dask Name: assign, 230 graph layers
[11]:
# Apply stored config energy calibration
sp.append_energy_axis(bias_voltage=16.8)
INFO - Adding energy column to dataframe:
INFO - Using energy calibration parameters generated on 05/11/2025, 22:12:21
INFO - Dask DataFrame Structure:
X Y t ADC Xm Ym kx ky tm energy
npartitions=100
float64 float64 float64 float64 float64 float64 float64 float64 float64 float64
... ... ... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ... ...
Dask Name: assign, 243 graph layers
[12]:
# Apply delay calibration
delay_range = (-500, 1500)
sp.calibrate_delay_axis(delay_range=delay_range, preview=True)
INFO - Adding delay column to dataframe:
INFO - Append delay axis using delay_range = [-500, 1500] and adc_range = [475.0, 6400.0]
INFO - X Y t ADC Xm \
0 0.434859 0.434859 0.434859 0.434859 -23.229539
1 365.449462 1002.449462 70101.449462 6317.449462 353.990854
2 761.180719 818.180719 75615.180719 6316.180719 792.188407
3 692.247275 971.247275 66455.247275 6317.247275 714.567888
4 671.208533 712.208533 73026.208533 6317.208533 697.121864
5 298.564015 1163.564015 68458.564015 6315.564015 280.209580
6 571.244851 665.244851 73903.244851 6316.244851 588.744257
7 821.745204 544.745204 72631.745204 6317.745204 846.423246
8 817.843415 415.843415 72421.843415 6316.843415 835.817221
9 1005.986836 666.986836 72801.986836 6316.986836 1037.794863
Ym kx ky tm energy delay
0 97.345250 -2.122381 -1.798954 -47.750550 -8.260098 -660.190765
1 1035.022370 -1.110531 0.716257 70084.443887 7.511015 1472.134840
2 838.824489 0.064883 0.189978 75614.295861 0.223190 1471.706572
3 984.121088 -0.143325 0.579720 66449.554893 15.953907 1472.066591
4 741.756354 -0.190122 -0.070396 73025.824668 3.068939 1472.053513
5 1187.146582 -1.308441 1.124312 68432.050694 10.829497 1471.498402
6 702.930977 -0.480833 -0.174540 73900.341539 2.016655 1471.728220
7 586.646064 0.210361 -0.486461 72627.587696 3.583610 1472.234668
8 466.950285 0.181912 -0.807531 72412.183319 3.871962 1471.930267
9 707.960804 0.723693 -0.161048 72794.505409 3.365034 1471.978679
Compute final data volume#
[13]:
axes = ['kx', 'ky', 'energy', 'delay']
bins = [100, 100, 200, 50]
ranges = [[-2, 2], [-2, 2], [-4, 2], [-600, 1600]]
res = sp.compute(bins=bins, axes=axes, ranges=ranges)
[14]:
# save to NXmpes NeXus (including standardized metadata)
sp.save(data_path + "/binned.nxs")
Using mpes reader to convert the given files:
• ../src/sed/config/NXmpes_config.json
The output file generated: /home/runner/work/sed/sed/docs/tutorial/datasets/WSe2/binned.nxs.
[15]:
# Visualization (requires JupyterLab)
from jupyterlab_h5web import H5Web
H5Web(data_path + "/binned.nxs")
[15]:
<jupyterlab_h5web.widget.H5Web object>
[ ]: