Binning with metadata generation, and storing into a NeXus file#
In this example, we show how to bin the same data used for example 3, but using the values for correction/calibration parameters generated in the example notebook 3, which are locally saved in the file sed_config.yaml. These data and the corresponding (machine and processing) metadata are then stored to a NeXus file following the NXmpes NeXus standard (https://fairmat-experimental.github.io/nexus-fairmat-proposal/9636feecb79bb32b828b1a9804269573256d7696/classes/contributed_definitions/NXmpes.html#nxmpes) using the ‘dataconverter’ of the pynxtools package (FAIRmat-NFDI/pynxtools).
[1]:
%load_ext autoreload
%autoreload 2
import sed
from sed.dataset import dataset
%matplotlib widget
Load Data#
[2]:
dataset.get("WSe2") # Put in Path to a storage of at least 20 GByte free space.
data_path = dataset.dir # This is the path to the data
scandir, _ = dataset.subdirs # scandir contains the data, _ contains the calibration files
INFO - Not downloading WSe2 data as it already exists at "/home/runner/work/sed/sed/docs/tutorial/datasets/WSe2".
Set 'use_existing' to False if you want to download to a new location.
INFO - Using existing data path for "WSe2": "/home/runner/work/sed/sed/docs/tutorial/datasets/WSe2"
INFO - WSe2 data is already present.
[3]:
metadata = {}
# manual Meta data. These should ideally come from an Electronic Lab Notebook.
#General
metadata['experiment_summary'] = 'WSe2 XUV NIR pump probe data.'
metadata['entry_title'] = 'Valence Band Dynamics - 800 nm linear s-polarized pump, 0.6 mJ/cm2 absorbed fluence'
metadata['experiment_title'] = 'Valence band dynamics of 2H-WSe2'
#User
# Fill general parameters of NXuser
# TODO: discuss how to deal with multiple users?
metadata['user0'] = {}
metadata['user0']['name'] = 'Julian Maklar'
metadata['user0']['role'] = 'Principal Investigator'
metadata['user0']['affiliation'] = 'Fritz Haber Institute of the Max Planck Society'
metadata['user0']['address'] = 'Faradayweg 4-6, 14195 Berlin'
metadata['user0']['email'] = 'maklar@fhi-berlin.mpg.de'
#NXinstrument
metadata['instrument'] = {}
metadata['instrument']['energy_resolution'] = 140.
metadata['instrument']['temporal_resolution'] = 35.
#analyzer
metadata['instrument']['analyzer']={}
metadata['instrument']['analyzer']['slow_axes'] = "delay" # the scanned axes
metadata['instrument']['analyzer']['spatial_resolution'] = 10.
metadata['instrument']['analyzer']['energy_resolution'] = 110.
metadata['instrument']['analyzer']['momentum_resolution'] = 0.08
metadata['instrument']['analyzer']['working_distance'] = 4.
metadata['instrument']['analyzer']['lens_mode'] = "6kV_kmodem4.0_30VTOF.sav"
#probe beam
metadata['instrument']['beam']={}
metadata['instrument']['beam']['probe']={}
metadata['instrument']['beam']['probe']['incident_energy'] = 21.7
metadata['instrument']['beam']['probe']['incident_energy_spread'] = 0.11
metadata['instrument']['beam']['probe']['pulse_duration'] = 20.
metadata['instrument']['beam']['probe']['frequency'] = 500.
metadata['instrument']['beam']['probe']['incident_polarization'] = [1, 1, 0, 0] # p pol Stokes vector
metadata['instrument']['beam']['probe']['extent'] = [80., 80.]
#pump beam
metadata['instrument']['beam']['pump']={}
metadata['instrument']['beam']['pump']['incident_energy'] = 1.55
metadata['instrument']['beam']['pump']['incident_energy_spread'] = 0.08
metadata['instrument']['beam']['pump']['pulse_duration'] = 35.
metadata['instrument']['beam']['pump']['frequency'] = 500.
metadata['instrument']['beam']['pump']['incident_polarization'] = [1, -1, 0, 0] # s pol Stokes vector
metadata['instrument']['beam']['pump']['incident_wavelength'] = 800.
metadata['instrument']['beam']['pump']['average_power'] = 300.
metadata['instrument']['beam']['pump']['pulse_energy'] = metadata['instrument']['beam']['pump']['average_power']/metadata['instrument']['beam']['pump']['frequency']#µJ
metadata['instrument']['beam']['pump']['extent'] = [230., 265.]
metadata['instrument']['beam']['pump']['fluence'] = 0.15
#sample
metadata['sample']={}
metadata['sample']['preparation_date'] = '2019-01-13T10:00:00+00:00'
metadata['sample']['preparation_description'] = 'Cleaved'
metadata['sample']['sample_history'] = 'Cleaved'
metadata['sample']['chemical_formula'] = 'WSe2'
metadata['sample']['description'] = 'Sample'
metadata['sample']['name'] = 'WSe2 Single Crystal'
metadata['file'] = {}
metadata['file']["trARPES:Carving:TEMP_RBV"] = 300.
metadata['file']["trARPES:XGS600:PressureAC:P_RD"] = 5.e-11
metadata['file']["KTOF:Lens:Extr:I"] = -0.12877
metadata['file']["KTOF:Lens:UDLD:V"] = 399.99905
metadata['file']["KTOF:Lens:Sample:V"] = 17.19976
metadata['file']["KTOF:Apertures:m1.RBV"] = 3.729931
metadata['file']["KTOF:Apertures:m2.RBV"] = -5.200078
metadata['file']["KTOF:Apertures:m3.RBV"] = -11.000425
# Sample motor positions
metadata['file']['trARPES:Carving:TRX.RBV'] = 7.1900000000000004
metadata['file']['trARPES:Carving:TRY.RBV'] = -6.1700200225439552
metadata['file']['trARPES:Carving:TRZ.RBV'] = 33.4501953125
metadata['file']['trARPES:Carving:THT.RBV'] = 423.30500940561586
metadata['file']['trARPES:Carving:PHI.RBV'] = 0.99931647456264949
metadata['file']['trARPES:Carving:OMG.RBV'] = 11.002500171914066
[4]:
# create sed processor using the config file, and collect the meta data from the files:
sp = sed.SedProcessor(folder=scandir, config="../src/sed/config/mpes_example_config.yaml", system_config={}, metadata=metadata, collect_metadata=True)
INFO - Configuration loaded from: [/home/runner/work/sed/sed/docs/src/sed/config/mpes_example_config.yaml]
INFO - Folder config loaded from: [/home/runner/work/sed/sed/docs/tutorial/sed_config.yaml]
INFO - Default config loaded from: [/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/sed/config/default.yaml]
WARNING - Entry "KTOF:Lens:Sample:V" for channel "sampleBias" not found. Skipping the channel.
WARNING - No valid token provided for elabFTW. Fetching elabFTW metadata will be skipped.
INFO - Collecting data from the EPICS archive...
WARNING - Fetching elabFTW metadata only supported for loading from "runs"
[5]:
# Apply jittering to X, Y, t, ADC columns.
sp.add_jitter()
INFO - add_jitter: Added jitter to columns ['X', 'Y', 't', 'ADC'].
[6]:
# Calculate machine-coordinate data for pose adjustment
sp.bin_and_load_momentum_calibration(df_partitions=10, plane=33, width=10, apply=True)
[7]:
# Adjust pose alignment, using stored distortion correction
sp.pose_adjustment(xtrans=8, ytrans=7, angle=-4, apply=True, use_correction=True)
INFO - No landmarks defined, using momentum correction parameters generated on 11/04/2025, 21:51:55
INFO - Calculated thin spline correction based on the following landmarks:
pouter_ord: [[203.2 341.96]
[299.16 345.32]
[350.25 243.7 ]
[304.38 149.88]
[199.52 152.48]
[154.28 242.27]]
pcent: (248.29, 248.62)
INFO - Applied translation with (xtrans=8.0, ytrans=7.0).
INFO - Applied rotation with angle=-4.0.
[8]:
# Apply stored momentum correction
sp.apply_momentum_correction()
INFO - Adding corrected X/Y columns to dataframe:
Calculating inverse deformation field, this might take a moment...
INFO - Dask DataFrame Structure:
X Y t ADC Xm Ym
npartitions=100
float64 float64 float64 float64 float64 float64
... ... ... ... ... ...
... ... ... ... ... ... ...
... ... ... ... ... ...
... ... ... ... ... ...
Dask Name: apply_dfield, 206 graph layers
[9]:
# Apply stored config momentum calibration
sp.apply_momentum_calibration()
INFO - Adding kx/ky columns to dataframe:
INFO - Using momentum calibration parameters generated on 11/04/2025, 21:52:02
INFO - Dask DataFrame Structure:
X Y t ADC Xm Ym kx ky
npartitions=100
float64 float64 float64 float64 float64 float64 float64 float64
... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ...
Dask Name: assign, 216 graph layers
[10]:
# Apply stored config energy correction
sp.apply_energy_correction()
INFO - Applying energy correction to dataframe...
INFO - Using energy correction parameters generated on 11/04/2025, 21:52:03
INFO - Dask DataFrame Structure:
X Y t ADC Xm Ym kx ky tm
npartitions=100
float64 float64 float64 float64 float64 float64 float64 float64 float64
... ... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ...
Dask Name: assign, 230 graph layers
[11]:
# Apply stored config energy calibration
sp.append_energy_axis(bias_voltage=16.8)
INFO - Adding energy column to dataframe:
INFO - Using energy calibration parameters generated on 11/04/2025, 21:52:11
INFO - Dask DataFrame Structure:
X Y t ADC Xm Ym kx ky tm energy
npartitions=100
float64 float64 float64 float64 float64 float64 float64 float64 float64 float64
... ... ... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ... ...
Dask Name: assign, 243 graph layers
[12]:
# Apply delay calibration
delay_range = (-500, 1500)
sp.calibrate_delay_axis(delay_range=delay_range, preview=True)
INFO - Adding delay column to dataframe:
INFO - Append delay axis using delay_range = [-500, 1500] and adc_range = [475.0, 6400.0]
INFO - X Y t ADC Xm \
0 0.221117 0.221117 0.221117 0.221117 -23.459020
1 364.733598 1001.733598 70100.733598 6316.733598 353.217549
2 760.710811 817.710811 75614.710811 6315.710811 791.701522
3 692.163834 971.163834 66455.163834 6317.163834 714.479649
4 671.328970 712.328970 73026.328970 6317.328970 697.251691
5 298.550756 1163.550756 68458.550756 6315.550756 280.195331
6 570.615246 664.615246 73902.615246 6315.615246 588.062503
7 821.712586 544.712586 72631.712586 6317.712586 846.388985
8 817.732184 415.732184 72421.732184 6316.732184 835.703013
9 1006.209138 667.209138 72802.209138 6317.209138 1038.030454
Ym kx ky tm energy delay
0 97.151189 -2.122997 -1.799474 -47.976794 -8.260150 -660.262914
1 1034.406192 -1.112606 0.714604 70083.719258 7.512319 1471.893198
2 838.388388 0.063577 0.188809 75613.837141 0.223628 1471.547953
3 984.046877 -0.143562 0.579521 66449.474480 15.954146 1472.038425
4 741.864316 -0.189774 -0.070106 73025.946970 3.068784 1472.094167
5 1187.135073 -1.308479 1.124281 68432.037438 10.829527 1471.493926
6 702.374609 -0.482661 -0.176032 73899.685004 2.017408 1471.515695
7 586.614899 0.210269 -0.486544 72627.554514 3.583654 1472.223658
8 466.843303 0.181605 -0.807818 72412.068026 3.872118 1471.892720
9 708.178008 0.724325 -0.160466 72794.719628 3.364756 1472.053717
Compute final data volume#
[13]:
axes = ['kx', 'ky', 'energy', 'delay']
bins = [100, 100, 200, 50]
ranges = [[-2, 2], [-2, 2], [-4, 2], [-600, 1600]]
res = sp.compute(bins=bins, axes=axes, ranges=ranges)
[14]:
# save to NXmpes NeXus (including standardized metadata)
sp.save(data_path + "/binned.nxs")
Using mpes reader to convert the given files:
• ../src/sed/config/NXmpes_config.json
The output file generated: /home/runner/work/sed/sed/docs/tutorial/datasets/WSe2/binned.nxs.
[15]:
# Visualization (requires JupyterLab)
from jupyterlab_h5web import H5Web
H5Web(data_path + "/binned.nxs")
[15]:
<jupyterlab_h5web.widget.H5Web object>
[ ]: