Binning of temperature-dependent ARPES data using time-stamped external temperature data#

In this example, we pull some temperature-dependent ARPES data from Zenodo, which was recorded as a continuous temperature ramp. We then add the respective temperature information from the respective timestamp/temperature values to the dataframe, and bin the data as function of temperature For performance reasons, best store the data on a locally attached storage (no network drive). This can also be achieved transparently using the included MirrorUtil class.

[1]:
%load_ext autoreload
%autoreload 2
import numpy as np
import matplotlib.pyplot as plt
import os
import glob

import sed
from sed.dataset import dataset

%matplotlib widget

Load Data#

[2]:
dataset.get("TaS2") # Put in Path to a storage of at least 20 GByte free space.
data_path = dataset.dir
scandir, caldir = dataset.subdirs # scandir contains the data, caldir contains the calibration files

# correct timestamps if not correct timezone set
tzoffset = os.path.getmtime(scandir + '/Scan0121_1.h5') - 1594998158.0
if tzoffset:
    for file in glob.glob(scandir +'/*.h5'):
        os.utime(file, (os.path.getmtime(file)-tzoffset, os.path.getmtime(file)-tzoffset))
INFO - Not downloading TaS2 data as it already exists at "/home/runner/work/sed/sed/docs/tutorial/datasets/TaS2".
Set 'use_existing' to False if you want to download to a new location.
INFO - Using existing data path for "TaS2": "/home/runner/work/sed/sed/docs/tutorial/datasets/TaS2"
INFO - TaS2 data is already present.
[3]:
# create sed processor using the config file with time-stamps:
sp = sed.SedProcessor(folder=scandir, user_config="../src/sed/config/mpes_example_config.yaml", system_config={}, time_stamps=True, verbose=True)
INFO - Folder config loaded from: [/home/runner/work/sed/sed/docs/tutorial/sed_config.yaml]
INFO - User config loaded from: [/home/runner/work/sed/sed/docs/src/sed/config/mpes_example_config.yaml]
INFO - Default config loaded from: [/opt/hostedtoolcache/Python/3.10.16/x64/lib/python3.10/site-packages/sed/config/default.yaml]
[4]:
# Apply jittering to X, Y, t, ADC columns.
sp.add_jitter()
INFO - add_jitter: Added jitter to columns ['X', 'Y', 't', 'ADC'].
[5]:
sp.bin_and_load_momentum_calibration(df_partitions=10, plane=33, width=3, apply=True)
[6]:
features = np.array([[337., 242.], [289., 327.], [187., 344.], [137., 258.], [189., 161.], [289., 158.], [236.0, 250.0]])
sp.define_features(features=features, rotation_symmetry=6, include_center=True, apply=True)
sp.generate_splinewarp(include_center=True)
INFO - Calculated thin spline correction based on the following landmarks:
pouter_ord: [[137. 258.]
 [187. 344.]
 [289. 327.]
 [337. 242.]
 [289. 158.]
 [189. 161.]]
pcent: (236.0, 250.0)
[7]:
# Adjust pose alignment, using stored distortion correction
sp.pose_adjustment(xtrans=15, ytrans=8, angle=-5, apply=True)
INFO - Applied translation with (xtrans=15.0, ytrans=8.0).
INFO - Applied rotation with angle=-5.0.
[8]:
# Apply stored momentum correction
sp.apply_momentum_correction()
INFO - Adding corrected X/Y columns to dataframe:
Calculating inverse deformation field, this might take a moment...
INFO - Dask DataFrame Structure:
                      X        Y        t      ADC timeStamps sampleBias       Xm       Ym
npartitions=97
                float64  float64  float64  float64    float64    float64  float64  float64
                    ...      ...      ...      ...        ...        ...      ...      ...
...                 ...      ...      ...      ...        ...        ...      ...      ...
                    ...      ...      ...      ...        ...        ...      ...      ...
                    ...      ...      ...      ...        ...        ...      ...      ...
Dask Name: apply_dfield, 492 graph layers
[9]:
# Apply stored config momentum calibration
sp.apply_momentum_calibration()
INFO - Adding kx/ky columns to dataframe:
INFO - Dask DataFrame Structure:
                      X        Y        t      ADC timeStamps sampleBias       Xm       Ym       kx       ky
npartitions=97
                float64  float64  float64  float64    float64    float64  float64  float64  float64  float64
                    ...      ...      ...      ...        ...        ...      ...      ...      ...      ...
...                 ...      ...      ...      ...        ...        ...      ...      ...      ...      ...
                    ...      ...      ...      ...        ...        ...      ...      ...      ...      ...
                    ...      ...      ...      ...        ...        ...      ...      ...      ...      ...
Dask Name: assign, 502 graph layers
[10]:
# Apply stored config energy correction
sp.apply_energy_correction()
INFO - Applying energy correction to dataframe...
INFO - Dask DataFrame Structure:
                      X        Y        t      ADC timeStamps sampleBias       Xm       Ym       kx       ky       tm
npartitions=97
                float64  float64  float64  float64    float64    float64  float64  float64  float64  float64  float64
                    ...      ...      ...      ...        ...        ...      ...      ...      ...      ...      ...
...                 ...      ...      ...      ...        ...        ...      ...      ...      ...      ...      ...
                    ...      ...      ...      ...        ...        ...      ...      ...      ...      ...      ...
                    ...      ...      ...      ...        ...        ...      ...      ...      ...      ...      ...
Dask Name: assign, 516 graph layers
[11]:
# Load energy calibration EDCs
scans = np.arange(127,136)
voltages = np.arange(21,12,-1)
files = [caldir + r'/Scan' + str(num).zfill(4) + '_1.h5' for num in scans]
sp.load_bias_series(data_files=files, normalize=True, biases=voltages, ranges=[(64000, 76000)])
rg = (65500, 66000)
sp.find_bias_peaks(ranges=rg, ref_id=5, infer_others=True, apply=True)
sp.calibrate_energy_axis(ref_energy=-0.5, energy_scale="kinetic", method="lmfit")
INFO - Use feature ranges: [(67180.0, 67780.0), (66820.0, 67384.0), (66436.0, 67012.0), (66088.0, 66652.0), (65764.0, 66316.0), (65500.0, 66004.0), (65188.0, 65704.0), (64864.0, 65416.0), (64624.0, 65140.0)].
INFO - Extracted energy features: [[6.76360000e+04 4.47140008e-01]
 [6.72520000e+04 4.41972464e-01]
 [6.68800000e+04 4.45905387e-01]
 [6.65320000e+04 4.43017632e-01]
 [6.62080000e+04 4.37122852e-01]
 [6.58960000e+04 4.28882003e-01]
 [6.55960000e+04 4.22135979e-01]
 [6.52960000e+04 4.13137674e-01]
 [6.50320000e+04 4.00443912e-01]].
INFO - [[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 163
    # data points      = 9
    # variables        = 3
    chi-square         = 0.00179088
    reduced chi-square = 2.9848e-04
    Akaike info crit   = -70.7004554
    Bayesian info crit = -70.1087817
[[Variables]]
    d:   1.09335629 +/- 0.06668048 (6.10%) (init = 1)
    t0:  7.6176e-07 +/- 1.3448e-08 (1.77%) (init = 1e-06)
    E0: -48.0855611 +/- 1.25773261 (2.62%) (init = -21)
[[Correlations]] (unreported correlations are < 0.100)
    C(d, t0)  = -0.9998
    C(d, E0)  = -0.9993
    C(t0, E0) = +0.9985
[12]:
# Apply stored config energy calibration
#sp.append_energy_axis(bias_voltage=17)
sp.append_energy_axis()
INFO - Adding energy column to dataframe:
INFO - Using energy calibration parameters generated on 02/05/2025, 22:26:53
INFO - Dask DataFrame Structure:
                      X        Y        t      ADC timeStamps sampleBias       Xm       Ym       kx       ky       tm   energy
npartitions=97
                float64  float64  float64  float64    float64    float64  float64  float64  float64  float64  float64  float64
                    ...      ...      ...      ...        ...        ...      ...      ...      ...      ...      ...      ...
...                 ...      ...      ...      ...        ...        ...      ...      ...      ...      ...      ...      ...
                    ...      ...      ...      ...        ...        ...      ...      ...      ...      ...      ...      ...
                    ...      ...      ...      ...        ...        ...      ...      ...      ...      ...      ...      ...
Dask Name: assign, 531 graph layers
[13]:
# add time-stamped temperature data
# either, directly retrieve data from EPICS archiver instance (within FHI network),
#sp.add_time_stamped_data(dest_column="T_B", archiver_channel="trARPES:Carving:TEMP-B")
# or use externally provided timestamp/data pairs
import h5py
with h5py.File(f"{data_path}/temperature_data.h5", "r") as file:
    data = file["temperatures"][()]
    time_stamps = file["timestamps"][()]
sp.add_time_stamped_data(dest_column="sample_temperature", time_stamps=time_stamps, data=data)
INFO - add_time_stamped_data: Added time-stamped data as column sample_temperature.
[14]:
# inspect calibrated event histogram
axes = ['kx', 'ky', 'energy', 'sample_temperature']
ranges = [[-3, 3], [-3, 3], [-6, 2], [10, 300]]
sp.view_event_histogram(dfpid=80, axes=axes, ranges=ranges)

Define the binning ranges and compute calibrated data volume#

[15]:
axes = ['kx', 'ky', 'energy', 'sample_temperature']
bins = [100, 100, 300, 100]
ranges = [[-2, 2], [-2, 2], [-6, 2], [20, 270]]
res = sp.compute(bins=bins, axes=axes, ranges=ranges, normalize_to_acquisition_time="sample_temperature")
INFO - Calculate normalization histogram for axis 'sample_temperature'...

Some visualization:#

[16]:
fig, axs = plt.subplots(4, 1, figsize=(4, 12), constrained_layout=True)
res.loc[{'energy':slice(-.1, 0)}].sum(axis=(2,3)).T.plot(ax=axs[0])
res.loc[{'kx':slice(-.2, .2)}].sum(axis=(0,3)).T.plot(ax=axs[1])
res.loc[{'ky':slice(-.2, .2)}].sum(axis=(1,3)).T.plot(ax=axs[2])
res.loc[{'kx':slice(-.2, .2), 'ky':slice(-.2, .2), 'energy':slice(-2, 0.2)}].sum(axis=(0,1)).plot(ax=axs[3])
[16]:
<matplotlib.collections.QuadMesh at 0x7f1bd0f66980>
[17]:
# Inspect effect of histogram normalization
fig, ax = plt.subplots(1,1)
(sp._normalization_histogram/sp._normalization_histogram.sum()).plot(ax=ax)
(sp._binned.sum(axis=(0,1,2))/sp._binned.sum(axis=(0,1,2,3))).plot(ax=ax)
plt.show()
[18]:
# Remaining fluctuations are an effect of the varying count rate throughout the scan
plt.figure()
rate, secs = sp.loader.get_count_rate()
plt.plot(secs, rate)
[18]:
[<matplotlib.lines.Line2D at 0x7f1c38a46ce0>]
[19]:
# Normalize for intensity around the Gamma point
res_norm = res.copy()
res_norm = res_norm/res_norm.loc[{'kx':slice(-.3, .3), 'ky':slice(-.3, .3)}].sum(axis=(0,1,2))
[20]:
fig, axs = plt.subplots(4, 1, figsize=(4, 12), constrained_layout=True)
res_norm.loc[{'energy':slice(-.1, 0)}].sum(axis=(2,3)).T.plot(ax=axs[0])
res_norm.loc[{'kx':slice(-.2, .2)}].sum(axis=(0,3)).T.plot(ax=axs[1])
res_norm.loc[{'ky':slice(-.2, .2)}].sum(axis=(1,3)).T.plot(ax=axs[2])
res_norm.loc[{'kx':slice(-.2, .2), 'ky':slice(-.2, .2), 'energy':slice(-2, 0.5)}].sum(axis=(0,1)).plot(ax=axs[3])
[20]:
<matplotlib.collections.QuadMesh at 0x7f1c387ff580>
[21]:
# Lower Hubbard band intensity versus temperature
plt.figure()
res_norm.loc[{'kx':slice(-.2, .2), 'ky':slice(-.2, .2), 'energy':slice(-.6, 0.1)}].sum(axis=(0,1,2)).plot()
[21]:
[<matplotlib.lines.Line2D at 0x7f1c385e7220>]
[ ]: