Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 27 additions & 6 deletions src/ess/dream/io/geant4.py
Original file line number Diff line number Diff line change
Expand Up @@ -196,6 +196,27 @@ def _to_edges(centers: sc.Variable) -> sc.Variable:
)


def _rebin_to_tofrange(da):
'''Rebins a monitor TOA histogram to the range of values
expected from a real instrument at ESS.
That is, to the range [0, 1/14] s.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm getting confused here between TOA=time-of-arrival and ETO=event_time_offset.
ETO is the time which will always be between 0 and 1/14s.
TOA is the time the neutron arrived at the detector, so TOF = TOA - birth time of the neutron. TOA can have any value.

Which one are we refering to here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the CSV files we get TOA. But the time-of-flight lookup table expects ETO. So to be able to compute TOF using the time-of-flight lookup table we need to convert TOA to ETO.

For the detector data this is easy, it's just a mod operator on the TOA values in the file.

For (histogramed) monitor data, it's a bit more involved, but essentially the same in principle.


Strategy:
1. Create a grid that alingns with the pulse times.
2. Rebin TOA on that grid.
3. Fold pulses into new dimension and sum over pulses.
'''
dim = da.dim
unit = da.coords[dim].unit
period = (1.0 / sc.scalar(14.0, unit='Hz')).to(unit=unit)
N = sc.sum(da.coords[dim] < period).value
K = (da.coords[dim].max() // period + 1).to(dtype='int')
grid = sc.linspace(dim, sc.scalar(0.0, unit=unit), K * period, K * (N - 1) + 1)
out = da.rebin({dim: grid}).fold(dim, sizes={'_': K, dim: N - 1}).sum('_')
out.coords[dim] = sc.linspace(dim, sc.scalar(0.0, unit=unit), period, N)
return out
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems very complicated for what is effectively a modulus operation. How about something like

tof, counts, err = np.loadtxt(file_path, usecols=(0, 1, 2), unpack=True)
toa = tof % sc.reciprocal(sc.scalar(14.0, unit='Hz'))
toa = _to_edges(sc.array(dims=["toa"], values=toa, unit="us"))
data = sc.DataArray(
    sc.array(dims=["toa"], values=counts, variances=err**2, unit="counts"),
    coords={
        "toa": toa,
    },
)
data = data.hist(toa=some_range_based_on(tof))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't work. You'll get multiple values added to some bins, and no values added to others. You can try it out and look at the difference.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also didn't really follow. From the Strategy outlined, it sounded like a modulo operation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're not wrong, it's similar. But if you use the suggested approach of applying mod and then re-histogramming you get aliasing effects.

If your final grid is sufficiently coarse then this is not very visible, but we don't want to restrict the monitor histogram to a very coarse grid directly after loading the data.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think you're right; the important point here is that the input data is histogrammed, right?
Then it's not as simple as just a modulo...

Are we sure that it will always be histogrammed or do we need to check for da.bins is None?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the important point here is that the input data is histogrammed, right?

Yes the aliasing problem only affects histogrammed data.

Are we sure that it will always be histogrammed or do we need to check for da.bins is None?

In this case we're loading data from a simulation. The current version of the code assumes the monitor is a histogram, I think it's fine to keep that assumption for now. If they change the simulation in the to have an event mode monitor we can deal with that later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you just add a safety net with

if da.bins is not None:
    raise NotImplementedError

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there migh be a misunderstanding here. What's your concern?

The monitordata is loaded from a CSV file, it can never be binned the way the program works now.

Copy link
Member

@nvaytet nvaytet Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned about someone in the future trying to use this to load a different file where the monitor data would be events?
Cf your comment:

If they change the simulation in the to have an event mode monitor we can deal with that later.

Copy link
Contributor Author

@jokasimr jokasimr Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't think we need to worry about that.

A non public aux function that is used in a single place should not validate it's arguments. It's only extracted from the place where it is called to improve readability.
We should assume that it will not be re-used blindly for some purpose that it was not meant for in some future context that we cannot predict.
Adding validation to it seems to me the same as adding unnecessary validation to a piece of code for the reason that it might be copied and used erroneously.

I don't want this to be a long discussion. If you think this is important, please come talk to me in person instead to avoid misunderstandings.



def load_mcstas_monitor(
file_path: MonitorFilename[RunType],
position: CaveMonitorPosition,
Expand All @@ -221,14 +242,14 @@ def load_mcstas_monitor(
tof, counts, err = np.loadtxt(file_path, usecols=(0, 1, 2), unpack=True)

tof = _to_edges(sc.array(dims=["tof"], values=tof, unit="us"))
data = sc.DataArray(
sc.array(dims=["tof"], values=counts, variances=err**2, unit="counts"),
coords={"tof": tof},
)
data = _rebin_to_tofrange(data)
return NeXusComponent[CaveMonitor, RunType](
sc.DataGroup(
data=sc.DataArray(
sc.array(dims=["tof"], values=counts, variances=err**2, unit="counts"),
coords={
"tof": tof,
},
),
data=data,
position=position,
)
)
Expand Down
12 changes: 8 additions & 4 deletions src/ess/powder/correction.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,11 +59,15 @@ def normalize_by_monitor_histogram(
)
lut = sc.lookup(norm, dim="wavelength")
if detector.bins is None:
result = (
detector / lut[sc.midpoints(detector.coords['wavelength'], dim='dspacing')]
)
c = lut[sc.midpoints(detector.coords['wavelength'], dim='dspacing')]
result = detector / c
result.masks['monitor_intensity_is_zero'] = c == sc.scalar(0.0, unit=c.unit)
else:
result = detector.bins / lut
c = lut(detector.bins.coords['wavelength'])
result = detector / c
result.bins.masks['monitor_intensity_is_zero'] = c == sc.scalar(
0.0, unit=c.unit
)
return ScaledCountsDspacing[RunType](result)


Expand Down
27 changes: 25 additions & 2 deletions tests/dream/io/geant4_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,19 @@
import scipp.testing
import scippnexus as snx

from ess.dream import DreamGeant4ProtonChargeWorkflow, data, load_geant4_csv
from ess.powder.types import Filename, NeXusComponent, NeXusDetectorName, SampleRun
from ess.dream import DreamGeant4ProtonChargeWorkflow, data, io, load_geant4_csv
from ess.powder.types import (
CaveMonitorPosition,
Filename,
NeXusComponent,
NeXusDetectorName,
SampleRun,
)


@pytest.fixture(scope="module")
def monitor_file_path():
return data.simulated_monitor_diamond_sample()


@pytest.fixture(scope="module")
Expand Down Expand Up @@ -184,3 +195,15 @@ def test_geant4_in_workflow(file_path, file):
expected = load_geant4_csv(file)["instrument"]["mantle"]["events"]
expected.coords["detector"] = sc.scalar("mantle")
sc.testing.assert_identical(detector, expected)


def test_geant4_monitor_in_expected_range(monitor_file_path):
dummy_position = CaveMonitorPosition(sc.vector([0, 0, 0.0], unit='m'))
mon = io.geant4.load_mcstas_monitor(monitor_file_path, dummy_position)['data']
assert (
mon.coords['tof'] >= sc.scalar(0.0, unit='s').to(unit=mon.coords['tof'].unit)
).all()
assert (
mon.coords['tof']
<= sc.scalar(1 / 14.0, unit='s').to(unit=mon.coords['tof'].unit)
).all()
13 changes: 11 additions & 2 deletions tests/powder/correction_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -347,7 +347,12 @@ def test_normalize_by_monitor_histogram_expected_results():
monitor=WavelengthMonitor[SampleRun, CaveMonitor](monitor),
uncertainty_broadcast_mode=UncertaintyBroadcastMode.fail,
)
expected = ScaledCountsDspacing[SampleRun](detector / monitor.data)
# Simple way to get all false array same shape as detector data.
# Is there a better way?
all_false = sc.isinf(detector.bins.data)
expected = ScaledCountsDspacing[SampleRun](
detector / monitor.data
).bins.assign_masks({'monitor_intensity_is_zero': all_false})
sc.testing.assert_identical(normalized, expected)


Expand All @@ -362,12 +367,16 @@ def test_normalize_by_monitor_histogram_ignores_monitor_values_out_of_range():
'wavelength': sc.array(dims=['wavelength'], values=[0.0, 3, 4], unit='Å')
},
)
# Simple way to get all false array same shape as detector data.
all_false = sc.isinf(detector.bins.data)
normalized = normalize_by_monitor_histogram(
CountsWavelength[SampleRun](detector),
monitor=WavelengthMonitor[SampleRun, CaveMonitor](monitor),
uncertainty_broadcast_mode=UncertaintyBroadcastMode.fail,
)
expected = ScaledCountsDspacing[SampleRun](detector / sc.scalar(4.0, unit='counts'))
expected = ScaledCountsDspacing[SampleRun](
detector / sc.scalar(4.0, unit='counts')
).bins.assign_masks({'monitor_intensity_is_zero': all_false})
sc.testing.assert_identical(normalized, expected)


Expand Down
Loading