Add OpenMP Parallel Implementation for Belief Propagation Decoder by 0xSooki · Pull Request #78 · quantumgizmos/ldpc

0xSooki · 2025-06-05T09:13:15Z

This PR introduces OpenMP-based parallelization to the Belief Propagation (BP) decoder, significantly improving performance for large LDPC codes. Closes #72

Summary

Added OpenMP support for parallel execution of BP iterations
Added cython bindings for the underlying parallel cpp implementation

Testing

Added performance benchmarks across 1, 2, 4, and 8 threads, all parallel results verified against the serial implementation
Performance validated on 1200×600 LDPC matrices

Performance

Bason on the TEST(BpDecoderParallel, ThreadScaling) benchmark

Threads	Parallel Time (μs)	Serial Time (μs)	Speedup vs Serial
1	13,445	24,284	1.81x
2	9,201	24,239	2.63x
4	6,953	24,310	3.50x
8	10,407	24,591	2.36x

quantumgizmos · 2025-06-05T13:52:26Z

Nice work. Do you have any intuition as to why there is a slow down from 4 ---> 8 threads?

Could it be that the extra cores on your device are efficiency cores or similar?
Perhaps furhter speedups could be obtained by investigating the shared vs private OpenMP variables across cores?

0xSooki · 2025-06-05T14:00:15Z

My intuition would be that with more threads an overhead of managing them also occurs and it makes use of the cache less efficiently. During writing my thesis something of similar sort also occurred thus I only used 5 threads in the end to reap the maximum speed benefits of parallelization. I will look into further speedup improvements.

quantumgizmos · 2025-06-05T14:41:13Z

Is the shared memory for each thread being recopied at each iteration I wonder?
I would have thought that the parallelisation overhead should be quite small once the thread pool (and associated memory for each thread) is initialised.

However, if the entire PCM is being copied to each thread at each iteration, I could see that this might cause quite some overhead.

0xSooki · 2025-06-05T15:09:34Z

I believe that class members should be shared by default (does not copy values). However I will try to benchmark them using the explicit shared keyword.

quantumgizmos · 2025-06-05T15:14:05Z

Another thing to try would be benchmarking on larger LDPC codes. In quantum error correction, we often decode codes over matries of 10,000+ columns. It's possible that the parallelisation overhead could be less of a bottleneck in this regime.

0xSooki · 2025-06-05T19:08:57Z

Ahh yes, with 15000 columns I was able to achieve the following results. This time, 8 threads did much better.

Threads	Parallel Time (μs)	Serial Time (μs)	Speedup vs Serial
1	167,997	324,516	1.93x
2	95,153	336,669	3.54x
4	109,404	335,716	3.07x
8	78,741	324,567	4.12x

quantumgizmos · 2025-06-05T19:20:10Z

This is great to see. I noticed your benchmark is running over a single random syndrome. Some syndromes are more difficult to decode than others, so this could be accounting for the increased speed at thread_count=4. Could you average over a larger number of cycles (eg. 1000 or so)? If necessary, to speed things up, you could increase the sparsity of the matrix and syndrome.

0xSooki · 2025-06-05T21:37:26Z

I have added a cycles parameter for the benchmarks and did one with a matrix of 10,000 columns over 1,000 cycles.

Threads	Parallel avg (μs)	Serial avg (μs)	Speedup
1	91,745	184,536	2.01x
2	60,514	184,029	3.04x
4	42,364	184,181	4.35x
8	48,088	185,780	3.86x

0xSooki · 2025-06-10T16:20:54Z

I have tried some improvements like using locks, omp atomics, or storing the partial results in a matrix and reducing it afterward, thus eliminating the critical section. I managed to get a small speedup for more threads by using atomics for a run with the same specs as #78 (comment). It reduced the time from ~48,000μs to ~42,000μs for 8 threads

0xSooki · 2025-06-16T09:47:57Z

@quantumgizmos, just a gentle reminder that today was mentioned as the last day for reviewing open PRs. If it would be helpful to keep iterating, I’d be more than happy to continue working on it.

quantumgizmos · 2025-06-16T10:56:28Z

Hi @0xSooki Can you make a comment on #72 ? I can assign you to that Issue and the UnitaryHack team will be in touch about the bounty reward.

quantumgizmos · 2025-06-16T10:57:15Z

If you are interested on working on this beyond UnitaryHack, we can explore ways of furhter improving the OpenMp implementation. Let me know :)

0xSooki · 2025-06-22T09:05:40Z

Would be more than happy to further work on it. What would be your preferred way of communication?

quantumgizmos · 2025-06-22T11:33:33Z

@0xSooki Great. My email address is joschka@roffe.eu

0xSooki added 2 commits June 5, 2025 11:05

add OpenMP Parallel Implementation for Belief Propagation Decoder

fd05441

update benchmark

f3f8cdd

0xSooki added 2 commits June 5, 2025 21:43

add cycles parameter for bp benchmarks

7a934c8

refactor bp benchmark

346da4f

add conditional openmp include

9d5752f

Fireond mentioned this pull request Aug 21, 2025

[New Feature] Add OpenMP for BP+OSD decoder #85

Open

Conversation

0xSooki commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Performance

Uh oh!

quantumgizmos commented Jun 5, 2025

Uh oh!

0xSooki commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

quantumgizmos commented Jun 5, 2025

Uh oh!

0xSooki commented Jun 5, 2025

Uh oh!

quantumgizmos commented Jun 5, 2025

Uh oh!

0xSooki commented Jun 5, 2025

Uh oh!

quantumgizmos commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

0xSooki commented Jun 5, 2025

Uh oh!

0xSooki commented Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

0xSooki commented Jun 16, 2025

Uh oh!

quantumgizmos commented Jun 16, 2025

Uh oh!

quantumgizmos commented Jun 16, 2025

Uh oh!

0xSooki commented Jun 22, 2025

Uh oh!

quantumgizmos commented Jun 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

0xSooki commented Jun 5, 2025 •

edited

Loading

0xSooki commented Jun 5, 2025 •

edited

Loading

quantumgizmos commented Jun 5, 2025 •

edited

Loading

0xSooki commented Jun 10, 2025 •

edited

Loading