Skip to content

rgw: An initial commit of enabling RGWDedup#2

Open
yangdaegon wants to merge 15 commits intomainfrom
wip_rgw_dedup_worker
Open

rgw: An initial commit of enabling RGWDedup#2
yangdaegon wants to merge 15 commits intomainfrom
wip_rgw_dedup_worker

Conversation

@yangdaegon
Copy link
Member

@yangdaegon yangdaegon commented Nov 29, 2022

This PR is the first task for RGWdedup enable.
RGWdedup is an internal feature of RGW that can reduce disk usage by selectively removing duplicated data in RGW object data.
With this feature, users can use deduplication more conveniently as a part of RGW function for RGW workload without operating ‘estimate’, ‘sample-dedup’, ‘chunk-scrub’, and ‘chunk-repair’ in ceph-dedup-tool.

In this PR, RGWDedup instance that controls the RGWdedup feature and RGWDedupManager thread that manages entire objects and workers is added.
It also includes skeleton codes of RGWDedupWorker and RGWChunkScrubWorker that actually carry out logic on RADOS layer.

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

This commit adds an instance of RGWDedup and a skeleton code of DedupManager.
  RGWDedup - an instance that controlls RGWDedupManager during its lifecycle.
  DedupManager - a thread that manages whole deduplication routine.

Signed-off-by: Sungmin Lee sung_min.lee@samsung.com
get_object() collects all the rados object a zone which current RGW belongs.
@ssdohammer-sl ssdohammer-sl force-pushed the wip_rgw_dedup_worker branch 2 times, most recently from aa52f25 to 329da6c Compare November 29, 2022 04:32
@yangdaegon yangdaegon added the enhancement New feature or request label Nov 29, 2022
@yangdaegon yangdaegon marked this pull request as ready for review November 29, 2022 05:23
ssdohammer-sl and others added 3 commits December 16, 2022 14:24
 - Change function name from get_rados_objects() to prepare_dedup_work()
 - Add append_ioctxs() to get base, chunk, and cold pools from existing data pools of storage_classes
 - Add set_dedup_tier() to declare dedup_tier between base-pools and chunk-pools.
@ssdohammer-sl ssdohammer-sl force-pushed the wip_rgw_dedup_worker branch 2 times, most recently from 78eb44f to ea9e795 Compare December 18, 2022 18:49
ssdohammer-sl and others added 5 commits January 20, 2023 07:12
RGWFPManager is a RGWDedup component that stores chunks stat collected by RGWDedupWorker.

This commit adds RGWFPManager and test codes.

Signed-off-by: daegon.yang <daegon.yang@samsung.com>
RGWDedupWorker chunks an RADOS Object and stores Chunk's fingerprint in FPManager.
If the fingerprint is already stored in the FPManager, chunk dedup is performed.

This commit implements RGWDedupWorker's dedup logic.

Signed-off-by: daegon.yang <daegon.yang@samsung.com>
@yangdaegon yangdaegon force-pushed the wip_rgw_dedup_worker branch from c8403c3 to ac0549d Compare January 25, 2023 00:41
@github-actions
Copy link

github-actions bot commented Aug 7, 2023

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Aug 7, 2023
yangdaegon pushed a commit that referenced this pull request Dec 6, 2023
Sanitized backtrace:
```
DEBUG 2023-11-14 15:23:50,871 [shard 0] osd - snaptrim_event(id=10610, detail=SnapTrimEvent(pgid=16.1a snapid=a needs_pause=0)): interrupted crimson::common::actingset_changed (acting set changed)

    #0 0x5653c613c071 in seastar::shared_mutex::unlock() (/usr/bin/ceph-osd+0x1ed27071)
    #1 0x5653c8670acf in auto seastar::futurize_invoke<crimson::OrderedConcurrentPhaseT<crimson::osd::SnapTrimEvent::WaitSubop>::ExitBarrier<crimson::OrderedConcurrentPhaseT<crimson::osd::SnapTrimEvent::WaitSubop>::BlockingEvent::Trigger<crimson::osd::SnapTrimEvent> >::exit()::{lambda()#1}&>(crimson::OrderedConcurrentPhaseT<crimson::osd::SnapTrimEvent::WaitSubop>::ExitBarrier<crimson::OrderedConcurrentPhaseT<crimson::osd::SnapTrimEvent::WaitSubop>::BlockingEvent::Trigger<crimson::osd::SnapTrimEvent> >::exit()::{lambda()#1}&) (/usr/bin/ceph-osd+0x2125bacf)
    #2 0x5653c8670e22 in _ZN7seastar20noncopyable_functionIFNS_6futureIvEEvEE17direct_vtable_forIZNS2_4thenIZN7crimson23OrderedConcurrentPhaseTINS7_3osd13SnapTrimEvent9WaitSubopEE11ExitBarrierINSC_13BlockingEvent7TriggerISA_EEE4exitEvEUlvE_S2_EET0_OT_EUlDpOT_E_E4callEPKS4_ (/usr/bin/ceph-osd+0x2125be22)

freed by thread T1 here:
    #0 0x7f10628b73cf in operator delete(void*, unsigned long) (/lib64/libasan.so.6+0xb73cf)
    #1 0x5653c8794bff in crimson::osd::SnapTrimEvent::~SnapTrimEvent() (/usr/bin/ceph-osd+0x2137fbff)

previously allocated by thread T1 here:
    #0 0x7f10628b6367 in operator new(unsigned long) (/lib64/libasan.so.6+0xb6367)

SUMMARY: AddressSanitizer: heap-use-after-free (/usr/bin/ceph-osd+0x1ed27071) in seastar::shared_mutex::unlock()
```

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
yangdaegon pushed a commit that referenced this pull request Dec 6, 2023
```
    // SnapTrimEvent is a background operation,
    // it's lifetime is not guarnteed since the caller
    // returned future is being ignored. We should capture
    // a self reference thourhgout the entire execution
    // progress (not only on finally() continuations).
    // See: PG::on_active_actmap()
```

Sanitized backtrace:
```
DEBUG 2023-11-16 08:42:48,441 [shard 0] osd - snaptrim_event(id=21122, detail=SnapTrimEvent(pgid=3.1 snapid=3cb needs_pause=1)): interrupted crimson::common::actingset_changed (acting set changed

kernel callstack:
    #0 0x55e310e0ace7 in seastar::shared_mutex::unlock() (/usr/bin/ceph-osd+0x1edd0ce7)
    #1 0x55e313325d9c in auto seastar::futurize_invoke<crimson::OrderedConcurrentPhaseT<crimson::osd::SnapTrimEvent::WaitSubop>::ExitBarrier<crimson::OrderedConcurrentPhaseT<crimson::osd::SnapTrimEvent::WaitSubop>::BlockingEvent::Trigger<crimson::osd::SnapTrimEvent> >::exit()::{lambda()#1}&>(crimson::OrderedConcurrentPhaseT<crimson::osd::SnapTrimEvent::WaitSubop>::ExitBarrier<crimson::OrderedConcurrentPhaseT<crimson::osd::SnapTrimEvent::WaitSubop>::BlockingEvent::Trigger<crimson::osd::SnapTrimEvent> >::exit()::{lambda()#1}&) (/usr/bin/ceph-osd+0x212ebd9c)
    #2 0x55e3133260ef in _ZN7seastar20noncopyable_functionIFNS_6futureIvEEvEE17direct_vtable_forIZNS2_4thenIZN7crimson23OrderedConcurrentPhaseTINS7_3osd13SnapTrimEvent9WaitSubopEE11ExitBarrierINSC_13BlockingEvent7TriggerISA_EEE4exitEvEUlvE_S2_EET0_OT_EUlDpOT_E_E4callEPKS4_ (/usr/bin/ceph-osd+0x212ec0ef)
0x61500013365c is located 92 bytes inside of 472-byte region [0x615000133600,0x6150001337d8)
freed by thread T2 here:
    #0 0x7fb345ab73cf in operator delete(void*, unsigned long) (/lib64/libasan.so.6+0xb73cf)
    #1 0x55e313474863 in crimson::osd::SnapTrimEvent::~SnapTrimEvent() (/usr/bin/ceph-osd+0x2143a863)

previously allocated by thread T2 here:
    #0 0x7fb345ab6367 in operator new(unsigned long) (/lib64/libasan.so.6+0xb6367)
    #1 0x55e31183ac18 in auto crimson::OperationRegistryI::create_operation<crimson::osd::SnapTrimEvent, crimson::osd::PG*, SnapMapper&, snapid_t const&, bool const&>(crimson::osd::PG*&&, SnapMapper&, snapid_t const&, bool const&) (/usr/bin/ceph-osd+0x1f800c18)
SUMMARY: AddressSanitizer: heap-use-after-free (/usr/bin/ceph-osd+0x1edd0ce7) in seastar::shared_mutex::unlock()
```

Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request stale

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants