Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
89 commits
Select commit Hold shift + click to select a range
3dc05ec
SWDEV-520417 - Cherry-pick fixes from Shore to amd-mainline (#1470)
kzhuravl Apr 11, 2025
bee2246
SWDEV-520417 - Cherry-pick fixes from Shore to amd-mainline (#1470) (…
Aug 1, 2025
fa85384
[Compiler CI/CD] fix psdb mainline trigger
skganesan008 Jun 3, 2025
53377bb
Add Offoading to llvm-readobj and llvm-objcopy
david-salinas Aug 28, 2024
d0220b3
Add Offoading to llvm-readobj and llvm-objcopy (#3395)
Aug 1, 2025
580ad88
[Compiler CI/CD] fix psdb mainline trigger (#3393)
Aug 1, 2025
c013b65
Disable flang warning, its erroneously firing on all uses
ronlieb Jun 8, 2025
aadc521
Disable flang warning, its erroneously firing on all uses (#3444)
Aug 5, 2025
a73be02
rocm 70 : amdflang-legacy -> amdflang-classic
ronlieb Feb 24, 2025
9559a30
rocm 70 : amdflang-legacy -> amdflang-classic (#3447)
Aug 5, 2025
2ec0ccc
[amdllvm] - Remove symlink for flang-classic
estewart08 Jul 10, 2025
2d1c17a
[amdllvm] - Remove symlink for flang-classic (#3451)
Aug 5, 2025
79000d8
[runtimes] - Use cmake semicolon generator expression for rocr-runtim…
estewart08 Jul 22, 2025
45ebc24
[Comgr] Turn LIT tests back on
lamb-j Jul 25, 2025
16f43f7
SWDEV-522811 - fix compress/decompress in LLVM Offloading API
Jun 17, 2025
6372494
[Comgr] Turn LIT tests back on (#3455)
Aug 5, 2025
74ab911
[runtimes] - Use cmake semicolon generator expression for rocr-runtim…
Aug 5, 2025
75d2650
SWDEV-522811 - fix compress/decompress in LLVM Offloading API (#3456)
Aug 5, 2025
8960842
Reapply "[AMDGPU][MC] Allow op_sel in v_alignbit_b32 etc in GFX9 and …
jwanggit86 Jul 21, 2025
cc3d141
Reapply "[AMDGPU][MC] Allow op_sel in v_alignbit_b32 etc in GFX9 and …
Aug 6, 2025
0ee704b
AMDGPU: Fix assert when multi operands to update after folding imm (#…
macurtis-amd Jul 16, 2025
520cf38
Update amd/comgr/test-lit/CMakeLists.txt
lamb-j Aug 6, 2025
b9374bb
AMDGPU: Fix assert when multi operands to update after folding imm (#…
Aug 7, 2025
352815d
[OpenMP] Update printf statement with missing argument. (#153704)
CatherineMoore Aug 15, 2025
2e6e2e5
[OpenMP] Update ompdModule.c printf to match argument type (#152785)
CatherineMoore Aug 15, 2025
7844040
[OpenMP] Update printf stmt in kmp_settings.cpp (#152800)
CatherineMoore Aug 14, 2025
80c52ba
Fix various CodeQL issues (#3635)
Aug 19, 2025
b47a040
CP for #3537
Shoreshen Aug 25, 2025
4dc7fed
CP for #3537 (#3718)
lajagapp Aug 25, 2025
00ed955
[AMDGPU] Do not fold an immediate into instructions with frame indexe…
changpeng Aug 6, 2025
8357ccc
[AMDGPU] Do not fold an immediate into instructions with frame indexe…
Aug 25, 2025
fdf996f
[AMDGPU] Fix op_sel settings for v_cvt_scale32_* and v_cvt_sr_* (#151…
changpeng Jul 30, 2025
6778db2
[AMDGPU] Fix op_sel settings for v_cvt_scale32_* and v_cvt_sr_* (#151…
Aug 25, 2025
4773e06
[Comgr] Support 4-field-triple for Spirv
lamb-j Aug 4, 2025
fb10877
[Comgr] Support 4-field-triple for Spirv (#3732)
Aug 25, 2025
c25d6c5
[HIP] compressed bundle format defaults to v3 (#152600)
yxsamliu Aug 8, 2025
edcb2f2
[HIP] compressed bundle format defaults to v3 (#152600) (#3734)
Aug 25, 2025
04692cf
[AMDGPU] Ensure non-reserved CSR spilled regs are live-in (#146427)
macurtis-amd Aug 1, 2025
3bd0f94
[AMDGPU] Ensure non-reserved CSR spilled regs are live-in (#146427) (…
Aug 25, 2025
bf1fe01
[AMDGPU] AMDGPUPromoteAlloca: increase default max-regs to 32 (#155076)
perlfu Aug 26, 2025
96f63b4
[AMDGPU] AMDGPUPromoteAlloca: increase default max-regs to 32 (#15507…
Aug 26, 2025
301eb38
[offload] recover performance on 534.hpgmg [used]
ronlieb Sep 6, 2025
e784a9a
[offload] recover performance on 534.hpgmg [used] (#3923)
Sep 7, 2025
753bbd3
[AMDGPU][Attributor] Remove final update of waves-per-eu after the at…
shiltian Aug 27, 2025
64ee5b3
[AMDGPU][Attributor] Remove final update of waves-per-eu after the at…
Sep 17, 2025
75432aa
[AMDGPU] NFCI: Track AV Register Pressure separately (#149863)
jrbyrnes Jul 25, 2025
2c56a3e
[AMDGPU] NFCI: Track AV Register Pressure separately (#149863) (#4095)
CatherineMoore Sep 17, 2025
c37b82f
[NFC] Fix compile warnings in `llvm/unittests/Object/OffloadingBundle…
shiltian Aug 28, 2025
9303cdf
[NFC] Fix compile warnings in `llvm/unittests/Object/OffloadingBundle…
Sep 17, 2025
a397ae6
[InstCombine] Added pattern for recognising the construction of packe…
zGoldthorpe Jul 30, 2025
0806d7b
[InstCombine] Added pattern for recognising the construction of packe…
Sep 17, 2025
01828d4
[compiler-rt]: fix CodeQL errors (#3798)
ampandey-AMD Sep 5, 2025
81cb3e3
[compiler-rt]: fix CodeQL errors (#3798) (#4100)
Sep 18, 2025
7861d81
[ASan] Intercept rocr api's 'hsa_amd_vmem_address_reserve_align' and …
ampandey-AMD Sep 2, 2025
3f0bfdb
[NFC][Clang] Fix typo in feature predicate insertion
lamb-j Sep 2, 2025
feb1126
[AMDGPU] Make ds/global load intrinsics IntrArgMemOnly (#152792)
choikwa Aug 12, 2025
cc93ab3
[AMDGPU] Make ds/global load intrinsics IntrArgMemOnly (#152792) (#4113)
Sep 18, 2025
4b1ff96
[AMDGPU] Restrict to VGPR only for mfma scale operands (#158117)
changpeng Sep 11, 2025
425bfcd
[Comgr] Fix memory leak in name expression API
lamb-j Sep 8, 2025
9cd4d5f
[Comgr] Fix memory leak in name expression API (#4116)
Sep 18, 2025
a431e5a
[AMDGPU] Restrict to VGPR only for mfma scale operands (#158117) (#4115)
Sep 18, 2025
844cf36
[NFC][Clang] Fix typo in feature predicate insertion (#4112)
Sep 18, 2025
39bfe1b
[ASan] Intercept rocr api's 'hsa_amd_vmem_address_reserve_align' and …
Sep 18, 2025
eb22c5a
offload-tunnel-cmake with proper escape
ronlieb Oct 1, 2025
9046f1e
offload-tunnel-cmake with proper escape (#4312)
Oct 2, 2025
9e6aaa6
Apply atomicrmw metadata lowering patches (#3783)
saipoorna Sep 17, 2025
10996a5
Apply atomicrmw metadata lowering patches (#3783) (#4345)
Oct 3, 2025
276de06
[InstCombine] Fold integer unpack/repack patterns through ZExt (#153583)
zGoldthorpe Aug 15, 2025
e442d97
[InstCombine] Fold integer unpack/repack patterns through ZExt (#1535…
Oct 4, 2025
3d9374a
[AMDGPU] Enable i8 vectorization and sinking of free vector ops (#3971)
ronlieb Sep 20, 2025
bdad897
[AMDGPU] Enable i8 vectorization and sinking of free vector ops (#397…
Oct 4, 2025
16f5f73
[AMDGPU] NFC. Add testcase to test SIInsertWaitcnts::generateWaitcntI…
choikwa Sep 10, 2025
796d821
[AMDGPU] NFC. Add testcase to test SIInsertWaitcnts::generateWaitcntI…
Oct 6, 2025
3fded30
[AMDGPU] Remove scope check in SIInsertWaitcnts::generateWaitcntInstB…
choikwa Sep 12, 2025
5065ac8
[AMDGPU] Remove scope check in SIInsertWaitcnts::generateWaitcntInstB…
Oct 6, 2025
abbb17f
[AMDGPU]: Unpack packed instructions overlapped by MFMAs post-RA sche…
akadutta Sep 19, 2025
c7cae27
[AMDGPU]: Unpack packed instructions overlapped by MFMAs post-RA sche…
Oct 6, 2025
b6bd867
AMDGPU: Report unaligned scratch access as fast if supported by tgt (…
macurtis-amd Sep 15, 2025
9653571
AMDGPU: Report unaligned scratch access as fast if supported by tgt (…
Oct 6, 2025
7bfba6d
[AMDGPU] Fix 160181. Be less optimistic when allocating module scope …
hjagasiaAMD Oct 10, 2025
ffd4bcc
[AMDGPU] Fix 160181. Be less optimistic when allocating module scope …
ronlieb Oct 12, 2025
b2a4d5d
Amd/dev/catmoore/rel path (#4422)
CatherineMoore Oct 10, 2025
a2ee874
Amd/dev/catmoore/rel path (#4422) (#4491)
CatherineMoore Oct 15, 2025
0fa6648
Adjust device-libs search (#4517)
kzhuravl Oct 21, 2025
974b96d
Add missing ${extra_cmake_args} (#4603)
kzhuravl Oct 22, 2025
687ddd9
Disable azure llvm-project PSDB build trigger (#352)
skganesan008 Oct 22, 2025
d39c106
Revert "[Comgr][NFC] Refactor remove dir for windows" (#3141) (#344)
kzhuravl Oct 23, 2025
4affeb6
convert HIP struct type vector to llvm vector type (#416)
kzhuravl Oct 29, 2025
3633054
[AMDGPU][Attributor] Stop inferring amdgpu-no-flat-scratch-init in sa…
skc7 Oct 3, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .azuredevops/rocm-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,5 +23,7 @@ trigger:
- '**/*.md'
- LICENSE.TXT

pr: none

jobs:
- template: ${{ variables.CI_COMPONENT_PATH }}/llvm-project.yml@pipelines_repo
106 changes: 106 additions & 0 deletions .github/workflows/PSDB-amd-mainline.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
name: Compiler CI PSDB trigger on amd-mainline branch

# Controls when the workflow will run
on:
pull_request:
branches: [amd-mainline]
types: [opened, reopened, synchronize, ready_for_review]
workflow_dispatch:

# A workflow run is made up of one or more jobs that can run sequentially or in parallel, below is a single job called invoke jenkins jobs
jobs:
# This workflow contains a single job called "invoke_jenkins_PSDB"
invoke_jenkins_PSDB:
if: github.event.pull_request.draft == false
runs-on:
group: compiler-generic-runners
env:
svc_acc_org_secret: ${{secrets.CI_GITHUB_TOKEN}}
input_sha: ${{ github.event.pull_request.head.sha != '' && github.event.pull_request.head.sha || github.sha }}
input_pr_num: ${{ github.event.pull_request.number != '' && github.event.pull_request.number || 0 }}
input_pr_url: ${{ github.event.pull_request.html_url != '' && github.event.pull_request.html_url || '' }}
input_pr_title: ${{ github.event.pull_request.title != '' && github.event.pull_request.title || '' }}
# set the pipeline name here based on branch name
pipeline_name: ${{secrets.CI_JENKINS_MAINLINE_JOB_NAME}}
JENKINS_URL: ${{secrets.CI_JENKINS_URL}}
CONTAINER_IMAGE: ${{ secrets.JENKINS_TRIGGER_DOCKER_IMAGE }}

# Steps represent a sequence of tasks that will be executed as part of the job
steps:

# Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it
- name: Set environment variable for container image
run: |
echo "CONTAINER_IMAGE=${{ secrets.JENKINS_TRIGGER_DOCKER_IMAGE }}" >> $GITHUB_ENV
echo "CONTAINER_NAME=my_container_${{ github.run_id }}" >> $GITHUB_ENV


- name: Pull container image
run: docker pull "${{env.CONTAINER_IMAGE}}"


- name: Run container
run: |
docker run -d --name "${{env.CONTAINER_NAME}}" $CONTAINER_IMAGE sleep infinity
#docker exec "${{env.CONTAINER_NAME}}" /bin/bash -c "git clone ${{secrets.CI_UTILS_REPO}} ."
docker exec "${{env.CONTAINER_NAME}}" /bin/bash -c "echo 'Running commands inside the container'"

- name: Escape pull request title
run: |
import json
import os
import shlex
with open('${{ github.event_path }}') as fh:
event = json.load(fh)
escaped = event['pull_request']['title']
with open(os.environ['GITHUB_ENV'], 'a') as fh:
print(f'PR_TITLE={escaped}', file=fh)
shell: python3 {0}

- name: Run Jenkins Cancel Script
env:
JENKINS_URL: ${{secrets.CI_JENKINS_URL}}
JENKINS_USER: ${{secrets.CI_JENKINS_USER}}
JENKINS_API_TOKEN: ${{secrets.CI_JENKINS_TOKEN}}
JENKINS_JOB_NAME: ${{secrets.CI_JENKINS_JOB_NAME}}
PR_NUMBER: ${{ github.event.pull_request.number }}
COMMIT_HASH: ${{ github.event.after }}
run: |
docker exec -e JENKINS_JOB_NAME=${{secrets.CI_JENKINS_JOB_NAME}} -e PR_NUMBER=${{ github.event.pull_request.number }} -e COMMIT_HASH=${{ github.event.after }} -e JENKINS_URL=${{secrets.CI_JENKINS_URL}} -e JENKINS_USER=${{secrets.CI_JENKINS_USER}} -e JENKINS_API_TOKEN=${{secrets.CI_JENKINS_TOKEN}} "${{env.CONTAINER_NAME}}" /bin/bash -c "PYTHONHTTPSVERIFY=0 python3 cancel_previous_build.py"


# Runs a set of commands using the runners shell
- name: Getting Event Details
run: |
echo $(pwd)
echo $GITHUB_ENV
echo $GITHUB_REPOSITORY
echo $GITHUB_SERVER_URL
echo "GITHUB_SHA is: $GITHUB_SHA"
echo "GITHUB_WORKFLOW_SHA is: $GITHUB_WORKFLOW_SHA"
echo "GITHUB_BASE_REF is: $GITHUB_BASE_REF"
echo "GITHUB_REF_NAME is: $GITHUB_REF_NAME"
echo "github.event.pull_request.id is: ${{github.event.pull_request.id}}"
echo "github.event.pull_request.html_url is: ${{github.event.pull_request.html_url}}"
echo "github.event.pull_request.number is: ${{github.event.pull_request.number}}"
echo "github.event.pull_request.url is: ${{github.event.pull_request.url}}"
echo "github.event.pull_request.issue_url is: ${{github.event.pull_request.issue_url}}"
echo "github.event.pull_request.head.sha is: ${{github.event.pull_request.head.sha}}"
echo "github.event.pull_request.base.ref is: ${{github.event.pull_request.base.ref}}"
echo "github.event.pull_request.merge_commit_sha is: ${{github.event.pull_request.merge_commit_sha}}"
echo "github.event.pull_request is: ${{github.event.pull_request}}"


- name: Trigger Jenkins Pipeline
if: steps.check_changes.outcome != 'failure'
run: |
echo "--Running jenkins_api.py with input sha - $input_sha for pull request - $input_pr_url"
docker exec -e GITHUB_REPOSITORY="$GITHUB_REPOSITORY" -e svc_acc_org_secret="$svc_acc_org_secret" -e input_sha="$input_sha" -e input_pr_url="$input_pr_url" -e pipeline_name="$pipeline_name" \
-e input_pr_num="$input_pr_num" -e PR_TITLE="$PR_TITLE" -e JENKINS_URL="$JENKINS_URL" -e GITHUB_PAT="$svc_acc_org_secret" "${{env.CONTAINER_NAME}}" \
/bin/bash -c 'echo \"PR NUM: "$input_pr_num"\" && PYTHONHTTPSVERIFY=0 python3 jenkins_api.py -s \"${JENKINS_URL}\" -jn "$pipeline_name" -ghr "$GITHUB_REPOSITORY" -ghsha "$input_sha" -ghprn "$input_pr_num" -ghpru "$input_pr_url" -ghprt "$PR_TITLE" -ghpat="$svc_acc_org_secret"'

- name: Stop and remove container
if: always()
run: |
docker stop "${{env.CONTAINER_NAME}}"
docker rm "${{env.CONTAINER_NAME}}"
25 changes: 15 additions & 10 deletions amd/comgr/src/comgr-compiler.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -810,16 +810,11 @@ amd_comgr_status_t AMDGPUCompiler::createTmpDirs() {
return AMD_COMGR_STATUS_SUCCESS;
}

amd_comgr_status_t AMDGPUCompiler::removeTmpDirs() {
if (TmpDir.empty()) {
return AMD_COMGR_STATUS_SUCCESS;
}
ProfilePoint Point("RemoveDir");

#ifdef _WIN32
// On windows fs::remove_directories takes huge time so use fs::remove.
#ifdef _WIN32
amd_comgr_status_t removeDirectory(const StringRef DirName) {
std::error_code EC;
for (fs::directory_iterator Dir(TmpDir, EC), DirEnd; Dir != DirEnd && !EC;
for (fs::directory_iterator Dir(DirName, EC), DirEnd; Dir != DirEnd && !EC;
Dir.increment(EC)) {
const StringRef Path = Dir->path();

Expand Down Expand Up @@ -849,16 +844,26 @@ amd_comgr_status_t AMDGPUCompiler::removeTmpDirs() {
}
}

if (fs::remove(TmpDir)) {
if (fs::remove(DirName)) {
return AMD_COMGR_STATUS_ERROR;
}

return AMD_COMGR_STATUS_SUCCESS;
#else
}
#endif

amd_comgr_status_t AMDGPUCompiler::removeTmpDirs() {
if (TmpDir.empty()) {
return AMD_COMGR_STATUS_SUCCESS;
}
ProfilePoint Point("RemoveDir");
#ifndef _WIN32
if (fs::remove_directories(TmpDir)) {
return AMD_COMGR_STATUS_ERROR;
}
return AMD_COMGR_STATUS_SUCCESS;
#else
return removeDirectory(TmpDir);
#endif
}

Expand Down
9 changes: 8 additions & 1 deletion amd/comgr/src/comgr.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -310,7 +310,8 @@ amd_comgr_status_t COMGR::parseTargetIdentifier(StringRef IdentStr,


// TODO: Add a LIT test for this
if (IdentStr == "amdgcn-amd-amdhsa--amdgcnspirv") {
if (IdentStr == "amdgcn-amd-amdhsa--amdgcnspirv" ||
IdentStr == "amdgcn-amd-amdhsa-unknown-amdgcnspirv") {
// Features not supported for SPIR-V
if (!Ident.Features.empty())
return AMD_COMGR_STATUS_ERROR_INVALID_ARGUMENT;
Expand Down Expand Up @@ -2110,6 +2111,8 @@ amd_comgr_populate_name_expression_map(amd_comgr_data_t Data, size_t *Count) {
if (!RelaRangeOrError) {
llvm::logAllUnhandledErrors(RelaRangeOrError.takeError(), llvm::errs(),
"RelaRange creation error: ");
for (auto *Ptr : NameExpDataVec)
delete Ptr;
return AMD_COMGR_STATUS_ERROR;
}
auto RelaRange = std::move(RelaRangeOrError.get());
Expand All @@ -2130,6 +2133,8 @@ amd_comgr_populate_name_expression_map(amd_comgr_data_t Data, size_t *Count) {
if (!RodataOrError) {
llvm::logAllUnhandledErrors(RodataOrError.takeError(), llvm::errs(),
"Rodata creation error: ");
for (auto *Ptr : NameExpDataVec)
delete Ptr;
return AMD_COMGR_STATUS_ERROR;
}
auto Rodata = std::move(RodataOrError.get());
Expand Down Expand Up @@ -2160,6 +2165,8 @@ amd_comgr_populate_name_expression_map(amd_comgr_data_t Data, size_t *Count) {
}
}

for (auto *Ptr : NameExpDataVec)
delete Ptr;
} // end AMD_COMGR_DATA_KIND_EXECUTABLE conditional

*Count = DataP->NameExpressionMap.size();
Expand Down
5 changes: 1 addition & 4 deletions amd/comgr/test-lit/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,7 @@ if (NOT DEFINED LLVM_LIT_PATH)
endif()
message("-- LLVM_LIT_PATH: ${LLVM_LIT_PATH}")

# TODO: Re-enable target once nPSDB issue with llvm-lit is fixed
#add_custom_target(test-lit COMMAND "${LLVM_LIT_PATH}"
# "${CMAKE_CURRENT_BINARY_DIR}" -v)
add_custom_target(test-lit COMMAND echo "${LLVM_LIT_PATH}"
add_custom_target(test-lit COMMAND "${LLVM_LIT_PATH}"
"${CMAKE_CURRENT_BINARY_DIR}" -v)

macro(add_comgr_lit_binary name lang)
Expand Down
8 changes: 4 additions & 4 deletions clang/docs/ClangOffloadBundler.rst
Original file line number Diff line number Diff line change
Expand Up @@ -525,15 +525,15 @@ The compressed offload bundle begins with a header followed by the compressed bi
This is a unique identifier to distinguish compressed offload bundles. The value is the string 'CCOB' (Compressed Clang Offload Bundle).

- **Version Number (16-bit unsigned int)**:
This denotes the version of the compressed offload bundle format. The current version is `2`.
This denotes the version of the compressed offload bundle format. The current version is `3`.

- **Compression Method (16-bit unsigned int)**:
This field indicates the compression method used. The value corresponds to either `zlib` or `zstd`, represented as a 16-bit unsigned integer cast from the LLVM compression enumeration.

- **Total File Size (32-bit unsigned int)**:
- **Total File Size (unsigned int, 32-bit in v2, 64-bit in v3)**:
This is the total size (in bytes) of the file, including the header. Available in version 2 and above.

- **Uncompressed Binary Size (32-bit unsigned int)**:
- **Uncompressed Binary Size (unsigned int, 32-bit in v2, 64-bit in v3)**:
This is the size (in bytes) of the binary data before it was compressed.

- **Hash (64-bit unsigned int)**:
Expand All @@ -542,4 +542,4 @@ The compressed offload bundle begins with a header followed by the compressed bi
- **Compressed Data**:
The actual compressed binary data follows the header. Its size can be inferred from the total size of the file minus the header size.

> **Note**: Version 3 of the format is under development. It uses 64-bit fields for Total File Size and Uncompressed Binary Size to support files larger than 4GB. To experiment with version 3, set the environment variable `COMPRESSED_BUNDLE_FORMAT_VERSION=3`. This support is experimental and not recommended for production use.
> **Note**: Version 3 is now the default format. For backward compatibility with older HIP runtimes that support version 2 only, set the environment variable `COMPRESSED_BUNDLE_FORMAT_VERSION=2`.
2 changes: 1 addition & 1 deletion clang/include/clang/Driver/OffloadBundler.h
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,7 @@ class CompressedOffloadBundle {
static llvm::Expected<CompressedBundleHeader> tryParse(llvm::StringRef);
};

static inline const uint16_t DefaultVersion = 2;
static inline const uint16_t DefaultVersion = 3;

static llvm::Expected<std::unique_ptr<llvm::MemoryBuffer>>
compress(llvm::compression::Params P, const llvm::MemoryBuffer &Input,
Expand Down
2 changes: 1 addition & 1 deletion clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -304,7 +304,7 @@ static Value *GetOrInsertAMDGPUPredicate(CodeGenFunction &CGF, Twine Name) {
P->setExternallyInitialized(true);

return CGF.Builder.CreateLoad(
RawAddress(P, PTy, CharUnits::One(), KnownNonNull), true);
RawAddress(P, PTy, CharUnits::One(), KnownNonNull));
}

Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID,
Expand Down
16 changes: 16 additions & 0 deletions clang/lib/Driver/ToolChain.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -483,7 +483,23 @@ ParsedClangName
ToolChain::getTargetAndModeFromProgramName(StringRef PN) {
std::string ProgName = normalizeProgramName(PN);
size_t SuffixPos;
bool FlangNew = false;
const DriverSuffix *DS = parseDriverSuffix(ProgName, SuffixPos);

// Part II: Warn if invocation happens with flang-new (for Flang); this is for
// the time being and should be removed once AMD Classic Flang has been
// removed from ROCm.
if (FlangNew) {
// flang-new warning is overwarning, disabling until fixed.
if (false && !::getenv("AMD_NOWARN_FLANG_NEW")) {
// The solution with "llvm::errs()" is not ideal, but the driver object
// is not been constructed yet, so we cannot use the Diag() infrastructure
// for this.
llvm::errs() << "warning: the 'amdflang-new' and 'flang-new' commmands "
"have been deprecated; please use 'amdflang' instead\n";
}
}

if (!DS)
return {};
size_t SuffixEnd = SuffixPos + strlen(DS->Suffix);
Expand Down
66 changes: 32 additions & 34 deletions clang/lib/Driver/ToolChains/CommonArgs.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,23 @@ using namespace clang::driver::tools;
using namespace clang;
using namespace llvm::opt;

static bool addRPathCmdArg(const llvm::opt::ArgList &Args,
ArgStringList &CmdArgs,
const std::string pathCandidate,
bool onlyIfPathExists = true) {
SmallString<0> simplifiedPathCandidate(pathCandidate);
llvm::sys::path::remove_dots(simplifiedPathCandidate, true);

bool pathExists = llvm::sys::fs::exists(simplifiedPathCandidate);

if (onlyIfPathExists && !pathExists)
return false;

CmdArgs.push_back("-rpath");
CmdArgs.push_back(Args.MakeArgString(simplifiedPathCandidate));
return pathExists;
}

static bool useFramePointerForTargetByDefault(const llvm::opt::ArgList &Args,
const llvm::Triple &Triple) {
if (Args.hasArg(clang::driver::options::OPT_pg) &&
Expand Down Expand Up @@ -1351,12 +1368,8 @@ void tools::addOpenMPRuntimeSpecificRPath(const ToolChain &TC,
// one of the LIBRARY_PATH directories.
ArgStringList EnvLibraryPaths;
addDirectoryList(Args, EnvLibraryPaths, "", "LIBRARY_PATH");
for (auto &EnvLibraryPath : EnvLibraryPaths) {
if (llvm::sys::fs::exists(EnvLibraryPath)) {
CmdArgs.push_back("-rpath");
CmdArgs.push_back(Args.MakeArgString(EnvLibraryPath));
}
}
for (auto &EnvLibraryPath : EnvLibraryPaths)
addRPathCmdArg(Args, CmdArgs, EnvLibraryPath);

if (Args.hasFlag(options::OPT_fopenmp_implicit_rpath,
options::OPT_fno_openmp_implicit_rpath, true)) {
Expand All @@ -1365,46 +1378,33 @@ void tools::addOpenMPRuntimeSpecificRPath(const ToolChain &TC,
SmallString<256> DefaultLibPath =
llvm::sys::path::parent_path(TC.getDriver().Dir);
llvm::sys::path::append(DefaultLibPath, CLANG_INSTALL_LIBDIR_BASENAME);
if (TC.getSanitizerArgs(Args).needsAsanRt()) {
CmdArgs.push_back("-rpath");
CmdArgs.push_back(Args.MakeArgString(TC.getCompilerRTPath()));
}
if (TC.getSanitizerArgs(Args).needsAsanRt())
addRPathCmdArg(Args, CmdArgs, TC.getCompilerRTPath(),
/*onlyIfPathExists=*/false);

// In case LibSuffix was not built, try lib
std::string CandidateRPath_suf = D.Dir + "/../" + LibSuffix;
CmdArgs.push_back("-rpath");
CmdArgs.push_back(Args.MakeArgString(CandidateRPath_suf.c_str()));

// Add lib directory in case LibSuffix does not exist
std::string CandidateRPath_lib = D.Dir + "/../lib";
if ((!llvm::sys::fs::exists(CandidateRPath_suf)) &&
(llvm::sys::fs::exists(CandidateRPath_lib))) {
CmdArgs.push_back("-rpath");
CmdArgs.push_back(Args.MakeArgString(CandidateRPath_lib.c_str()));
}
if (!addRPathCmdArg(Args, CmdArgs, CandidateRPath_suf,
/*onlyIfPathExists=*/false))
addRPathCmdArg(Args, CmdArgs, CandidateRPath_lib);

std::string rocmPath =
Args.getLastArgValue(clang::driver::options::OPT_rocm_path_EQ).str();
if (rocmPath.size() != 0) {
std::string rocmPath_lib = rocmPath + "/lib";
std::string rocmPath_suf = rocmPath + "/" + LibSuffix;
if (llvm::sys::fs::exists(rocmPath_suf)) {
CmdArgs.push_back("-rpath");
CmdArgs.push_back(Args.MakeArgString(rocmPath_suf.c_str()));
} else if (llvm::sys::fs::exists(rocmPath_lib)) {
CmdArgs.push_back("-rpath");
CmdArgs.push_back(Args.MakeArgString(rocmPath_lib.c_str()));
}
if (!addRPathCmdArg(Args, CmdArgs, rocmPath_suf))
addRPathCmdArg(Args, CmdArgs, rocmPath_lib);
}

// Add Default lib path to ensure llvm dynamic library is picked up for
// lib-debug/lib-perf
if (LibSuffix != "lib" && llvm::sys::fs::exists(DefaultLibPath)){
CmdArgs.push_back("-rpath");
CmdArgs.push_back(Args.MakeArgString(DefaultLibPath.c_str()));
}
if (LibSuffix != "lib")
addRPathCmdArg(Args, CmdArgs, DefaultLibPath.c_str());

if (llvm::find_if(CmdArgs, [](StringRef str) {
if (llvm::find_if(CmdArgs, [](StringRef str) {
return !str.compare("--enable-new-dtags");
}) == CmdArgs.end())
CmdArgs.push_back("--disable-new-dtags");
Expand Down Expand Up @@ -1444,10 +1444,8 @@ void tools::addArchSpecificRPath(const ToolChain &TC, const ArgList &Args,
CandidateRPaths.emplace_back(*CandidateRPath);

for (const auto &CandidateRPath : CandidateRPaths) {
if (TC.getVFS().exists(CandidateRPath)) {
CmdArgs.push_back("-rpath");
CmdArgs.push_back(Args.MakeArgString(CandidateRPath));
}
if (TC.getVFS().exists(CandidateRPath))
addRPathCmdArg(Args, CmdArgs, CandidateRPath, /*onlyIfPathExists=*/false);
}
}

Expand Down
2 changes: 1 addition & 1 deletion clang/test/CodeGen/amdgpu-builtin-is-invocable.c
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
// AMDGCNSPIRV-LABEL: define spir_func void @foo(
// AMDGCNSPIRV-SAME: ) addrspace(4) #[[ATTR0:[0-9]+]] {
// AMDGCNSPIRV-NEXT: [[ENTRY:.*:]]
// AMDGCNSPIRV-NEXT: [[TMP0:%.*]] = load volatile i1, ptr addrspace(1) @llvm.amdgcn.has.gfx10-insts, align 1
// AMDGCNSPIRV-NEXT: [[TMP0:%.*]] = load i1, ptr addrspace(1) @llvm.amdgcn.has.gfx10-insts, align 1
// AMDGCNSPIRV-NEXT: [[TOBOOL:%.*]] = icmp ne i1 [[TMP0]], false
// AMDGCNSPIRV-NEXT: br i1 [[TOBOOL]], label %[[IF_THEN:.*]], label %[[IF_END:.*]]
// AMDGCNSPIRV: [[IF_THEN]]:
Expand Down
Loading