Skip to content

Feature/sample mul matrix#1

Open
kamil-goras-mobica wants to merge 15 commits intomainfrom
feature/sample_mul_matrix
Open

Feature/sample mul matrix#1
kamil-goras-mobica wants to merge 15 commits intomainfrom
feature/sample_mul_matrix

Conversation

@kamil-goras-mobica
Copy link
Collaborator

No description provided.

Comment on lines 24 to 28
struct Matrix
{
unsigned int mRows;
unsigned int mCols;
std::vector<int> mData;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preference to use data independent structure which will accept basic OpenCL types as follows: cl_[u]short, cl_[u]int, cl_[u]long, cl_half, cl_float, cl_double

Comment on lines 41 to 47
std::vector<int> data;
data.reserve(size);
for (unsigned int i = 0; i < size; ++i)
{
data.push_back(distr(generator));
}
mData = std::move(data);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the point to introduce second vector and perform move operation ? I see this shouldn't be a performance hit but still additional code may confuse WG reviewers.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shajder This is corrected by introducing another constructor

Comment on lines 79 to 80
Matrix& A;
Matrix& B;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preference to use smart pointers

Comment on lines 119 to 123
std::string name;
platform.getInfo(CL_PLATFORM_NAME, &name);
std::cout << "Platform: " << name << std::endl;

std::vector<cl::Device> platformDevices;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we move reusable variables outside the loop content ?

{
for (auto& context : mContexts)
{
mPrograms.emplace_back(cl::Program(context, mKernelSource, true));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preference to call cl::Program::build explicitly to log possible errors with cl::BuildError

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shajder This should be corrected

Comment on lines 201 to 202
cl::Buffer resultMatBuffer(mContexts[gpu], CL_MEM_WRITE_ONLY,
mMatDims.fstMtxRows * mMatDims.sndMtxCols * sizeof(int));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug spotted around here, what is missing is initialization of result buffer with enqueueFillBuffer. This will eventually accumulate garbage from uninitialized buffer during operation of matrix merging.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shajder I have added enqueueFillBuffer

}


std::lock_guard<std::mutex> lk(m);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't fully investigate that but I have an observation which makes me think this makes all the threads running in sync.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shajder I have tested it with sleeping one one thread and in console it is visible that one thread finishes and it it waiting for another.

Comment on lines 286 to 287
std::transform(result.mData.begin(), result.mData.end(), helper.mData.begin(),
result.mData.begin(), std::plus<int>());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accumulating garbabe data from unitialized result buffer above

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shajder enqueueFillBuffer was added as you told in previous comment.

bool MatrixMultiplication::LoadKernel()
{
//std::string fileName("C:\\RND200\\OpenCL-SDK\\samples\\core\\mulmatrix\\mulMatrix.cl");
std::string fileName("mulMatrix.cl");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is minor but I think it is slightly more convenient to have such a small kernel source in the same source file

Comment on lines 396 to 399
unsigned int rows1 = matDims.fstMtxRows;
unsigned int cols1 = matDims.fstMtxCols;
unsigned int rows2 = matDims.sndMtxRows;
unsigned int cols2 = matDims.sndMtxCols;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preference for validation of maximum ranges to avoid vague errors as result

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shajder I introduced validation

@kamil-goras-mobica kamil-goras-mobica force-pushed the feature/sample_mul_matrix branch from 5161df4 to 7067f0c Compare January 8, 2024 13:18
@kamil-goras-mobica kamil-goras-mobica force-pushed the feature/sample_mul_matrix branch from 9685cdd to 16716fd Compare January 22, 2024 12:04
@kamil-goras-mobica kamil-goras-mobica force-pushed the feature/sample_mul_matrix branch from 16716fd to 00c74c9 Compare January 22, 2024 12:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants