Skip to content

Ai interview resume upload #124

Open
Priyamanjare54 wants to merge 3 commits intotarinagarwal:mainfrom
Priyamanjare54:Ai-interview-resume-upload
Open

Ai interview resume upload #124
Priyamanjare54 wants to merge 3 commits intotarinagarwal:mainfrom
Priyamanjare54:Ai-interview-resume-upload

Conversation

@Priyamanjare54
Copy link

📝 PR Description

1. Summary

This PR adds backend support for uploading and parsing candidate resumes (PDF/TXT).
Key features:

  • Extracts name, role, skills, and years of experience from resumes
  • Stores resumes on Cloudflare R2
  • Saves parsed data in the database for generating AI-driven interview questions
  • Includes unit tests for parser functionality

2. Related Issue

Closes #119


3. Type of Change

  • New feature
  • Test update

4. Testing Instructions

  1. Set up .env with R2 credentials and JWT secret
  2. Install dependencies: npm install
  3. Start the server: npm run dev
  4. Test API endpoints with Postman:
    • POST /api/interviews/upload-resume → Upload PDF/TXT resumes
    • GET /api/interviews/candidates → Verify parsed data
  5. Run unit tests: npm test
  6. Verify error handling for invalid file types, empty files, or files >5MB

5. Notes

  • pdf-parse is used for PDF extraction
  • Skills extraction is dynamic, not limited to a static list
  • Ensure R2 environment variables are configured before testing

SWOC 2026 Participant?
YES

@tarinagarwal
Copy link
Owner

@Priyamanjare54 please attach the relevant screenshots

Copy link

@tarin-lgtm tarin-lgtm bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes Requested 🐈

This PR introduces backend support for uploading and parsing candidate resumes, storing them on Cloudflare R2, and saving parsed data for AI-driven interview questions. The implementation includes new API endpoints and unit tests but suffers from critical security issues, missing authorization checks, insufficient input validation, lack of documentation, and performance concerns due to unbounded queries.

There are a few things I'd like to see addressed before we merge this:

Before merging

  1. Add comprehensive API documentation for all new endpoints and exported functions.
  2. Implement proper authorization checks on all endpoints, especially for sensitive data access.
  3. Improve input validation and sanitization to prevent security vulnerabilities and injection attacks.
Findings breakdown (70 total)

2 critical / 14 high / 27 medium / 20 low / 7 info

Confidence: 95%


Reviewed by Looks Good To Meow — AI-powered code review

if (difficulty) where.difficulty = difficulty;
if (creator) where.creatorId = creator;

const labs = await prisma.lab.findMany({
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 Critical — The GET /api/labs endpoint returns all labs without any pagination or limit, which can lead to performance degradation and high memory usage when the dataset grows large.

Implement pagination parameters (e.g., limit and offset or cursor-based pagination) in the query and apply them in the prisma.lab.findMany call to limit the number of labs returned per request.

performance

);

//Get all candidates for the authenticated user
router.get('/candidates', authenticateToken, async (req, res) => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 Critical — The GET /api/interviews/candidates endpoint fetches all candidates for a user without pagination, which can cause performance issues and high memory consumption with many candidates.

Add pagination support (limit and offset or cursor) to this endpoint and apply it in the prisma.interviewCandidate.findMany query to limit the number of candidates returned per request.

performance

/**
* Upload resume file to Cloudflare R2
*/
async function uploadResumeToR2(buffer, filename, mimetype) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ High — AWS S3 credentials (accessKeyId and secretAccessKey) are loaded from environment variables but used directly in the S3 client configuration. If environment variables are leaked or misconfigured, this could expose sensitive credentials.

Ensure environment variables R2_ACCESS_KEY_ID and R2_SECRET_ACCESS_KEY are securely stored and not exposed in logs or error messages. Consider using IAM roles or temporary credentials if possible for better security.

security

/**
* Upload resume file to Cloudflare R2
*/
async function uploadResumeToR2(buffer, filename, mimetype) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ High — Exported function 'uploadResumeToR2' lacks JSDoc documentation including description, parameters, return type, and possible errors.

Add a JSDoc comment for 'uploadResumeToR2' describing its purpose, parameters (buffer, filename, mimetype), return value (public URL string), and errors it may throw.

documentation

* POST /api/interviews/upload-resume
* Upload and parse candidate resume
*/
router.post('/upload-resume', authenticateToken, upload.single('resume'), async (req, res) => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ High — API endpoint POST /api/interviews/upload-resume lacks documentation for HTTP method, path, request body schema, response schema, authentication, and error responses.

Add detailed API documentation for this endpoint including HTTP method, URL path, expected request body (multipart/form-data with 'resume' file), authentication requirement, response structure for success and error cases.

documentation

// Configure S3 client for Cloudflare R2
const s3Client = new S3Client({
region: 'auto',
endpoint: `https://${process.env.R2_ACCOUNT_ID}.r2.cloudflarestorage.com`,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 Medium — Hardcoded S3 bucket name 'alienvault-storage' instead of using environment variable or configuration.

Use an environment variable or configuration file to set the bucket name instead of hardcoding it.

best-practices

limits: {
fileSize: 5 * 1024 * 1024, // 5MB limit
},
fileFilter: (req, file, cb) => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 Medium — File filter callback uses generic Error with message that may leak implementation details to clients.

Use a custom error class or error code to distinguish file type errors and handle them gracefully in the route handler.

best-practices

* POST /api/interviews/upload-resume
* Upload and parse candidate resume
*/
router.post('/upload-resume', authenticateToken, upload.single('resume'), async (req, res) => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 Medium — No validation on req.file.originalname or req.file.mimetype before using them in uploadResumeToR2 and database save.

Add validation and sanitization for file name and mimetype to prevent injection or unexpected values.

best-practices

message: 'Resume uploaded and parsed successfully'
});

} catch (error) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 Medium — Error handling in POST /upload-resume route uses string matching on error.message which is brittle and may miss errors.

Use custom error classes or error codes to handle specific errors more reliably.

best-practices



//Get specific candidate details
router.get('/candidates/:id', authenticateToken, async (req, res) => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 Medium — No validation on req.params.id format in GET /candidates/:id route before querying database.

Validate id format (e.g., UUID or ObjectId) before database query to avoid unnecessary queries and potential errors.

best-practices

Copy link

@tarin-lgtm tarin-lgtm bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes Requested 🐈

This PR adds backend support for uploading and parsing candidate resumes, storing them on Cloudflare R2, and saving parsed data for AI-driven interview questions. The implementation is functional but has critical issues including missing input validation, incomplete error handling for file uploads, and lack of API documentation. Security and maintainability concerns must be addressed before merging.

There are a few things I'd like to see addressed before we merge this:

Before merging

  1. Add comprehensive input validation and sanitization for all user inputs, especially in resume upload route.
  2. Implement explicit error handling for multer middleware errors to prevent unhandled exceptions.
  3. Provide detailed API documentation for all new endpoints, including HTTP methods, request/response schemas, authentication, and error handling.
Findings breakdown (65 total)

2 critical / 14 high / 22 medium / 23 low / 4 info

Confidence: 95%


🔗 View Full Review Report — detailed findings, severity breakdown, and agent analysis

Reviewed by Looks Good To Meow — AI-powered code review


💬 You can interact with me directly in this PR:

  • @tarin-lgtm fix [any constraints]
  • @tarin-lgtm explain [your question]
  • @tarin-lgtm improve [focus area]
  • @tarin-lgtm test [what to focus on]

});

// --- GET ALL Labs ---
router.get("/", async (req, res) => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 Critical — The GET /api/labs endpoint returns all labs matching filters without pagination, which can cause performance degradation and high memory usage with large datasets.

Implement pagination using query parameters (e.g., page, limit) and apply them in the prisma.lab.findMany query to limit the number of records returned per request.

performance

);

//Get all candidates for the authenticated user
router.get('/candidates', authenticateToken, async (req, res) => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 Critical — The GET /api/interviews/candidates endpoint fetches all candidates for a user without pagination, which can lead to performance issues and high memory usage if the user has many candidates.

Add pagination parameters (e.g., page, limit) to the endpoint and apply them in the prisma.interviewCandidate.findMany query to limit the number of candidates returned per request.

performance

* POST /api/interviews/upload-resume
* Upload and parse candidate resume
*/
router.post('/upload-resume', authenticateToken, upload.single('resume'), async (req, res) => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ High — No validation or sanitization of user input fields in the resume upload route beyond file type and size; parsedData fields are used directly in database insert, which may lead to injection or invalid data.

Add validation and sanitization for parsedData fields (name, role, skills, yearsOfExperience) before saving to database to ensure data integrity and security.

best-practices

* POST /api/interviews/upload-resume
* Upload and parse candidate resume
*/
router.post('/upload-resume', authenticateToken, upload.single('resume'), async (req, res) => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ High — The route handler for '/upload-resume' uses multer middleware for file upload but does not handle errors from multer middleware explicitly. If multer encounters an error (e.g., invalid file type), it will pass the error to Express error handler, but the current try-catch block inside the async handler will not catch multer errors thrown before the handler is invoked.

Add an error handling middleware for multer or wrap multer middleware to catch errors and respond with appropriate HTTP status and message. Alternatively, handle multer errors in a centralized error handler to provide consistent error responses.

bugs

/**
* Upload resume file to Cloudflare R2
*/
async function uploadResumeToR2(buffer, filename, mimetype) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ High — The exported function 'uploadResumeToR2' lacks JSDoc documentation. It should include a description, parameter types and descriptions, return type, and possible thrown errors.

Add JSDoc for 'uploadResumeToR2' describing its purpose (uploading resume to Cloudflare R2), parameters (buffer: Buffer, filename: string, mimetype: string), return value (string URL), and errors it may throw.

documentation

* POST /api/interviews/upload-resume
* Upload and parse candidate resume
*/
router.post('/upload-resume', authenticateToken, upload.single('resume'), async (req, res) => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 Medium — Catch block in /upload-resume route uses generic error message for unexpected errors, which may hinder debugging and user feedback.

Include error details in logs and consider returning more informative error messages or error codes to clients while avoiding sensitive information exposure.

best-practices

);

//Get all candidates for the authenticated user
router.get('/candidates', authenticateToken, async (req, res) => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 Medium — No validation on query parameters or user ID in /candidates GET route; potential for invalid or malicious input.

Validate and sanitize query parameters and user ID before using them in database queries.

best-practices



//Get specific candidate details
router.get('/candidates/:id', authenticateToken, async (req, res) => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 Medium — No validation on candidate ID parameter in /candidates/:id GET route; could lead to invalid queries or errors.

Validate and sanitize the candidate ID parameter before querying the database.

best-practices

);

// Delete a candidate
router.delete('/candidates/:id', authenticateToken, async (req, res) => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 Medium — No validation on candidate ID parameter in DELETE /candidates/:id route; could cause unexpected behavior or errors.

Validate and sanitize the candidate ID parameter before performing delete operation.

best-practices

);

//Update candidate information
router.patch('/candidates/:id', authenticateToken, async (req, res) => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 Medium — No validation or sanitization on request body fields (name, role, skills, yearsOfExperience) in PATCH /candidates/:id route before updating database.

Add validation and sanitization for these fields to ensure data integrity and prevent injection attacks.

best-practices

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] AI Interview-Resume Upload & Basic Parsing

2 participants