Skip to content

Code Review #33

@4rthurCai

Description

@4rthurCai

Code Review Report

Date: 2025-12-01
Generated by Claude Code.


High Severity Issues

1. CSRF Protection Disabled for Authentication Endpoints

File: apps/auth/views.py:27-29, 41, 147, 472, 512
Severity: High
Type: Authentication Bypass / CSRF

Issue:

class CsrfExemptSessionAuthentication(SessionAuthentication):
    def enforce_csrf(self, request):
        return  # CSRF protection completely disabled

@api_view(["POST"])
@authentication_classes([CsrfExemptSessionAuthentication])  # Used on auth endpoints
@permission_classes([AllowAny])
def auth_initiate_api(request):

Impact:

  • Attackers can perform CSRF attacks against authentication endpoints
  • Users can be tricked into initiating auth flows, logging in, or logging out
  • While Turnstile provides some protection, CSRF tokens are the standard defense

Recommendation:
Use Django's standard CSRF protection with proper CSRF token handling in the frontend. If you must exempt CSRF for specific technical reasons, document why and add compensating controls:

# Only exempt CSRF if absolutely necessary and add alternative protections
# Consider using custom headers or origin validation instead

2. Missing Authorization Check on User Review Endpoints

File: apps/web/views.py:346-359
Severity: High
Type: Broken Authorization

Issue:

def get_queryset(self):
    """Only reviews belonging to the authenticated user with vote annotations."""
    return Review.objects.with_votes(
        request_user=self.request.user, user=self.request.user
    )

The queryset filtering is correct, but Django REST Framework's RetrieveModelMixin could potentially be vulnerable if the lookup is manipulated.

Impact:

  • While current implementation looks safe due to queryset filtering, it relies entirely on the ORM
  • No explicit authorization check in retrieve/update/delete operations
  • Potential for IDOR (Insecure Direct Object Reference) vulnerabilities

Recommendation:
Add explicit permission checks:

def get_object(self):
    obj = super().get_object()
    # Explicit authorization check
    if obj.user != self.request.user:
        raise PermissionDenied("You don't have permission to access this review")
    return obj

Medium Severity Issues

3. Race Condition in Vote System

File: apps/web/models/vote.py:10-41
Severity: Medium
Type: Race Condition / Data Integrity

Issue:

@transaction.atomic
def vote(self, value, course_id, category, user):
    # No select_for_update lock
    course = Course.objects.get(id=course_id)
    vote, created = self.get_or_create(course=course, category=category, user=user)

    # Race condition: concurrent votes can corrupt scores
    if not created:
        if category == Vote.CATEGORIES.QUALITY:
            course.quality_score -= vote.value  # Not atomic with read

Impact:

  • Under concurrent voting, scores can become incorrect
  • Two simultaneous votes could:
    1. Both read the same initial score
    2. Both modify it independently
    3. Last write wins, losing one vote's contribution
  • Data integrity violation

Recommendation:

@transaction.atomic
def vote(self, value, course_id, category, user):
    # Lock the course row to prevent concurrent modifications
    course = Course.objects.select_for_update().get(id=course_id)
    vote, created = self.get_or_create(course=course, category=category, user=user)
    # ... rest of logic

4. N+1 Query Problem in Course Serializer

File: apps/web/serializers.py:237-243, 138, 246, 256-274
Severity: Medium
Type: Performance / N+1 Queries

Issue:

def get_review_set(self, obj):
    request = self.context.get("request")
    if request and request.user.is_authenticated:
        return ReviewSerializer(
            obj.review_set.all(), many=True, context=self.context
        ).data  # Triggers N+1 queries for each review's votes
    return []

def get_review_count(self, obj):
    return obj.review_set.count()  # Separate query instead of annotation

Impact:

  • For a course with 50 reviews, this generates 50+ additional queries
  • Severe performance degradation as review counts grow
  • API response times increase linearly with review count

Recommendation:

# In the view's get_queryset:
queryset = Course.objects.annotate(
    review_count=Count('review')
).prefetch_related(
    Prefetch('review_set', queryset=Review.objects.with_votes(request.user))
)

# In serializer:
def get_review_count(self, obj):
    return obj.review_count  # Use annotation

5. Inefficient Average Calculation

File: apps/web/models/vote.py:43-52
Severity: Medium
Type: Performance

Issue:

def _calculate_average_score(self, course, category):
    votes = self.filter(course=course, category=category)
    if not votes.exists():
        return 0

    total_score = sum(vote.value for vote in votes)  # Loads all votes into memory
    vote_count = votes.count()  # Separate query
    return round(total_score / vote_count, 1)

Impact:

  • Loads all vote objects into memory
  • For popular courses with 1000+ votes, this is wasteful
  • Makes two database queries instead of one

Recommendation:

from django.db.models import Avg

def _calculate_average_score(self, course, category):
    result = self.filter(course=course, category=category).aggregate(
        avg_score=Avg('value')
    )
    avg = result['avg_score']
    return round(avg, 1) if avg is not None else 0

6. Missing Input Validation on Vote Value

File: apps/web/views.py:499-514
Severity: Medium
Type: Input Validation

Issue:

@api_view(["POST"])
@permission_classes([IsAuthenticated])
def course_vote_api(request, course_id):
    try:
        value = request.data["value"]  # No validation here
        forLayup = request.data["forLayup"]
    except KeyError:
        # ...

    # int() can raise ValueError
    # Value validation happens later in vote() which silently returns None
    new_score, is_unvote, new_vote_count = Vote.objects.vote(
        int(value), course_id, category, request.user
    )

Impact:

  • Non-integer values cause ValueError
  • Out-of-range values (e.g., 1000) fail silently
  • Poor error messages for API consumers

Recommendation:

def course_vote_api(request, course_id):
    try:
        value = int(request.data["value"])
        forLayup = request.data["forLayup"]
    except (KeyError, ValueError, TypeError):
        return Response(
            {"detail": "Invalid request. 'value' must be an integer between 1 and 5, 'forLayup' must be boolean"},
            status=400
        )

    if not 1 <= value <= 5:
        return Response({"detail": "Vote value must be between 1 and 5"}, status=400)

    # ... rest of code

7. Denial of Service - No Timeout on HTTP Requests

File: apps/spider/utils.py:36
Severity: Medium
Type: Denial of Service

Issue:

def retrieve_soup(url, data=None, preprocess=lambda x: x):
    print(url)
    if data is not None:
        data = data.encode("utf-8")
    with urllib_request.urlopen(url, data=data) as response:  # No timeout!
        return BeautifulSoup(preprocess(response.read().decode("utf-8")), "html.parser")

Impact:

  • If the remote server doesn't respond, this will hang indefinitely
  • Can block Celery workers permanently
  • Potential for resource exhaustion

Recommendation:

with urllib_request.urlopen(url, data=data, timeout=30) as response:
    return BeautifulSoup(preprocess(response.read().decode("utf-8")), "html.parser")

8. Crash on Missing Hardcoded User

File: apps/analytics/views.py:24
Severity: Medium
Type: Error Handling

Issue:

@staff_member_required
def home(request):
    course_picker = User.objects.get(username="CoursePicker")  # Crashes if not found!

Impact:

  • If "CoursePicker" user doesn't exist, the entire analytics dashboard crashes
  • User.DoesNotExist exception will be raised
  • Hardcoded dependency on specific database state

Recommendation:

course_picker = User.objects.filter(username="CoursePicker").first()
if course_picker:
    # filter out course_picker from queries
else:
    # handle case where user doesn't exist

9. Division by Zero / ValueError in Random Selection

File: apps/analytics/views.py:201-203
Severity: Medium
Type: Error Handling

Issue:

unlabeled_reviews = models.Review.objects.filter(...).exclude(...)
count = unlabeled_reviews.count()
random_index = randint(0, count - 1)  # Crashes if count == 0!
review = unlabeled_reviews[random_index]

Impact:

  • If no unlabeled reviews exist, randint(0, -1) raises ValueError
  • Crashes the sentiment labeler page
  • No graceful handling of empty state

Recommendation:

count = unlabeled_reviews.count()
if count == 0:
    return render(request, "sentiment_labeler.html", {"no_reviews": True})
random_index = randint(0, count - 1)

10. N+1 Query in Recommendations Task

File: apps/recommendations/tasks.py:95, 113
Severity: Medium
Type: Performance

Issue:

for i in range(psarray.shape[0]):
    current_class = Course.objects.get(id=course_ids[i])  # Query inside loop!
    # ...
    for other_i in np.argpartition(...):
        course_id = course_ids[other_i]
        # Another potential query

Impact:

  • For 1000 courses, this runs 1000+ individual queries
  • Significantly slows down recommendation generation
  • Inefficient database usage

Recommendation:

# Pre-fetch all courses into a dictionary
courses_dict = {c.id: c for c in Course.objects.filter(id__in=course_ids)}

for i in range(psarray.shape[0]):
    current_class = courses_dict[course_ids[i]]  # O(1) lookup

Low Severity Issues

11. Missing Permission Checks on Read-Only Endpoints

File: apps/web/views.py:396, 436, 459
Severity: Low
Type: Missing Access Control

Issue:

@api_view(["GET"])
def medians(request, course_id):  # No @permission_classes decorator
    # ...

@api_view(["GET"])
def course_professors(request, course_id):  # No permission check
    # ...

@api_view(["GET"])
def course_instructors(request, course_id):  # No permission check
    # ...

Impact:

  • These endpoints are accessible to unauthenticated users
  • May expose information you want to restrict
  • Inconsistent with other endpoints that require authentication

Recommendation:
Decide on intended access level and add decorators:

@api_view(["GET"])
@permission_classes([AllowAny])  # Explicit is better than implicit
def medians(request, course_id):
    # ...

Or restrict to authenticated users if appropriate:

@api_view(["GET"])
@permission_classes([IsAuthenticated])
def course_professors(request, course_id):
    # ...

12. Broad Exception Handling

File: apps/web/views.py:575-579
Severity: Low
Type: Error Handling

Issue:

except Exception:
    return Response(
        {"detail": "An error occurred processing your request"},
        status=500,
    )

Impact:

  • Catches all exceptions, including KeyboardInterrupt, SystemExit
  • Masks programming errors
  • Makes debugging difficult
  • Generic error message provides no useful information

Recommendation:

except (DatabaseError, ValueError) as e:
    logger.exception("Error processing review vote")
    return Response(
        {"detail": "An error occurred processing your request"},
        status=500,
    )

13. Potential N+1 in Course.get_instructors()

File: apps/web/models/course.py:185-208
Severity: Low
Type: Performance

Issue:

def get_instructors(self, term=CURRENT_TERM):
    instructors = []
    offerings = self.courseoffering_set.all()  # Prefetch needed

    if term:
        offerings = offerings.filter(term=term)

    for offering in offerings:
        for instructor in offering.instructors.all():  # N+1 query
            instructors.append(instructor)

Impact:

  • If not prefetched, generates one query per offering
  • Called in serializers which could amplify the problem

Recommendation:

def get_instructors(self, term=CURRENT_TERM):
    instructors = []
    offerings = self.courseoffering_set.prefetch_related('instructors').all()

    if term:
        offerings = offerings.filter(term=term)

    # Use set for O(1) lookups instead of list
    seen_ids = set()
    unique_instructors = []
    for offering in offerings:
        for instructor in offering.instructors.all():
            if instructor.id not in seen_ids:
                seen_ids.add(instructor.id)
                unique_instructors.append(instructor)

    return unique_instructors

14. Log Injection Vulnerability

File: apps/web/views.py:564
Severity: Low (Now Fixed)
Type: Log Injection

Recommendation:

logger.warning("Review %s not found for voting", str(review_id).replace('\n', '').replace('\r', ''))

Already posed by copilot check.


Additional Recommendations

Code Quality Improvements

  1. Use Database Indexes: Ensure frequently queried fields have indexes

    • Review.professor (already indexed ✓)
    • Review.term (already indexed ✓)
    • Consider composite indexes for common query patterns
  2. Add Rate Limiting: Implement rate limiting on voting endpoints

    # Add throttling classes to views
    from rest_framework.throttling import UserRateThrottle
    
    class VoteThrottle(UserRateThrottle):
        rate = '100/hour'
  3. Add Monitoring: Implement logging and monitoring for:

    • Failed authentication attempts
    • Suspicious voting patterns
    • Slow queries
    • Exception rates
  4. Security Headers: Add security headers in settings

    SECURE_BROWSER_XSS_FILTER = True
    SECURE_CONTENT_TYPE_NOSNIFF = True
    X_FRAME_OPTIONS = 'DENY'
  5. Input Sanitization: While Django ORM prevents SQL injection, ensure:

    • User-generated content is properly escaped in templates
    • External API responses are validated
    • File uploads (if any) are properly validated

Summary Statistics

  • High Severity Issues: 2
  • Medium Severity Issues: 8
  • Low Severity Issues: 4
  • Total Issues: 14

Metadata

Metadata

Labels

bugSomething isn't workinghelp wantedExtra attention is needed

Type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions