Skip to content

Conversation

@ivymxu
Copy link

@ivymxu ivymxu commented Dec 27, 2025

Consolidate old MSCI subjects into MSE so users see a single course with unified reviews, clearer ratings, and cleaner search results.

Proposed Changes:

Importer canonicalization:

  • Add subject rename map in canonical.go and normalize course codes during fetch and convert

Database migration:

  • Create reusable merge_course(old_code, new_code) function
  • Merge all msci* courses into mse* equivalents
  • Migration files in hasura/migrations/default/1770000000000_merge_msci_to_mse/

Testing:

  • Unit tests pass: go test ./importer/uw/parts/course
  • Confirmed frontend shows only MSE courses with consolidated data

Future usage:

  • Add new renames to subjectRenames map as needed
  • Use merge_course function for other subject renames

@ivymxu ivymxu marked this pull request as ready for review December 27, 2025 20:30
@AD1938 AD1938 self-requested a review December 28, 2025 03:34
Copy link
Member

@AD1938 AD1938 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR! The team noticed this before but did not get a chance to address it.

A few things we might want to consider:

  1. the merge_course function is only one-way (old->new), it might be better for MSCI and MSE courses to appear interchangeably. We might want a way to associate multiple course codes together, so that querying any of them gives all the associated ones.
  2. i assume this solution intends to handle future course merges by applying merge_course functions through Hasura migrations, this would save the mapping information in the hasura/migrations/default. But we might want the DB itself to know the merge information between course codes as well.

Curious to hear your thoughts on this!

@@ -0,0 +1,89 @@
-- Merge MSCI course codes into MSE and make the helper reusable for future renames.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this migration on prod data. Applying it raises some errors due to review_check_course_taken trigger, likely due to some weird data on prod, i need to look into this further and add more details later.

FROM review
WHERE course_id = old_id
ON CONFLICT (course_id, user_id) DO NOTHING;
DELETE FROM review WHERE course_id = old_id;
Copy link
Member

@AD1938 AD1938 Dec 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid removing old courses. If a new user who has taken a MSCI course signs up after we applied this migration. The transcript still records MSCI, the parsing would find no matches for this MSCI course and the user would not be able to leave reviews.

Also, if someone (say an alumni) who has taken MSCI before but does not know the MSCI -> MSE subject code change wants to search some MSCI data, they won't find anything.

UW Flow has been keeping lots of migrated and eliminated course codes. For example, CM is a subject code that is no longer in use but we still keep their data (reviews + ratings)


func canonicalCourseCode(subj, num string) string {
return strings.ToLower(canonicalSubject(subj) + num)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo this is not needed. If MSCI has been changed to MSE, UW API would give us MSE only.

If UW API does give MSCI in the future, it means there are some real MSCI courses being offered, then we should not map them to MSE.

Overall, I don't think importer needs to know any remapping information

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants