A fully automated, 24/7 participant recruitment pipeline integrating REDCap and Microsoft Graph API. This Python-based system autonomously manages eligibility screening, Study ID assignment (HC/MDD), personalized email communications via Microsoft Bookings, and comprehensive weekly enrollment reporting.
- Overview
- Key Features
- System Architecture and Workflow
- Core Components
- Technology Stack
- Prerequisites
- Installation and Setup
- Configuration
- Microsoft Graph Authentication
- Deployment (Systemd)
- Usage and Monitoring
This pipeline automates the traditionally manual process of participant recruitment. By integrating REDCap with the Microsoft Graph API, it provides a seamless flow from initial screening to appointment booking, ensuring timely communication and accurate data management.
- 24/7 Autonomous Operation: Runs continuously as
systemdservices with automatic restart on failure. - Real-Time Eligibility Screening: Immediate evaluation based on centralized criteria (Age, Location, Language, Contraindications, QIDS scores).
- Dynamic Study ID Assignment: Automated assignment of IDs based on group (e.g., Healthy Control: 3000+, MDD: 10200+).
- Microsoft Graph Integration: Secure email communication using OAuth2 with automatic token refresh (MSAL).
- Microsoft Bookings Scheduling: Sends personalized invitations with links for participants to self-schedule E-Consent sessions.
- Privacy Protection: Instructs participants to use only their Study ID (not their real name) when booking appointments.
- Professional Notifications: Automated, neutral communications for ineligible participants.
- Comprehensive Reporting: Generates detailed weekly HTML reports with enrollment funnel visualizations and metrics.
The pipeline operates as a series of stateless microservices that use REDCap as the Single Source of Truth (SSOT).
graph TD
U[Participant] -->|Submits Screening Form| R(REDCap - SSOT)
subgraph Stateless Services
S1(S1: ID Assigner Service)
S2(S2: Invitation Service)
S3(S3: Ineligible Notification Service)
end
R --o|Polls Pending Records| S1
S1 -->|Evaluates Eligibility & Assigns ID| S1
S1 -->|Updates REDCap: StudyID, Status| R
R --o|Polls Eligible Records| S2
S2 -->|Updates REDCap: Invitation Timestamp| R
R --o|Polls Ineligible Records| S3
S3 -->|Updates REDCap: Notification Timestamp| R
S2 -->|Send Invitation Email| MSAPI(Microsoft Graph API)
S3 -->|Send Notification Email| MSAPI
MSAPI --> E(Email to Participant)
style R fill:#f9f,stroke:#333,stroke-width:2px
style S1 fill:#9cf,stroke:#333,stroke-width:2px
style S2 fill:#9cf,stroke:#333,stroke-width:2px
style S3 fill:#9cf,stroke:#333,stroke-width:2px
- A participant submits the REDCap screening form.
- [S1] ID Assigner polls REDCap for pending records, evaluates eligibility using
EligibilityChecker:- Eligible: Updates REDCap with Study ID and
pipeline_processing_status = 'eligible_id_assigned' - Ineligible: Updates REDCap with
pipeline_processing_status = 'ineligible'and reasons - Review Required: Updates REDCap with
pipeline_processing_status = 'manual_review_required'
- Eligible: Updates REDCap with Study ID and
- [S2] Invitation Service polls REDCap for
pipeline_processing_status = 'eligible_id_assigned'. Sends Microsoft Bookings invitations via MS Graph API and updates REDCap withpipeline_invitation_sent_timestampandpipeline_processing_status = 'eligible_invited'. - [S3] Ineligible Notification Service polls REDCap for
pipeline_processing_status = 'ineligible'. Sends notifications via MS Graph API and updates REDCap withpipeline_ineligible_notification_sent_timestampandpipeline_processing_status = 'ineligible_notified'.
- No Local State: All services are stateless - no SQLite databases for tracking participant status
- REDCap as Truth: All participant state is stored in REDCap fields
- Idempotent Operations: Services can be restarted at any time without losing state
- Concurrency Safety: Built-in retry mechanisms for handling race conditions during ID assignment
- Independent Authentication: Each service manages its own MSAL token cache
| Service | Filename | Description |
|---|---|---|
| S1: ID Assigner | eligible_id_assigner.py |
The core service. Polls REDCap for pending records, applies eligibility rules with QIDS validation, assigns Study IDs (HC/MDD) with concurrency handling, and updates REDCap status fields. |
| S2: Invitation Service | outlook_autonomous_scheduler.py |
Monitors REDCap for eligible participants (pipeline_processing_status = 'eligible_id_assigned') and sends personalized invitations with Microsoft Bookings links. Uses independent MSAL authentication with automatic token refresh. |
| S3: Ineligible Notifier | send_ineligible_emails_fixed.py |
Monitors REDCap for ineligible participants (pipeline_processing_status = 'ineligible') and sends professional notifications. Has independent MSAL authentication - not dependent on S2. |
| Module | Filename | Description |
|---|---|---|
| Eligibility Checker | eligibility_checker.py |
Contains the centralized business logic and criteria for determining participant eligibility. |
| REDCap Client | redcap_client.py |
A wrapper class for handling all interactions with the REDCap API (exporting/importing records). |
- Languages: Python 3.10+
- Data Source: REDCap (via API) - Single Source of Truth
- External APIs: Microsoft Graph API (Email sending, OAuth2), Microsoft Bookings
- Libraries:
requests,msal(Microsoft Authentication Library),pandas,matplotlib,seaborn,python-dotenv,urllib3(for retry strategies) - Persistence: REDCap fields only (no local SQLite databases)
- Authentication: MSAL with file-based token caching (
.auth_cache_*.json) - Deployment: Systemd user services
- Python: Version 3.10 or higher
- REDCap: An active project with API access enabled (Token required)
- Azure AD Application: A registered application with the following Delegated permissions for Microsoft Graph:
Mail.Send.Shared(To send mail on behalf of the lab account)Mail.SendUser.Read
- Operating System: Linux environment capable of running
systemduser services
-
Clone the repository:
git clone https://github.com/PrecisionNeuroLab/Automated-REDCap-Recruitment-Pipeline.git cd Automated-REDCap-Recruitment-Pipeline -
Set up a Python virtual environment and install dependencies:
python3 -m venv venv source venv/bin/activate pip install requests python-dotenv pandas matplotlib seaborn msal urllib3 -
Create necessary directories:
mkdir -p logs reports/charts
-
Configure REDCap Fields (Required for SSOT):
Ensure your REDCap project has these fields for state tracking:
pipeline_processing_status(Radio/Dropdown): Values: 'pending', 'eligible_id_assigned', 'ineligible', 'manual_review_required', 'eligible_invited', 'ineligible_notified'pipeline_ineligibility_reasons(Text)pipeline_invitation_sent_timestamp(Datetime)pipeline_ineligible_notification_sent_timestamp(Datetime)
Create a .env file in the root directory and add the necessary credentials.
# REDCap Configuration
REDCAP_API_URL=https://redcap.your_institution.edu/api/
REDCAP_API_TOKEN=YOUR_REDCAP_API_TOKEN
# Azure AD / Microsoft Graph Configuration
# (The Tenant and Client IDs below are specific to the provided implementation)
AZURE_TENANT_ID=396573cb-f378-4b68-9bc8-15755c0c51f3
AZURE_CLIENT_ID=3d360571-8a54-4a1b-9373-58d35333d068
AZURE_CLIENT_SECRET=YOUR_AZURE_APPLICATION_CLIENT_SECRET
# Optional: Allow test emails (bypass rate limits for specific addresses if set to 'true')
ALLOW_TEST_EMAILS=falseThe pipeline uses delegated authentication via MSAL to send emails autonomously. Each service manages its own authentication independently.
-
Ensure Redirect URI: Verify that the Azure Application's redirect URI is set to
http://localhost:8000for the initial authentication flow. -
Initial Authentication for Each Service:
Each service needs to be authenticated once:
# Authenticate the Invitation Service python outlook_autonomous_scheduler.py --test # Authenticate the Ineligible Notification Service python send_ineligible_emails_fixed.py --once
-
Browser Login: A browser window will open for each service. Log in using the account that has delegation rights (e.g.,
tristan8@stanford.edu) to the sender account (e.g.,kellerlab@stanford.edu). -
Token Storage: Each service stores its authentication tokens in separate MSAL cache files:
- Invitation Service:
.auth_cache_scheduler.json - Ineligible Notification Service:
.auth_cache_ineligible.json
- Invitation Service:
The services will automatically refresh access tokens as needed using MSAL's built-in token management. Each service is independent and doesn't rely on other services for authentication.
For robust, 24/7 operation, the pipeline is designed to run as user-level systemd services. This ensures the services start automatically at boot and restart if they fail.
Note: Setting up the specific .service files requires defining the ExecStart paths based on your installation directory. Refer to the project documentation for example configurations.
-
Enable User Lingering:
This allows user services to run even when the user is not logged in.
loginctl enable-linger $USER -
Install and Start Services:
After placing the
.servicedefinition files into~/.config/systemd/user/:systemctl --user daemon-reload systemctl --user enable redcap-id-assigner redcap-email-scheduler redcap-ineligible-emails systemctl --user start redcap-id-assigner redcap-email-scheduler redcap-ineligible-emails
You can run services manually if needed.
# Run ID Assigner once
python eligible_id_assigner.py --once
# Run Invitation Scheduler continuously
python outlook_autonomous_scheduler.pyCheck the status of the services using systemctl:
systemctl --user status redcap-id-assigner
systemctl --user status redcap-email-schedulerTo view real-time logs using journalctl (if deployed via systemd):
# View ID Assigner logs
journalctl --user -u redcap-id-assigner -f
# View Invitation Service logs
journalctl --user -u redcap-email-scheduler -fLogs are also written directly to the ./logs directory (e.g., logs/outlook_autonomous.log).
The pipeline consists of three core services:
- ID Assigner Service (
eligible_id_assigner.py) - Invitation Service (
outlook_autonomous_scheduler.py) - Ineligible Notification Service (
send_ineligible_emails_fixed.py)
All services query REDCap directly for current state (SSOT architecture).