A Python tool for analyzing meeting recordings from the Grain API. This tool fetches meeting data, analyzes speaking patterns, participant engagement, and generates comprehensive reports.
- Fetches meeting recordings from Grain API within a specified date range
- Analyzes speaking time for internal vs external participants
- Identifies who spoke first in meetings
- Calculates speaking wait times and participant join spread
- Classifies meetings by speaking patterns (internal only, external only, both, or no speech)
- Participant classification using scope field (internal/external/unknown) with intelligent fallback
- Participant emails - Lists all participant emails for each meeting
- Meeting owner identification - Identifies meeting owner from internal participants
- Tracks no-shows (participants scheduled but didn't attend)
- Calculates average handle time (using meeting duration as proxy)
- Lateness tracking (with limitations - see API Limitations section)
- Exports results to CSV for further analysis
- Python 3.8 or higher
- A Grain API key (Personal Access Token)
- Access to a Grain workspace
-
Clone or download this repository
-
Create a virtual environment (recommended):
python -m venv venv
-
Activate the virtual environment:
- On macOS/Linux:
source venv/bin/activate - On Windows:
venv\Scripts\activate
- On macOS/Linux:
-
Install dependencies:
pip install -r requirements.txt
- Log in to your Grain account
- Navigate to API settings (Under Settings > Integration > API)
- Generate a Personal Access Token (PAT) or Workspace Access Token (WAT)
- Copy the token - you'll need it to run the analysis
All parameters can be configured via command-line arguments or environment variables. Command-line arguments take precedence over environment variables.
- API Key: Your Grain Personal Access Token (workspace is determined automatically from the API key)
- Start Date: Start date in
YYYY-MM-DDformat (inclusive) - End Date: End date in
YYYY-MM-DDformat (exclusive)
python run_analysis.py \
--api-key "your-api-key-here" \
--start-date "2025-10-19" \
--end-date "2025-11-20"Set environment variables:
export GRAIN_API_KEY="your-api-key-here"
export GRAIN_START_DATE="2025-10-19"
export GRAIN_END_DATE="2025-11-20"On Windows:
set GRAIN_API_KEY=your-api-key-here
set GRAIN_START_DATE=2025-10-19
set GRAIN_END_DATE=2025-11-20Then run:
python run_analysis.pyYou can use environment variables for some parameters and command-line arguments for others. CLI arguments override environment variables:
export GRAIN_API_KEY="your-api-key-here"
python run_analysis.py --start-date "2025-10-19" --end-date "2025-11-20"Run with all required parameters:
python run_analysis.py \
--api-key "your-api-key" \
--start-date "2025-10-19" \
--end-date "2025-11-20"python run_analysis.py \
--api-key "your-api-key" \
--start-date "2025-10-19" \
--end-date "2025-11-20" \
--simultaneous-threshold 10 \
--output "custom_output.csv"python run_analysis.py --help# Set all parameters via environment variables
export GRAIN_API_KEY="grain_pat_your_key_here"
export GRAIN_START_DATE="2025-10-19"
export GRAIN_END_DATE="2025-11-20"
# Run the analysis
python run_analysis.py
# Or use command-line arguments
python run_analysis.py \
--api-key "grain_pat_your_key_here" \
--start-date "2025-10-19" \
--end-date "2025-11-20" \
--output "my_analysis_results.csv"You can also import and use the function in your own scripts:
from meeting_data_analysis import meeting_data_analysis
df = meeting_data_analysis(
api_key="your-api-key",
start_date="2025-10-19",
end_date="2025-11-20",
simultaneous_threshold_seconds=5
)
# Process the DataFrame
print(df.head())
df.to_csv("my_analysis.csv", index=False)--api-keyorGRAIN_API_KEY: Your Grain API Personal Access Token (workspace is determined automatically)--start-dateorGRAIN_START_DATE: Start date inYYYY-MM-DDformat (inclusive)--end-dateorGRAIN_END_DATE: End date inYYYY-MM-DDformat (exclusive)
--simultaneous-threshold: Threshold in seconds to consider two speakers as speaking simultaneously (default: 5)--output: Output CSV file path (default:meeting_data_analysis.csv)
When using meeting_data_analysis() as a Python function:
api_key(str, required): Your Grain API Personal Access Token (workspace is determined automatically)start_date(str, required): Start date inYYYY-MM-DDformat (inclusive)end_date(str, required): End date inYYYY-MM-DDformat (exclusive)simultaneous_threshold_seconds(int, optional): Threshold in seconds to consider two speakers as speaking simultaneously. Default: 5
The analysis generates a CSV file (meeting_data_analysis.csv) with the following columns:
internal_user_ids: List of internal user IDs who participatedrecording_id: Unique identifier for the recordingstart_datetime: When the meeting startedduration_minutes: Meeting duration in minutesmeeting_owner_email: Email of the meeting owner (first internal participant)meeting_owner_name: Name of the meeting ownerparticipant_emails: List of all participant emails for the meeting (sorted)num_speakers: Total number of unique speakersnum_internal_speakers_who_spoke: Number of internal participants who spokenum_external_speakers_who_spoke: Number of external participants who spokeinternal_speaking_minutes: Total internal speaking time in minutesexternal_speaking_minutes: Total external speaking time in minutesspeaking_category: Classification (e.g., "Both spoke", "Only internal spoke", "Only external spoke", "No one spoke")internal_speaking_pct: Percentage of total speaking time by internal participantsnum_internal_participants: Total number of internal participantsnum_external_participants: Total number of external participantsfirst_internal_spoke_time: Timestamp when first internal participant spokefirst_external_spoke_time: Timestamp when first external participant spokewho_spoke_first: Classification of who spoke first ("internal", "external", "simultaneous", "only_internal", "only_external", "no_speech")speaking_wait_time_minutes: Time difference between first internal and external speech (in minutes)first_join_time: When the first participant joinedlast_join_time: When the last participant joinedtotal_meeting_participants: Total number of participantsjoin_spread_minutes: Time difference between first and last join (in minutes)
The following metrics were requested for appointment/operational analysis:
num_no_shows: Number of participants who were scheduled but did not attend (confirmed_attendee=False)num_internal_no_shows: Number of internal participants who were no-showsnum_external_no_shows: Number of external participants who were no-showsaverage_handle_time_minutes: Average handle time (using total meeting duration as proxy)customer_late_minutes: LIMITED - Customer lateness in minutes (requires scheduled start time - see limitations)agent_late_minutes: LIMITED - Agent lateness in minutes (requires scheduled start time - see limitations)meeting_start_late_minutes: LIMITED - Meeting start lateness in minutes (requires scheduled start time - see limitations below)
The tool classifies participants as internal or external using the following priority:
-
Scope field (primary method):
scope="internal"→ Classified as internalscope="external"→ Classified as externalscope="unknown"or null → Uses fallback methods below
-
Email domain (for unknown scope):
- Participants with
@grain.coor@grain.comemails are classified as internal
- Participants with
-
Participant cache (for unknown scope):
- Uses a cache of known internal participants (by ID and name) built from all recordings
- Helps identify internal participants even when email is missing in some recordings
-
User ID (for unknown scope):
- If
user_idis present, participant is classified as internal
- If
-
Default: External (if none of the above apply)
This multi-method approach ensures accurate classification even when the API doesn't consistently return complete participant data across all recordings.
Important: The Grain API has limitations that affect some requested metrics:
-
No-Shows: Partially available
- The API provides
confirmed_attendeefield which indicates if a participant was present confirmed_attendee=Falseindicates a no-show- Limitation: This only works if participants are marked as scheduled in Grain. If the API doesn't return scheduled participants who didn't attend, they won't be counted.
- The API provides
-
Average Handle Time: Available as proxy
- Using
duration_ms(total meeting duration) as a proxy for handle time - Limitation: This is the total meeting duration, not necessarily the exact "handle time" which may have a different definition (e.g., active engagement time, time from first contact to resolution, etc.)
- Using
-
Lateness Metrics (Customer/Agent/Meeting Start):
- Missing Data: The Grain API does not provide scheduled start times
- What's Available: Only actual start time (
start_datetime) and actual join times - Impact: Cannot calculate lateness without comparing actual vs scheduled times
- Workaround: Would require integration with calendar system (Google Calendar, Outlook, etc.) or CRM to get scheduled times
-
Complete No-Show Tracking:
- Missing Data: The API doesn't provide a list of all scheduled participants
- What's Available: Only participants who were in the recording or marked as no-shows
- Impact: If someone was scheduled but never appeared in the recording data, they may not be counted
- Workaround: Would require integration with calendar/CRM system to get full scheduled attendee list
To fully support the requested metrics, consider:
-
For Lateness Calculation:
- Integrate with calendar API (Google Calendar, Outlook, etc.) to get scheduled start times
- Compare
start_datetimefrom Grain API with scheduled start time from calendar - Calculate:
lateness = actual_start - scheduled_start
-
For Complete No-Show Tracking:
- Integrate with calendar API to get all scheduled attendees
- Compare scheduled attendees with
confirmed_attendee=Trueparticipants from Grain - Missing attendees = no-shows
-
For Handle Time:
- Clarify the exact definition of "handle time" (total duration, active time, time to resolution, etc.)
- If different from total duration, may need additional processing or different data source
The analysis now includes the requested operational metrics where possible:
✅ Implemented:
- No-show tracking (using
confirmed_attendeefield) - Average handle time (using meeting duration as proxy)
- Lateness metrics (columns included but will be null/empty - requires scheduled start times not available in API)
The CSV output includes all requested columns. Lateness metrics are included as placeholders but will be empty/null until scheduled meeting data is available from an external source (calendar system, CRM, etc.).
Analysis complete! Found 76 recordings.
First 10 rows:
internal_user_ids ... join_spread_minutes
0 NaN ... 0.0
1 NaN ... 0.0
...
If you get authentication errors:
- Verify your API key is correct
- Ensure the API key has not expired
- Check that your API key has the necessary permissions
If no recordings are found:
- Verify the date range is correct
- Check that recordings exist for that date range in the workspace associated with your API key
- Ensure your API key has access to view recordings
If you get import errors:
- Ensure you've activated the virtual environment
- Run
pip install -r requirements.txtto install dependencies
pandas>=2.3.3- Data manipulation and analysisnumpy>=2.3.5- Numerical computingrequests>=2.32.5- HTTP library for API calls
This project is provided as-is for analysis purposes.
For issues related to:
- Grain API: Contact Grain support or check Grain API Documentation
- This tool: Check the code comments or modify as needed for your use case