Skip to content

For customers to pull meeting metrics

Notifications You must be signed in to change notification settings

grain-team/meeting-data-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Meeting Data Analysis

A Python tool for analyzing meeting recordings from the Grain API. This tool fetches meeting data, analyzes speaking patterns, participant engagement, and generates comprehensive reports.

Features

  • Fetches meeting recordings from Grain API within a specified date range
  • Analyzes speaking time for internal vs external participants
  • Identifies who spoke first in meetings
  • Calculates speaking wait times and participant join spread
  • Classifies meetings by speaking patterns (internal only, external only, both, or no speech)
  • Participant classification using scope field (internal/external/unknown) with intelligent fallback
  • Participant emails - Lists all participant emails for each meeting
  • Meeting owner identification - Identifies meeting owner from internal participants
  • Tracks no-shows (participants scheduled but didn't attend)
  • Calculates average handle time (using meeting duration as proxy)
  • Lateness tracking (with limitations - see API Limitations section)
  • Exports results to CSV for further analysis

Prerequisites

  • Python 3.8 or higher
  • A Grain API key (Personal Access Token)
  • Access to a Grain workspace

Installation

  1. Clone or download this repository

  2. Create a virtual environment (recommended):

    python -m venv venv
  3. Activate the virtual environment:

    • On macOS/Linux:
      source venv/bin/activate
    • On Windows:
      venv\Scripts\activate
  4. Install dependencies:

    pip install -r requirements.txt

Getting Your API Key

  1. Log in to your Grain account
  2. Navigate to API settings (Under Settings > Integration > API)
  3. Generate a Personal Access Token (PAT) or Workspace Access Token (WAT)
  4. Copy the token - you'll need it to run the analysis

Configuration

All parameters can be configured via command-line arguments or environment variables. Command-line arguments take precedence over environment variables.

Required Parameters

  • API Key: Your Grain Personal Access Token (workspace is determined automatically from the API key)
  • Start Date: Start date in YYYY-MM-DD format (inclusive)
  • End Date: End date in YYYY-MM-DD format (exclusive)

Option 1: Command-Line Arguments (Recommended)

python run_analysis.py \
  --api-key "your-api-key-here" \
  --start-date "2025-10-19" \
  --end-date "2025-11-20"

Option 2: Environment Variables

Set environment variables:

export GRAIN_API_KEY="your-api-key-here"
export GRAIN_START_DATE="2025-10-19"
export GRAIN_END_DATE="2025-11-20"

On Windows:

set GRAIN_API_KEY=your-api-key-here
set GRAIN_START_DATE=2025-10-19
set GRAIN_END_DATE=2025-11-20

Then run:

python run_analysis.py

Option 3: Mix of Both

You can use environment variables for some parameters and command-line arguments for others. CLI arguments override environment variables:

export GRAIN_API_KEY="your-api-key-here"

python run_analysis.py --start-date "2025-10-19" --end-date "2025-11-20"

Usage

Basic Usage

Run with all required parameters:

python run_analysis.py \
  --api-key "your-api-key" \
  --start-date "2025-10-19" \
  --end-date "2025-11-20"

Advanced Options

python run_analysis.py \
  --api-key "your-api-key" \
  --start-date "2025-10-19" \
  --end-date "2025-11-20" \
  --simultaneous-threshold 10 \
  --output "custom_output.csv"

View All Options

python run_analysis.py --help

Complete Example

# Set all parameters via environment variables
export GRAIN_API_KEY="grain_pat_your_key_here"
export GRAIN_START_DATE="2025-10-19"
export GRAIN_END_DATE="2025-11-20"

# Run the analysis
python run_analysis.py

# Or use command-line arguments
python run_analysis.py \
  --api-key "grain_pat_your_key_here" \
  --start-date "2025-10-19" \
  --end-date "2025-11-20" \
  --output "my_analysis_results.csv"

Using as a Module

You can also import and use the function in your own scripts:

from meeting_data_analysis import meeting_data_analysis

df = meeting_data_analysis(
    api_key="your-api-key",
    start_date="2025-10-19",
    end_date="2025-11-20",
    simultaneous_threshold_seconds=5
)

# Process the DataFrame
print(df.head())
df.to_csv("my_analysis.csv", index=False)

Command-Line Parameters

Required Parameters

  • --api-key or GRAIN_API_KEY: Your Grain API Personal Access Token (workspace is determined automatically)
  • --start-date or GRAIN_START_DATE: Start date in YYYY-MM-DD format (inclusive)
  • --end-date or GRAIN_END_DATE: End date in YYYY-MM-DD format (exclusive)

Optional Parameters

  • --simultaneous-threshold: Threshold in seconds to consider two speakers as speaking simultaneously (default: 5)
  • --output: Output CSV file path (default: meeting_data_analysis.csv)

Function Parameters (for direct use)

When using meeting_data_analysis() as a Python function:

  • api_key (str, required): Your Grain API Personal Access Token (workspace is determined automatically)
  • start_date (str, required): Start date in YYYY-MM-DD format (inclusive)
  • end_date (str, required): End date in YYYY-MM-DD format (exclusive)
  • simultaneous_threshold_seconds (int, optional): Threshold in seconds to consider two speakers as speaking simultaneously. Default: 5

Output

The analysis generates a CSV file (meeting_data_analysis.csv) with the following columns:

Core Metrics

  • internal_user_ids: List of internal user IDs who participated
  • recording_id: Unique identifier for the recording
  • start_datetime: When the meeting started
  • duration_minutes: Meeting duration in minutes
  • meeting_owner_email: Email of the meeting owner (first internal participant)
  • meeting_owner_name: Name of the meeting owner
  • participant_emails: List of all participant emails for the meeting (sorted)
  • num_speakers: Total number of unique speakers
  • num_internal_speakers_who_spoke: Number of internal participants who spoke
  • num_external_speakers_who_spoke: Number of external participants who spoke
  • internal_speaking_minutes: Total internal speaking time in minutes
  • external_speaking_minutes: Total external speaking time in minutes
  • speaking_category: Classification (e.g., "Both spoke", "Only internal spoke", "Only external spoke", "No one spoke")
  • internal_speaking_pct: Percentage of total speaking time by internal participants
  • num_internal_participants: Total number of internal participants
  • num_external_participants: Total number of external participants
  • first_internal_spoke_time: Timestamp when first internal participant spoke
  • first_external_spoke_time: Timestamp when first external participant spoke
  • who_spoke_first: Classification of who spoke first ("internal", "external", "simultaneous", "only_internal", "only_external", "no_speech")
  • speaking_wait_time_minutes: Time difference between first internal and external speech (in minutes)
  • first_join_time: When the first participant joined
  • last_join_time: When the last participant joined
  • total_meeting_participants: Total number of participants
  • join_spread_minutes: Time difference between first and last join (in minutes)

Operational Metrics (Requested Features)

The following metrics were requested for appointment/operational analysis:

  • num_no_shows: Number of participants who were scheduled but did not attend (confirmed_attendee=False)
  • num_internal_no_shows: Number of internal participants who were no-shows
  • num_external_no_shows: Number of external participants who were no-shows
  • average_handle_time_minutes: Average handle time (using total meeting duration as proxy)
  • customer_late_minutes: LIMITED - Customer lateness in minutes (requires scheduled start time - see limitations)
  • agent_late_minutes: LIMITED - Agent lateness in minutes (requires scheduled start time - see limitations)
  • meeting_start_late_minutes: LIMITED - Meeting start lateness in minutes (requires scheduled start time - see limitations below)

Participant Classification

The tool classifies participants as internal or external using the following priority:

  1. Scope field (primary method):

    • scope="internal" → Classified as internal
    • scope="external" → Classified as external
    • scope="unknown" or null → Uses fallback methods below
  2. Email domain (for unknown scope):

    • Participants with @grain.co or @grain.com emails are classified as internal
  3. Participant cache (for unknown scope):

    • Uses a cache of known internal participants (by ID and name) built from all recordings
    • Helps identify internal participants even when email is missing in some recordings
  4. User ID (for unknown scope):

    • If user_id is present, participant is classified as internal
  5. Default: External (if none of the above apply)

This multi-method approach ensures accurate classification even when the API doesn't consistently return complete participant data across all recordings.

API Limitations & Data Availability

Important: The Grain API has limitations that affect some requested metrics:

✅ Available Metrics

  1. No-Shows: Partially available

    • The API provides confirmed_attendee field which indicates if a participant was present
    • confirmed_attendee=False indicates a no-show
    • Limitation: This only works if participants are marked as scheduled in Grain. If the API doesn't return scheduled participants who didn't attend, they won't be counted.
  2. Average Handle Time: Available as proxy

    • Using duration_ms (total meeting duration) as a proxy for handle time
    • Limitation: This is the total meeting duration, not necessarily the exact "handle time" which may have a different definition (e.g., active engagement time, time from first contact to resolution, etc.)

❌ Not Available via API

  1. Lateness Metrics (Customer/Agent/Meeting Start):

    • Missing Data: The Grain API does not provide scheduled start times
    • What's Available: Only actual start time (start_datetime) and actual join times
    • Impact: Cannot calculate lateness without comparing actual vs scheduled times
    • Workaround: Would require integration with calendar system (Google Calendar, Outlook, etc.) or CRM to get scheduled times
  2. Complete No-Show Tracking:

    • Missing Data: The API doesn't provide a list of all scheduled participants
    • What's Available: Only participants who were in the recording or marked as no-shows
    • Impact: If someone was scheduled but never appeared in the recording data, they may not be counted
    • Workaround: Would require integration with calendar/CRM system to get full scheduled attendee list

Recommendations

To fully support the requested metrics, consider:

  1. For Lateness Calculation:

    • Integrate with calendar API (Google Calendar, Outlook, etc.) to get scheduled start times
    • Compare start_datetime from Grain API with scheduled start time from calendar
    • Calculate: lateness = actual_start - scheduled_start
  2. For Complete No-Show Tracking:

    • Integrate with calendar API to get all scheduled attendees
    • Compare scheduled attendees with confirmed_attendee=True participants from Grain
    • Missing attendees = no-shows
  3. For Handle Time:

    • Clarify the exact definition of "handle time" (total duration, active time, time to resolution, etc.)
    • If different from total duration, may need additional processing or different data source

Summary

The analysis now includes the requested operational metrics where possible:

Implemented:

  • No-show tracking (using confirmed_attendee field)
  • Average handle time (using meeting duration as proxy)

⚠️ Partially Implemented:

  • Lateness metrics (columns included but will be null/empty - requires scheduled start times not available in API)

The CSV output includes all requested columns. Lateness metrics are included as placeholders but will be empty/null until scheduled meeting data is available from an external source (calendar system, CRM, etc.).

Example Output

Analysis complete! Found 76 recordings.

First 10 rows:
  internal_user_ids  ... join_spread_minutes
0               NaN  ...                 0.0
1               NaN  ...                 0.0
...

Troubleshooting

API Authentication Errors

If you get authentication errors:

  • Verify your API key is correct
  • Ensure the API key has not expired
  • Check that your API key has the necessary permissions

No Recordings Found

If no recordings are found:

  • Verify the date range is correct
  • Check that recordings exist for that date range in the workspace associated with your API key
  • Ensure your API key has access to view recordings

Module Not Found Errors

If you get import errors:

  • Ensure you've activated the virtual environment
  • Run pip install -r requirements.txt to install dependencies

Dependencies

  • pandas>=2.3.3 - Data manipulation and analysis
  • numpy>=2.3.5 - Numerical computing
  • requests>=2.32.5 - HTTP library for API calls

License

This project is provided as-is for analysis purposes.

Support

For issues related to:

  • Grain API: Contact Grain support or check Grain API Documentation
  • This tool: Check the code comments or modify as needed for your use case

About

For customers to pull meeting metrics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages