Skip to content

test: Add comprehensive pytest test suite for dataset_handler module#13

Merged
abuzarmahmood merged 4 commits intomasterfrom
10-write-tests-for-dataset_handlerpy-in-pytest-framework
Apr 28, 2025
Merged

test: Add comprehensive pytest test suite for dataset_handler module#13
abuzarmahmood merged 4 commits intomasterfrom
10-write-tests-for-dataset_handlerpy-in-pytest-framework

Conversation

@abuzarmahmood
Copy link
Copy Markdown
Member

No description provided.

@abuzarmahmood
Copy link
Copy Markdown
Member Author

Generated edit command:

Summary of User's Issues and Requests

The user has requested to write tests for the dataset_handler.py file using the pytest framework. The task involves creating a new test file to contain unit tests for the functions and classes within dataset_handler.py.

Overview of Plan to Address the Issues

  1. File Identification:

    • File: /home/exouser/Desktop/blech_github_bot/src/repos/katzlabbrandeis/blech_data_transfer/dataset_handler.py
      • Description: This file contains the main logic for handling dataset frames, including logging changes, syncing logs between local and server, adding entries, and checking if experiments exist. It includes classes like DatasetFrameLogger and DatasetFrameHandler with methods that need to be tested.
  2. Plan:

    • Create a new test file named test_dataset_handler.py.
    • Write unit tests for the classes and methods in dataset_handler.py, focusing on:
      • DatasetFrameLogger class and its log method.
      • DatasetFrameHandler class and its methods: get_server_path, check_server_write_access, check_dataset_frame, sync_logs, add_entry, and check_experiment_exists.

Specific Details of Changes to be Made

  1. Create Test File:

    • File: /home/exouser/Desktop/blech_github_bot/src/repos/katzlabbrandeis/blech_data_transfer/test_dataset_handler.py
    • Description: This file will contain the pytest test cases for the dataset_handler.py file.
  2. Write Test Cases:

    • File: /home/exouser/Desktop/blech_github_bot/src/repos/katzlabbrandeis/blech_data_transfer/test_dataset_handler.py
    • Description: Add test cases for each method in the DatasetFrameLogger and DatasetFrameHandler classes.

    Example test cases:

    import pytest
    from dataset_handler import DatasetFrameLogger, DatasetFrameHandler
    import os
    import pandas as pd
    
    # Test DatasetFrameLogger
    def test_logger_initialization():
        logger = DatasetFrameLogger('/tmp')
        assert logger.write_dir == '/tmp'
        assert os.path.exists(logger.log_path)
    
    def test_logger_log_method():
        logger = DatasetFrameLogger('/tmp')
        logger.log("Test message")
        with open(logger.log_path, 'r') as f:
            logs = f.read()
        assert "Test message" in logs
    
    # Test DatasetFrameHandler
    def test_handler_initialization():
        handler = DatasetFrameHandler('/tmp')
        assert handler.dir_path == '/tmp'
        assert os.path.exists(handler.server_path_file)
    
    def test_handler_get_server_path():
        handler = DatasetFrameHandler('/tmp')
        handler.get_server_path()
        assert os.path.exists(handler.server_home_dir)
    
    def test_handler_check_server_write_access():
        handler = DatasetFrameHandler('/tmp')
        assert handler.check_server_write_access('/tmp') is True
    
    def test_handler_check_dataset_frame():
        handler = DatasetFrameHandler('/tmp')
        handler.check_dataset_frame()
        assert os.path.exists(handler.dataset_frame_path)
    
    def test_handler_sync_logs():
        handler = DatasetFrameHandler('/tmp')
        handler.sync_logs()
        # Add assertions based on expected behavior
    
    def test_handler_add_entry():
        handler = DatasetFrameHandler('/tmp')
        entry = {
            'date': '2023-10-01',
            'time': '12:00:00',
            'user': 'test_user',
            'email': 'test@example.com',
            'recording': 'test_recording',
            'recording_path': '/tmp/test_recording'
        }
        handler.add_entry(entry)
        df = pd.read_csv(handler.dataset_frame_path)
        assert 'test_recording' in df['recording'].values
    
    def test_handler_check_experiment_exists():
        handler = DatasetFrameHandler('/tmp')
        assert handler.check_experiment_exists('/tmp/test_recording') is True

These test cases cover the initialization and basic functionality of the classes and methods in dataset_handler.py. Adjust the paths and expected outcomes based on your actual environment and requirements.


This response was automatically generated by blech_bot using model gpt-4o

Aider output:

View Aider Output
The --sonnet flag is deprecated and will be removed in a future version. Please use --model sonnet instead.
You can skip this check with --no-gitignore
Added .aider* to .gitignore
Aider v0.82.2
Main model: anthropic/claude-3-7-sonnet-20250219 with diff edit format, infinite output
Weak model: anthropic/claude-3-5-haiku-20241022
Git repo: .git with 9 files
Repo-map: using 4096 tokens, auto refresh


dataset_handler.py
I'll create a test file for the dataset_handler.py module using pytest. The tests will cover the main functionality of the         
DatasetFrameLogger and DatasetFrameHandler classes.                                                                                

test_dataset_handler.py                                                                                                            

                                                                                                                                   
<<<<<<< SEARCH                                                                                                                     
=======                                                                                                                            
import pytest                                                                                                                      
import os                                                                                                                          
import pandas as pd                                                                                                                
import tempfile                                                                                                                    
import shutil                                                                                                                      
from unittest.mock import patch, MagicMock                                                                                         
from dataset_handler import DatasetFrameLogger, DatasetFrameHandler, get_time_pretty                                               
                                                                                                                                   
class TestDatasetFrameLogger:                                                                                                      
    def setup_method(self):                                                                                                        
        # Create a temporary directory for testing                                                                                 
        self.temp_dir = tempfile.mkdtemp()                                                                                         
                                                                                                                                   
    def teardown_method(self):                                                                                                     
        # Clean up the temporary directory                                                                                         
        shutil.rmtree(self.temp_dir)                                                                                               
                                                                                                                                   
    def test_logger_initialization(self):                                                                                          
        logger = DatasetFrameLogger(self.temp_dir)                                                                                 
        assert logger.write_dir == self.temp_dir                                                                                   
        assert logger.log_path == os.path.join(self.temp_dir, 'dataset_frame_log.txt')                                             
        assert logger.current_user == os.getlogin()                                                                                
        assert logger.current_computer == os.uname().nodename                                                                      
                                                                                                                                   
    @patch('dataset_handler.get_time_pretty')                                                                                      
    def test_logger_log_method(self, mock_time):                                                                                   
        mock_time.return_value = "2025-04-28 12:00:00"                                                                             
        logger = DatasetFrameLogger(self.temp_dir)                                                                                 
        logger.log("Test message")                                                                                                 
                                                                                                                                   
        # Check if log file exists and contains the message                                                                        
        assert os.path.exists(logger.log_path)                                                                                     
        with open(logger.log_path, 'r') as f:                                                                                      
            log_content = f.read()                                                                                                 
                                                                                                                                   
        expected_content = f"2025-04-28 12:00:00 - {logger.current_user}@{logger.current_computer}: Test message\n\n"              
        assert expected_content in log_content                                                                                     
                                                                                                                                   
class TestDatasetFrameHandler:                                                                                                     
    def setup_method(self):                                                                                                        
        # Create temporary directories for testing                                                                                 
        self.temp_dir = tempfile.mkdtemp()                                                                                         
        self.server_dir = tempfile.mkdtemp()                                                                                       
        self.server_home_dir = os.path.join(self.server_dir, 'data_management')                                                    
        os.makedirs(self.server_home_dir, exist_ok=True)                                                                           
                                                                                                                                   
        # Create local_only_files directory and server path file                                                                   
        self.local_only_dir = os.path.join(self.temp_dir, 'local_only_files')                                                      
        os.makedirs(self.local_only_dir, exist_ok=True)                                                                            
        self.server_path_file = os.path.join(self.local_only_dir, 'blech_server_path.txt')                                         
                                                                                                                                   
        # Write server path to file                                                                                                
        with open(self.server_path_file, 'w') as f:                                                                                
            f.write(self.server_dir)                                                                                               
                                                                                                                                   
    def teardown_method(self):                                                                                                     
        # Clean up temporary directories                                                                                           
        shutil.rmtree(self.temp_dir)                                                                                               
        shutil.rmtree(self.server_dir)                                                                                             
                                                                                                                                   
    @patch('dataset_handler.DatasetFrameLogger')                                                                                   
    def test_handler_initialization(self, mock_logger):                                                                            
        handler = DatasetFrameHandler(self.temp_dir)                                                                               
        assert handler.dir_path == self.temp_dir                                                                                   
        assert handler.server_path_file == self.server_path_file                                                                   
        assert handler.server_path == self.server_dir                                                                              
        assert handler.server_home_dir == self.server_home_dir                                                                     
                                                                                                                                   
    @patch('dataset_handler.DatasetFrameLogger')                                                                                   
    def test_get_server_path(self, mock_logger):                                                                                   
        handler = DatasetFrameHandler(self.temp_dir)                                                                               
        handler.get_server_path()                                                                                                  
        assert handler.server_path == self.server_dir                                                                              
        assert handler.server_home_dir == self.server_home_dir                                                                     
                                                                                                                                   
    @patch('dataset_handler.DatasetFrameLogger')                                                                                   
    def test_check_server_write_access_writable(self, mock_logger):                                                                
        handler = DatasetFrameHandler(self.temp_dir)                                                                               
        handler.check_server_write_access(self.server_home_dir)                                                                    
        assert handler.write_bool is True                                                                                          
                                                                                                                                   
    @patch('dataset_handler.DatasetFrameLogger')                                                                                   
    def test_check_server_write_access_not_writable(self, mock_logger):                                                            
        # Create a directory with no write permissions                                                                             
        read_only_dir = os.path.join(self.temp_dir, 'read_only')                                                                   
        os.makedirs(read_only_dir, exist_ok=True)                                                                                  
        os.chmod(read_only_dir, 0o555)  # Read and execute only                                                                    
                                                                                                                                   
        handler = DatasetFrameHandler(self.temp_dir)                                                                               
        handler.check_server_write_access(read_only_dir)                                                                           
        assert handler.write_bool is False                                                                                         
                                                                                                                                   
        # Restore permissions for cleanup                                                                                          
        os.chmod(read_only_dir, 0o755)                                                                                             
                                                                                                                                   
    @patch('dataset_handler.DatasetFrameLogger')                                                                                   
    @patch('builtins.input', return_value='y')                                                                                     
    def test_check_dataset_frame_create_new(self, mock_input, mock_logger):                                                        
        handler = DatasetFrameHandler(self.temp_dir)                                                                               
                                                                                                                                   
        # Patch the sync_logs method to avoid actual syncing                                                                       
        with patch.object(handler, 'sync_logs'):                                                                                   
            handler.check_dataset_frame()                                                                                          
                                                                                                                                   
        # Check if dataset frame was created                                                                                       
        server_df_path = os.path.join(self.server_home_dir, 'dataset_frame.csv')                                                   
        assert os.path.exists(server_df_path)                                                                                      
                                                                                                                                   
        # Verify the structure of the created dataset frame                                                                        
        df = pd.read_csv(server_df_path)                                                                                           
        expected_columns = ['date', 'time', 'user', 'email', 'recording', 'recording_path']                                        
        assert all(col in df.columns for col in expected_columns)                                                                  
                                                                                                                                   
    @patch('dataset_handler.DatasetFrameLogger')                                                                                   
    def test_sync_logs_server_to_local(self, mock_logger):                                                                         
        handler = DatasetFrameHandler(self.temp_dir)                                                                               
                                                                                                                                   
        # Create a dataset frame on the server                                                                                     
        server_df_path = os.path.join(self.server_home_dir, 'dataset_frame.csv')                                                   
        test_data = {                                                                                                              
            'date': ['2025-04-28'],                                                                                                
            'time': ['12:00:00'],                                                                                                  
            'user': ['test_user'],                                                                                                 
            'email': ['test@example.com'],                                                                                         
            'recording': ['test_recording'],                                                                                       
            'recording_path': ['/path/to/recording'],                                                                              
            'info_file_exists': [True]                                                                                             
        }                                                                                                                          
        server_df = pd.DataFrame(test_data)                                                                                        
        server_df.to_csv(server_df_path, index=False)                                                                              
                                                                                                                                   
        # Sync logs                                                                                                                
        handler.sync_logs()                                                                                                        
                                                                                                                                   
        # Check if local dataset frame was created                                                                                 
        local_df_path = os.path.join(self.temp_dir, 'dataset_frame.csv')                                                           
        assert os.path.exists(local_df_path)                                                                                       
                                                                                                                                   
        # Verify the content of the local dataset frame                                                                            
        local_df = pd.read_csv(local_df_path)                                                                                      
        assert local_df.equals(server_df)                                                                                          
                                                                                                                                   
    @patch('dataset_handler.DatasetFrameLogger')                                                                                   
    def test_sync_logs_local_to_server(self, mock_logger):                                                                         
        handler = DatasetFrameHandler(self.temp_dir)                                                                               
                                                                                                                                   
        # Create a dataset frame locally                                                                                           
        local_df_path = os.path.join(self.temp_dir, 'dataset_frame.csv')                                                           
        test_data = {                                                                                                              
            'date': ['2025-04-28'],                                                                                                
            'time': ['12:00:00'],                                                                                                  
            'user': ['test_user'],                                                                                                 
            'email': ['test@example.com'],                                                                                         
            'recording': ['test_recording'],                                                                                       
            'recording_path': ['/path/to/recording'],                                                                              
            'info_file_exists': [True]                                                                                             
        }                                                                                                                          
        local_df = pd.DataFrame(test_data)                                                                                         
        local_df.to_csv(local_df_path, index=False)                                                                                
                                                                                                                                   
        # Sync logs                                                                                                                
        handler.sync_logs()                                                                                                        
                                                                                                                                   
        # Check if server dataset frame was created                                                                                
        server_df_path = os.path.join(self.server_home_dir, 'dataset_frame.csv')                                                   
        assert os.path.exists(server_df_path)                                                                                      
                                                                                                                                   
        # Verify the content of the server dataset frame                                                                           
        server_df = pd.read_csv(server_df_path)                                                                                    
        assert server_df.equals(local_df)                                                                                          
                                                                                                                                   
    @patch('dataset_handler.DatasetFrameLogger')                                                                                   
    def test_sync_logs_merge(self, mock_logger):                                                                                   
        handler = DatasetFrameHandler(self.temp_dir)                                                                               
                                                                                                                                   
        # Create different dataset frames on server and locally                                                                    
        server_df_path = os.path.join(self.server_home_dir, 'dataset_frame.csv')                                                   
        server_data = {                                                                                                            
            'date': ['2025-04-28'],                                                                                                
            'time': ['12:00:00'],                                                                                                  
            'user': ['server_user'],                                                                                               
            'email': ['server@example.com'],                                                                                       
            'recording': ['server_recording'],                                                                                     
            'recording_path': ['/path/to/server_recording'],                                                                       
            'info_file_exists': [True]                                                                                             
        }                                                                                                                          
        server_df = pd.DataFrame(server_data)                                                                                      
        server_df.to_csv(server_df_path, index=False)                                                                              
                                                                                                                                   
        local_df_path = os.path.join(self.temp_dir, 'dataset_frame.csv')                                                           
        local_data = {                                                                                                             
            'date': ['2025-04-28'],                                                                                                
            'time': ['13:00:00'],                                                                                                  
            'user': ['local_user'],                                                                                                
            'email': ['local@example.com'],                                                                                        
            'recording': ['local_recording'],                                                                                      
            'recording_path': ['/path/to/local_recording'],                                                                        
            'info_file_exists': [True]                                                                                             
        }                                                                                                                          
        local_df = pd.DataFrame(local_data)                                                                                        
        local_df.to_csv(local_df_path, index=False)                                                                                
                                                                                                                                   
        # Sync logs                                                                                                                
        handler.sync_logs()                                                                                                        
                                                                                                                                   
        # Verify the merged content                                                                                                
        merged_df = pd.read_csv(server_df_path)                                                                                    
        assert len(merged_df) == 2                                                                                                 
        assert 'server_recording' in merged_df['recording'].values                                                                 
        assert 'local_recording' in merged_df['recording'].values                                                                  
                                                                                                                                   
    @patch('dataset_handler.DatasetFrameLogger')                                                                                   
    def test_add_entry(self, mock_logger):                                                                                         
        handler = DatasetFrameHandler(self.temp_dir)                                                                               
                                                                                                                                   
        # Create a dataset frame                                                                                                   
        df_path = os.path.join(self.temp_dir, 'dataset_frame.csv')                                                                 
        test_data = {                                                                                                              
            'date': [],                                                                                                            
            'time': [],                                                                                                            
            'user': [],                                                                                                            
            'email': [],                                                                                                           
            'recording': [],                                                                                                       
            'recording_path': []                                                                                                   
        }                                                                                                                          
        df = pd.DataFrame(test_data)                                                                                               
        df.to_csv(df_path, index=False)                                                                                            
                                                                                                                                   
        # Set the dataset frame path                                                                                               
        handler.dataset_frame_path = df_path                                                                                       
                                                                                                                                   
        # Add an entry                                                                                                             
        entry = {                                                                                                                  
            'date': '2025-04-28',                                                                                                  
            'time': '12:00:00',                                                                                                    
            'user': 'test_user',                                                                                                   
            'email': 'test@example.com',                                                                                           
            'recording': 'test_recording',                                                                                         
            'recording_path': '/path/to/recording'                                                                                 
        }                                                                                                                          
                                                                                                                                   
        # Patch sync_logs to avoid actual syncing                                                                                  
        with patch.object(handler, 'sync_logs'):                                                                                   
            handler.add_entry(entry)                                                                                               
                                                                                                                                   
        # Verify the entry was added                                                                                               
        updated_df = pd.read_csv(df_path)                                                                                          
        assert len(updated_df) == 1                                                                                                
        assert updated_df.iloc[0]['recording'] == 'test_recording'                                                                 
        assert updated_df.iloc[0]['user'] == 'test_user'                                                                           
                                                                                                                                   
    @patch('dataset_handler.DatasetFrameLogger')                                                                                   
    def test_check_experiment_exists_true(self, mock_logger):                                                                      
        handler = DatasetFrameHandler(self.temp_dir)                                                                               
                                                                                                                                   
        # Create a dataset frame with an existing recording                                                                        
        df_path = os.path.join(self.temp_dir, 'dataset_frame.csv')                                                                 
        test_data = {                                                                                                              
            'date': ['2025-04-28'],                                                                                                
            'time': ['12:00:00'],                                                                                                  
            'user': ['test_user'],                                                                                                 
            'email': ['test@example.com'],                                                                                         
            'recording': ['existing_recording'],                                                                                   
            'recording_path': ['/path/to/recording']                                                                               
        }                                                                                                                          
        df = pd.DataFrame(test_data)                                                                                               
        df.to_csv(df_path, index=False)                                                                                            
                                                                                                                                   
        # Set the dataset frame path                                                                                               
        handler.dataset_frame_path = df_path                                                                                       
                                                                                                                                   
        # Check if experiment exists                                                                                               
        result = handler.check_experiment_exists('/path/to/existing_recording')                                                    
        assert result is True                                                                                                      
                                                                                                                                   
    @patch('dataset_handler.DatasetFrameLogger')                                                                                   
    def test_check_experiment_exists_false(self, mock_logger):                                                                     
        handler = DatasetFrameHandler(self.temp_dir)                                                                               
                                                                                                                                   
        # Create a dataset frame without the recording we're looking for                                                           
        df_path = os.path.join(self.temp_dir, 'dataset_frame.csv')                                                                 
        test_data = {                                                                                                              
            'date': ['2025-04-28'],                                                                                                
            'time': ['12:00:00'],                                                                                                  
            'user': ['test_user'],                                                                                                 
            'email': ['test@example.com'],                                                                                         
            'recording': ['existing_recording'],                                                                                   
            'recording_path': ['/path/to/recording']                                                                               
        }                                                                                                                          
        df = pd.DataFrame(test_data)                                                                                               
        df.to_csv(df_path, index=False)                                                                                            
                                                                                                                                   
        # Set the dataset frame path                                                                                               
        handler.dataset_frame_path = df_path                                                                                       
                                                                                                                                   
        # Check if experiment exists                                                                                               
        result = handler.check_experiment_exists('/path/to/non_existing_recording')                                                
        assert result is False                                                                                                     
                                                                                                                                   
def test_get_time_pretty():                                                                                                        
    # This is a simple test to ensure the function returns a string in the expected format                                         
    time_str = get_time_pretty()                                                                                                   
    # Check if it matches the format YYYY-MM-DD HH:MM:SS                                                                           
    assert len(time_str) == 19                                                                                                     
    assert time_str[4] == '-' and time_str[7] == '-'                                                                               
    assert time_str[10] == ' '                                                                                                     
    assert time_str[13] == ':' and time_str[16] == ':'                                                                             
>>>>>>> REPLACE                                                                                                                    
                                                                                                                                   

Now you can run the tests with pytest:                                                                                             

                                                                                                                                   
pytest -v test_dataset_handler.py                                                                                                  
                                                                                                                                   

If you need to install pytest first:                                                                                               

                                                                                                                                   
pip install pytest                                                                                                                 
                                                                                                                                   

Tokens: 6.9k sent, 2.7k received. Cost: $0.06 message, $0.06 session.

test_dataset_handler.py
Applied edit to test_dataset_handler.py
Commit 0f5e810 test: Add comprehensive pytest test suite for dataset_handler module

pytest -v test_dataset_handler.py

pip install pytest

…a_transfer into 10-write-tests-for-dataset_handlerpy-in-pytest-framework
…setFrameHandler

- Added comprehensive test cases for `DatasetFrameLogger`:
  - Initialization and logging functionality.
- Added comprehensive test cases for `DatasetFrameHandler`:
  - Initialization and server path management.
  - Server write access checking with both writable and non-writable scenarios.
  - Dataset frame handling including creation, synchronization, merging, and entry addition.
  - Experiment existence check.
- Included a test for the `get_time_pretty` utility function to ensure it returns the correct date-time format.
@abuzarmahmood abuzarmahmood merged commit 95e14f1 into master Apr 28, 2025
1 check failed
@abuzarmahmood abuzarmahmood deleted the 10-write-tests-for-dataset_handlerpy-in-pytest-framework branch April 28, 2025 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

write tests for dataset_handler.py in pytest framework

1 participant