A simple search engine that crawls webpages and retrieve searched word pages via a user friendly interface or frontend. The search engine consists of a web crawler,indexing system using the whoosh library and a frontend.
/indexdir # whoosh search library
/static # stylesheet to apply styles to templates
/templates # contains mplates to display search form and search results
├── search_form.html
├── search_results.html
/app.py # flask application
/crawler.py # script to crawl websites for searched words
/requirements.txt # required libraries for the project
python > 3.7 is required To use this repository, follow these steps:
-
Clone the Repository:
git clone https://github.com/MikeNsiah10/search_engine_project.git cd search_engine_project -
Setting Up a Python Environment It is recommended to use a virtual environment to manage the dependencies for this project. A virtual environment helps to isolate your project's dependencies from your global Python environment, avoiding potential conflicts.
# Create a virtual environment in a directory named 'env'
python3 -m venv env
# Activate the virtual environment
# On Windows
env\Scripts\activate
# On macOS/Linux
source env/bin/activate
- Install Dependencies:
Make sure you have the necessary libraries installed. You can use pip to install them:
pip install -r requirements.txt
The search engine was deployed on the university demo server. An FTP in my case Filezilla to transfer files and putty an ssh client to deploy my project to the server A wsgi file was created to connect the flask app to the server. To run on a development server , do:
python app.py and click the link