This is a news website that scrapes news articles and generates humorous headlines. The website features a daily email subscription service, social media sharing capabilities, and an admin panel for managing and editing generated headlines. Additionally, the platform includes functionality for training a machine learning model based on user interactions with the jokes. The implementation utilizes Python for backend processing and Next.js for the frontend interface.
The project is designed to automate the process of scraping news articles, generating humorous content, and managing user interactions. It leverages AWS services for deployment and scalability, ensuring a robust and efficient system.
This section provides instructions for setting up the local environment, including Docker, the database, and the code.
- Docker and Docker Compose installed on your machine.
- Python 3.8 or higher installed.
- AWS credentials configured for accessing DynamoDB in the Sydney region.
-
Setting Up Docker
-
Build and Run the Docker Container:
Navigate to the project directory and run the following command to build and start the Docker container:
docker-compose up --build
This will start the Flask server exposing the LLaMA model on port 8000.
-
-
Configuring the Database
-
Ensure AWS Credentials:
Make sure your AWS credentials are set up to access DynamoDB in the Sydney region. You can configure this using the
create_aws_config.pyscript. -
Create DynamoDB Tables:
Run the
setup_db.pyscript to create the necessary DynamoDB tables:python setup_db.py
-
-
Running the Scraper
-
Activate the Virtual Environment:
Depending on your operating system, activate the virtual environment:
-
Linux:
source ./venv/bin/activate -
Windows:
source ./venv/Scripts/activate
-
-
Run the Scraper:
Execute the
run_scrape_news.shscript to start the news scraper:./run_scrape_news.sh
-
Ensure the following model files are available in C:/Users/username/weights/llama/text:
-
Model Files (safetensors or pth):
model-00001-of-00004.safetensorsmodel-00002-of-00004.safetensorsmodel-00003-of-00004.safetensorsmodel-00004-of-00004.safetensors- Or, if downloaded in a different format:
consolidated.00.pth, etc.
-
Configuration and Metadata Files:
config.json: Defines model configuration.generation_config.json: Contains generation-specific configurations.special_tokens_map.json: Maps special tokens.tokenizer_config.json: Configuration for the tokenizer.tokenizer.model: Contains tokenization rules.
-
Additional Metadata:
.gitattributesUSE_POLICY.md: Model use policy information.
To start the LMS server, follow these steps:
-
Visit LM Studio: lmstudio.ai
-
Start the LMS Server:
Run the following command to start the LMS server:
python -m lms start lms server start
This will initialize the server and make it accessible locally.
The development environment is set up to facilitate efficient coding and testing. It includes:
- Python: Used for backend processing and scripting.
- Next.js: Utilized for the frontend interface.
- AWS Services: Deployed on AWS Lambda and Lightsail for scalability and reliability.
Ensure all dependencies are installed and configured correctly to maintain a smooth development workflow.
-
Start the Flask Server:
Navigate to the
admindirectory and run the following command to start the Flask server:python server.py
This will start the server on
http://localhost:5000. -
Access the Admin Page:
Open your web browser and go to:
http://localhost:5000/admin/index.htmlThis page will display the live data from the DynamoDB database.
To clear the data from the DynamoDB table, run the following script:
python admin/clear_database.pyThis will delete all entries from the themasonnetwork_drudgescrape table.
To test the LLM server, run the following command:
python test_llm.pyThis will verify that the LLM server is running and accessible.