C++ SEO Analyzer

A basic C++ command-line SEO analysis tool focused on demonstrating practical language proficiency and object-oriented programming concepts through real-world inspired HTML parsing and analysis.

Project Overview

This project is not intended as a production-ready SEO tool. Instead, it is a showcase of my C++ and OOP skills, designed to simulate the process of SEO analysis on static HTML samples. The analyzer extracts and reports on SEO-relevant information such as page titles, meta descriptions, keyword occurrences, and link counts.

Background & Objective

While exploring C++ and object-oriented programming, I sought a hands-on project to bridge theory and practice. I chose to build an SEO analyzer for HTML files, allowing me to apply file I/O, string manipulation, error handling, and OOP design patterns in a relevant context. The project serves as both a demonstration of my technical growth and a portfolio artifact.

Features

Loads and parses HTML files
Extracts and reports:
- Title and meta description
- Keyword density (with case-insensitive search)
- List and classification of internal/external links
Handles varied HTML formatting (spaces, line breaks, cases)
Demonstrates modular C++ OOP design

Sample Websites

To reflect real-world diversity, the repository includes three example sites for analysis:

E-Commerce Product Page (ecommerce-product/index.html)
Personal Blog (personal-blog/index.html)
Portfolio (portfolio/index.html)

Each site is analyzed for basic SEO structure, and the analyzer’s output is demonstrated on these samples.

How It Works

Build the Analyzer:
```
g++ seo_analyzer.cpp -o seo_analyzer
```
Run the Program:
```
./seo_analyzer
```
Enter the path to an HTML file and a keyword when prompted.

The analyzer reads the file, extracts the title and meta description, counts keyword occurrences (case-insensitive), and lists internal/external links.

Typical Output

E-Commerce Product Page (`ecommerce-product/index.html`, keyword: store)

Title: "Buy the SuperWidget 3000"
Meta Description: "Purchase the amazing SuperWidget 3000 with free shipping!"
Keyword 'store': found 1 time
Links: 0

Personal Blog (`personal-blog/index.html`, keyword: SEO)

Title: "Jane Doe's Blog"
Meta Description: "Personal blog sharing tech tips, stories, and tutorials."
Keyword 'SEO': found 2 times
Links: 3 (all internal)

Portfolio (`portfolio/index.html`, keyword: Developer)

Title: "Sam Smith Portfolio"
Meta Description: "Sam Smith - Web Developer Portfolio showcasing projects and skills."
Keyword 'Developer': found 3 times
Links: 0

This output helps quickly assess SEO basics (title, meta, keyword density, and link structure) for any HTML sample.

Challenges & Solutions

During development, I faced several common issues:

HTML parsing fragility: The code initially failed to find titles and meta descriptions if tags had extra spaces, different case, or line breaks.
Whitespace and formatting: Extracted values often included unwanted spaces or newlines.
Keyword matching: Ensuring case-insensitive and accurate keyword counts was essential.
HTML variability: Real-world HTML is not standardized; parsing logic needed robustness.

Resolution:
To address these, I improved the parsing logic:

Converted the entire HTML content to lowercase for case-insensitive search.
Extracted values from the original content to preserve case, trimming whitespace for clean output.
Enhanced search patterns to handle extra spaces and line breaks.
Tested thoroughly using diverse HTML examples.

Example of robust parsing logic:

void loadFromFile(const std::string& filepath) {
    std::ifstream file(filepath);
    std::stringstream buffer;
    if (file.is_open()) {
        buffer << file.rdbuf();
        bodyContent = buffer.str();
    }
    file.close();

    std::string lowerContent = toLower(bodyContent);

    // Find <title>
    size_t titleStart = lowerContent.find("<title>");
    size_t titleEnd = lowerContent.find("</title>");
    if (titleStart != std::string::npos && titleEnd != std::string::npos) {
        title = bodyContent.substr(titleStart + 7, titleEnd - titleStart - 7);
        title.erase(0, title.find_first_not_of(" \n\r\t"));
        title.erase(title.find_last_not_of(" \n\r\t") + 1);
    } else {
        title = "";
    }

    // Find meta description
    size_t metaStart = lowerContent.find("name=\"description\"");
    if (metaStart != std::string::npos) {
        size_t contentPos = lowerContent.find("content=\"", metaStart);
        if (contentPos != std::string::npos) {
            contentPos += 9;
            size_t metaEnd = lowerContent.find("\"", contentPos);
            if (metaEnd != std::string::npos) {
                metaDescription = bodyContent.substr(contentPos, metaEnd - contentPos);
                metaDescription.erase(0, metaDescription.find_first_not_of(" \n\r\t"));
                metaDescription.erase(metaDescription.find_last_not_of(" \n\r\t") + 1);
            }
        }
    } else {
        metaDescription = "";
    }
}

Technical Design & Language Proficiency

This project demonstrates:

Encapsulation and modularity: Parsing logic and data representation are separated into dedicated classes.
Inheritance & polymorphism: Extended analyzer functionality using classic OOP approaches.
String manipulation and file I/O: Robust handling of reading files and searching/extracting data.
Error handling: Graceful management of file and parsing exceptions.
Readable, maintainable code: Structured for easy understanding and extensibility.

Usage

Build:
```
g++ seo_analyzer.cpp -o seo_analyzer
```
Run:
```
./seo_analyzer
```
Follow prompts to analyze any HTML file and keyword.

Future Work

Add modules for analyzing header structure, readability, and generating detailed reports.
Integrate a real HTML parser for more complex documents.
Expand set of sample websites and analysis features.

Conclusion

SEO--Optimization is a learning project designed to demonstrate my skills in C++ and object-oriented programming. It parses sample HTML files for SEO-relevant data, with robust logic that reflects real-world variability. Suggestions, improvements, and contributions are welcome!

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
Outputs		Outputs
assets		assets
websites		websites
README.md		README.md
sample.html		sample.html
seo_analyzer.cpp		seo_analyzer.cpp
seo_analyzer.exe		seo_analyzer.exe

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

C++ SEO Analyzer

Table of Contents

Project Overview

Background & Objective

Features

Sample Websites

How It Works

Typical Output

E-Commerce Product Page (`ecommerce-product/index.html`, keyword: store)

Personal Blog (`personal-blog/index.html`, keyword: SEO)

Portfolio (`portfolio/index.html`, keyword: Developer)

Challenges & Solutions

Technical Design & Language Proficiency

Usage

Future Work

Conclusion

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

MITHRADEVI-K/SEO--Optimization

Folders and files

Latest commit

History

Repository files navigation

C++ SEO Analyzer

Table of Contents

Project Overview

Background & Objective

Features

Sample Websites

How It Works

Typical Output

E-Commerce Product Page (ecommerce-product/index.html, keyword: store)

Personal Blog (personal-blog/index.html, keyword: SEO)

Portfolio (portfolio/index.html, keyword: Developer)

Challenges & Solutions

Technical Design & Language Proficiency

Usage

Future Work

Conclusion

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

E-Commerce Product Page (`ecommerce-product/index.html`, keyword: store)

Personal Blog (`personal-blog/index.html`, keyword: SEO)

Portfolio (`portfolio/index.html`, keyword: Developer)

Packages