Skip to content

sxijyoti/polyglot-parser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PoC for Project 1: Implement MultiLang Parser

This project implements a multilanguage parser for 3 languages (python, javascript, ruby) as part of the Proof of Concept for MetaCall's Project 1. The aim was take in source files and parse them using Tree-sitter API, create AST and reduced IR. Adapting the generated IR to a unified standard format. It also explores dependency graphs (minimal for this PoC) which is necessary for Function Mesh and Intellisense.

DEMO VIDEO: https://www.youtube.com/watch?v=RP18MPPS8g0

How to Use

[NOTE] This PoC is currently tested on a MacOS environment and might cause errors on a different setup

# build (Ninja)
cmake -S . -B build -G Ninja -DCMAKE_BUILD_TYPE=Release
cmake --build build -j

# for help
./build/polygot_parser -h

# parse a single file
./build/polygot_parser -f examples/example.js

# parse a directory recursively
./build/polygot_parser -d examples/

CLI Reference

  polygot_parser -f <file>           parse one file
  polygot_parser -f <file1> <file2>  parse more than one file
  polygot_parser -d <directory>      parse all supported files in a directory

Other options:
  -o <output.json>        write JSON to file (default: stdout)
  -h, --help              show this help

The output is the unified JSON format which can be consumed by the VS Code extension (Intellisense) and the Function Mesh.

FileTree Structure

├─ CMakeLists.txt
├─ README.md
├─ src/
│  ├─ main.c                      # main cli logic is defined here
│  ├─ parser.c                    # the parser logic is handled here
│  └─ parser.h
├─ adapters/                       # extracts language specific queries
│  ├─ adapters.h
│  ├─ adapters.c
│  ├─ python_adapter.c           
│  ├─ js_adapter.c              
│  └─ ruby_adapter.c        
├─ ir/
│  ├─ ir.h                  
│  └─ ir.c                  # creates the Intermediate Representation(IR)
├─ graph/
│  ├─ graph.h
│  └─ graph.c              # creates a minimal dependency graph
├─ exporter/
│  ├─ mc_export.h   
│  └─ mc_export.c        # exports to a json output which can be consumed later
├─ tests/               
└─ examples/        # some example files 
  • CMakeLists.txt - Build configuration and dependencies
  • src/main.c - Main CLI entry point
  • src/parser.c - parsing flow and file traversal
  • adapters - tree-sitter extration for languages(py,js,rb)
  • ir - normalized IR for symbols and exports
  • graph- dependency graph builder with a type associated with its edges and nodes
  • exporter - JSON export of IR and graph

Architecture

image

JSON Schema

{
  "languages": {
    "<lang>": {
      "functions": [
        { "name": "sum", "args": ["a","b"], "exported": true }
      ],
      "classes": [
        { "name": "Calculator", "args": [], "exported": true }
      ],
      "objects": [
        { "name": "CONFIG", "exported": true }
      ]
    }
  },
  "graph": {
    "edges": [
      {
        "from": "examples/example.js",
        "from_kind": "file",        // file | symbol | module
        "to": "examples/example.py",
        "to_kind": "module",       // file | symbol | module
        "type": "require",         // import | require | define | export | member_of
        "lang": "js"
      }
    ]
  }
}

Example Output

HELP Command

image

Output for example.js

image

Integration with Metacall VSCode Extension?

As you can see example.py has different functions defined with arguments image

I called these functions in the JS File and gave it different arguments and as you can see there are the error wriggles! Comes from the VSCode Extension :)

image

This is due to the generated parser output:

image

Code link to show integration with vscode ext: https://github.com/sxijyoti/mc-vscode-extension/tree/testing-parser

AI Disclosure

All architectural decisions, implementation details, and the proof of concept were independently conceived and implemented by me. AI usuage was kept to as minimal as possible. AI was used for the tests only and to check with the vscode extension due to time constraint. Everything was reviewed and verified by me before being taken into consideration.

About

multilanguage parser

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors