Skip to content

ParserSQL

Mikel-Bracamonte edited this page Jul 16, 2025 · 16 revisions

SQL Parser Usage

We implemented a custom parser for SQL-like queries that supports:

  • Table creation with index definitions
  • Index creation
  • Selection queries with filtering by key or range
  • Insert and delete instructions

This page shows the usage of the SQL parser. A technical explanation of the inner workings of the parser can be seen here.

Executing instructions

To execute an instruction, the method execute_sql() can be imported from parserSQL/parser.py as follows:

from parserSQL import parser

And then executed using:

parser.execute_sql(sql)

The method requires a string that contains the SQL instruction.

Our custom frontend application can also be used. The usage of this application is shown here.

General information

Each instruction is separated by a semicolon (;). If a instruction throws an error when executed, all the following intructions won't be executed.

A single-line comment starts with --.

A multi-line comment starts with /* and ends on the first instance of */.

Strings are defined using single quotes (').

If a float value is required, it must be explicitly declared as a float, even if it only has a whole part. For example, a value of 5 as a float has to be written as 5.0.

A point value is declared as two floats inside parenthesis separated by a comma. For example: (2.0, 2.0).

Supported index types and data types can be seen here.

Supported conditions can be seen here.

Table creation

Tables are created with the CREATE TABLE instruction, followed by the table name and the column definitions. CREATE TABLE may be followed by IF NOT EXISTS, in which case an error won't be thrown if the table already exists.

Columns are defined by a column name, a data type, and optionally an index type. The primary key is specified via PRIMARY KEY. A table must have at least one column and exactly one primary key. If a column is of data type VARCHAR, the maximum length has to be specified.

Supported index types and data types can be seen here.

By default, a table's primary key has a HASH index.

For example:

CREATE TABLE Employees (
  ID INT PRIMARY KEY INDEX HASH,
  FirstName VARCHAR(20) INDEX BTREE,
  LastName VARCHAR(20) INDEX BTREE,
  Salary FLOAT,
  IsActive BOOL,
  EstablishmentLocation POINT INDEX RTREE
);

Full examples can be seen here.

Table deletion

Tables can be dropped using the DROP TABLE instruction, followed by the table name. DROP TABLE may be followed by IF EXISTS, in which case an error won't be thrown if the table doesn't exist.

Dropping a table will also drop all related indexes.

For example:

DROP TABLE Employees;

Full examples can be seen here.

Data insertion

Data is inserted using the INSERT instruction. The table is specificed via INTO {table_name}, after which the order of the columns may be specificed. The values to be inserted are specified with VALUES, followed by the values inside parenthesis separated by commas.

For example:

INSERT INTO Employees (ID, FirstName, LastName, Salary, IsActive, EstablishmentLocation) VALUES (879512, 'Alice', 'Smith', 3400.75, TRUE, (40.7128, -74.0060));

Full examples can be seen here.

Data selection

Data is selected using the SELECT instruction, after which the columns to return are specified separated by commas, or instead * may be used to return all columns. The table to select from is specified using FROM {table_name}, after which conditions may be specified using WHERE. Only the entries that fulfill the conditions are returned.

The available conditions and the way to write them can be seen here.

At the end of the instruction, ORDER BY followed by a column name may be added to order the results according to the specified column. When ORDER BY is used, the results are returned in descending order, if the results are to be returned in ascending order instead, ASC may be used.

The amount of data to be returned can be specified using LIMIT followed by a number.

For example:

SELECT ID, FirstName, LastName FROM Employees WHERE
IsActive AND LastName > 'G'
ORDER BY LastName ASC
LIMIT 20;

Full examples can be seen here.

Data deletion

Data is deleted using the DELETE instruction. The table to delete from is specified using FROM {table_name}, after which conditions may be (and probably should be) specified using WHERE. Only entries that fulfill the conditions will be deleted.

The available conditions and the way to write them can be seen here.

For example:

DELETE FROM Employees WHERE NOT IsActive AND LastName < 'B';

Full examples can be seen here.

Index creation

An index is created using the CREATE INDEX instruction, followed by the index name. The table is specified using ON {table_name}.

The index type may be specified using USING followed by the index type. If the index is specified, a valid index for the column's data type must be used.

The supported indexes for each data type can be seen here

If no index type is specified, a B+ Tree index is used by default, unless the column is of data type POINT, in which case an RTree index is used by default.

At the end, the column to be indexed is specified inside parenthesis.

For example:

CREATE INDEX idx_salary ON Employees USING AVL (Salary);

Full examples can be seen here.

Index deletion

An index is dropped using the DROP INDEX instruction, followed by the index name. The index's table is specified using ON {table_name}.

When a table is created and an index is assigned to a column, it has the following default name: idx_{column_name}_{index_type}. For example: idx_FirstName_BTREE.

For example:

DROP INDEX idx_salary ON Employees;

Full examples can be seen here.

Supported data types and index types

The supported data types are the following:

  • INT
  • FLOAT
  • VARCHAR
  • POINT
  • BOOL

The supported index types are the following:

  • BTREE
  • HASH
  • AVL
  • ISAM
  • RTREE

INT, FLOAT, and VARCHAR columns support all indexes except for the RTree index. POINT columns only support the RTree index.

Supported conditions

Many conditions may be specified, joined together by AND, OR and NOT operations. NOT operations are evaluated first, followed by AND operations and finally OR operations. If a specific priority is needed, parenthesis may be used.

Columns of type INT, FLOAT and VARCHAR support the following conditions:

  • Greater than (>)
  • Greater or equal to (>=)
  • Less than (<)
  • Less or equal to (<=)
  • Equal to (=)
  • Not equal to (!= or <>)

A value corresponding to the column's data type has to be written in the condition's right hand side.

Columns of type BOOL can be written by themselves and will function as {column_name} = TRUE.

Columns of type POINT support the following conditions:

  • WITHIN RECTANGLE (WITHIN RECTANGLE)
  • WITHIN CIRCLE (WITHIN CIRCLE)
  • KNN (KNN)

The value needed in the right hand side depends on the operation.

The WITHIN RECTANGLE operation requires a rectangle declaration, which is specified by a bottom-left point and a top-right point, written as (x_min, y_min, x_max, y_max), where each value is a float and x_min and y_min must be less than x_max and y_max respectively. It will return the points that are located inside the rectangle.

The WITHIN CIRCLE operation requires a circle declaration, which is specified by a mid-point and a radius, written as (x, y, radius), where each value is a float and radius must be positive. It will return the points that are located inside the circle.

THE KNN operation requires a point and a value for k, written as (x, y, k), where x and y are floats and k is a positive integer. It will return the k nearest neighbours to the point x, y.

Full examples can be seen here.

Full examples

  • Table creation and data insertion
CREATE TABLE Customer (
    ID INT PRIMARY KEY INDEX HASH,
    Name VARCHAR(20) INDEX BTree,
    Address VARCHAR(70),
    Balance FLOAT INDEX AVL,
    Location POINT INDEX RTree
);

INSERT INTO Customer VALUES(735289, 'Quenta', '742 Evergreen Terrace, Springfield, IL, USA', (39.791555, -89.650422));
  • Data selection
SELECT * FROM Customer WHERE ID = 735289;
SELECT * FROM Customer WHERE Balance >= 300.00;
SELECT Name, Address FROM Customer WHERE Location WITHIN CIRCLE (38.538254, -87.682953, 5.0) ORDER BY Name;
SELECT Name, Address FROM Customer WHERE Location WITHIN RECTANGLE (35.739125, -89.925284, 42.681854, -85.276781) LIMIT 5;
SELECT Name, Address FROM Customer WHERE Location KNN (2.624, 6.525, 3);

SELECT * FROM Customer WHERE
 ID != 735289 AND Name > 'C' AND
 (Location KNN (2.624, 6.525, 3) OR Location WITHIN RECTANGLE (35.739125, -89.925284, 42.681854, -85.276781))
 ORDER BY Name
 LIMIT 10;
  • Data deletion
DELETE FROM Customer WHERE ID = 735289;
DELETE FROM Customer WHERE Location WITHIN CIRCLE (38.538254, -87.682953, 5.0);
  • Index creation and deletion
CREATE INDEX idx_address_avl ON Customer USING ISAM (Address);
DROP INDEX idx_address_avl ON Customer;

Clone this wiki locally