-
Notifications
You must be signed in to change notification settings - Fork 71
Open
Labels
Description
SuperDB does not yet support a "tuple IN subquery", e.g., a query that contains WHERE ... IN ... in which the left hand side is a tuple and the right hand side is a subquery.
Details
Repro is with super commit ecdb021.
Here's a working example in Postgres.
$ psql postgres
psql (17.6 (Homebrew))
Type "help" for help.
postgres=# CREATE TABLE employees (
employee_id INT PRIMARY KEY,
name VARCHAR(100),
department_id INT,
job_title VARCHAR(50),
office VARCHAR(50)
);
CREATE TABLE
postgres=# INSERT INTO employees (employee_id, name, department_id, job_title, office) VALUES
(1, 'Alice Chen', 10, 'Manager', 'Sales'),
(2, 'Bob Smith', 10, 'Manager', 'Marketing'),
(3, 'Carol Jones', 20, 'Analyst', 'Sales'),
(4, 'David Lee', 15, 'Director', 'Engineering'),
(5, 'Eve Williams', 20, 'Analyst', 'Marketing'),
(6, 'Frank Brown', 30, 'Developer', 'Engineering'),
(7, 'Grace Davis', 25, 'Consultant', 'Marketing');
INSERT 0 7
postgres=# SELECT employee_id, name, department_id, job_title
FROM employees
WHERE (department_id, job_title) IN (
SELECT department_id, job_title
FROM employees
WHERE office = 'Sales'
);
employee_id | name | department_id | job_title
-------------+--------------+---------------+-----------
1 | Alice Chen | 10 | Manager
2 | Bob Smith | 10 | Manager
3 | Carol Jones | 20 | Analyst
5 | Eve Williams | 20 | Analyst
(4 rows)
When run in super the same example returns no output.
$ super -version
Version: ecdb02110
$ super -f parquet -o employees.parquet -c "
VALUES
(1, 'Alice Chen', 10, 'Manager', 'Sales'),
(2, 'Bob Smith', 10, 'Manager', 'Marketing'),
(3, 'Carol Jones', 20, 'Analyst', 'Sales'),
(4, 'David Lee', 15, 'Director', 'Engineering'),
(5, 'Eve Williams', 20, 'Analyst', 'Marketing'),
(6, 'Frank Brown', 30, 'Developer', 'Engineering'),
(7, 'Grace Davis', 25, 'Consultant', 'Marketing')
| values {employee_id:c0, name:c1, department_id:c2, job_title:c3, office:c4}" && super -c "
SELECT employee_id, name, department_id, job_title
FROM employees.parquet
WHERE (department_id, job_title) IN (
SELECT department_id, job_title
FROM employees.parquet
WHERE office = 'Sales'
);"
[no output]
@mccanne explained that this is currently happening because the tuple on the left hand side of the IN turns into records of the format {c0: ..., c1: ..., ...} and comparing records for equality currently requires the types to match, so the specific field names returned by the subquery will cause a mismatch.