Skip to content

Commit 774aee4

Browse files
Final changes for VECTOR support in Oracle Database 23ai.
1 parent acd68ee commit 774aee4

20 files changed

+1471
-35
lines changed

doc/src/api_manual/fetch_info.rst

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -112,3 +112,19 @@ FetchInfo Attributes
112112
This read-only attribute returns the type of the column as mandated by the
113113
Python Database API. The type will be one of the :ref:`database type
114114
constants <dbtypes>` defined at the module level.
115+
116+
.. attribute:: FetchInfo.vector_dimensions
117+
118+
This read-only attribute returns the number of dimensions required by
119+
vector columns. If the column is not a vector column or allows for any
120+
number of dimensions, the value returned is ``None``.
121+
122+
.. versionadded:: 2.2.0
123+
124+
.. attribute:: FetchInfo.vector_type
125+
126+
This read-only attribute returns the storage type required by vector
127+
columns. If the column is not a vector column or allows for any
128+
type of storage, the value returned is ``None``.
129+
130+
.. versionadded:: 2.2.0

doc/src/api_manual/module.rst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3147,6 +3147,14 @@ Also see the table :ref:`supporteddbtypes`.
31473147
type VARCHAR2. It will compare equal to the DB API type :data:`STRING`.
31483148

31493149

3150+
.. data:: DB_TYPE_VECTOR
3151+
3152+
Describes columns, attributes or array elements in a database that are of
3153+
type VECTOR.
3154+
3155+
.. versionadded:: 2.2.0
3156+
3157+
31503158
.. data:: DB_TYPE_XMLTYPE
31513159

31523160
Describes columns, attributes or array elements in a database that are of

doc/src/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ User Guide
4040
user_guide/lob_data.rst
4141
user_guide/json_data_type.rst
4242
user_guide/xml_data_type.rst
43+
user_guide/vector_data_type.rst
4344
user_guide/soda.rst
4445
user_guide/aq.rst
4546
user_guide/cqn.rst

doc/src/release_notes.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@ Thick Mode Changes
3434
Common Changes
3535
++++++++++++++
3636

37+
#) Added support for columns of type VECTOR.
3738
#) Added support for database type :data:`oracledb.DB_TYPE_INTERVAL_YM` which
3839
is represented in Python by instances of the new
3940
:ref:`oracledb.IntervalYM <interval_ym>` class

doc/src/user_guide/appendix_a.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -480,6 +480,10 @@ values.
480480
- :data:`~oracledb.DB_TYPE_OBJECT`
481481
- No relevant notes
482482
- OBJECT of specific type
483+
* - VECTOR
484+
- :data:`~oracledb.DB_TYPE_VECTOR`
485+
- No relevant notes
486+
-
483487

484488
Binding of contiguous PL/SQL Index-by BINARY_INTEGER arrays of string, number, and date are
485489
supported in python-oracledb Thin and Thick modes. Use :meth:`Cursor.arrayvar()` to build
Lines changed: 223 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,223 @@
1+
.. _vectors:
2+
3+
*****************
4+
Using Vector Data
5+
*****************
6+
7+
Oracle Database 23ai introduced a new data type VECTOR for artificial
8+
intelligence and machine learning search operations. The vector data type
9+
is a homogeneous array of 8-bit signed integers, 32-bit floating-point
10+
numbers, or 64-bit floating-point numbers. With the vector data type, you
11+
can define the number of dimensions for the data and the storage format
12+
for each dimension value in the vector.
13+
14+
To create a table with three columns for vector data, for example:
15+
16+
.. code-block:: sql
17+
18+
CREATE TABLE vector_table (
19+
v32 vector(3, float32),
20+
v64 vector(3, float64),
21+
v8 vector(3, int8)
22+
)
23+
24+
In this example, each column can store vector data of three dimensions where
25+
each dimension value is of the specified storage format. This example is used
26+
in subsequent sections.
27+
28+
.. _insertvector:
29+
30+
Inserting Vectors
31+
=================
32+
33+
With python-oracledb, vector data can be inserted using Python arrays
34+
(``array.array()``). To use Python arrays, import the ``array`` module in your
35+
code.
36+
37+
Python arrays (``array.array()``) of float (32-bit), double (64-bit), or
38+
int8_t (8-bit signed integer) are used as bind values when inserting vector
39+
columns. For example:
40+
41+
.. code-block:: python
42+
43+
vector_data_32 = array.array("f", [1.625, 1.5, 1.0]) # 32-bit float
44+
vector_data_64 = array.array("d", [11.25, 11.75, 11.5]) # 64-bit float
45+
vector_data_8 = array.array("b", [1, 2, 3]) # 8-bit signed integer
46+
47+
cursor.execute(
48+
"insert into vector_table (v32, v64, v8) values (:1, :2, :3)",
49+
[vector_data_32, vector_data_64, vector_data_8],
50+
)
51+
52+
See `vector.py <https://github.com/oracle/python-oracledb/tree/main/
53+
samples/vector.py>`__ for a runnable example.
54+
55+
If you are using python-oracledb Thick mode with older versions of Oracle
56+
Client libraries than 23ai, see this
57+
:ref:`section <vector_thick_mode_old_client>`.
58+
59+
.. _fetchvector:
60+
61+
Fetching Vectors
62+
================
63+
64+
With python-oracledb, vector columns are fetched as Python arrays
65+
(``array.array()``). For example:
66+
67+
.. code-block:: python
68+
69+
cursor.execute("select * from vector_table")
70+
for row in cursor:
71+
print(row)
72+
73+
This prints an output such as::
74+
75+
(array("f", [1.625, 1.5, 1.0]), array("d", [11.25, 11.75, 11.5]), array("b", [1, 2, 3]))
76+
77+
The :ref:`FetchInfo <fetchinfoobj>` object that is returned as part of the
78+
fetched metadata contains attributes :attr:`FetchInfo.vector_dimensions` and
79+
:attr:`FetchInfo.vector_type` which return the number of dimensions of the
80+
vector column and the storage format of each dimension value in the vector
81+
column respectively.
82+
83+
You can convert the vector data fetched from a connection to a Python list by
84+
using the following :ref:`output type handler <outputtypehandlers>`:
85+
86+
.. code-block:: python
87+
88+
def output_type_handler(cursor, metadata):
89+
if metadata.type_code is oracledb.DB_TYPE_VECTOR:
90+
return cursor.var(metadata.type_code, arraysize=cursor.arraysize,
91+
outconverter=list)
92+
connection.outputtypehandler = output_type_handler
93+
cursor.execute("select * from vector_table")
94+
for row in cursor:
95+
print(row)
96+
97+
For each vector column, the database will now return a Python list
98+
representation of each row's value.
99+
100+
If you are using python-oracledb Thick mode with older versions of Oracle
101+
Client libraries than 23ai, see :ref:`below <vector_thick_mode_old_client>`.
102+
103+
.. _vector_thick_mode_old_client:
104+
105+
Using python-oracledb Thick Mode with Older Versions of Oracle Client Libraries
106+
===============================================================================
107+
108+
If you are using python-oracledb Thick mode with older versions of Oracle
109+
Client libraries than 23ai, then you must use strings when inserting vectors.
110+
For example:
111+
112+
.. code-block:: python
113+
114+
vector_data_32 = "[1.625, 1.5, 1.0]"
115+
vector_data_64 = "[11.25, 11.75, 11.5]"
116+
vector_data_8 = "[1, 2, 3]"
117+
118+
cursor.execute(
119+
"insert into vector_table (v32, v64, v8) values (:1, :2, :3)",
120+
[vector_data_32, vector_data_64, vector_data_8],
121+
)
122+
123+
The vector columns are fetched as Python lists. For example:
124+
125+
.. code-block:: python
126+
127+
cursor.execute("select * from vector_table")
128+
for row in cursor:
129+
print(row)
130+
131+
See `vector_string.py <https://github.com/oracle/python-oracledb/tree/main/
132+
samples/vector_string.py>`__ for a runnable example.
133+
134+
.. _numpyvectors:
135+
136+
Using NumPy
137+
===========
138+
139+
Vector data can be used with Python's `NumPy <https://numpy.org>`__ package
140+
types. To use NumPy's ndarray type, install NumPy, for example with
141+
``pip install numpy``, and import the module in your code.
142+
143+
Inserting Vectors with NumPy
144+
----------------------------
145+
146+
To insert vectors, you must convert NumPy ndarray types to array types. This
147+
conversion can be done by using an input type handler. For example:
148+
149+
.. code-block:: python
150+
151+
def numpy_converter_in(value):
152+
if value.dtype == numpy.float64:
153+
dtype = "d"
154+
elif value.dtype == numpy.float32:
155+
dtype = "f"
156+
else:
157+
dtype = "b"
158+
return array.array(dtype, value)
159+
160+
def input_type_handler(cursor, value, arraysize):
161+
if isinstance(value, numpy.ndarray):
162+
return cursor.var(
163+
oracledb.DB_TYPE_VECTOR,
164+
arraysize=arraysize,
165+
inconverter=numpy_converter_in,
166+
)
167+
168+
Using it in an ``INSERT`` statement:
169+
170+
.. code-block:: python
171+
172+
vector_data_32 = numpy.array([1.625, 1.5, 1.0])
173+
vector_data_64 = numpy.array([11.25, 11.75, 11.5])
174+
vector_data_8 = numpy.array([1, 2, 3])
175+
176+
connection.inputtypehandler = input_type_handler
177+
178+
cursor.execute(
179+
"insert into vector_table (v32, v64, v8) values (:1, :2, :3)",
180+
[vector_data_32, vector_data_64, vector_data_8],
181+
)
182+
183+
Fetching Vectors with NumPy
184+
---------------------------
185+
186+
To fetch vector data as an ndarray type, you can convert the array type to
187+
an ndarray type by using an output type handler. For example:
188+
189+
.. code-block:: python
190+
191+
def numpy_converter_out(value):
192+
if value.typecode == "b":
193+
dtype = numpy.int8
194+
elif value.typecode == "f":
195+
dtype = numpy.float32
196+
else:
197+
dtype = numpy.float64
198+
return numpy.array(value, copy=False, dtype=dtype)
199+
200+
def output_type_handler(cursor, metadata):
201+
if metadata.type_code is oracledb.DB_TYPE_VECTOR:
202+
return cursor.var(
203+
metadata.type_code,
204+
arraysize=cursor.arraysize,
205+
outconverter=numpy_converter_out,
206+
)
207+
208+
Using it in a query:
209+
210+
.. code-block:: python
211+
212+
connection.outputtypehandler = output_type_handler
213+
214+
cursor.execute("select * from vector_table")
215+
for row in cursor:
216+
print(row)
217+
218+
This prints an output such as::
219+
220+
(array([1.625, 1.5, 1.0], dtype=float32), array([11.25, 11.75, 11.5], dtype=float64), array([1, 2, 3], dtype=int8))
221+
222+
See `vector_numpy.py <https://github.com/oracle/python-oracledb/tree/main/
223+
samples/vector_numpy.py>`__ for a runnable example.

samples/create_schema.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,4 +54,8 @@
5454
sample_env.run_sql_script(
5555
conn, "create_schema_21", main_user=sample_env.get_main_user()
5656
)
57+
if sample_env.get_server_version() >= (23, 4):
58+
sample_env.run_sql_script(
59+
conn, "create_schema_23", main_user=sample_env.get_main_user()
60+
)
5761
print("Done.")

samples/sql/create_schema_23.sql

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
/*-----------------------------------------------------------------------------
2+
* Copyright 2023, Oracle and/or its affiliates.
3+
*
4+
* This software is dual-licensed to you under the Universal Permissive License
5+
* (UPL) 1.0 as shown at https://oss.oracle.com/licenses/upl and Apache License
6+
* 2.0 as shown at http://www.apache.org/licenses/LICENSE-2.0. You may choose
7+
* either license.*
8+
*
9+
* If you elect to accept the software under the Apache License, Version 2.0,
10+
* the following applies:
11+
*
12+
* Licensed under the Apache License, Version 2.0 (the "License");
13+
* you may not use this file except in compliance with the License.
14+
* You may obtain a copy of the License at
15+
*
16+
* https://www.apache.org/licenses/LICENSE-2.0
17+
*
18+
* Unless required by applicable law or agreed to in writing, software
19+
* distributed under the License is distributed on an "AS IS" BASIS,
20+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
21+
* See the License for the specific language governing permissions and
22+
* limitations under the License.
23+
*---------------------------------------------------------------------------*/
24+
25+
/*-----------------------------------------------------------------------------
26+
* create_schema_23.sql
27+
*
28+
* Performs the actual work of creating and populating the schemas with the
29+
* database objects used by the python-oracledb samples that require Oracle
30+
* Database 23.4 or higher. It is executed by the Python script
31+
* create_schema.py.
32+
*---------------------------------------------------------------------------*/
33+
34+
create table &main_user..SampleVectorTab (
35+
v32 vector(3, float32),
36+
v64 vector(3, float64),
37+
v8 vector(3, int8)
38+
)
39+
/

0 commit comments

Comments
 (0)