Api refactoring[Suspended, DO NOT USE]#157
Conversation
086b5c3 to
980330b
Compare
980330b to
485da42
Compare
|
I still have some doubt regarding to issue #154. |
blaesus
left a comment
There was a problem hiding this comment.
Overall, I am afraid this PR is inadequate. As we have more pressing tasks, I suggest we shelf this PR. We will continue to use the existing HTTP interface.
Two major issues with this PR are:
- ChainName in URL is not implemented,
- string representation of UnitContent is awkward to use.
1. ChainName in URL
The new URL scheme exposes the chain_name parameter, but the PR does not actually implement it. The parameter in URL is simply ignore (see unicus/cortex.rs).
Moreover, if user writes to a chain other than the one that is actually running, the data will be written to the wrong chain.
This simply should not happen to a public interface of a database.
2. UnitContent string representation.
Here are a list of higher-level issues of the current string representation scheme. Comments about implementation details are in the code.
Symmetry
Currently, the UnitContent string representation is symmetric (the input and output sets are one-to-one) – symmetry is usual for byte streams, but doubtful for strings that humans handle.
For example, why can't users say property_name=age&property=10, but have to say
property_name=age&property=_f64#10?
If more numerical types are introduced (for example, u64), which one should the user use?
property_name=age&property=_f64#10
property_name=age&property=_u64#10
Precision
The current design is under-specced.
For examples, bytes are represented as _bytes#[0x7b,0x22,0x66,0x36,0x34,0x00,0xff].
What happens when 0x is dropped?
What happens when there are spaces or \n around comma or brackets?
What happens for trailing spaces or \n? (All common in http interactions).
What happens when someone say 0xgg?
Of course there are answers to these questions, but the answers are not easily known.s
Extension
How do we say age >= 10, str includes "hello", or obj.field == 1?
These are all frequent database queries that will be used by users. The current design cannot accomondate these requirements, and it is non-obvious how they can be implemented upon the existing syntax.
Exposure of implementation details
For the user, what's the difference BsonBytes and JsonString? They are just two ways to store JSON.
Also, what's the difference between BsonBytes and Bytes?
Which one should the user choose if they have some JSON?
_string#{"hello": "world"}
_json_str#{"hello": "world"}
_bson_bytes#[...]
_bytes#[...]
|
|
||
| #[repr(u8)] | ||
| pub enum ContentTypePrefix { | ||
| pub enum ContentTypeBytePrefix { |
| } | ||
|
|
||
| fn decode_hex(s: &str) -> Result<Vec<u8>, ParseIntError> { | ||
| // Input is gonna be a str like "[0x01,0x02,0xff]" |
There was a problem hiding this comment.
What does gonna mean? Does it mean that the caller have to ensure s comes in that form, or that the function will handle malformed input?
| .iter() | ||
| .map(|byte| { | ||
| let mut s = String::new(); | ||
| match write!(&mut s, "{:#04x}", byte) { |
There was a problem hiding this comment.
the 0x prefix seems redundant and waste space? If it's a 1kB data, 0x will be repeated 1024 times?
| fn fmt(&self, f: &mut Formatter<'_>) -> FormatResult { | ||
| let unit_content_prefix = get_content_type_str_prefix(&self); | ||
|
|
||
| match self { |
There was a problem hiding this comment.
The logic to insert prefix and SEPARATOR are duplicated for all branches?
| type Err = UnitContentError; | ||
|
|
||
| fn from_str(s: &str) -> Result<Self, Self::Err> { | ||
| let data: Vec<&str> = s.split(CONTENT_TYPE_VALUE_SEPARATOR).collect(); |
There was a problem hiding this comment.
It's surprising and inefficient to split on SEPARATOR, only to add them back with join.
| r#"_json_str#{"f64": 0.0, "str": "string_0", "bool": true}"#, | ||
| "_bson_bytes#[0x01,0x02,0x03,0xff,0x3f,0x00]", | ||
| "_bytes#[0x7b,0x22,0x66,0x36,0x34,0x00,0xff]", | ||
| ]; |
There was a problem hiding this comment.
The test suite is missing these vulnerabilities:
- malformed inputs (for example, ``,
###). - Chinese and other multi-unicode inputs.
Also, I am not sure if weird structures would break URL parsing:
What happens if I use _str#property_name, _str#&by_kv, or _str#_str in URL?
| (segments[1], "") | ||
| } else { | ||
| ("", "") | ||
| let (_host_ip, _chain_name, target_grouping_str, unit_id_str, height_str) = match segments.len() |
There was a problem hiding this comment.
Why is chain_name ignored ever since? This is a key part of the modification.
| pub struct ArtificialDataBenchSpec { | ||
| pub name: &'static str, | ||
| pub unicus_port: u16, | ||
| pub chain_name: &'static str, |
There was a problem hiding this comment.
chain_name should have type of ChainName
| fn batch_insert<F>( | ||
| client: &ImmuxDBClient, | ||
| units_vec: &[Vec<Unit>], | ||
| chain_name: &str, |
There was a problem hiding this comment.
chain_name should have type ChainName.
|
|
||
| pub const MAX_KVKEY_LENGTH: usize = 8 * 1024; // 8KB | ||
| pub const MAX_KVKEY_LENGTH: usize = 8 * 1024; | ||
| // 8KB |
No description provided.