Skip to content

Api refactoring[Suspended, DO NOT USE]#157

Open
poppindouble wants to merge 1 commit intomasterfrom
api-refactoring
Open

Api refactoring[Suspended, DO NOT USE]#157
poppindouble wants to merge 1 commit intomasterfrom
api-refactoring

Conversation

@poppindouble
Copy link
Copy Markdown
Collaborator

No description provided.

@poppindouble poppindouble force-pushed the api-refactoring branch 5 times, most recently from 086b5c3 to 980330b Compare October 5, 2019 00:22
@poppindouble poppindouble changed the title Api refactoring(W.I.P) Api refactoring Oct 5, 2019
@poppindouble poppindouble requested a review from blaesus October 5, 2019 00:25
@poppindouble
Copy link
Copy Markdown
Collaborator Author

I still have some doubt regarding to issue #154.
here is the link.
https://github.com/immux/immux/issues/154#issuecomment-537774161

Copy link
Copy Markdown
Contributor

@blaesus blaesus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I am afraid this PR is inadequate. As we have more pressing tasks, I suggest we shelf this PR. We will continue to use the existing HTTP interface.

Two major issues with this PR are:

  1. ChainName in URL is not implemented,
  2. string representation of UnitContent is awkward to use.

1. ChainName in URL

The new URL scheme exposes the chain_name parameter, but the PR does not actually implement it. The parameter in URL is simply ignore (see unicus/cortex.rs).

Moreover, if user writes to a chain other than the one that is actually running, the data will be written to the wrong chain.

This simply should not happen to a public interface of a database.

2. UnitContent string representation.

Here are a list of higher-level issues of the current string representation scheme. Comments about implementation details are in the code.

Symmetry

Currently, the UnitContent string representation is symmetric (the input and output sets are one-to-one) – symmetry is usual for byte streams, but doubtful for strings that humans handle.

For example, why can't users say property_name=age&property=10, but have to say
property_name=age&property=_f64#10?

If more numerical types are introduced (for example, u64), which one should the user use?
property_name=age&property=_f64#10
property_name=age&property=_u64#10

Precision

The current design is under-specced.

For examples, bytes are represented as _bytes#[0x7b,0x22,0x66,0x36,0x34,0x00,0xff].

What happens when 0x is dropped?

What happens when there are spaces or \n around comma or brackets?

What happens for trailing spaces or \n? (All common in http interactions).

What happens when someone say 0xgg?

Of course there are answers to these questions, but the answers are not easily known.s

Extension

How do we say age >= 10, str includes "hello", or obj.field == 1?

These are all frequent database queries that will be used by users. The current design cannot accomondate these requirements, and it is non-obvious how they can be implemented upon the existing syntax.

Exposure of implementation details

For the user, what's the difference BsonBytes and JsonString? They are just two ways to store JSON.

Also, what's the difference between BsonBytes and Bytes?

Which one should the user choose if they have some JSON?
_string#{"hello": "world"}
_json_str#{"hello": "world"}
_bson_bytes#[...]
_bytes#[...]


#[repr(u8)]
pub enum ContentTypePrefix {
pub enum ContentTypeBytePrefix {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the renaming?

}

fn decode_hex(s: &str) -> Result<Vec<u8>, ParseIntError> {
// Input is gonna be a str like "[0x01,0x02,0xff]"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does gonna mean? Does it mean that the caller have to ensure s comes in that form, or that the function will handle malformed input?

.iter()
.map(|byte| {
let mut s = String::new();
match write!(&mut s, "{:#04x}", byte) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the 0x prefix seems redundant and waste space? If it's a 1kB data, 0x will be repeated 1024 times?

fn fmt(&self, f: &mut Formatter<'_>) -> FormatResult {
let unit_content_prefix = get_content_type_str_prefix(&self);

match self {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic to insert prefix and SEPARATOR are duplicated for all branches?

type Err = UnitContentError;

fn from_str(s: &str) -> Result<Self, Self::Err> {
let data: Vec<&str> = s.split(CONTENT_TYPE_VALUE_SEPARATOR).collect();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's surprising and inefficient to split on SEPARATOR, only to add them back with join.

r#"_json_str#{"f64": 0.0, "str": "string_0", "bool": true}"#,
"_bson_bytes#[0x01,0x02,0x03,0xff,0x3f,0x00]",
"_bytes#[0x7b,0x22,0x66,0x36,0x34,0x00,0xff]",
];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test suite is missing these vulnerabilities:

  1. malformed inputs (for example, ``, ###).
  2. Chinese and other multi-unicode inputs.

Also, I am not sure if weird structures would break URL parsing:
What happens if I use _str#property_name, _str#&by_kv, or _str#_str in URL?

(segments[1], "")
} else {
("", "")
let (_host_ip, _chain_name, target_grouping_str, unit_id_str, height_str) = match segments.len()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is chain_name ignored ever since? This is a key part of the modification.

pub struct ArtificialDataBenchSpec {
pub name: &'static str,
pub unicus_port: u16,
pub chain_name: &'static str,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chain_name should have type of ChainName

fn batch_insert<F>(
client: &ImmuxDBClient,
units_vec: &[Vec<Unit>],
chain_name: &str,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chain_name should have type ChainName.

Comment thread src/config.rs

pub const MAX_KVKEY_LENGTH: usize = 8 * 1024; // 8KB
pub const MAX_KVKEY_LENGTH: usize = 8 * 1024;
// 8KB
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

@poppindouble poppindouble changed the title Api refactoring Api refactoring[Suspended, DO NOT USE] Oct 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants