Conversation
* implemented TCP keep alives * reconnect logic implementation * updated README for new features * read code de-duplication into function
* implemented TCP keep alives * refactored TCP keepalive patch * updated Cargo.lock * fix keepalive retries compilation * compilation fixes * fix unused code * fix arguments comments * updated Help and README * initial reconnect logic implementation * compilation fixes, reformated * updated README for new reconnect feature unified buffer read added reconnect logging * reconnect logic de-duplication into function * implement retry mechanism on connection open fail * README typo fix * improved cli arguments dependencies * arg dependencies
Set reconnect default timeout to 60
…weedle#4) * fix invalid NodeTableResponse struct * added infrastructure event in observe mode - to output in JSON gateway and node details (addresses, barcodes, versions etc.) * added persistent storage * Updated README to reflect implementation
|
@litinoveweedle What was your motivation for changing |
|
Doing my changes in your PR doesn't sound nice to me - the one thing you can give me is credit for my work and in your PR it will be gone/lost in the Github merge. But sure, license permits that, so you are free to do that. ;-o
This can be reverted, but I do not see is it as a perfect way - it will only work completely/as expected/correctly after ~12-24hrs when during the night Node Table Response messages are received. Also the correlation is build in the taptap - you can't change it or implement it to your needs. When used with any log parsers and or data stores (Logstash/Fluentd + Elastic) you would like those components to be the one to correlate, index etc - those are perfect for the usage. I wanted taptap to work in KISS design principles, like some generic Linux GNU tool - do one thing, but do it well. Not to be jack of all trades poorly. In my case, when I run taptap and I need to immediately start producing enumerated (identified by serials) output - but without data in the persistent file I need to be able to temporary enumerate serials defined by users to the panels/tigo optimizers, even before receiving any serials from taptap. So I implemented this temporary defined serial to 'some' PV node, till the proper identification is received during the night. Also same is valid for the re-installation of the HA addon, when already capture persistent data will be lost. And I can guarantee there will be tons of the user issues, that I installed your addon, and it doesn't work, with my response it is not a bug but feature, for the first time i might need one day to start working. And again I can implement same feature in the taptap, or half in the taptap and half in the higher layer, but to me, the best design is to implement complete PV node it to serial correlation and logic outside of the taptap, so anyone can decide and implement what is best to them. Whats supports this is, that no one would be using taptap just on its own, but always inside of something - ELK, MQTT bridge etc. and this is than nice place to implement complete correlation for your use case - someone might prefer temporary correlation, other might not. |
@litinoveweedle The commits in this branch still list you as the author regardless of who opened the PR. Given your unfamiliarity with Rust and the volume of changes I was about to request, I chose to open a PR so that I can make those changes myself on top of yours – ultimately merging your features, preserving your authorship, and sparing your Copilot tokens – and so that I can push my changes somewhere we can both discuss them before merging.
Right, that all makes sense. These exact concerns motivated my design of these data structures: #[derive(Debug, Copy, Clone, Eq, PartialEq, Serialize, Deserialize, JsonSchema)]
pub struct Gateway {
/// The gateway's link layer ID.
///
/// This value can change over time and is duplicated between different systems, but it is
/// always present.
pub id: gateway::link::GatewayID,
/// The gateway's hardware address.
///
/// This value is permanent and globally unique, but it is not always known.
#[serde(default, skip_serializing_if = "Option::is_none")]
pub address: Option<pv::LongAddress>,
}
#[derive(Debug, Copy, Clone, Eq, PartialEq, Serialize, Deserialize, JsonSchema)]
pub struct Node {
/// The node's ID.
///
/// This value can change over time and is duplicated between different gateways, but it is
/// always present.
pub id: pv::NodeID,
/// The node's hardware address.
///
/// This value is permanent and globally unique, but it is not always known.
#[serde(default, skip_serializing_if = "Option::is_none")]
pub address: Option<pv::LongAddress>,
/// The node's barcode.
///
/// This value is permanent and globally unique, but it is not always known.
#[serde(default, skip_serializing_if = "Option::is_none")]
pub barcode: Option<Barcode>,
}
|
|
Regarding the authorship as far as I know, the GitHub will not add in this way the author's name to the contributors lists. At least that is my experience on my projects where I am trying to exactly avoid that situation. I understand my rust level is not anywhere near to yours, and you shall probably add copilot there as well, but you did say previously that you are not gonna implement any features or fix bugs anymore, only accept PR, so I did my best, as I did not saw line of Rust programmers waiting to do that job. Regarding that enumeration, as I said, do whatever you want, but one feature which sometimes works immediately, sometimes only after day, dependant also on the persistent - state file switch .... That will not be straightforward to users. |
Gotcha. GitHub's docs indicate that commit author email is used to drive the contributors list. My expectation is that you will be listed as a GitHub contributor once this branch or any other branch containing commits you authored lands in
Of course! I appreciate your effort, and I want this repo to solve your problems, which is why I'm not insisting that you change a bunch of stuff on your own. I think we had a miscommunication though. My specific message was:
If you opened one PR per feature, we could have discussed/iterated/merged them here and cut releases as you went. That's not how we got here though, and rather than ask you to disentangle all your work, I figured it'd be better for me to start from where you are, to understand your motivations, to propose a derivative of your work to merge upstream, and to discuss as we go.
I agree it's not straightforward but I also don't see how it's avoidable. Neither How are you using infrastructure reports? It seems like the primary use would be to store them in the receiving application and then use them to interpret node IDs (which is what I think we can avoid by emitting |
|
OK, fine, thank you, lets just discuss the functional stuff.
I do use those to interpret Node id and gateway ID into address, serial, versions etc. In the configuration of the Home Assistant addon user defines user friendly name for every PV node/panel and serial (in the same way as in the Tigo app/cloud). When taptap is executed there are two options - it will either immediately fire infrastructure event from the existing persistent file and based on that I can update friendly-name -> serial relation in the HA, or if this is not yet known HA addon will create temporary random mapping between friendly-name -> serial and correct it later, when real infrastructure event is fired. So yes, it can happen that for the first 12-24 hours the mapping will not be correct, but at least the addon will work and display data, although panels migh get mixed. (very often not, as ID are sequential) Having the information about power and infrastructure in two separate event/messages is perfectly fine - the event structure is well defined, and information are split in the way any relational database would like them normalized. What you propose is to put everything in power report, even there is always some higher layer over the taptap which would do that correlation much better - especially when serial, address, versions tend to change very rarely, and there is no reason why to emits those for every power event. Such duplication is unnecessary - when any infrastructure information is received (much more rarely) then dedicated infrastructure event is fired. I do not need to parse every power report and see, that serial does or doesn't change (as it almost never does).
Do you see why I prefer separation into two independent events? Especially when there is always some layer above taptap which by nature handle any data parsing/enhancement/storing/indexing very well - as it is its primary functionality. Put into power report all those redundant rarely changing fields (addresses, versions, serials) data and it will be emitted thousands time, always same data, same gateway address and version for every node, same barcode for given node, again and again. To me this is completely wrong. What is your argument against having it split? That user needs to do the correlation? User will not be reading taptap output, some application will. And it can correlate its output if needed - or do anything else - but that is up to the given use case. |
|
I would maybe also suggest to complete implementation of the Topology Report in the observer.rs Those messages are received during the day and it is my understanding, that those also include PV node addresses which could be converted to the barcodes/serial. This would increase number of messages which can init/update serial numbers. |
|
Of course. I spent the last couple days preparing for travel this week and haven't been able to respond to the software topics. I'll be away through Friday but expect to find time to continue discussing them here while I'm away. I think we could get this branch in a state where I'd be happy to merge and cut a release this coming weekend. FWIW I do not intend to unilaterally merge this branch or to release changes derived from yours without coordinating with you. You made these changes to better serve your users, and while I don't want to upstream your changes as-is, I also don't want to harm your users. |
|
Thank you for your reply. I would also prefer to incorporate all required changes into your upstream. I think that we have currently single open topic: where to put the addresses/serials/or any other detailed information. I would like to keep the current 'infrastructure event' as it is. It could also be later expanded by other information received by taptap (i.e. nodes mesh topology etc). Combining all those data together in the power event doesn't seems to be right. Taking my architect hat on, I would still keep infrastructure data separated as I don't see taptap as end-to-end tool for monitoring, but rather as the Tigo data access layer with higher functions separated in other software in the best implementation of the Unix philosophy:
The first two points exactly match our discussion so far. I know I am lousy programmer but I am IT architect by profession, and I would strongly prefer to just follow SW architecture proven by years. |
Something like this could/should work.... |
|
Last comment. I started to backport your changes in this PR back to my fork so I can release it ASAP. It seems that you had need to rename persistent_file into state_file in your modifications of my original PR. But it is IMHO also wrong. This file has to do nothing with the state of the taptap process. State files on the Linux reflect state of the process. Not its persistent data. State files are NOT data files. On the other hand file to store persistent data of the process could be called persistent-file, persistent-data or data-file, etc. The funny thing is, that my MQTT wrapper had ALREADY state_file option - where it writes exactly the information about its process state. So if not changed I will have two states files options even only one is used correctly and the other not. Could you please propose some other name to avoid further breaking of my other work? |
|
It seems like there's three main topics here:
These are interrelated, but I'll try to articulate my thoughts in order. Scope
For example, the protocol has no notion of realtime clocks. Instead, each gateway autonomously increments a slot clock counter which it communicates to the CCA. I believe addressing warrants a similar approach. Gateway IDs and node IDs are always available in the protocol and Output identifiersThe UNIX philosphy produced no end of UNIX tools which repeat portions of their output when such repetition is meaningful. Consider two versions of And one which prints: The second version is clearly less usable than the first because it requires the consuming application to keep track of values Turning to time series databases like InfluxDB and Prometheus, these systems all expect label repetition, accepting input data like: These systems partition data into a two level hierarchy: timestamped data points are densely packed into labeled collections of such points. Labels like the four IDs above are part of the collection and are inherently deduplicated. It's cheap to add data point, it's cheap to have many labels, it's cheap to query across combinations of various labels, but it's difficult or impossible to do operations analogous to SQL In reply to:
It's much easier to remove identifiers which aren't needed than to add identifiers which are needed. If I do not understand why you see repeated identifiers as a negative here. If you're storing Other eventsI assumed you had a purpose for that event beyond just decoding power reports, which is why I asked what it was. Showing a table of hardware in a dashboard makes sense. I agree you should have an event for that, I agree it makes sense to emit this event as facts are learned, and I agree it makes sense to emit this event on startup based on persistent data. The most general thing It also makes sense that you'd want to retrospectively assign a MAC address/barcode to a previously unidentiifed node's data if you have a data store which supports that operation. Circling back to identification and addressing challenges, this is a classic database problem. The basic challenge is that we're trying to use gateway and node IDs as natural keys, but these attributes turn out to be mutable, and mutable natural keys cause no end of hardship. The database solution is to use surrogate keys instead. So: what if Some event sketches: {
"discovery": {
"gateways": [
{
"id": "b6412389-5c9d-49c9-be6c-c920acc09f99",
"mac": null,
"version": null,
"nodes": [
{
"id": "a44497d3-4d40-4eb0-8682-596a03e4eccc",
"mac": null,
"barcode": null,
"version": null
}
]
}
]
}
}{
"node_identity": {
"id": "a44497d3-4d40-4eb0-8682-596a03e4eccc",
"mac": "…",
"barcode": "…"
}
}{
"power_report": {
"gateway": {
"id": "b6412389-5c9d-49c9-be6c-c920acc09f99",
"mac": null
}
"node": {
"id": "a44497d3-4d40-4eb0-8682-596a03e4eccc",
"mac": null,
"barcode": null
},
"timestamp": "…",
"voltage_in": 44.25,
"voltage_out": 40.8,
"current": 5.8,
"dc_dc_duty_cycle": 0.9411764705882353,
"temperature": 24.0,
"rssi": 134
}
}In reply to:
I propose denormalizing identifiers into power reports because these identifiers can and do change over time, meaning a normalized alternative would require very careful handling of the time component to be correct. There is not a higher layer which can do understand these how these identifiers change better than Would events like these address your use cases? |
I named it "state file" because it is a file that contains state in the
|
|
This in the event.rs do not produce required |
Please do not do this. First you complicate stuff. Second, you need to generate UUID from something. If you have long_address/serial you do not need UUID. If you don't have it yet, you have nothing to generate (not random but consistent) UUID from. To the proposed structure of the events, it is again just complicate everything. It much more better to always print up-to-date complete structure of the persistent data instead providing partial updates. it doesn't matter what was the trigger for persistent data update. If you say that is is always 'easier to delete' something, than to not have it, it shall be better to print complete topology data. I have feeling that more we discuss this issue least usable solution it became. It is so much easier already just to maintain my fork of the taptap and forget about the whole PR back thing. I really feel bad were this all. You was before not even fixing bugs. Then you gutted my work from PR, effectively removed me from contributors, changed everything to your liking, broke compatibility with something people really use and now you proposing changes to make it harder to work again. I do not understand, why do you keep asking me to comment things, if you always do it your way. |
What AI thinks: Why it wasn’t working |
Actually to be compatible with the previous format it needs to be:
|
|
Your proposed changes in this PR generates error if state-file arg is used but not yet exists and it is created on the first infrastructure event: EDIT: the following error is actually raised always, despite state-file is being successfully updated.
EDIT is seems I was able to trace error to the case when relative path is being used for state-file, like:
|
I took @litinoveweedle's changeset from #19, rewrote git history to make the following changes, and opened this PR to better review and further discuss them:
build.ymlchanges into one commit, since it was lots of noiseedition = 2025, since there is no such Rust edition