-
Notifications
You must be signed in to change notification settings - Fork 424
[OpenSTA] Updated SDF Generation with Internal Registers #3230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Fixed an issue with how the post-implementation SDF annotations are generated for black boxes with internal registers on the input or output ports (but not both). When neither of the input or output ports are registered, then the circuit would be fully combinatorial. This behavior is unchanged. When both of the input and output ports are registered, then the input ports would have setup and hold constraints. The output ports would have clock to Q delays. This did not annotate correctly for multi-bit ports, so I updated the code to apply the annotations per pin in the port. When the input port is registered and the output port is not, then the input ports would have setup and hold constraints; however originally the output ports did not have clock to Q delays. In a black-box model, the output ports would have a clock to Q delay equal to the clock to Q delay of the input ports plus the combinatorial delays from the input ports to the output ports. I have added this logic to the SDF generation code to include these delays. When the output port is registered and the input port is not, then the output ports would have clock to Q delays (as normal); however originally the input ports were not given setup or hold constraints (these were applied to the output ports by how the architecture file traditionally defines these ports). For a black-box, the input ports would have setup and hold constraints, there is just a bit of delay before the timing arc hits the registers. Added the proper setup and hold constraints and applied port delays on the input ports equal to their propagation delay to the output ports.
// will have setup and hold constraints. This pin will absorb | ||
// the maximum setup and hold constraints of all paths. To | ||
// account for the internal delays of the black-box, we add | ||
// port delays equal to the max/min path delays to registers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Create a bolded comment explaining assumptions on the timing graph within primitive blocks. -Peter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative way to achieve this is to have a combinational delay from the input port to the (registered) output port, which has a uTsu and uTh as well as a uTco. That's how they are defined in the architecture file, and how Tatum models and analyzes them.
Can liberty models do the same thing? If so, you could consider just using that syntax as it is closer to what VPR does internally. If not, I would explicitly write a comment explaining that you are converting from one style to the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See https://docs.verilogtorouting.org/en/latest/tutorials/arch/timing_modeling/ for some examples of the modeling I'm talking about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
High level this looks ready to go. Recommend adding an additional reviewer to obtain good feedback on the data structure changes.
With these changes, does openSTA match Tatum for analysis of these types of blocks? If you haven't already, I recommend checking on one architecture with the 3 different modes. 2 of them exist on RAMs (but the inputs are always registered); to check the inputs not registered with outputs registered you'll need to use a DSP block or some such. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question about the necessity of modeling this way, vs. the Tatum style (see above).
// will have setup and hold constraints. This pin will absorb | ||
// the maximum setup and hold constraints of all paths. To | ||
// account for the internal delays of the black-box, we add | ||
// port delays equal to the max/min path delays to registers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative way to achieve this is to have a combinational delay from the input port to the (registered) output port, which has a uTsu and uTh as well as a uTco. That's how they are defined in the architecture file, and how Tatum models and analyzes them.
Can liberty models do the same thing? If so, you could consider just using that syntax as it is closer to what VPR does internally. If not, I would explicitly write a comment explaining that you are converting from one style to the other.
// will have setup and hold constraints. This pin will absorb | ||
// the maximum setup and hold constraints of all paths. To | ||
// account for the internal delays of the black-box, we add | ||
// port delays equal to the max/min path delays to registers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See https://docs.verilogtorouting.org/en/latest/tutorials/arch/timing_modeling/ for some examples of the modeling I'm talking about.
Fixed an issue with how the post-implementation SDF annotations are generated for black boxes with internal registers on the input or output ports (but not both).
When neither of the input or output ports are registered, then the circuit would be fully combinatorial. This behavior is unchanged.
When both of the input and output ports are registered, then the input ports would have setup and hold constraints. The output ports would have clock to Q delays. This did not annotate correctly for multi-bit ports, so I updated the code to apply the annotations per pin in the port.
When the input port is registered and the output port is not, then the input ports would have setup and hold constraints; however originally the output ports did not have clock to Q delays. In a black-box model, the output ports would have a clock to Q delay equal to the clock to Q delay of the input ports plus the combinatorial delays from the input ports to the output ports. I have added this logic to the SDF generation code to include these delays.
When the output port is registered and the input port is not, then the output ports would have clock to Q delays (as normal); however originally the input ports were not given setup or hold constraints (these were applied to the output ports by how the architecture file traditionally defines these ports). For a black-box, the input ports would have setup and hold constraints, there is just a bit of delay before the timing arc hits the registers. Added the proper setup and hold constraints and applied port delays on the input ports equal to their propagation delay to the output ports.