Skip to content

How should the {si} package work? / Roadmap ideas #5

@Robinlovelace

Description

@Robinlovelace

I'm looking for feedback from anyone with experience of SIMs in terms of:

  • How to not reinvent the wheel? Aim is for modelling functions in {spflow} and {gravity} and other packages to be easy to implement, with the function si_predict(). Aim to add these packages to Suggests and put examples implementing them into articles/vignettes.
  • What additional functionality would be most useful? Currently the main function is actually focussed on pre-processing with si_to_od() creating an 'analysis ready' (and modelling ready) data frame with all the variables from origins and destinations you could need.
    • Functions like si_model_exponential_decay() and si_model_power() for quickly getting people started and not having to define their own functions
    • Implementation of the radiation model, previously implemented in {stplanr} and in scikit-mobility
    • More example datasets?
  • Tidy or standard evaluation?
  • Anything else?

Currently (2022-04-22) the function used to predict interaction is called si_predict() and works like this:

https://github.com/Robinlovelace/si/blob/d9ae80e683b316d619f3a8843f2a7d138c7d3b1f/README.qmd#L40-L53

That is likely to change to a tidy-eval framework in #10.

Previous questions (now mostly answered) related to this:

  • Should it be called si_predict(), perhaps with another function e.g. called si_train() to train models (constrained/unconstrained)?
    • Yes, now implemented
  • Should the first argument of the of the fun argument be an od object (I'm currently thinking not as that arg is already in si_model(), heads up @Nowosad)?
  • How should custom SI prediction functions, e.g. si_gravity() work? I'm thinking as simple as possible would be good, enabling commands such as si_predict(od, fun = si_gravity(m = origins_population, n = destinations_population, distance = distance_euclidean)) would be good
  • Related to the previous question, should we use tidy evaluation (currently is being used with var_p)?
    • Implemented, now constraint_p
  • More broadly which conventions should we follow in terms of symbols used for SIM equations, e.g. Wilson's 1979 paper uses w_1/w_2, while some more recent papers (e.g. Simini's 2012 paper) uses m/n, throughout?
    • Going with notation in Dennett's 2018 paper

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions