https://github.com/pypest/pyemu/blob/22c26e96761952fc95845213eb628885e8a7fde8/autotest/utils/to_pestpp.py#L42C1-L45C50
Hi, I’d like to suggest a performance improvement to the following assignments:
flow_df.loc[:,"k"] = flow_df.lay.apply(int) - 1
flow_df.loc[:,"i"] = flow_df.row.apply(int) - 1
flow_df.loc[:,"j"] = flow_df.col.apply(int) - 1
flow_df.loc[:,"wsp"] = flow_df.wsp.apply(int) - 1
These can be more efficiently rewritten using Pandas’ vectorized .astype(int) method:
flow_df["k"] = flow_df["lay"].astype(int) - 1
flow_df["i"] = flow_df["row"].astype(int) - 1
flow_df["j"] = flow_df["col"].astype(int) - 1
flow_df["wsp"] = flow_df["wsp"].astype(int) - 1
.apply(int) invokes a Python-level loop with individual function calls per element, which is much slower and more memory-intensive than .astype(int), especially for large DataFrames. Using .astype() leverages optimized, compiled code for bulk operations, making it both faster and more memory-efficient.
https://github.com/pypest/pyemu/blob/22c26e96761952fc95845213eb628885e8a7fde8/autotest/utils/to_pestpp.py#L42C1-L45C50
Hi, I’d like to suggest a performance improvement to the following assignments:
These can be more efficiently rewritten using Pandas’ vectorized .astype(int) method:
.apply(int) invokes a Python-level loop with individual function calls per element, which is much slower and more memory-intensive than .astype(int), especially for large DataFrames. Using .astype() leverages optimized, compiled code for bulk operations, making it both faster and more memory-efficient.