Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
90 changes: 90 additions & 0 deletions scripts/apbs-bornprofile-extractDX.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
import numpy as np
import pandas as pd
from gridData import Grid

import argparse

parser = argparse.ArgumentParser()
parser.add_argument(
'--input', help='input filename')
parser.add_argument(
'--output', help='output filename. .dx is auto added to name')
parser.add_argument(
'grid', type=float, help='grid interval')
parser.add_argument(
'gridL', type=float, help='loose grid interval')

args = parser.parse_args()

filename = args.input
outname = args.output

# Grid interval
grid_int= args.grid

# Edge offset for Grid
offset = grid_int/2.0

# For losse edge
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

loose

grid_intL = args.gridL
offsetL = grid_intL/2.0

df = pd.read_csv(filename,delim_whitespace=True)
df = df.to_numpy()
Comment on lines +32 to +33
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using pandas, just use the numpy function; pandas isn't a dependency.

(However, if integrated with existing code, you'll already have the data.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, calling a nump array df is confusing.


data = df[:,0:4]
samples = data[:, 0:3]

coord = data[:, 0:3].T
weight = data[:,3]
weight = weight
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove line


# get dimensions
xmin = np.min(data[:,0])
ymin = np.min(data[:,1])
zmin = np.min(data[:,2])
xmax = np.max(data[:,0])
ymax = np.max(data[:,1])
zmax = np.max(data[:,2])

# get grid dimensions
xlen = int((xmax-xmin)/grid_int) + 1
ylen = int((ymax-ymin)/grid_int) + 1
zlen = int((zmax-zmin)/grid_int) + 1

# get edge.
xedge = np.linspace(xmin-offset, xmax+offset, xlen+1)
yedge = np.linspace(ymin-offset, ymax+offset, ylen+1)
zedge = np.linspace(zmin-offset, zmax+offset, zlen+1)
edges = [xedge, yedge, zedge]

# get loose edge.
xedge_L = np.arange(xmin-offsetL, xmax+offsetL, grid_intL)
yedge_L = np.arange(ymin-offsetL, ymax+offsetL, grid_intL)
zedge_L = np.arange(zmin-offsetL, zmax+offsetL, grid_intL)
edges_L = [xedge_L, yedge_L, zedge_L]

# Filling empty points with high value
filling_value = 2* weight.max()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fill_factor should be configurable, default can be fill_factor = 2

Suggested change
filling_value = 2* weight.max()
filling_value = fill_factor * weight.max()

grid = np.full([xlen, ylen, zlen], filling_value)

def get_ind(x, xmin, xint):
ind = int((x-xmin)/xint)
return ind
Comment on lines +71 to +73
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not be necessary with histogramdd (or if you really need to this stuff, look at numpy digitize).


for i, c in enumerate(coord.T):
xind = get_ind(c[0], xmin, grid_int)
yind = get_ind(c[1], ymin, grid_int)
zind = get_ind(c[2], zmin, grid_int)
grid[xind, yind, zind] = weight[i]
Comment on lines +75 to +79
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to use np.histogramdd here — this looks very slow.


# resample
g = Grid(grid, edges=edges)

# Needs GridDataFormat 0.6 or higher
g.interpolation_spline_order = 3
g_L = g.resample(edges_L)

# export
g.export(outname+'.dx')
g_L.export(outname+'-loose.dx')