-
Notifications
You must be signed in to change notification settings - Fork 131
Description
The picks.csv output from PhaseNet contains some clunky formatting that requires the user to perform several string manipulations to properly format the itp, tp_prob, its, ts_prob columns.
I will show an example of reading the csv with pandas although reading the csv with the csv package runs into the same formatting issues . I will also share the function I had to make to correctly format the entries.
Pandas
import pandas as pd
df = pd.read_csv('output/picks.csv')
The result is a dataframe containing strings in the itp, tp_prob, its, ts_prob columns.
print(df['itp'][0])
>>> '[ 1 6620 8114]'
print(df['ts_prob'][0])
>>> '[ 0.11291095 0.31720835 0.06021817]'
The values are not uniformly separated either which means the str.split() method can't be applied to convert the string into a list. Ideally, the csv would contain a uniform, comma-separated list of values. Another solution would be to also save a pickle file to the output directory that contains the lists in object form.
To fix the formatting with the current picks.csv, I made the following function:
import shlex
import pandas as pd
df = pd.read_csv('output/picks.csv')
def pickConverter(df):
for col in ['itp', 'its']:
pick_entry_list = []
for x in range(len(df)):
try:
pick_entry_list.append(list(map(int, shlex.split(df[col][x].strip('[]')))))
except AttributeError:
pick_entry_list.append([])
pass
df[col] = pick_entry_list
for col in ['tp_prob', 'ts_prob']:
prob_entry_list = []
for x in range(len(df)):
try:
prob_entry_list.append(list(map(float, shlex.split(df[col][x].strip('[]')))))
except AttributeError:
prob_entry_list.append([])
pass
df[col] = prob_entry_list
return df