getNLDAS parallel download 

Ran into a couple problems when retrieving NLDAS data with `dopar=True` (default arg)

In order to troubleshoot I changed a couple of `verbose` flags to `True` that were nested and inaccessible to the main flag. Not sure if these are supposed to inherit the top level `verbose` or not. 

https://github.com/LynkerIntel/climatePy/blob/0b88e133cb110a9cb465071de649a0d47fbca26b/climatePy/_dap.py#L518

The first issue is that the DAP retrieval methods fail silently due to this `try, except` where only the URL is returned: 

https://github.com/LynkerIntel/climatePy/blob/0b88e133cb110a9cb465071de649a0d47fbca26b/climatePy/_dap.py#L748-L756

Printing `e` is a quick fix here:

```
try:
    if "http" in dap_row["URL"]:
        x = var_to_da(var = get_data(dap_row), dap_row = dap_row)
    else:
        raise Exception("dap_to_local() not avaliable, yet, dataset URL must be in http format")
except Exception as e:
        print(e)
        return dap_row["URL"]
```

The second issue relates to the `Parallel` call:
https://github.com/LynkerIntel/climatePy/blob/0b88e133cb110a9cb465071de649a0d47fbca26b/climatePy/_dap.py#L863


Executed by:
```
dap_data = climatePy.getNLDAS(
    AOI       = AOI,
    varname   = ["dswrfsfc", "cape180_0mb"],
    model = "FORA0125_H.002",
    startDate = "2010-01-01",
    endDate   = "2010-01-03",
    verbose = True,
    dopar = True
)
```

 where all workers return:
`Cookie file cannot be read and written: /.urs_cookies_cookies`

This issue does not occur when `n_jobs` is set to 1. I suspect it has something to do with how the EarthData authentication is provided to the server when multiple requests are made simultaneously, as I was able to get around this by ensuring I have the .netrc and. .dodsrc files in my home dir and then removing the `writeDodsrc()` call (as well as the unlink command) in `getNLDAS`. 

https://github.com/LynkerIntel/climatePy/blob/0b88e133cb110a9cb465071de649a0d47fbca26b/climatePy/_shortcuts.py#L712-L733

After this, `getNLDAS` ran smoothly. The parallel download issue might be platform or dependency specific, and occurred for when when installing from main in an env with Python=3.11.4 (macOS 13.2.1, M2). 

Lastly, when troubleshooting this I was redirected to a warning:

"The server is temporarily unable to service your request because your IP address has reached the limit of concurrent connections."

So perhaps limiting the number of workers, or having it be a provided argument, is a good idea (currently set to -1, or "all cpus" according to the joblib docs.
https://github.com/LynkerIntel/climatePy/blob/0b88e133cb110a9cb465071de649a0d47fbca26b/climatePy/_dap.py#L863

Happy to submit a PR for problem 1, but the authentication issue certainly needs more digging. It didn't see an obvious cause in `_netrc_utils.writeDodsrc()`. If this issue is reproducible by others, perhaps the default download method should be non-parallel for now. 


        

 
 



	x = netrc_utils.writeDodsrc()

	# get matching arguments for climatepy_filter function
	dap_meta = dap.climatepy_dap(
	AOI = AOI,
	id = "NLDAS",
	varname = varname,
	startDate = startDate,
	endDate = endDate,
	model = model,
	verbose = verbose
	)

	dap_meta['dopar'] = dopar

	# need to provide dap_meta dictionary object directly as input
	dap_data = dap.dap(
	**dap_meta
	)

	# unlink Dodsrc file
	os.unlink(x)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

getNLDAS parallel download #16

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	try:
	if "http" in dap_row["URL"]:
	x = var_to_da(var = get_data(dap_row), dap_row = dap_row)
	else:
	raise Exception("dap_to_local() not avaliable, yet, dataset URL must be in http format")
	except Exception as e:
	return dap_row["URL"]

	return x

getNLDAS parallel download #16

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions