-
Notifications
You must be signed in to change notification settings - Fork 6
fixup openfmri datasets metadata #17
Copy link
Copy link
Open
Description
Currently (some might have been fixed upstream) we have following gotchas while parsing metadata from openfmri datasets (before enabling any custom ones, just bids parser)
- 9 datalad/datalad@2a8dd26 part of Various bugfixes addressing some corner cases while parsing datasets.datalad.org with new extractors datalad#2151
[INFO ] Aggregate metadata for dataset /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000009
[WARNING] Failed to load participants info due to: 'ascii' codec can't encode character u'\u2019' in position 72: ordinal not in range(128) [csv.py:next:108]. Skipping the rest of file
- 30 - same as 9. FOI: took 7:10.73 (7min) to aggregate! 223kB size of ds- and 128kB size of cn- compressed
[INFO ] Aggregate metadata for dataset /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000030
[ERROR ] Failed to get dataset metadata (bids): 'ascii' codec can't decode byte 0xe2 in position 3325: ordinal not in range(128) [ascii.py:decode:26]
[ERROR ] Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately) [aggregate_metadata(/mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000030)]
aggregate_metadata(error): /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000030 [Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately)]
[INFO ] Aggregate metadata for dataset /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000053
[ERROR ] Failed to get dataset metadata (bids): 'ascii' codec can't decode byte 0xc2 in position 1188: ordinal not in range(128) [ascii.py:decode:26]
[ERROR ] Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately) [aggregate_metadata(/mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000053)]
aggregate_metadata(error): /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000053 [Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately)]
- 117
[INFO ] Aggregate metadata for dataset /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000117
[ERROR ] Failed to get dataset metadata (bids): 'ascii' codec can't decode byte 0xe2 in position 1585: ordinal not in range(128) [ascii.py:decode:26]
[ERROR ] Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately) [aggregate_metadata(/mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000117)]
aggregate_metadata(error): /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000117 [Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately)]
- 140 - just because README is large since includes output of bids-validator. doing nothing about that for now
[INFO ] Aggregate metadata for dataset /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000140
[INFO ] Removed metadata field(s) due to blacklisting and max size settings: set(['description'])
- 164
[INFO ] Aggregate metadata for dataset /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000164
[ERROR ] Failed to get dataset metadata (bids): 'ascii' codec can't decode byte 0xc3 in position 1244: ordinal not in range(128) [ascii.py:decode:26]
[ERROR ] Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately) [aggregate_metadata(/mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000164)]
aggregate_metadata(error): /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000164 [Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately)]
- 201 - well, no WARNING now but see https://github.com/datalad/datalad/issues/2152
[INFO ] Aggregate metadata for dataset /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000201
[WARNING] Failed to load participants info due to: "delimiter" must be string, not unicode [csv.py:__init__:79]. Skipping the rest of file
- 214
[INFO ] Aggregate metadata for dataset /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000214
[ERROR ] Failed to get dataset metadata (bids): 'ascii' codec can't decode byte 0xe2 in position 1412: ordinal not in range(128) [ascii.py:decode:26]
[ERROR ] Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately) [aggregate_metadata(/mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000214)]
aggregate_metadata(error): /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000214 [Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately)]
- 216
[INFO ] Aggregate metadata for dataset /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000216
[ERROR ] Failed to get dataset metadata (bids): 'ascii' codec can't decode byte 0xc3 in position 1232: ordinal not in range(128) [ascii.py:decode:26]
[ERROR ] Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately) [aggregate_metadata(/mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000216)]
aggregate_metadata(error): /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000216 [Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately)]
- 218 - it is in a bit screwy state... for now manually unannexed/git added top level text files. fixed participants.tsv header to not have trailing tab
[INFO ] Aggregate metadata for dataset /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000218
[WARNING] Could not determine file-format, assuming TSV
- 221 Was a unicode whitespace used to separate fields in Authors. sent patch upstream as well
[INFO ] Aggregate metadata for dataset /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000221
[ERROR ] Failed to get dataset metadata (bids): No JSON object could be decoded [decoder.py:raw_decode:382]
[ERROR ] Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately) [aggregate_metadata(/mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000221)]
aggregate_metadata(error): /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000221 [Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately)]
- 223 - just a single column in participants.tsv -- useless
[INFO ] Aggregate metadata for dataset /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000223
[WARNING] Could not determine file-format, assuming TSV
- 224
[INFO ] Aggregate metadata for dataset /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000224
[ERROR ] Failed to get dataset metadata (bids): 'ascii' codec can't decode byte 0xe2 in position 1805: ordinal not in range(128) [ascii.py:decode:26]
[ERROR ] Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately) [aggregate_metadata(/mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000224)]
aggregate_metadata(error): /mnt/btrfs/datasets-meta6-1-redo1/datalad/crawl/openfmri/ds000224 [Metadata extraction failed (see previous error message, set datalad.runtime.raiseonerror=yes to fail immediately)]
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels