Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
- Process big csv files in chunks (/!\ breaking changes) [#159](https://github.com/datagouv/csv-detective/pull/159)
- Handle column named `count` in profile creation [#171](https://github.com/datagouv/csv-detective/pull/171)
- Prevent `NaN` in headers [#173](https://github.com/datagouv/csv-detective/pull/173)
- Refactor formats structure in the repo (/!\ breaking changes) [#176](https://github.com/datagouv/csv-detective/pull/176)

## 0.9.2 (2025-08-26)

Expand Down
9 changes: 5 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

This is a package to **automatically detect column content in tabular files**. The script reads either the whole file or the first few rows and performs various checks (regex, casting, comparison with official lists...) to see for each column if it matches with various content types.

Currently supported file types: csv, xls, xlsx, ods.
Currently supported file types: csv(.gz), xls, xlsx, ods.

You can also directly feed the URL of a remote file (from data.gouv.fr for instance).

Expand Down Expand Up @@ -34,15 +34,16 @@ inspection_results = routine(
num_rows=-1, # Value -1 will analyze all lines of your file, you can change with the number of lines you wish to analyze
save_results=False, # Default False. If True, it will save result output into the same directory as the analyzed file, using the same name as your file and .json extension
output_profile=True, # Default False. If True, returned dict will contain a property "profile" indicating profile (min, max, mean, tops...) of every column of you csv
output_schema=True, # Default False. If True, returned dict will contain a property "schema" containing basic [tableschema](https://specs.frictionlessdata.io/table-schema/) of your file. This can be use to validate structure of other csv which should match same structure.
output_schema=True, # Default False. If True, returned dict will contain a property "schema" containing basic [tableschema](https://specs.frictionlessdata.io/table-schema/) of your file. This can be use to validate structure of other csv which should match same structure.
tags=["fr"], # Default None. If set as a list of strings, only performs checks related to the specified tags (you can see the available tags with FormatsManager().available_tags())
)
```

## So What Do You Get ?

### Output

The program creates a `Python` dictionnary with the following information :
The program creates a `python` dictionnary with the following information :

```
{
Expand Down Expand Up @@ -185,7 +186,7 @@ Only the format with highest score is present in the output.
## Improvement suggestions

- Smarter refactors
- Improve performances
- Performances improvements
- Test other ways to load and process data (`pandas` alternatives)
- Add more and more detection modules...

Expand Down
1 change: 0 additions & 1 deletion csv_detective/detect_fields/FR/README.md

This file was deleted.

Empty file.
Empty file.

This file was deleted.

9 changes: 0 additions & 9 deletions csv_detective/detect_fields/FR/geo/code_fantoir/__init__.py

This file was deleted.

9 changes: 0 additions & 9 deletions csv_detective/detect_fields/FR/geo/code_postal/__init__.py

This file was deleted.

10 changes: 0 additions & 10 deletions csv_detective/detect_fields/FR/geo/code_region/__init__.py

This file was deleted.

16 changes: 0 additions & 16 deletions csv_detective/detect_fields/FR/geo/departement/__init__.py

This file was deleted.

19 changes: 0 additions & 19 deletions csv_detective/detect_fields/FR/geo/latitude_l93/__init__.py

This file was deleted.

This file was deleted.

19 changes: 0 additions & 19 deletions csv_detective/detect_fields/FR/geo/longitude_l93/__init__.py

This file was deleted.

This file was deleted.

16 changes: 0 additions & 16 deletions csv_detective/detect_fields/FR/geo/pays/__init__.py

This file was deleted.

Empty file.
Loading