Our First data challenge

This is a test. The LSU Interdisciplinary AI-JC attempting the first hands-on activity.

Distinguishing Ancient Chinese Written Characters

Paper: https://www.nature.com/articles/s41597-024-02933-w
GitHub (Oracle-MNIST): https://github.com/wm-bupt/oracle-mnist

Steps:

Data pre-processing
- Download the data: from this repo - Credits: Oracle-NMIST Team
- Look at the data (inspect)
- Process the data
Data Set preparation
NN set up
- brainstorm on possible architectures
- Start with the MNIST CNN (LeCun)
- Modify to get better performances

Results

Qur final architecture and results are reported in the images below.

Our final test accuracy (defined as the fraction of correct predictions on the total sample) is : 90.87% (Pure LeNet was giving 80% after 60 Epochs)

Observations

Switching from Sigmoid() to ReLU() initially caused the loss to explode to ~10⁴.
- Root cause: images were still in uint8 format.
- Fix: convert to float32 and normalize pixel values to the range [0,1].
Pooling comparison:
- Replacing AvgPool2d with MaxPool2d degraded performance. The model ran better with AvgPool2d, so we reverted the change.
Dropout experiments to mitigate overfitting:
- Dropout(0.2): produced a small improvement.
- Dropout(0.5): significantly improved generalization — reaching ~90% validation accuracy.

What We Did Not Try

Data augmentation (e.g., rotations, translations, flips, noise injection)
Deeper architecture (e.g., additional convolutional blocks or more filters).

Both directions could further improve performance.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
data		data
images		images
src		src
.gitignore		.gitignore
Data_Read_and_Plot.ipynb		Data_Read_and_Plot.ipynb
LeNet_Training.ipynb		LeNet_Training.ipynb
README.md		README.md
t10k-images-idx3-ubyte.gz		t10k-images-idx3-ubyte.gz
t10k-labels-idx1-ubyte.gz		t10k-labels-idx1-ubyte.gz
train-images-idx3-ubyte.gz		train-images-idx3-ubyte.gz
train-labels-idx1-ubyte.gz		train-labels-idx1-ubyte.gz
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Our First data challenge

Distinguishing Ancient Chinese Written Characters

Steps:

Results

Observations

What We Did Not Try

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Our First data challenge

Distinguishing Ancient Chinese Written Characters

Steps:

Results

Observations

What We Did Not Try

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages