OSOCR-News+

15th, Jan, 2025

Love, peace, and cat

18th, Dec, 2025

Object 320 is about to enter Beta. It will be beyond Open-set and likely beyond OCR. Stay tunned.

6th, Nov, 2025

320 finally starts to converge--- took me a while to hunt down the bugs...

⑨ th, Oct, 2025

No promise--- but i will try to contain 320 training in a single T4

6th, Oct, 2025

vram signature of Object 320 is kinda wild (Full scale, training mode, testing mode will be light as always)

LSCT has finally been 50% modernized, hurrah!

There are alot happend behind the scene, but it should be worth the long journey.

26th, Sep, 2025

Object 32x will come with a much more flexible framework (and lsct has be modernized), stay tunned :)

23th, Sep, 2025

Live demo GUI of our icdar2025 paper (Object 310) will be updated soon.

25th, Aug, 2025

Object 313 is still undergoing analysis. Will release after getting better consistency

Object 320 is reaching first prototype phase soon

Object Watch-and-Act (Object 310) has been released and can do smth VLMs can't. Much better than C4.

See you in Wuhan!

https://github.com/lancercat/wna

30th, Jul, 2025

synthetic cotraining (C4) and multi-part representation (OAPR) will return in 32x. Hopefully in the first release.

24th, June, 2025

Object 35x will come in early/mid 2026 (maybe). Let's move beyond OCR

23th, June, 2025

32x will be a full fledged MARL system (likely).

12th, June, 2025

Watch-and-Act+ (Object 313) is feature complete. We get some mild performance improvements beyond object 310

Development efforts now go to Object 32x, where we will stage a more flexible routing framework with a more inclusive protocol

⑨ th, June, 2025

Starting to document Watch-and-Act (Object 310), which is fully inductive and much more powerful than CFOR.

See you in Wuhan.

21, Mar, 2025

Branched for CFOR. Cleaning starts.

seriously speaking object 310 is far better that CFOR even with an inductive setup...

Plus LSCT is already modernized... So not really motivated to clean up this legacy... But a promise is a promise afterall.

18, Mar, 2025

framework 320 is taking form, after that i will go on clean up CFOR training code... I didn't forget it, just too may stuff going on and too tired to do the cleaning during weekends.

25, Feb, 2025

CFOR training code should be available one week after icdar ddl... (sigh)

24, Feb, 2025

OpenCCD (VSDF) is returning to framework NG, in a slightly different form

12, Feb, 2025

Object 310 is undergoing final reproducibility checking.... After this I should have the time to clean up C4.

6, Feb, 2025

Object 310 is happening. I want to put some sneak peeks but i cannot...

23, Jan, 2025

Writing proposal && a paper

Will proceed to release CFOR training code/data/documents once these shenenigans settle down...

That's why I only put a Q1-2025 DDL when its just a few days of work... You just don't know what other tasks can fly in...

08, Jan, 2025

The next release is internally frozen. There are several deep and winding rabbit holes to be dig into in the future works.

Expect CFOR level generalization performance while being fully inductive, and a big leap from Moose.

Revealing more will break anonymity so pardon me being vague.

Have to say that time flies fast.

10, Dec, 2024

The next release still needs to wait.

We are refactoring the full framework for a more elegant implement of [something]. ETA one or two months. Please also expect a big performance leap :)

Framework NG-> NG+

17, Sep, 2024

The next release will support bf16 and multi-gpu inference.

Multi-gpu training will be delayed to a future release (in a less usual manner).

⑨, ⑨, 2024

The next release is scheduled in November, with more languages and better model flexibility.

CIH day, June, 2024

Datasets start being added. Tuning for performance.

⑨ th, June, 2024

It's happening (maybe).

28th, May, 2024

Proceeding to adopt new datasets like FudanVI, Union14M, and others.

Now we have some powerful devices. Time to scale up.

5th, May, 2024

Due to a specific application need, C4-family may receive a weird DLC (object ~~282~~ 305) in the next months,

which means we may or may not get the first of the "far more intresting stuffs" finished befor the AAAI ddl.

I would not say it is not interesting, it is,

but it still would only concern the OCR community, hence not that significant.

2nd, May, 2024

C4/LSCT family will be finally coming to light if I can pull off the Major Revision.

The stage has been set, buckle up.

And from stage 2 of the NG framework we are considering to stop support for GPUs with less than 16 GiB of VRAM

22th, Apr, 2024

Code cleaned up and verified for Moose.

Start documentation process and quality check procedure.

Once done, we will start uploading things onto kaggle and github (hopefully before mayday).

18th, Apr, 2024

OAPR released. Note it is still built with the first generation framework. The NG framework starts with Moose.

12th, Apr, 2024

We started to tidy up and will release the first version of the NG-framework (likely before May).

See you in ICDAR24.

BTW, the whole NG framework is current going towards version two, and there will be a planned version 3 by halloween.

Hope we can show you ppl some thing far more interesting than this in near future.

⑨ th, Apr, 2024

Another project is about to reach training stage (coding is mostly done), which brings some interesting new features.

Hope we can get some results in May.

Happy the ⑨th day of a month~

29th, Mar, 2024

The QA DLC will be delayed, as we find few benchmarks for an open-end VQA model.

The next relase will still be solely single-gpu OCR.

But changes are indeed happening, they are just slower than expected (partially bcs I am still adapting to the new lifestyle).

28th, Feb, 2024

Allow me to expand this repo to QA a bit, before we resume on the multi-gpu approach.

The next revision will hopefully be ready by April Fool's day.

10th, Feb, 2024

The core of the NG framework is taking shape.

1st, Feb, 2024

The NG framework will come with a lot of DLCs, so expect some weird [optional] dependencies :-)

18th, Jan, 2024

~~The NG framework may require pytorch>=2.1~~ The sparsity feature is not used in the end. The first method based on the NG framework is under ablative, see you ppl soon.

29th, OCT, 2023

NG framework will be delayed for a while due to all kinds of paperwork, writing, and relocation preparations... The framework itself is 80% done (usable if you don't need multigpu that is), but I have no time and resources to deploy and tune it.

Spoiler: The NG framework will natively support multi-GPU parallel training but in a really weird way.

(you can have your guess now, we will hopefully ship that part in late 2024 if smooth.)

~~10GB cards will be supported till 2025.~~

1st, OCT, 2023i

We will drop the support of 8Gib GPUs for training in future iterations. We will try to make regular models fit into 10Gib cards for training (we will know by halloween)...

29th, May, 2023

We recommend moving to 24Gb+ GPUs, like x090s, P40s, and M40s.

Considering the prices of P40s and M40s are pretty affordable these days, we will gradually drop training supports for 8Gib GPUs.

The training bar for future regular models would be 10Gib (p102, 1080tis, M40s), and 20Gib for large models.

The inferencing cost will also go up, however, we will make sure to support inference on 6G cards like P106-100s.

Hi community,

We are going to make a new multi-lingual OSOCR benchmark set.

We want to collect some ideas for the language list that people may want to recognize.

If you have a specific language in mind, please open an issue. We will pick some of the suggested languages to collect data and annotate.

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
README.md		README.md

lancercat/OSOCR-News

Folders and files

Latest commit

History

Repository files navigation