Text recognition #2

adubovskoy · 2016-11-16T13:22:51Z

We need to go deeper. I propose to introduce a text recognition nodejs plugin. Imagine how would be cool to put a layout drawn on a napkin.

sylvainpolletvillard · 2016-11-16T13:33:01Z

It goes well beyond the scope of this small project but it would be super fun 😄

I think this kind of OCR tools already exist in Node, maybe you can try to chain them ? Time to experiment 👍

TryHardNinja · 2016-11-18T09:01:02Z

This plugin is very incredible. I look forward to the support of all properties

sylvainpolletvillard · 2016-11-18T09:38:43Z

It is a deliberate choice to not support some of the grid properties such as gaps or implicit zones. See the associated comments in the section. I do not plan to support more properties for now.

Note that you can always set these properties manually next to grid-kiss declaration if you feel the need to.

sylvainpolletvillard · 2017-02-19T20:34:16Z

@adubovskoy I've done a few experiments with Tesseract.js today:

As you can see, there is still work to do with OCR 😄 I think the Tesseract configuration needs some tweaking to identify zones, corners, column alignment etc.. It goes beyond my skills and I need some help to progress on this feature.

If you want to test it by yourself, check out the ocr branch here : https://github.com/sylvainpolletvillard/grid-kiss-playground/tree/ocr ; and look at the OCR button in the playground header.

kartikadur · 2017-07-03T13:01:46Z

Would restricting letters to capitals/uppercase or specific keywords prevent OCR read errors?

Creating a second pass to adjust row widths and column addons (e.g. - between rows, etc.) might help adjust any errors/mistakes in the OCR image.

this is something really interesting, wish my PC was up to the challenge.

sylvainpolletvillard · 2017-07-03T13:24:13Z

Yes, I tried a specific alphabet but the results have been disappointing so far. I think it would require a high-def camera, image pre-processing and a specific tesseract config to handle this kind of layout.

kartikadur · 2017-07-14T11:31:54Z

Since you have an automated process and your code can figure out areas, would using an automated naming convention work? At least for the current version. These automated names can be variablized like in sass (or the next version of css) and listed at the top, giving the user the ability to manually change the names should they want to.

sylvainpolletvillard · 2017-07-14T11:53:06Z

@kartikadur are we still talking about OCR ? could you give an example ?

kartikadur · 2017-07-15T06:43:51Z

I'd like to borrow the image you have attached above to work on something that I think might be promising.

Instead of using the OCR reader/ program, I was thinking of using simple edge detection to figure where the section boundaries are located. this should then allow for automatic detection of areas and thus by extension the automated naming.

Sorry if this sounds a little fuzzy, but its something that I am exploring right now. I should hopefully have an example for you in a day or two.

kartikadur · 2017-07-15T07:14:28Z

As a preliminary example, at least for the first step, I used the code from this codepen to create the example I have posted below. It still needs work, but hopefully it should give you a basic idea.

sylvainpolletvillard · 2017-07-15T09:02:06Z

Feel free to try everything you want, and have fun 😉
This image is pretty bad quality, it is hand drawn and the picture is from a cheap webcam with low lighting and indirect angle. Do not hesitate to make your own in high definition

corysimmons · 2017-12-12T10:17:05Z

This is by far the most interesting/exciting css-related issue I've stumbled upon. GitHub needs more creativity like this. 😍

stephanschubert · 2018-05-03T15:40:59Z

If only every GitHub repository had open issues like this one. 😄

sylvainpolletvillard · 2018-05-03T17:59:58Z

lol yes but this is still an unresolved issue, I wish I had the knowledge to make it happen. This would make a really, really cool demo.

corysimmons · 2018-05-04T02:07:07Z

@sylvainpolletvillard I'm learning AI/ML this year and plan on playing around with all kinds of stuff like this, so eta 1 year before I can PR (unless someone beats me to it?). 👴

sylvainpolletvillard · 2018-05-04T14:23:26Z

Hi @corysimmons , hope you're doing well ! This can indeed be a fun exercise for machine learning, but I don't have huge expectations 😄 . I think I was on the right track with Tesseract.js tho

sylvainpolletvillard · 2018-07-05T16:13:17Z

Don't think I forget about this issue. I talked with several experts in image recognition and tried different solutions. I'm currently quite pleased with the results I got with OpenCV:

corysimmons · 2018-07-05T23:03:10Z

Nice work @sylvainpolletvillard ! This continues to be my favorite issue on Github.

What are your thoughts on lining up that grid for use? Can you combine your result with a "snap-to unit" (i.e. a unit to round to, defaults to something like 30px) and some masonry math to line it up?
Can you recognize/capture the text and insert it into the appropriate cells?
I'd imagine you'd need to make some pretty huge cells to account for big words and grid nesting, so maybe grids should live in their own sub-directory like css/grids/homepage.css

sylvainpolletvillard · 2018-07-05T23:44:56Z

This is obviously some very early stage work. To get a minimum viable product, I would like to :

automatically rotate the picture to make horizontal lines truly horizontal
improve the calculation of the coordinates by adding a margin tolerance to compensate the lack of precision of our ridiculous human hands
crop the zone content and use optical character recognition to extract text (this is going to be hard because of handwriting)
auto expand the grid to fit the extracted content while respecting the original proportions as close as possible

corysimmons · 2018-07-06T12:22:52Z

Do you have this work pushed anywhere? I'd like to take a look at it and tinker around with estimations based off of how we know CSS Grid works.

sylvainpolletvillard · 2018-07-06T16:31:26Z

I'm looking for external contributions on this topic, but first I would like to keep the door open for other tools and approaches. I already changed my mind 2 times, using Tesseract then Tensorflow then OpenCV with Python. Before pushing any code, I want to be sure that I picked the right tool stack, so that if you decide to contribute, your contribution will not be lost if we need to change the tool later. Just so you know, my current work is based on this tutorial : https://www.pyimagesearch.com/2016/02/08/opencv-shape-detection/

What would be great if you want to help is that you try your own approach so that we can compare our results. Ideally, I would like this tool to use JavaScript or WebAssembly so that we can use it on mobile or desktop and use the camera or import pictures. Then I could call postcss-grid-kiss in realtime and get immediate feedback on a website layout. Hopefully we will get to that point one day 👍

corysimmons · 2018-07-07T03:36:36Z

Sounds good. Thanks for posting results with OpenCV. Looks cool!

sylvainpolletvillard · 2018-07-08T19:39:24Z

Progress !

Today, I converted my python code to JavaScript and I managed to run OpenCV in the browser thanks to WebAssembly. I also implemented auto-rotation, precision correction and finalized the ASCII formatter.

sylvainpolletvillard · 2018-07-08T19:44:39Z

Next step is optical character recognition (OCR) to extract the zone selector, but there is a major issue. The two OCR solutions in the browser I tried are Tesseract.js and Google Cloud Vision. Tesseract.js runs locally in the browser, while Cloud Vision is a remote API that can be used for free for 1000 queries before Google asks for some money. Cloud Vision did a very impressive job to recognize my crappy handwriting, but unfortunately I can't say the same about Tesseract. The only characters that Tesseract can recognize accurately are high definition printed characters.

So to sum up, if we want OCR, we'll need to limit to 1000 queries or pay for it. The other option would be to let the user complete the grid code by specifying each zone selector. It's less magic but it's the guarantee to avoid OCR errors. Also, maybe it's not very natural to write on paper classes or id selectors ? What do you think ?

sylvainpolletvillard · 2018-07-08T20:30:25Z

I think OpenCV.js is the best we can expect at this point, so I published the code here : https://github.com/sylvainpolletvillard/grid-kiss-vision ; in case you want to play with it @corysimmons

corysimmons · 2018-07-09T06:06:05Z

Cloud Vision: Can devs procure their own API keys and get a 1000 limit? 1000 sketches-to-layouts per person might not be too bad. 🤔

sylvainpolletvillard · 2018-07-09T09:07:23Z

I thought about that, but I don't know if a 1000 queries limit is sustainable. There is a lot of variable adjustement before getting the expected layout, it can take between 5 and 50 tries. So we can't really limit to 1 call per picture.

11111000000 · 2019-03-08T19:20:14Z

...of course EMACS already has mode for this (M-x artist-mode, which I use with pen)

corysimmons · 2019-03-10T17:23:06Z

Oh god it's the Emacs people! Run!!

🏃‍♂️💨

corysimmons · 2019-12-18T14:35:10Z

@11111000000 I apologize for the joke above. I'm re-reading this issue and realize how helpful your suggestion might be (basically free OCR).

sylvainpolletvillard · 2019-12-18T18:22:44Z

EMACS artist mode is not OCR, it is assistive editing so it's a different issue. I often use asciiflow.com for the same kind of assistance when drawing a grid-kiss layout.

corysimmons · 2019-12-18T19:05:37Z

Oh yeah, you're right. I figured maybe the logic could be pulled out, but now that I think about it, it probably works by calculating the cursor (or pen) coordinates in realtime and converting those points to a line. The line smoothing logic should be pretty similar, but not really as important as finding an OCR thing.

sylvainpolletvillard · 2022-09-24T22:38:46Z

Tesseract.js project has been recently updated and give better results for OCR, so I have updated this side project.

OCR is still not perfect, Tesseract is not good at handwriting, but at least it's a fully working prototype !

Try it online here: https://sylvainpolletvillard.github.io/grid-kiss-vision/

corysimmons · 2022-09-26T16:01:25Z

@sylvainpolletvillard That's awesome! Good work!

It's crazy that OCR is so bad with handwriting. Your handwriting is pretty clear.

Is the Cloud Vision code available anywhere?

sylvainpolletvillard · 2022-09-26T18:43:13Z

No, I did not continue with Cloud Vision because of the pricing limitations. Having everything working locally on a browser is much better in my opinion, although the ML model is much worse.

corysimmons · 2022-09-27T20:46:07Z

There is a lot of variable adjustement before getting the expected layout, it can take between 5 and 50 tries.

@sylvainpolletvillard Did you have a chance to fix this? Do users still have to manually tweak variables before getting the correct shapes?

sylvainpolletvillard · 2022-09-27T22:43:23Z

I don't know, I did not try Cloud Vision since 2018. The reason I came back to this experiment is the announcement of Tesseract.js 3.0, which is much easier to use, it almost does not need any configuration at all.

sylvainpolletvillard added idea help wanted labels Mar 30, 2017

Text recognition #2

Text recognition #2

Comments

adubovskoy commented Nov 16, 2016

sylvainpolletvillard commented Nov 16, 2016

TryHardNinja commented Nov 18, 2016

sylvainpolletvillard commented Nov 18, 2016

sylvainpolletvillard commented Feb 19, 2017 • edited Loading

kartikadur commented Jul 3, 2017 • edited Loading

sylvainpolletvillard commented Jul 3, 2017

kartikadur commented Jul 14, 2017 • edited Loading

sylvainpolletvillard commented Jul 14, 2017

kartikadur commented Jul 15, 2017

kartikadur commented Jul 15, 2017 • edited Loading

sylvainpolletvillard commented Jul 15, 2017 • edited Loading

corysimmons commented Dec 12, 2017

stephanschubert commented May 3, 2018

sylvainpolletvillard commented May 3, 2018

corysimmons commented May 4, 2018

sylvainpolletvillard commented May 4, 2018

sylvainpolletvillard commented Jul 5, 2018

corysimmons commented Jul 5, 2018

sylvainpolletvillard commented Jul 5, 2018

corysimmons commented Jul 6, 2018

sylvainpolletvillard commented Jul 6, 2018 • edited Loading

corysimmons commented Jul 7, 2018

sylvainpolletvillard commented Jul 8, 2018

sylvainpolletvillard commented Jul 8, 2018

sylvainpolletvillard commented Jul 8, 2018

corysimmons commented Jul 9, 2018

sylvainpolletvillard commented Jul 9, 2018

11111000000 commented Mar 8, 2019 • edited Loading

corysimmons commented Mar 10, 2019

corysimmons commented Dec 18, 2019

sylvainpolletvillard commented Dec 18, 2019

corysimmons commented Dec 18, 2019

sylvainpolletvillard commented Sep 24, 2022

corysimmons commented Sep 26, 2022

sylvainpolletvillard commented Sep 26, 2022

corysimmons commented Sep 27, 2022

sylvainpolletvillard commented Sep 27, 2022

sylvainpolletvillard commented Feb 19, 2017 •

edited

Loading

kartikadur commented Jul 3, 2017 •

edited

Loading

kartikadur commented Jul 14, 2017 •

edited

Loading

kartikadur commented Jul 15, 2017 •

edited

Loading

sylvainpolletvillard commented Jul 15, 2017 •

edited

Loading

sylvainpolletvillard commented Jul 6, 2018 •

edited

Loading

11111000000 commented Mar 8, 2019 •

edited

Loading