#Making an algorithm count the pieces from a pdf manual

1 messages · Page 1 of 1 (latest)

south linden
#

I would probably try open CV as that has a lot of tools for computer vision tasks

flint pawn
#

There are libraries for reading the PDFs, but you'll have to put in some extra work for actually counting the pieces. Could you send some examples of what they look like? It's been a while since I've last touched lego but I guess you could crop each of the pages to only the rectangle in the corner containing the bricks needed for that step, then you would have to come up with an algorithm to split this list of pieces up to get the individual bricks as well as their count.

Once you have that it should be fairly easy to group them (e.g. by image similarity) and sum their counts

sly apex
#

This is one of the pages

flint pawn
#

So we can't just cut out rectangles for the parts but at least the numbers don't overlap

#

How many pages do you have? I have a feeling that Python might not be fast enough here if you have lots of manuals

sly apex
#

The manual has 206 pages

#

I am sorry to ask, but what exactly does "building from source" mean?

flint pawn
#

compiling the source code of that library

sly apex
#

What other language could I use instead of python? Does java perhaps have a library similar to open cv? I found one for javascript, but I am wondering about its speed to solve that

flint pawn
viral reefBOT
#

@sly apex

File Attachments Not Allowed

For safety reasons we do not allow file and video attachments.

Code Formatting

You can share your code using triple backticks like this:
```
YOUR CODE
```

Large Portions of Code

For longer scripts use Hastebin or GitHub Gists and share the link here

Ignored these files
  • MK21005.pdf
sly apex
#

Yes! Sorry

#

Also too large for github lmao

flint pawn
#

🗿 You can just send me a direct link or something as well

trim wraith
# sly apex This is one of the pages

from what i understand you want the info in the little blue box, are looking for just the total amount of pieces or how much of each piece is required?

sly apex
#

How much of each piece is required

trim wraith
sly apex
#

With the pictures

flint pawn
#

Step one: Figure out how to read the PDF in Python and then count the pages (so we can see you read it properly)

sly apex
#

Sir yes sir! O7

flint pawn
#

😈

trim wraith
# sly apex With the pictures

then you can do this with the code, you can subsection the image to only look at the blue box by scanning for a square that contains a certain color.

once you get the square you can subsection each piece, you can do that by line scanning, you create an empty array of images, with them will also be the polygon shape they are. you scan through each line, if the point is not the blue color you add it to the new image your making, you do that with each pixel, once the line scanner reachs blue again (meaning that it reached the end of the object), you go to the next line and repeat until the entire line is blue then add that image to the array as well as the polygon shape of it. to make sure your not getting a block you already scanned, you check the point in the list of polygon shapes you made to make sure its not in there. then do that with each block until its not adding anymore blocks to the array. then you need to scan the text, get each block position then each text position and just find the text that is closest in the vertical axis

sly apex
#

Would it also work, if I use the color of the rectangle's outline?

flint pawn
#

Yes

#

This is what I wrote 2 days ago (also extracts the pieces) but I think the resolution is too low for character recognition

#

Also not sure if the resolution is high enough on the smaller pieces so grouping them might be hard