Close Menu
    Trending
    • How This Man Grew His Beverage Side Hustle From $1k a Month to 7 Figures
    • Finding the right tool for the job: Visual Search for 1 Million+ Products | by Elliot Ford | Kingfisher-Technology | Jul, 2025
    • How Smart Entrepreneurs Turn Mid-Year Tax Reviews Into Long-Term Financial Wins
    • Become a Better Data Scientist with These Prompt Engineering Tips and Tricks
    • Meanwhile in Europe: How We Learned to Stop Worrying and Love the AI Angst | by Andreas Maier | Jul, 2025
    • Transform Complexity into Opportunity with Digital Engineering
    • OpenAI Is Fighting Back Against Meta Poaching AI Talent
    • Lessons Learned After 6.5 Years Of Machine Learning
    AIBS News
    • Home
    • Artificial Intelligence
    • Machine Learning
    • AI Technology
    • Data Science
    • More
      • Technology
      • Business
    AIBS News
    Home»Machine Learning»Optical Character Recognition for the Aberystwyth Pool League | by Aledllevans | Jan, 2025
    Machine Learning

    Optical Character Recognition for the Aberystwyth Pool League | by Aledllevans | Jan, 2025

    Team_AIBS NewsBy Team_AIBS NewsJanuary 17, 2025No Comments19 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Aled Llyr Evans — aledllevans@hotmail.com — https://www.linkedin.com/in/aled-evans-340a73130

    · Introduction

    · Aberystwyth Pool League

    · Downside Specification & Motivation

    · Optical Character Recognition

    · The Pipeline

    · Pre-Processing

    · Studying

    · Publish-Processing

    · Examples & Issues

    · Conclusion & Future Enhancements

    · References

    In recent times, synthetic intelligence (AI) and machine studying (ML) applied sciences have made vital strides, reshaping varied industries and enhancing the capabilities of software program functions. Among the many quite a few functions of AI and ML, Optical Character Recognition (OCR) stands out as a transformative software that allows machines to interpret and digitize printed or handwritten textual content. As soon as the area of science fiction, AI & ML are actually embedded within the cloth of contemporary know-how, powering functions that vary from voice assistants to predictive analytics. These superior techniques leverage huge datasets and complicated algorithms to study from expertise and adapt to new data, enabling machines to carry out duties that historically required human intelligence. Amongst these transformative applied sciences, Optical Character Recognition (OCR) serves as an important bridge between the bodily and digital worlds, facilitating the conversion of textual content from pictures into machine-readable codecs.

    This text demonstrates a particular use case of OCR know-how and light-weight a number of the underlying mechanisms. By shedding mild on what we predict an OCR pipeline ought to appear like, by the processes of picture acquisition, preprocessing, characteristic extraction, and character recognition, we purpose for example its pivotal position in bettering effectivity, accessibility, and information administration within the digital age.

    The Aberystwyth Pool League is an area sports activities league, shaped in 1977, that focuses on competitions of Blackball Pool in Aberystwyth, Wales. It’s run by a handful of volunteers that type the committee. The league brings collectively just a few hundred individuals collectively every season, each college college students and locals from the inhabitants, to take part in frames of pool to enhance their abilities and have enjoyable whereas having fun with the social side of the sport.

    To begin the seasonal league, gamers manage themselves into groups which signal as much as play from a selected venue. A collection of matches are then organized by the league between these groups; this schedule is thought in on a regular basis language as ‘the fixtures’. Every week, groups face off and earn factors for his or her workforce primarily based on their efficiency. The league retains observe of scores and rankings, culminating in winners (and losers) on the finish of the season.

    The end result of every match is recorded on a ‘rating card’. On the finish of the match, the end result is notarized with the captains’ signatures, photographed (normally with the profitable captain’s digicam cellphone) and that {photograph} is shared with the league secretary.

    The rating card template

    On the time of writing, the Winter 2024/25 season is underway. 29 groups have been registered and separated into 3 divisions (two with 10 and one with 9 and a ‘BYE’). The seasonal league operates a ‘double round-robin’ format (one house match and one away match with one another workforce in your division). With this configuration, the league is predicted to play 10*9 + 10*9 + 9*8 = 252 matches! Every requiring a bodily rating card after which a recording of the outcomes.

    A stuffed rating card

    Transcribing this information manually, for its use in producing statistics, is monotonous, repetitive and boring -if solely there was a greater method.

    Why do that in any respect?

    Clearly, the league standings have to be up to date in order that the groups know the place they stand; And who has received on the finish of the season, in fact. However as well as, statistics are attention-grabbing; And there are all types of measurements, comparisons and titbits that may be calculated from the collected information. I’ll element a few of these statistics in a future article.

    It’s doable to automate this process utilizing the magic of machine studying.

    How does it work?

    Particular implementations will fluctuate, as the entire pipeline might be a collection of steps, however recognising particular person letters will doubtless be the duty of a neural community whose enter layer might be some array containing details about the pixels inside some boundary of which the pipeline has declared to comprise a personality (or presumably, phrase).

    Alternatively, or along with, different strategies could also be concerned comparable to characteristic matching or handwriting motion evaluation. The precise steps concerned will rely upon the appliance and use circumstances of the precise mannequin/pipeline.

    With the ability to get hold of the textual content current in a picture has quite a lot of use circumstances: Quantity-plate recognition, scanning paperwork comparable to passports, cellular apps that translate textual content in actual time and naturally all method of different information entry duties. The artwork of this sort of machine studying is named Optical Character Recognition.

    We should always use OCR to learn the rating playing cards for us!

    But it surely’s not as straightforward as that simply but. Any drawback in OCR value taking a look at would require utility particular optimizations within the type of tailor-made pre-processing and post-processing steps to type working pipeline, this one is not any totally different. Particularly, most real-world functions of OCR contain studying printed textual content with a constant font as in a e-book, street signal or the again of a cereal field abroad; in our case we purpose to have the ability to learn hand-writing, which is extraordinarily difficult by comparability as a result of pure ambiguities that outcome from totally different pens, hand writers and kinds of writing.

    My language of alternative is Python, it has the libraries I might want to use and is quick and environment friendly sufficient to finish most duties required in good time on my machine; Sooner or later I would contemplate outsourcing a number of the heavier lifting to a lower-level language like C. I take advantage of EasyOCR to do the studying, however PyTesseract must also be talked about in its place.

    PyCharm is my on a regular basis IDE.

    Pre-processing concerns

    Maybe a very powerful consideration for us is that the content material we’re considering is contained throughout the desk that’s printed on the rating playing cards. The two-dimensional ink construction acts as a suggestion in order that the writers (and the reader) know the place to search for the precise data. Ascertaining these boundaries and extracting the content material inside is important to our objective.

    To do that, we have to use quite a lot of picture manipulations strategies from OpenCV and the Python Pillow Library.

    Publish-processing concerns

    Making correct predictions of handwritten textual content is feasible however laughably unreliable. With even immaculate examples of handwriting, the reader will make minor errors that can render the output nearly ineffective. The important thing to creating the entire pipeline worthwhile is to grasp that we are able to slender down our choices considerably if we all know what we’re in search of. A workforce has a finite variety of gamers registered at anyone time, so long as our predictions are shut sufficient to the right reply, we’re in with an opportunity of constructing one thing helpful.

    For this, we solely want a number of the normal libraries; And if we’re pushed for stronger mathematical operations, NumPy.

    The journey began with the rating playing cards template (designed in Microsoft Excel), and now- that turns out to be useful. Having this template means we are able to use it to search for the desk within the goal picture. There are “options” within the template which might be current within the goal, such because the desk, and its texts. Except they’re obscured within the {photograph}, they are going to be current within the goal picture. This realisation is the important thing to getting began.

    We are able to do that by extracting options from each pictures and evaluating them, I’ve used SIFT (Scale-Invariant Function Rework) as a result of it was probably the most dependable on this use case however there are different choices obtainable. As soon as we’ve got our curated set of excellent high quality keypoints, we match them between the 2 pictures with FLANN (Quick Library for Approximate Nearest Neighbours). Lastly, it’s important to filter dangerous matches utilizing Lowes distance ratio check.

    In inexperienced are the curated matches between the rating card template and {a photograph} containing a accomplished rating card.

    The images of the rating playing cards might be taken from an undesirable perspective; They might be skewed, rotated and can nearly actually comprise extra than simply the desk we’d like. It’s doable with some strategies from Linear Algebra to repair all these issues in a single go. With these matches, we are able to try to discover a homography between the goal picture and a picture taken from a extra fascinating perspective. We’d like a metamorphosis matrix to do that and fortuitously for us, OpenCV has all of the algorithms we’d like and a number of labored examples on how to do that.

    In the identical step, we greyscale our picture.

    That’s significantly better!

    We may put this into the reader proper now, however for the sake of robustness we are going to do some extra processing. EasyOCR has some built-in pre-processing steps, comparable to orientation correction and noise discount; However for the sake of completeness, we are going to do some ourselves.

    OCR readers sometimes profit from binarization, which suggests eliminating gray by assigning all pixels within the picture to be completely black or white. That is accomplished by calculating a threshold worth for the entire picture. A single threshold worth is ample within the absence of variable lightning situations or different distortions. In any other case, we are able to contemplate a variable threshold comparable to in Otsu’s methodology.

    Subsequently we do some mild gaussian blurring to cut back the sharpness that outcomes from the binarization.

    Because it seems, detecting and eradicating the desk is straightforward with morphological operations. We do that as a result of it solely represents noise that the reader may do with out. I’ve used a mixture of OpenCVs getStructuringElement, morphologyEx and findContours. Alternatively, we may contemplate a Hough Line Rework.

    As a supplementary step, we are able to overlay a barely modified model of the template to see how properly the attitude transformation has accomplished. At this level, it’s clear to see that we’ve got very well-defined boundaries for the cells.

    As an alternative of studying the picture as a complete, we must always completely break up the desk up and skim cell by cell. The reader can deal with a number of strains, and even paragraphs of textual content, however this manner it doesn’t should make these guesses as to the place these boundaries must be and the place these strains begin & finish. As well as, we all know precisely which prediction pertains to which cell. For the doubles frames, it suffices to separate the picture into prime and backside half.

    On this card, our house groups title is “Shiny Shellys”; And our away groups title is “Dungeons & Druids”.

    Our first house participant is “John Williams”; And his opponent is “Andre Davies”.

    Andre received this body, so the digits written within the rating columns are 0 and 1 respectively.

    Utilizing the usual English mannequin for EasyOCR yields critically underwhelming outcomes, it even struggles with figuring out the digits generally.

    3kls al (y'>   NNCEos +3Qv0?
    TWcllml 0 HnboePNIC
    S Laldro 0 ked ORe
    S.Uner 1 0 Mve VM
    M.ele 1 0 RabaukHnvic
    SjanM 0 1 VkiiGnn.
    TLulunU QeaZuOms
    S.YJaldraS 1 0 Amxc OmG
    m elnoe Gew O
    S Cosstrr 0 1 Mike Vi
    S.ildro 0 1 DecaONz
    S.leste. 1 0 Gnol
    MClmce 1 0 Nile VrA
    T.JCLA 0 1 Hus Pwls

    With the very best will on the earth, this received’t be ok. All however just a few are deeply unrecognisable. We have to prepare the mannequin to enhance the predictions. The excellent news is that its doable to tremendous tune the present mannequin for our functions.

    I used a manually labelled, curated dataset of round 1400 labelled pictures, like those above, and educated for two and a half to three epochs. This was very achievable on my house pc, taking just a few hours. This coaching is important for decoding this high quality of handwriting and doing so does wonders on the predictions.

    JhiBlaly'2   DUNCEONS 1DOUIDP
    J.Williom 0 1 ANDREDNIED
    J Waldron 0 1 GeEl ORME
    SiWoter 1 0 MiE M
    MiElmoe 1 0 RCBBOCADNI
    Dani 0 1 AULWigNn
    JWillian REAIADII
    J.aloro 1 0 AWRE ONO
    mi Emoe E O
    S LotRr 0 1 MIRE NM
    o.aldro 0 1 RDOyAonE
    Jleites 1 0 GrOE
    mElmoe 1 0 MILE VM
    JDoNel 0 1 HuDee OMIo

    As talked about beforehand, good post-processing is required for these predictions to be helpful to us.

    The methodology right here is — for any prediction, there’s a finite variety of actual potentialities. Take the workforce names for instance, there are 29 decisions. Supplied the reader offers us one thing that’s shut sufficient to the right reply, we stand to make guess.

    We use a metric known as Phonetic: Editex from the textdistance library to calculate the ‘distance’ between the prediction and a candidate reply. A ‘distance’ of 1 represents complete string similarity and a ‘distance’ of 0 full dissimilarity (I don’t know precisely how that is calculated but it surely works very properly). By evaluating the prediction to all potentialities, we are able to assert the right reply.

    Job accomplished! Or not — there at present round 200 and fifty gamers registered gamers within the APL, it will actually result in some inaccurate solutions now and again. We are able to leverage some extra of the data we all know in regards to the construction of the league to get the precise solutions, extra typically.

    If we all know which groups are concerned within the match, we are able to slender down the chances once more considerably. Sadly, we are able to’t depend on the prediction from what’s written within the prime cells (typically, individuals neglect to even fill these cells in), it may simply be improper and ship us down the improper path, particularly within the case the place a venue has an ‘A’ and a ‘B’ workforce, main their workforce names to vary by solely a single character, e.g. “Bar 46 A” & “Bar 46 B”.

    As an alternative, we are able to use a ‘majority’ rule to claim the workforce. For instance, the closest string to “J.Williom” from all doable choices is “J. Williams”, we all know that this participant is registered for the workforce “Shiny Shellys”, if that is additionally true for the majority of the predictions, then we are able to assert the house workforce with some confidence.

    As soon as we’ve got accomplished this, then we are able to look to guess the participant names with this information in hand. The typical workforce has 6–7 gamers registered; This massively reduces the possibilities of spurious guesses and will increase our general accuracy.

    With our pipeline constructed, in a single click on (and a few CPU time), we’re in a position to flip this:

    Into this:

    Shiny Shellys   Dungeons & Druids
    John Williams 0 1 Andre Davies
    Stephen Waldron 0 1 Greg Orme
    Stephen Lester 1 0 Mike Kay
    Michelle Elmore 1 0 Rebecca Davies
    John Daniel 0 1 Paul Wignal
    John Williams Rebecca Davies
    Stephen Waldron 1 0 Andre Davies
    Michelle Elmore Greg Orme
    Stephen Lester 0 1 Mike Kay
    Stephen Waldron 0 1 Rebecca Davies
    Stephen Lester 1 0 Greg Orme
    Michelle Elmore 1 0 Mike Kay
    John Daniel 0 1 Andre Davies

    Is it at all times 100% correct? Completely not, errors are made a few times per workforce per card. The variables that dictate these errors are quite a few, some groups:

    · Have many gamers, the extra choices to select from the extra doubtless inaccurate guesses are to be made when the predictions aren’t correct sufficient.

    · Use initials and never full names; When gamers have the identical surname, occasional errors are to be anticipated, e.g. “A. Evans” & “M. Evans”

    · Have a captain with dangerous handwriting (me included) — scribbles could be interpretable by a human, however this pipeline shouldn’t be designed to deal with that, printed textual content is understandably greatest.

    · Submit notably low-resolution or in any other case distorted pictures — the preprocessing steps are predominantly geared toward addressing the issues that happen due to this.

    · Write with a skinny pen or have a very mild writing fashion — In my expertise, the reader struggles with notably skinny textual content, some good dilation may repair this.

    · Make amendments to the line-up through the match or in any other case scribble on the cardboard to make indications — the detection and correction of such particulars might be very arduous to automatable if not unimaginable, for instance when writing outdoors of the cells. As such, handbook error checking will at all times be required for this pipeline

    In uncommon circumstances, the improper workforce is named thus trying to guess the participant names for that workforce is futile.

    Not so typically, however sometimes, we can’t discover a appropriate homography. The ensuing perspective transformation is dangerous or non-existent, as a result of dangerous or inadequate characteristic matches. The elements that contribute to this embody however aren’t restricted to:

    · Dangerous lightning situations i.e. glare/shadows.

    · Low decision — normally fixable by upscaling pictures, I enlarge pictures smaller than the template picture by an element of two.

    · Crinkled playing cards as a result of getting moist or folded playing cards — it is a nightmare.

    · Different undiscernible causes — perturbing the cardboard by blurring, including noise or rotating can generally ‘dislodge’ the cardboard in order that it might proceed by the ‘pipeline’ (puns supposed).

    Case 1 — Errors/Amendments

    From left to proper; artefacts that will introduce errors into the pipeline.

    1. Scribbles within the title cell will create issue for the reader.

    2. Scribbles within the rating column, 1 retrospectively modified to 0.

    3. Arrows indicating a change in participant place, “P Hatfield” & “M Swanson”.

    Case 2 — dangerous homography as a result of low variety of matches, unknown root trigger

    For this card, you may see that the influence on the names of the away gamers is minimal — however for the house workforce, we’ve got began to lose a number of the letters on the left-hand facet. On an unrelated observe, the ultimate body on this match was not performed as a result of lack of gamers, simply one other instance of the bizarre circumstances that may hinder us in our try to totally automate this pipeline.

    Case 3 — Bent card results in dangerous homography

    When the rating card is bent as on this case, the halves of the cardboard may be considered going through totally different instructions. Our assumption is that there’s a single level in house for a extra optimum digicam place, now there might be a number of.

    In consequence, the attitude transformation is lower than supreme, this isn’t an egregious case.

    Case 4 — Mild Glare makes eradicating the desk troublesome

    On this instance you may see that due to the sunshine glare, the desk has not been totally eliminated by our morphological operations. This noise might intervene with the predictions.

    I’ve discovered this pipeline to be extraordinarily helpful to me in my position as League Secretary. On a weekly foundation, I can full a process with half as a lot consideration in minutes that may ordinarily be just a few hours of cautious, thoughtful and meticulous work, leading to days of financial savings by the top of the season. I’m drastically incentivised by the potential time financial savings I can acquire to proceed to enhance on this pipeline; And enhancements may be made throughout the board.

    Pre-Processing,

    One may declare that our efforts could be greatest targeted on attaining the very best perspective transformation within the pre-processing stage — and that may be a legitimate declare, as if we fail at this step, then all the things else is moot. Nonetheless, the conditions during which good homographies aren’t attainable are both unimaginable as a result of nature of the {photograph} in any case — or doable, however by extraordinarily sophisticated and convoluted means. Typically, most playing cards obtain an appropriate perspective transformation.

    As I have to design the playing cards within the first place, I’ve the choice to incorporate fiducial markers, comparable to in a QR code. A number of markers might extra reliably produce matches to help to find the homography.

    Definitely, I may do some extra analysis into the workings of the matcher.

    Studying,

    There are numerous components to this undertaking the place it appears like we’re creating artwork and never doing science — that can also be true of working with the reader, the inside workings of which are sometimes enigmatic and unintuitive. Probably the most critical problem right here presents itself within the type of an (nearly) infinite house of potentialities which may be thought-about as an enter. This drawback wouldn’t be practically so arduous if we have been coping with a relentless printed font fashion and never a number of peoples handwriting. It’s compounded by the truth that; gamers are signed on and transferred from workforce to workforce, totally different pens might be used from week to week and captains are ceaselessly turned over or stood in for. We are able to by no means anticipate a participant’s title to be written precisely the best way it has earlier than — and we should at all times anticipate to have the ability to mistake letters or strings of letters for others.

    It’s my perception that, with sufficient engineering, a machine can do something that my very own mind can do. My very own mind leverages context from the state of affairs to make its guesses, seamlessly and with out recognition. I’ve information of, if even subconsciously, the names of the gamers within the league which can be signed on for the workforce, in addition to what strings of letters usually come collectively make names of individuals in my language and acquainted tradition. The machine should rely solely on its coaching to learn the textual content in a picture with out such further context. Given sufficient good high quality coaching although, it might approximate a few of this unconscious knowledge.

    Good high quality coaching is important. It’s my intention to each broaden on and curate the coaching information, bootstrap mixture mixtures and consider their worth by varied benchmarks.

    Publish-Processing,

    When making a prediction, EasyOCR outputs a rating of confidence. Equally, textdistance offers a rating of similarity between 0 and 1 for every candidate string. Presently, the most definitely participant title is taken with out regard for the potential of the others. Each measurements might be used to claim confidence in a prediction — or quite insecurity in additional ambiguous circumstances, drawing the person’s consideration to those and their alternate options. Situations the place title have been crossed out or are lacking may additionally be detectable utilizing this data.

    https://aberystwythpoolleague.co.uk/

    https://www.jetbrains.com/pycharm/

    https://numpy.org/

    https://en.wikipedia.org/wiki/Round-robin_tournament

    https://en.wikipedia.org/wiki/Optical_character_recognition

    https://github.com/JaidedAI/EasyOCR

    https://nextgeninvent.com/blogs/7-steps-of-image-pre-processing-to-improve-ocr-using-python-2/

    https://docs.opencv.org/3.4/db/d27/tutorial_py_table_of_contents_feature2d.html

    https://docs.opencv.org/4.x/dd/dd7/tutorial_morph_lines_detection.html

    https://docs.opencv.org/4.x/d9/dab/tutorial_homography.html

    https://docs.opencv.org/3.4/d1/de0/tutorial_py_feature_homography.html

    https://www.datatechnotes.com/2023/09/flann-feature-matching-example-with.html

    https://pypi.org/project/textdistance/

    https://github.com/life4/textdistance



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleApple Plans to Disable A.I. Summaries of News Notifications
    Next Article Top 3 Questions to Ask in Near Real-Time Data Solutions | by Shawn Shi | Jan, 2025
    Team_AIBS News
    • Website

    Related Posts

    Machine Learning

    Finding the right tool for the job: Visual Search for 1 Million+ Products | by Elliot Ford | Kingfisher-Technology | Jul, 2025

    July 1, 2025
    Machine Learning

    Meanwhile in Europe: How We Learned to Stop Worrying and Love the AI Angst | by Andreas Maier | Jul, 2025

    July 1, 2025
    Machine Learning

    Handling Big Git Repos in AI Development | by Rajarshi Karmakar | Jul, 2025

    July 1, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    How This Man Grew His Beverage Side Hustle From $1k a Month to 7 Figures

    July 1, 2025

    I Tried Buying a Car Through Amazon: Here Are the Pros, Cons

    December 10, 2024

    Amazon and eBay to pay ‘fair share’ for e-waste recycling

    December 10, 2024

    Artificial Intelligence Concerns & Predictions For 2025

    December 10, 2024

    Barbara Corcoran: Entrepreneurs Must ‘Embrace Change’

    December 10, 2024
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    Most Popular

    Profitable, AI-Powered Tech, Now Preparing for a Potential Public Listing

    June 7, 2025

    Geoffrey Hinton: These Jobs Will Be Replaced Due to AI

    June 17, 2025

    How AI Is Redefining Education and the Future of Work

    April 30, 2025
    Our Picks

    How This Man Grew His Beverage Side Hustle From $1k a Month to 7 Figures

    July 1, 2025

    Finding the right tool for the job: Visual Search for 1 Million+ Products | by Elliot Ford | Kingfisher-Technology | Jul, 2025

    July 1, 2025

    How Smart Entrepreneurs Turn Mid-Year Tax Reviews Into Long-Term Financial Wins

    July 1, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2024 Aibsnews.comAll Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.