Projects
reading mate
Sep 27, 2021     5 minutes read

1. Is it OK to judge people by the books they read?

Well, not really. Personally, from time to time I read complete crap purposedly, out of curiosity and to form my own opinion about it (I didn’t like 50 Shades of Gray as much as I had expected) and sometimes I read books which may seem completely contradictory to my views/beliefs, because I try to be open and give them a chance to convince me (which sometimes happens, but e.g. Osho’s meditation techinques were a little too extreme for me).

But still, anytime I visit someone and I see a plentiful bookshelf, I can’t help myself but analyse it thouroughly and make a judgment on my host. Sorry for being superficial.

Who knows, maybe this person would be my reading mate?

2. Is there any way to automate finding new reading mates?

Wow, this is going to be a true programmer move.

Theoretically, I could take a picture of my friend’s bookshelf with my smartphone, and the app that I wrote would:

In this particular blog post I will stick to the first dot only, because smartphone app development is not really my cup of tea, and the last dot is super difficult and… as I wrote at the beginning, I wouldn’t really believe it.

3. Reading text from an image

is a fairly popular machine learning problem, known as OCR, or Optical Character Recognition. Let’s move on to the specifics:

4. Solution

Task turned out to be quite simple, so I didn’t have tweak, not even know keras at all. Here’s ahwt I had to do:

Here comes the code:

import matplotlib.pyplot as plt
import keras_ocr
import numpy as np


def read_image():
    url = "https://pbs.twimg.com/media/E7zIo5RXMAQ9kEV.jpg"
    image = keras_ocr.tools.read(url)  # 1
    im = np.flip(image.transpose((1, 0, 2)), 0)  # rotate image counterclockwise
    return im


def read_titles(pred):
    books = []
    coords_max = 0
    for text in pred:
        letters, coords = text
        if min(coords[:, 1]) > coords_max:
            if books:
                books[-1] = [x for _, x in sorted(zip(order, books[-1]))]
            order = []
            books.append([])
        books[-1].append(letters)
        coords_max = max(coords[:, 1])
        order.append(min(coords[:, 0]))

    titles = [" ".join(b).upper() for b in books]
    return [t for t in titles if len(t) > 10]


def plot_predictions(im, pred):
    _, axs = plt.subplots(nrows=1, figsize=(20, 20))
    return keras_ocr.tools.drawAnnotations(image=im, predictions=pred, ax=axs)


def main():
    im = read_image()
    pipeline = keras_ocr.pipeline.Pipeline()  # 2
    pred = pipeline.recognize([im])[0]  # 3
    plot_predictions(im, pred)
    print(read_titles(pred))  # 4

main()

The code is, IMHO, surpisingly short if we considered how difficult the task actually is. Most of the work is done by keras_ocr package, but a few parts may seem a little confusing:

5. Results

['SPUFFORD FRANCIS LIGHI PERPETUAL', 
'S  GREAT CIRCLE', 
'SUNJEEV SAHOTA CHINA ROOM', 
'BEWILDERMENT RICHARD POWERS', 
'THE JORTUNE MEN MOHAMED NADIFA', 
'NO ONE IS TALKING ABOUT THIS PATRICIA LOCKWOOD', 
'MARY LAWSON A TOWN CALLED SOLACE', 
'AN ISLAND KAREN JENNINGS', 
'KAZUO ISHIGURO KLARA AND THE SUN', 
'TIR SWEETNESS O WATER NATHA HARRIS', 
'THE PROMISE DAMON GALGUT', 
'RACHELCUSK SECOND PLACE', 
'ARUDPRAGASAM ANUK NORTH PASSAGE A']

keras_ocr clearly can read some text, but not perfectly:

Another issue is that this algorithm cannot tell the difference between the author and the title, which is obvious; I sometimes can’t tell it myself either.

6. Summary

It was quite fun to use OCR, but it seems that reading is a little more that just recognizing letters, e.g. you have to know:

All of these problems can be addressed programatically, i.e. solved, but it would require much more sophisticated software.