With an OCR system, it becomes a simple task of taking a photo of a document and extracting text from it. There are many tools that can do this, but in a business setting, you may need a more “customized” approach. So, we are going to teach you how to create your own image-to-text converter using Python.
How to Convert Image into Text Format Using Python
- Open a Google Collab Notebook
- Install Tesseract
- Importing Libraries and Modules
- Import an Image and Extract the Text
Here are the perquisites that you need to fulfil to follow this guide. Bring your walls alive. Keep adding, changing and re-arranging. https://www.wallpics.com/ brings High-Quality Photo Tiles That Stick & Re-Stick.
- Have an internet connection;
- Have a Google account.
Now, let’s dive right into it.
1. Open a Google Collab Notebook
The Google Collaboratory is a free online software that can be used by anyone so long as they have a Google account. The reason we want to use Google collab is that it frees you from having to install and set up a new environment.
This is great for writing your own code without having to worry about any storage problems. It also alleviates any issues such as system incompatibility that can prevent a local environment from running properly.
Anyhow, create a new notebook and then we can start installing the relevant libraries and dependencies.
Anyhow, create a new notebook and then we can start installing the relevant libraries and dependencies.
2. Install Tesseract
Tesseract is an image-processing library for Python. It has prebuilt functions for OCR which alleviate a ton of pressure from the programmer. You simply have to install it and the rest of the work can be done using some simple code.
For our demonstration, we are going to be using this open-source code from Github courtesy of Bhadreshpsavani.
So, let’s see what installing the libraries looks like in action.
In your first code box, type the following:
!sudo apt install tesseract-ocr
Don’t worry about the messages, as long as you don’t see any red text, everything is going fine.
Then open another code box, and then type the following:
It should look like this. Don’t worry if you get some sort of warning about previously imported packages. Just press the “restart runtime” button and move on to the next step.
Then open another code box, and then type the following:
!pip install pytesseract
It should look like this. Don’t worry if you get some sort of warning about previously imported packages. Just press the “restart runtime” button and move on to the next step.
Now, all libraries have been installed and can be used with your code.
3. Importing Libraries and Modules
Now, that all libraries have been installed, we can start importing them so that we can use them. Along with the Tesseract library that we installed, we are also going to use some utility libraries that are provided by Python such as Shutil, OS and random.
Shutil lets Python use operations such as “copy” and “create” on a file, which is great for automating them. OS is required to interface with and does operations that relate to the device’s OS such as creating, removing, and calling the contents of a directory. Finally, Random is good for generating…well…random numbers.
So, open a new code box and type in the following:
Import pytesseract
Import shutil
Import os
Import random
try:
from PIL import Image
except for ImportError:
import Image
Run this and now we are ready to start doing OCR. Now, we just need to import an image and extract text from it.
4. Import an Image and Extract the Text
To import an image into your Google Collaboratory, you need to open a new code block and type in the following:
Then run it. This will create a button for uploading an image that looks like this:
Just click the “Choose files” button and select an image with text in it. It will be uploaded to the notebook.
In our demonstration we used the following image, however, you can use whatever you want.
In our demonstration we used the following image, however, you can use whatever you want.
Then you need to run the following command which will utilize the OCR libraries to extract text from the image:
Now, the text has been extracted, but we cannot see it. To see it we need to write another line of code that will print it out. Here it is:
This will prompt the notebook to print out the text extracted from the image. In our case it looks like this:
ExtractedInformation = pytesseract.image_to_string(Image.open('OCR sample.png'))
print(extractedInformation)
And look at that, a perfect result. With this, we have successfully converted an image into text using Python.
Before You Go…
If you found this too hard to follow, then worry not, there are still other ways to convert images to text using Python that do not require any coding. You can simply go online and search for an image-to-text converter. This will show you plenty of tools that can do what we just did with Python. And most of them are even made using Python so technically, you will be converting an image to text form by using Python.
For, this article, we went to Google and searched for the term “image-to-text converter.” We picked the top result which in this case was this website:
https://www.imagetotext.info/
Then we tested this top-ranking tool with the same image that we put in our Python code to see if it worked well. Here is how it went down.
And the output was as follows:
As you can see, the results are pretty good, so it is a pretty good alternative to writing code for Python and creating your own image-to-text converter. The good thing is that this tool is also Python-based so you would be technically using Python for converting images to text.
Conclusion
And that is it for this article. We saw two different ways in which you can convert images to text form using Python. One method was very hands-on and showed you how to use code to convert an image to text, while the other method just relied on an online tool.
Whichever method you end up using, we just hope it works for you and helps you out.
0 Comments: