One of the tasks I frequently use ChatGPT-like tools for is extracting markdown text from images. I enjoy watching conference videos on YouTube. Often, I find slides during these videos that I want to keep for future reference. To achieve this, I take screenshots and add them to my notebook. However, if I forget to add any textual comments with the screenshots, searching for them later becomes difficult. Additionally, there are times when I need to extract text in markdown format from the screenshots for future use.
Let’s look at an example screenshot that I took yesterday from a talk by OpenAI engineer on Fine Tuning.
