Search for headings in pages in a PDF using python

I want to extract titles from pdf pages and match them with a search query. See attached file for an example.

In the attached file, if I search for "Balance Sheet", the code should be able to return page 232.

So input will be a string and output will be a page number (integer value).

Note that "balance sheet" would be at multiple locations but we want to return only those pages in which it is in the title.

If you have previously used pdfminer then this should be easy for you. I'm open to other core languages like Java.

You can also explore pdftitle library, if that works.

Important thing is speed and accuracy. We tried doing it with PyPDF but it is not so accurate. So keep that in mind.

We can provide many other example documents if needed.

Навыки: Python, Интеллектуальный анализ данных, PDF, Java

О клиенте:
( 2 отзыв(-а, -ов) ) Gurgaon, India

ID проекта: #32749279

14 фрилансеров(-а) готовы выполнить эту работу в среднем за ₹24821

(130 отзывов(-а))
(37 отзывов(-а))
(91 отзывов(-а))

Hi I am a very experienced statistician, data scientist and academic writer. I have completed several PhD level thesis projects involving advanced statistical analysis of data. I have worked with data from several comp Больше

₹35000 INR за 7 дней(-я)
(38 отзывов(-а))

Hello Sir! I think I'm a great fit for this project because I have an interest in your project and can deliver on time, according to your specifications

₹25000 INR за 7 дней(-я)
(10 отзывов(-а))

Hello sir, I can make this for you. I am a python developer with more than 2 years of experience. I have done many projects in past. I can work on : 1. Web Scraping / Data Science / ML 2. Django 3. APP development 4. Больше

₹12500 INR за 2 дней(-я)
(37 отзывов(-а))
(3 отзывов(-а))

Hello, sir I've read your job posting carefully. I will search the title from pdf successfully. Here are my python skills - Data Visualization (Cryptocurrency trading bot, stock prediction, Prediction Algorithm for Spo Больше

₹37500 INR за 3 дней(-я)
(1 отзыв)

----------------Professional Python & PDF Processing Expert! Best Result in Time!----------- Dear sir. I've read your project description very carefully. I've extensive experience in Python & PDF Processing, so I belie Больше

₹25000 INR за 7 дней(-я)
(2 отзывов(-а))

I want to volunteer for your project of encoding and decoding. If you feel I am worth it you can give it a try. I will share the image of output for your confidence and then only ask for payment. If you want you can Больше

₹35000 INR за 7 дней(-я)
(0 отзывов(-а))

Hi. I am a data scientist. I am very familiar to Deep learning apis such as Tensorflow and fastai, mxnet. I have a good hands on working with Advanced R and Python and BI tools and technologies, AI, Big Data. I have qu Больше

₹25000 INR за 7 дней(-я)
(0 отзывов(-а))

I am a software developer and will be able to do the above mentioned task in 7 days.

₹15000 INR за 7 дней(-я)
(0 отзывов(-а))

We can build this using tesaract and open cv , using NLP we can also use pdf miner We can alterativelt also use AWS textextract

₹25000 INR за 7 дней(-я)
(0 отзывов(-а))

I am expert in data entry, typing, editing etc. if you hire me for this project, I will assure you that I will complete it on time. Thank you.

₹25000 INR за 7 дней(-я)
(0 отзывов(-а))