ML/AI/data science/

We process documents related to mortgages, aka everything that happens to originate a mortgage that you don't see as a borrower. Often times the only access to a document we have is a scan of a fax of a print out of the document. Our system is able to read and comprehend that document, turning a PDF into structured business content that our customers can act on.

This dataset represents the output of the OCR stage of our data pipeline. Since these documents are sensitive financial documents we have not provided you with the raw text that was extracted. Instead we have had to obscure the data. Each word in the source is mapped to one unique value in the output. If the word appears in multiple documents then that value will appear multiple times. The word order for the dataset comes directly from our OCR layer, so it should be roughly in order.

Here is a sample line:

CANCELLATION NOTICE,641356219cbc f95d0bea231b ... [lots more words] ... 52102c70348d b32153b8b30c

The first field is the document label. Everything after the comma is a space delimited set of word values.

The dataset is included as part of this repo.

Your Mission

Should you choose to accept it ...

Train a document classification model. Deploy your model to a public cloud platform (AWS/Google/Azure/Heroku) as a webservice, send us an email with the URL to you github repo, the URL of your publicly deployed service so we can submit test cases and a recorded screen cast demo of your solution's UI, its code and deployment steps. Also, we use AWS so we are partial to you using that ... just saying.

Measurement Criteria

We will measure your solution on the following criteria:

Does your webservice work?

Is your hosted model as accurate as ours? Better? (think confusion matrix)

Your code, is it understandable, readable and/or deployable?

Do you use industry best practices in training/testing/deploying?

Do you use modern packages/tools in your code and deployment pipeline like this?

The effectiveness of your demo, did you frame the problem and your approach to a solution, did you explain your thinking and any remaining gaps, etc?

Are we able to run your testcases against your webservice? Can we run them against our webservice?

A few more details

Webservice spec:


Respect content-type header (application/json and text/html minimum other bonus)

Discoverable from root path

URL encoded GET parameter "words" returns predicted document type (confidence is a bonus) in field "prediction" and "confidence"

HTML pages should be readable by a human and allow for action, aka input field and submit buttons etc.

Even a broken clock is right twice a day. A working webservice is a good first goal. It could return the highest likelihood doc class.

Квалификация: Python, Machine Learning (ML), Анализ и обработка данных, Amazon Web Services

Показать больше data access class vbnet stored procedure, php data grid class, utdallas computer science class project, data snoopy class, post data php class, best hosted project solution, vbnet data access class, vbnet data access class sample, ilocos norte national school special science class, data access class excel, metar data php class, sample data access class net , vb data access class, project data entry chennai best company, online data entry jobs best payout, data entry project best, data manager class, data visualization class, syllabus data entry class cbse

О работодателе:
( 0 отзыв(-а, -ов) ) Los Angeles, United States

ID проекта: #20647860

12 фрилансеров(-а) в среднем готовы выполнить эту работу за $207


Hello sir. As a computer vision, opencv and OCR expert, machine/deep learning expert, I'm glad to see your project. If you check my profile, you can see I have deep knowledge in machine/deep learning algorithms, OCR al Больше

$500 USD за 7 дней(-я)
(101 отзывов(-а))

Pro Python/Machine Learning (ML)/Amazon Web Services/Data Science Expert! Dear client, Once saw a your project, it was attracted my mind because I am very interested in your project and also, have rich experiences an Больше

$140 USD за 7 дней(-я)
(29 отзывов(-а))

Dear sir. Your project attracted my attention at first glance, because I've extensive experience in ML & AI Programming. I'm really confident about your project, and very eager to join your project. If we have a chance Больше

$200 USD за 7 дней(-я)
(67 отзывов(-а))

Hello, I have gone through your job posting and become very much interested to work with you. I am an expert in this field. I have already completed several projects like this. For evidence you can see my profile. Pl Больше

$250 USD за 5 дней(-я)
(32 отзывов(-а))

Hi, I am Manish with HybridSkill, We have a team that has Expertise in Highly Specialized Technical Training and Infrastructure Management Services. Using our Expertise in niche technologies, for instance, public and p Больше

$250 USD за 7 дней(-я)
(4 отзывов(-а))

Dear sir. I read your project description very carefully. I've really rich experience in developing ML & AI Program, so your project is very interesting to me. In the past, I developed many projects related on Programm Больше

$140 USD за 7 дней(-я)
(6 отзывов(-а))

A Data Scientist with experience in Python, R programming, R Shiny, R studio and anything related to data science and python Master in Engineering, Electrical and Electronic Engineer, who is dynamic, reliable, resourc Больше

$30 USD за 7 дней(-я)
(2 отзывов(-а))

ApexTechnomatics is the next generation, unified service provider of AI based ERP solutions Odoo and erpnext. Our dedicated team of 15+ python developers offerings that provide ongoing value and innovation. Our diver Больше

$250 USD за 15 дней(-я)
(1 отзыв)

Greetings I have worked with Fintech company providing solution to baks,NBFC's to help them to provide loan to the right applicant. I have 3 years of experience of python, machine learning. I completely understand yo Больше

$200 USD за 7 дней(-я)
(1 отзыв)

Dear User, We have seen your Project details , and i can see its striking, so here am introducing my company . We are working with pull of Data science and Deep learning project. please let me know. we can support you Больше

$140 USD за 7 дней(-я)
(1 отзыв)

• Profound understanding of the mathematical fundamentals of machine learning and statistics, with an emphasis on non-parametric and non-linear methods. Practiced in data architecture including data ingestion pipeline Больше

$240 USD за 7 дней(-я)
(0 отзывов(-а))

We have expertise in document classification and have experience of 3 years. I have experience in NLP and deep learning.

$140 USD за 7 дней(-я)
(0 отзывов(-а))