Programatically searching a huge database and sorting results

Завершен Опубликован 3 года (лет) назад Оплачивается при доставке
Завершен

Programatically searching a huge database and sorting results

The goal is to find all of the most common strings that precede each word or phrase in domain names. This is NOT a manual data entry job!

We have a very large database of ALL .com domain names and another list containing thousands of unique words or phrases. We need to take each word or phrase from the second list and search the whole domain database to find every instance of that word or phrase ONLY IF it starts within the first 7 digits of the domain name. The goal is to create a list of the most common words that precede each term, in order of most popular to the least popular, and show the quantity of each.

To give you an example, if the original term is “REBATE” (term number 1753) you would search the database of nearly 100 million domain names, finding any domain name that contains the term REBATE starting in the first 7 digits of the number. Then list all of the occurrences sorted by the first 7 digits in order of most frequent to least. So the outcome should look something like this (JUST AN EXAMPLE!)

1753,REBATE

THE-REBATE,18

YOUR-REBATE,9

CASH-REBATE,8

EASY-REBATE,8

RAPID-REBATE,7

HOME-REBATE,5

MAILIN-REBATE,5

TAX-REBATE,4

NOWAIT-REBATE,3

GETA-REBATE,2

ETC…

You also have to be able to open a rar file.

The output should be single text file listing multiple terms, one after another like the above.

We have thousands of terms to search. Our goal is to find who can do this the most efficiently, so we will award this to multiple freelancers asking them each to put in ONE HOUR worth of time. Then we will select whoever completes the most terms (and does it properly) to continue. We may work with more than one or may find one freelancer stands out and give them all the work.

I have a good track record as an employer with Freelancer and an even bigger positive record with Upwork which I’ve used for several years. I’m doing it this way because it’s impossible to tell from someone’s profile how efficient and accurate of a worker someone is just from their profile or talking to them. This will be a lot of work for whoever is best at it. The worst case, if someone else is faster and gets more done, you’ll get a positive review and a completed job.

The first step is to answer some questions.

***No automatic proposals. If you don’t answer these questions fully, you won’t be considered.

What tools would you use to do this?

Have you worked with a database of 100 million records before?

What’s the biggest database you’ve worked with before?

What would be your goal or expectation in: Terms Searched and Sorted / hour?

Big Data Sales Обработка данных Интеллектуальный анализ данных Data Analytics Data Extraction Data Scraping Веб-скрейпинг

ID проекта: #27394396

О проекте

26 заявок(-ки) Удаленный проект Последняя активность 3 года (лет) назад

Поручен:

obsurf

Hello. I've checked your description in detail. ***** I'm using Python for this, and have deal with oracle phone number database. ***** I am very experienced, honest, have good skills, and also have much availability t Больше

$35 USD / час
(0 отзывов(-а))
0.0
(16 отзывов(-а))
6.0

26 фрилансеров(-а) готовы выполнить эту работу в среднем за $25/час

yanakhokhlova199

Hello there! Happy to bid here since I have the capability to build your project. I am a database expert and have rich experience in manipulating, sorting data. So I think you’d better discuss with me for clear require Больше

$25 USD / час
(6 отзывов(-а))
5.1
mahmoudralizadeh

Hi there Just as a short introduction, we have a team of experienced Machine Learning and Deep Learning Experts who would be glad to help you shape your requirements into products. They have built many machine learning Больше

$25 USD / час
(4 отзывов(-а))
4.1
RusselExpert

Greetings. Thanks for your post. As a senior fullstack developer who has 10+ years of experience, I can deliver satisfactory product in a high quality. What tools would you use to do this? I will use c/c++ because it's Больше

$25 USD / час
(3 отзывов(-а))
4.1
ItMasterDev

you don't need a database for this type of job. if you have a dB you need just a plain csv output and some Linux bash command to filter and sort string inside your file. as long as the file is smaller than available Больше

$28 USD / час
(6 отзывов(-а))
4.2
rishabsingla003

Hello there, I throughly checked the requirements and really well understand it. First I need to know the type of DB we're using like: Mysql, Oracle, or any other. Frankly says, It can't be possible to traverse 1 Mill Больше

$10 USD / час
(3 отзывов(-а))
3.3
kaloyan13

I have very good parallel programming skills, which we can use to solve your task. What tools would you use to do this? - C/C++ and OpenMP or Pthreads Have you worked with a database of 100 million records before? - No Больше

$20 USD / час
(5 отзывов(-а))
3.2
gargankit642

Nice to meet you I am a Machine Learning expert In all domains such as industry, economy and biomedicine, any hard problem can be resolved using Artificial Intelligence Techniques. In my PhD research, I have developed Больше

$34 USD / час
(1 отзыв)
3.0
Lavish28

Please give me a chance i am beginner i will try my best i will make u happy to my work i alaways do my best in everything work

$25 USD / час
(0 отзывов(-а))
0.0
nizamfarhas

Hi I checked your post with title "Programatically searching a huge database and sorting results ". I am familiar to python. I want to discuss your project in detail. please contact me thanks

$15 USD / час
(0 отзывов(-а))
0.0
Khemchand1995

I am a professional data entry expert. I can assure you, that I shall give you top class work. You will like to come back to me again and again.

$25 USD / час
(0 отзывов(-а))
0.0
wordsworld007

Dear Sir / Mam, I am interested in your job vacancy. I have bundle of experience of research base projects in the past. I am very confident to provide you accurate results as I understand the basic requirements. I wou Больше

$10 USD / час
(0 отзывов(-а))
0.0
KarlSvend

Hello! I provide data collection services, through web scraping and text mining, for data interpretation, comparison, composition, distribution and relationship. Feel free to contact me!

$25 USD / час
(0 отзывов(-а))
0.0
piraprakash379

I have 9 years of experience in software development. I recently worked on something similar with client where they will search address based on some keywords like 'street 9' searched within 100million property address Больше

$22 USD / час
(0 отзывов(-а))
0.0
pmoleleki

I will be using SAS to perform the work I have got more 15 years in data analytics and data quality

$22 USD / час
(0 отзывов(-а))
0.0
bradleyglazer

Hi, I like your approach, nice. The will answers your questions are as follows: I would use python, I am learning HTML, php and javascript at the moment, but like coding at a lower level. I coded in ruby for Standard Больше

$20 USD / час
(0 отзывов(-а))
0.0
aryacena12

I've skills in data entry , data processing , MS-Excel , MS-Word and many more. Plus I complete my work right on time and if you hire me I'll never let you down . It would be pleasure to work with you . Ping me if you Больше

$40 USD / час
(0 отзывов(-а))
0.0
Tan75h

i read your contents and understand what is work. i will do this job. i will filter the data and find all of the most common strings that precede each word or phrase. i am expert in Ms Excel. I seen your contents. Больше

$23 USD / час
(0 отзывов(-а))
0.0