main hypothesis will be that training LayoutLM with more data, that we acquire through data-augmentation, will increase performance on the 3rd task of the SROIE challenge. First we, will replicate the baseline model of LayoutLM and finetune it on the SROIE data as-is in order to try to recreate the results of the LayoutLM paper in our environment, making sure to keep track of hyperparameters such as the number of epochs trained and learning rates. Then, we will create our augmented dataset by replacing the targets of the information extraction task (revenue number, firm name, and address) with randomly generated ones that still have the correct format. We will then check whether using the augmented data during the fine-tuning task is helpful. Following this main test, we plan on doing additional analysis looking at whether additional self-supervised pre-training on the target dataset could be useful.
6 фрилансеров(-а) готовы выполнить эту работу в среднем за $219
Hi, I am an experienced developer with 7+ years of experience. I will be pleased to do this project for you. Please inbox to discuss further details. Regards, Sharjeel