Use and potentially fine-tune a state-of-the-art Transformer model to answer questions
This article will demonstrate how to implement state-of-the-art AI models for questions answering. These models can answer a given question by analyzing a provided body of text. At the end of the article, we’ll briefly discuss how to fine-tune a QA model using only a small amount of code.
QA Transformer models answer the questions by highlighting a text span within the provided context. So, to give a counter-example, if the provided text for the example above was “Bert is a character from Sesame Street,” and the question remained the same, then the model would be unable to determine the correct answer. But, for the example in the figure above, the model can answer the question as “Ernie” is within the context.
In this article, we’ll discuss how to implement a Transformer model for question answering with just a few lines of code. We’ll be using a Python package my team developed called Happy Transformer, that’s currently available on PyPI
Simply install Happy Transformer from PyPi using pip.
pip install happytransformer
from happytransformer import HappyQuestionAnswering
Now we’ll use the imported HappyQuestionAnswering class to create an object. By default, a model called “distilbert-base-cased-distilled-squad” is used. According to this paper DISTILBERT is 40% smaller, 60% faster while still being 97% as capable as BERT. The models we’ll be using are typically fine-tuned on a dataset called the Stanford Question Answering Dataset (SQUAD).
happy_qa = HappyQuestionAnswering()
Prediction
Use happy_qa’s answer_question method to make prediction. This method contains the following parameters:
context (required): A string that contains information required to answer the question.
question(required): A string that contains the question. The answer for the question must be a text span within the context
top_k: The number of results that will be returned. By default it is set to 1.
context = "Today's date is January 10th, 2021"
question = "What is today's date?"result = happy_qa.answer_question(context, question, top_k = 2)
print(result)
Output:[QuestionAnsweringResult(answer=’January 10th, 2021′, score=0.9818391799926758, start=16, end=34), QuestionAnsweringResult(answer=’January 10th’, score=0.008791258558630943, start=16, end=28)]
The method returns a list of dataclass objects with variables answer, score start and end.
Extracting Results
Dataclasses provide a clean interface to extract information. Simply indicate the index of the list you want to access, and then use “.answer” ,“.score” , “.start” or “.end” to access the desired variable
context = "OpenAI created GPT-2"
question = "What company created GPT-2?"result = happy_qa.answer_question(context, question, top_k = 2)print(result[0])
print(result[0].answer)
print(result[0].start)
print(result[0].end)
Output:
QuestionAnsweringResult(answer=’OpenAI’, score=0.9973885416984558, start=0, end=6)
OpenAI
0
6
Other Models
Currently, Happy Transformer supports 4 different types of models for question answering: “ALBERT”, “BERT”, “DISTILBERT” and “ROBERTA.”
HappyWordPrediction classes contain two parameters that allow you to select many different Transformer models.
model_type: The type of model. Options include: “ALBERT”, “BERT”, “DISTILBERT” and “ROBERTA.” By default, it is set to “DISTILBERT”
model_name: A specific model of the provided type. You can find potential models here. By default it is set to “distilbert-base-cased-distilled-squad”
happy_qa_albert =
HappyQuestionAnswering("ALBERT", "twmkn9/albert-base-v2-squad2")happy_qa_bert = HappyQuestionAnswering("BERT", "deepset/bert-base- cased-squad2") happy_qa_distilbert = HappyQuestionAnswering("DISTILBERT", “distilbert-base-cased-distilled-squad")happy_qa_roberta = HappyQuestionAnswering("ROBERTA","deepset/roberta-base-squad2")
For the best performance, without considering hardware limitations, I recommend using “mfeb/albert-xxlarge-v2-squad2”
Fine-tuning a QA model is incredibly easy using Happy Transformer. First, process the training data into a CSV file with columns: “context,” “question,” “answer_text,” “answer_start.” Then, after creating a HappyQuestionAnswering object, you can use a method called train() that only requires a path to the newly created CSV file. Finally, you can use a method called eval() to determine if the loss changed after fine-tuning.
After fine-tuning, you can continue to use the answer_question() method as described above with your newly fine-tuned model. View Happy Transformer’s README for details on how to adjust the fine-tuning hyperparameters.
Visit this gist for an example of how to fine-tune a question answering model:
https://gist.github.com/EricFillion/9781ad53de06b92333b2ab03a0860bb0
https://github.com/EricFillion/happy-transformer
Code from this tutorial: https://colab.research.google.com/drive/1TOfbDSW-of7DS81sgI9PmnEfv_UQGqWs
Vennify AI’s YouTube channel. Subscribe for new videos about NLP.