Use AI to improve spam detection in Gmail and find legitimate emails that were mistakenly marked as spam by Google algorithms.
False positives in Gmail are uncommon but can happen, meaning an important email might mistakenly end up in your spam folder. When you’re dealing with hundreds of spam messages daily, identifying these legitimate emails becomes even more challenging.
You can create filters in Gmail such that emails from specific senders or with certain keywords are never marked as spam. But these filters would obviously not work for emails from new or unknown senders.
Find incorrectly classified messages in Gmail Spam
What if we used AI to analyze our spam emails in Gmail and predict which ones are likely false positives? With this list of misclassified emails, we could automatically move these emails to the inbox or generate a report for manual review.
Here’s a sample report generated from Gmail. It includes a list of emails with a low spam score that are likely legitimate and should be moved to the inbox. The report also includes a summary of the email content in your preferred language.
To get started, open this Google Script and make a copy of it in your Google Drive. Switch to the Apps Script editor and provide your email address, OpenAI API key, and preferred language for the email summary.
Choose the reportFalsePositives function from the dropdown and click the play button to run the script. It will search for unread spam emails in your Gmail account, analyze them using OpenAI’s API, and send you a report of emails with a low spam score.
If you would like to run this script automatically at regular intervals, go to the “Triggers” menu in the Google Apps Script editor and set up a time-driven trigger to run this script once every day as shown below. You can also choose the time of the day when you wish to receive the report.
How AI Spam Classification Works – The Technical Part
If you are curious to know how the script works, here is a brief overview:
The Gmail Script uses the Gmail API to search for unread spam emails in your Gmail account. It then sends the email content to OpenAI’s API to classify the spam score and generate a summary in your preferred language. Emails with a low spam score are likely false positives and can be moved to the inbox.
1. User Configuration
You can provide your email address where the report should be sent, your OpenAI API key, your preferred LLM model, and the language for the email summary.
const USER_EMAIL = ’email@domain.com’;
const OPENAI_API_KEY = ‘sk-proj-123’;
const OPENAI_MODEL = ‘gpt-4o’;
const USER_LANGUAGE = ‘English’;
2. Find Unread Emails in Gmail Spam Folder
We use the epoch time to find spam emails that arrived in the last 24 hours and are still unread.
const HOURS_AGO = 24;
const MAX_THREADS = 25;
const getSpamThreads_ = () => {
const epoch = (date) => Math.floor(date.getTime() / 1000);
const beforeDate = new Date();
const afterDate = new Date();
afterDate.setHours(afterDate.getHours() – HOURS_AGO);
const searchQuery = `is:unread in:spam after:${epoch(afterDate)} before:${epoch(beforeDate)}`;
return GmailApp.search(searchQuery, 0, MAX_THREADS);
};
3. Create a Prompt for the OpenAI Model
We create a prompt for the OpenAI model using the email message. The prompt asks the AI model to analyze the email content and assign a spam score on a scale from 0 to 10. The response should be in JSON format.
const SYSTEM_PROMPT = `You are an AI email classifier. Given the content of an email, analyze it and assign a spam score on a scale from 0 to 10, where 0 indicates a legitimate email and 10 indicates a definite spam email. Provide a short summary of the email in ${USER_LANGUAGE}. Your response should be in JSON format.`;
const MAX_BODY_LENGTH = 200;
const getMessagePrompt_ = (message) => {
const body = message
.getPlainBody()
.replace(/https?://[^s>]+/g, ”)
.replace(/[nrt]/g, ‘ ‘)
.replace(/s+/g, ‘ ‘)
.trim();
return [
`Subject: ${message.getSubject()}`,
`Sender: ${message.getFrom()}`,
`Body: ${body.substring(0, MAX_BODY_LENGTH)}`,
].join(‘n’);
};
4. Call the OpenAI API to get the Spam Score
We pass the message prompt to the OpenAI API and get the spam score and a summary of the email content. The spam score is used to determine if the email is a false positive.
The tokens variable keeps track of the number of tokens used in the OpenAI API calls and is included in the email report. You can use this information to monitor your API usage.
let tokens = 0;
const getMessageScore_ = (messagePrompt) => {
const apiUrl = `https://api.openai.com/v1/chat/completions`;
const headers = {
‘Content-Type’: ‘application/json’,
Authorization: `Bearer ${OPENAI_API_KEY}`,
};
const response = UrlFetchApp.fetch(apiUrl, {
method: ‘POST’,
headers,
payload: JSON.stringify({
model: OPENAI_MODEL,
messages: [
{ role: ‘system’, content: SYSTEM_PROMPT },
{ role: ‘user’, content: messagePrompt },
],
temperature: 0.2,
max_tokens: 124,
response_format: { type: ‘json_object’ },
}),
});
const data = JSON.parse(response.getContentText());
tokens += data.usage.total_tokens;
const content = JSON.parse(data.choices[0].message.content);
return content;
};
5. Process Spam Emails and email the Report
You can run this Google script manually or set up a cron trigger to run it automatically at regular intervals. It marks the spam emails as read so they aren’t processed again.
const SPAM_THRESHOLD = 2;
const reportFalsePositives = () => {
const html = [];
const threads = getSpamThreads_();
for (let i = 0; i < threads.length; i += 1) {
const [message] = threads[i].getMessages();
const messagePrompt = getMessagePrompt_(message);
const { spam_score, summary } = getMessageScore_(messagePrompt);
if (spam_score <= SPAM_THRESHOLD) {
html.push(`
`);
}
}
threads.forEach((thread) => thread.markRead());
if (html.length > 0) {
const htmlBody = [
`
Email Sender | Summary |
---|
‘,
].join(”);
const subject = `Gmail Spam Report – ${tokens} tokens used`;
GmailApp.sendEmail(USER_EMAIL, subject, ”, { htmlBody });
}
};
Also see: Authenticate your Gmail messages