Email Verification: spam stories, part 1
In spite of all the effort e-mail providers put in worldwide, every user receives a tremendous amount of unwanted and malicious mail to any e-mail account. It is indeed annoying, but let's look on the bright side: we can use these examples to demonstrate how useful the APIs by WhoisXML API can be in the battle against unwanted spam e-mails. In today's example we'll be using the e-mail verification API, the domain reputation API and the WHOIS API to analyze a spam email message which was not caught by a well-configured open-source spam filtering system.
One of our clients is in hold of an academic e-mail account at a large and respected university. The IT staff there is excellent, and they do a great job filtering out hundreds of spam emails an hour, directed to thousands of e-mail accounts. Yet there are some unwanted messages which still go through.
In particular, on 8 October, on the day of writing the present document, he received the following mail from [email protected], which he shared with us:
A profitable deal worth millions of Dollars. If interested email me
Wai Chim
Does not sound like a serious offer anyway, but… The reply-to address in the mail is [email protected]. A detailed look at the message source text does not reveal anything suspicious like links or scripts to run when opening the mail. So, apparently, whoever sent this message wants the addressee to reply to this e-mail account, thereby verifying that the potential victim at least might be ready to walk into his trap. After all, the mail did pass spam filtering at the server of the organization, as it is apparent from the following header lines:
X-Spam-Status: No, score=2.8 required=1500.0 tests=HTML_MESSAGE,
MISSING_HEADERS,SPF_PASS autolearn=no version=3.2.5
X-Spam-Level: **
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2014-05-11) on xxx
X-Scanned-By: MIMEDefang 2.64 on xxx
(We do not disclose here the hostnames and IP addresses of our client's mail account: they were replaced with "xxx" as being irrelevant to this investigation anyway.) Nevertheless, we do have two e-mail addresses here, and let us take a close look at them.
First let's check them with the e-mail verification API. Running the two addresses through it does not reveal any issues: both e-mail addresses are valid. This can happen though. This API is aimed at verifying e-mails to avoid sending messages to non-functional addresses, it is not primarily intended for finding spam addresses. But there is an important message here still: spam can be sent from mail addresses which have no technical issues at all, just like in this case.
So let's proceed to check the involved domains' reputation and WHOIS data. As for the reputation, we shall be using the domain reputation API. In particular, we use the following command-line on a Linux box:
curl --get "https://domain-reputation-api.whoisxmlapi.com/api/v1?apiKey=API_KEY&domainName=cuw.edu&mode=full"
where "API_KEY" is to be replaced with a valid API key. You can get one with 100 queries a month for free at the API website to try it yourself. Alternatively, you can just make a few queries directly from the webpage of the API.
As for "cuw.edu", we get
{
"reputationScore":97.52,
"testResults":[
{
"test":"WHOIS Domain check",
"warnings":[
"Owner details are publicly available"
]
},
{
"test":"SOA record configuration check",
"warnings":[
"The expire interval is 1296000. Recommended range is [604800 .. 1209600]",
"The minimum TTL is 60. Recommended range is [3600 .. 86400]"
]
},
{
"test":"Mail servers configuration check",
"warnings":[
"AAAA records not configured for mail servers: smg03.cuw.edu, smg01.cuw.edu, recr4app.cuw.edu, incoming01.cuw.edu, smg04.cuw.edu, smg02.cuw.edu",
"DMARC is not configured"
]
},
{
"test":"Potentially dangerous content",
"warnings":[
"Redirects found"
]
},
{
"test":"SSL vulnerabilities",
"warnings":[
"Your server supports suboptimal cipher suites: ECDHE-RSA-DES-CBC3-SHA, DES-CBC3-SHA",
"HPKP headers not set",
"HTTP Strict Transport Security not set",
"Heartbeat extension disabled",
"TLSA record not configured or configured wrong",
"OCSP stapling not configured"
]
}
]
}
The reputation score is high, so the domain is basically okay, although the rest of the report shows a number of issues which would be important to check by the personnel running this domain.
But who are they? It is a WHOIS lookup that can give us the answer.
Technicalities: The good old WHOIS API is suitable for this kind of task: it is still the best source of fundamental domain ownership and registration date information. For users who prefer command-line tools, there is the bestwhois command line utility available on GitHub, which makes it very simple to look at the current (and even historic) WHOIS data for a domain without restrictions.
Here we go, the result of the lookup reads
{
"domainName":"cuw.edu",
"parseCode":0,
"audit":{
"createdDate":"2019-10-07 17:49:26.000 UTC",
"updatedDate":"2019-10-07 17:49:26.000 UTC"
},
"registryData":{
"createdDate":"21-Dec-1993",
"updatedDate":"26-Sep-2019",
"expiresDate":"31-Jul-2020",
"registrant":{
"name":"Concordia University Wisconsin",
"street1":"12800 N. Lake Shore Drive",
"city":"Mequon",
"state":"WI",
"postalCode":"53097",
"country":"UNITED STATES",
"countryCode":"US",
"rawText":"Concordia University Wisconsin\n12800 N. Lake Shore Drive\nMequon, WI 53097\nUSA"
}
...
We have omitted the rest of the record for privacy reasons. As can be seen, although the registrar record does not hold much information, the registry contains very complete and fair WHOIS data, all the contact information is submitted in the rest of the record (not quoted here), so we might even inform them on the issues they are facing. It is the domain of Concordia University Wisconsin, a respected academic establishment. So "our friend" Wai Chim is hardly offering his good business opportunity for our client on their behalf. Maybe the reason for sending this e-mail from them was that it is more convincing for academic personnel like our client to get this from another university's account. You might say that clever university people will never get hooked by such an unconvincing message. According to the stories heard from system administrators, however, some of them are, surprisingly enough, very much prone to it…
Let me note that from the e-mail header it is clear that this mail was indeed sent by the mail server of this domain. Hence, probably the infrastructure of the university had been compromised. Let's hope they shall be reading this blog and act upon it ASAP.
But let us turn our attention to the domain of the reply-to address, corporatgroup.org. Their reputation is far less respectable:
{
"reputationScore":77.6,
"testResults":[
{
"test":"WHOIS Domain check",
"warnings":[
"Registered 1 month and 10 days ago",
"Owner details are publicly available"
]
},
{
"test":"SOA record configuration check",
"warnings":[
"Although the serial number is valid, it's not following the general convention: 3",
"The expire interval is 259200. Recommended range is [604800 .. 1209600]",
"The minimum TTL is 300. Recommended range is [3600 .. 86400]"
]
},
{
"test":"Mail servers configuration check",
"warnings":[
"DMARC is not configured",
"The top priority mail server is ASPMX.L.GOOGLE.com, but TTL is not equal to the recommended value (86400)"
]
},
{
"test":"Mail servers response",
"warnings":[
"Greeting response doesn't contain the mail server's domain name: alt3.aspmx.l.google.com, alt4.aspmx.l.google.com, aspmx.l.google.com, alt1.aspmx.l.google.com, alt2.aspmx.l.google.com"
]
}
]
}
A score below 80 can definitely be considered suspicious. And what’s more, the domain was registered 1 month and 10 days ago. Voilà, we have a young domain in the story. But who are they? Let's see their WHOIS record. The whole can be quoted without any violation of any data protection rule, as we shall not learn much from it anyway:
{
"domainName":"corporatgroup.org",
"parseCode":8,
"audit":{
"createdDate":"2019-10-08 00:37:18.000 UTC",
"updatedDate":"2019-10-08 00:37:18.000 UTC"
},
"registrarName":"Google LLC",
"registrarIANAID":"895",
"registryData":{
"createdDate":"2019-08-28T12:00:41Z",
"updatedDate":"2019-08-28T12:00:44Z",
"expiresDate":"2020-08-28T12:00:41Z",
"registrant":{
"organization":"Contact Privacy Inc. Customer 1245346785",
"state":"ON",
"country":"CANADA",
"countryCode":"CA",
"rawText":"Registrant Country: CA"
},
"domainName":"corporatgroup.org",
"nameServers":{
"rawText":"NS-CLOUD-B1.GOOGLEDOMAINS.COM\nNS-CLOUD-B2.GOOGLEDOMAINS.COM\nNS-CLOUD-B3.GOOGLEDOMAINS.COM\nNS-CLOUD-B4.GOOGLEDOMAINS.COM\n",
"hostNames":[
"NS-CLOUD-B1.GOOGLEDOMAINS.COM",
"NS-CLOUD-B2.GOOGLEDOMAINS.COM",
"NS-CLOUD-B3.GOOGLEDOMAINS.COM",
"NS-CLOUD-B4.GOOGLEDOMAINS.COM"
]
},
"status":"clientTransferProhibited serverTransferProhibited",
"parseCode":1275,
"audit":{
"createdDate":"2019-10-08 00:37:18.000 UTC",
"updatedDate":"2019-10-08 00:37:18.000 UTC"
},
"customField1Name":"RegistrarContactEmail",
"customField1Value":"[email protected]",
"registrarName":"Google LLC",
"registrarIANAID":"895",
"createdDateNormalized":"2019-08-28 12:00:41 UTC",
"updatedDateNormalized":"2019-08-28 12:00:44 UTC",
"expiresDateNormalized":"2020-08-28 12:00:41 UTC",
"customField2Name":"RegistrarContactPhone",
"customField3Name":"RegistrarURL",
"customField2Value":"+1.8772376466",
"customField3Value":"https://domains.google.com",
"whoisServer":"whois.google.com"
},
"contactEmail":"[email protected]",
"domainNameExt":".org",
"estimatedDomainAge":40
}
It is just a domain registered at Google, and nothing is revealed about the registrant. Unfortunately, there are many benign domains with similarly limited WHOIS information in the GDPR era. But we have found this very domain name in a mail message which is apparently spam. Do we need more evidence?
As a conclusion, although there are numerous special spam filtering tools available, even with very good performance, there is still room for improvement. Either on the mail servers' side or even in the end users' applications, like Thunderbird.
And indeed, although SpamAssasin does verify domain age, in the present case, the domain of the reply address went through this test without being marked as suspicious. Both the WHOIS and Domain Reputation APIs provide accurate information on the domain age which can improve such verification. The domain reputation score provided by the Domain Reputation API can also be apparently useful when checking the sanity of an e-mail. It is derived from a variety of security measures, thus providing a complex assessment on the domain's reputation. All these data are available through APIs, so e-mail software developers and system administrators can easily integrate them into their products.
Finally, another lesson to be learned from this case is that administrators of domains should regularly use the Domain Reputation API to verify their own domains and act upon the revealed shortcomings to avoid becoming compromised, like the US university domain involved in the present example. For more details on WhoisXML API products, visit our webpage.