Tuesday, September 29, 2020

The Poor Man’s AI

Sometimes I have fun with checking the logs of my website. Over the years I’ve changed domains, hosting providers, and technologies, but certain script attacks keep coming in the exact same style. If I’m looking up a couple of IP numbers they are originating from, most of the time I’m finding them reported by admins. 

A few weeks ago I’ve discovered a new phenomenon: an IP reported and blacklisted by a bot. In other words a webmaster with passion for coding, and pissed off by WordPress hackers has invested some time and effort to automate filtering the access logs for common WP attack patterns, and blacklisting the originating IP numbers.

Taking into account that various hosting providers are scanning sites for discovering WP vulnerabilities, and the script kiddies might use IP rotator, it’s hard to tell that blacklisting a particular IP number is always a good idea.

For me the remarkable thing is that people are starting to teach their scripts to identify an unknown script  (bot, crawler, you name it) after its behavior. The first intelligent antiviruses adopting behavior analysis have lifted the software security to a new level, and in my opinion the main utility of AI is offering new possibilities as a tool.

Since the advent of online shopping, travel tickets, sports betting, trading, and property listing there are countless crawlers sent day by day, hour by hour or even more frequently to gather data. While the high-end and middle-market companies are already hiding their data sources behind paid APIs, and using refined AI solutions for blocking the undesired bots, the low-end markets are dependent on the cheaper and less sophisticated software automations.

A small family shop or local business are not losing money by serving a hundred of bot visits per day via a classic hosting package, but a regional online business hosted in the cloud usually has a big database and it's  paying a quantifiable price for the outgoing traffic generated by bots, and the slower response time of their servers might be noticed by their clients.

Many times the velocity of getting the latest data sets, or a specific projection of a big amount of data are the keys of the success of a business. Ultimately knowledge is power, thus for a commercial entity at some point it becomes profitable to erect a fence against bots. 

In a hi-tech country you can buy whatever data you need. In a non-hi-tech country it depends on the local culture, how much is the price-quality ratio of the data you can collect. In the grey area of partially digitized data AI may be used eventually to analyze sound tracks and videos in order to rate the protagonist's objectivity.


No comments:

Post a Comment