Username: Save?
Password:
Home Forum Links Search Login Register*
    News: Welcome to the TechnoWorldInc! Community!
Recent Updates
[August 11, 2025, 08:33:44 AM]

[August 11, 2025, 08:33:44 AM]

[August 11, 2025, 08:33:44 AM]

[August 11, 2025, 08:33:44 AM]

[May 13, 2025, 08:34:25 AM]

[May 13, 2025, 08:34:25 AM]

[May 13, 2025, 08:34:25 AM]

[April 12, 2025, 08:24:20 AM]

[April 12, 2025, 08:24:20 AM]

[April 12, 2025, 08:24:20 AM]

[April 12, 2025, 08:24:20 AM]

[March 12, 2025, 09:35:30 AM]

[March 12, 2025, 09:35:30 AM]
Subscriptions
Get Latest Tech Updates For Free!
Resources
   Travelikers
   Funistan
   PrettyGalz
   Techlap
   FreeThemes
   Videsta
   Glamistan
   BachatMela
   GlamGalz
   Techzug
   Vidsage
   Funzug
   WorldHostInc
   Funfani
   FilmyMama
   Uploaded.Tech
   Netens
   Funotic
   FreeJobsInc
   FilesPark
Participate in the fastest growing Technical Encyclopedia! This website is 100% Free. Please register or login using the login box above if you have already registered. You will need to be logged in to reply, make new topics and to access all the areas. Registration is free! Click Here To Register.
+ Techno World Inc - The Best Technical Encyclopedia Online! » Forum » THE TECHNO CLUB [ TECHNOWORLDINC.COM ] » Techno Articles » Internet
 Bayesian Filter: Technology And Advantages
Pages: [1]   Go Down
  Print  
Author Topic: Bayesian Filter: Technology And Advantages  (Read 653 times)
Shawn Tracer
TWI Hero
**********


Karma: 2
Offline Offline

Posts: 16072


View Profile
Bayesian Filter: Technology And Advantages
« Posted: February 29, 2008, 12:10:39 PM »


Bayesian Filter: Technology And Advantages
 by: Julia Gulevich


Not a long time ago, most anti-spam products simply used a list of keywords to identify spam. A good set of keywords could catch much spam. However, a keyword-based anti-spam filter requires manual updating and can be easily fooled by tweaking the message a little. Spammers simply examine the latest anti-spam techniques and find ways to bypass them. At the result you’re left with a high number of false positives.

The need in a new effective technique to fight against spam stood up. The experience showed that this new method might adapt itself to the spammers' tactics that would change with time.

The Bayesian filtering is based on the principle that most events are dependent and that the probability of an event occurring in the future can be inferred from the occurrences of this event in the past. This approach is used to identify spam. If some piece of text occurred mostly in spam emails but not in legitimate mail, then it would be reasonable to suppose that this email is probably spam.

To filter mail using the Bayesian technology, you need to generate a database of words collected from spam and legitimate mail. Then a probability value is assigned to each word; the probability is based on the calculations that take into account how often that word occurs in spam as opposed to legitimate mail.

After the legitimate and spam databases are created during an initial training period, the word probabilities can be calculated and the Bayesian filter is ready for use. When a new mail arrives, it is broken into words and the most significant words are singled out. From these words, the Bayesian filter calculates the probability of a new message being spam or not. If the probability is greater than a spam threshold, say 0.9, the message is classified as spam.

Tip! G-Lock SpamCombat allows you assign the hot keys to the common operations. For example, you can assign F8 to Mark Message as SPAM function and F9 to Mark Message as Clean. Next time when you train the Bayesian filter you can simply use two keys on your keyboard F8 and F9.

It is important to note that the analysis of spam and legitimate mail is performed on the mail the particular user (organization, company, etc.) receives, and therefore the Bayesian filter is adjusted to this particular person, company, or organization. For example, a financial institution may receive a lot of emails with the "mortgage" word and would get a lot of false positives if using an outdated anti-spam filter. The Bayesian filter analyzes the entire message with the word "mortgage", and concludes whether this email is spam or legitimate basing NOT only on a single keyword "mortgage". The Bayesian approach to filter spam is highly effective - spam detection rates of over 99.7% can be achieved with a very low number of false positives!

Let’s summarize what benefits we get using the Bayesian filter to catch spam:

1) Much more intelligent approach because it examines all aspects of a message, as opposed to keyword checking that classifies a mail as spam on the basis of a single word.

2) Self-adapting - constantly learning from new spam and new valid inbound mails, the Bayesian filter evolves and adapts to new spam techniques.

3) Sensitive to the user – it learns the email habits of the company and understands that, for example, the emails with the "mortgage" word are not always spam.

4) Multi-lingual and international - being adaptive it can be used for any language. The Bayesian filter also takes into account certain languages deviations or the diverse usage of certain words in different areas, even if the same language is spoken.

5) Difficult to fool, as opposed to a keyword filter - an advanced spammer who wants to trick the Bayesian filter can either use fewer words that usually indicate spam, or more words that generally indicate valid mail (such as a valid contact name, etc). Doing the latter is impossible because the spammer would have to know the email profile of each recipient - and a spammer can never hope to gather this kind of information from every intended recipient.

About The Author
Julia Gulevich is a technical expert associated with development of computer software like AATools, Email Verifier, G-Lock EasyMail, Anti-Spam Software http://www.glocksoft.com/sc/ More information can be found at Anti Spam Blocker Resources http://www.glocksoft.net/sc/

Logged

Pages: [1]   Go Up
  Print  
 
Jump to:  

Copyright © 2006-2023 TechnoWorldInc.com. All Rights Reserved. Privacy Policy | Disclaimer
Page created in 0.039 seconds with 24 queries.