intTypePromotion=1
zunia.vn Tuyển sinh 2024 dành cho Gen-Z zunia.vn zunia.vn
ADSENSE

A new approach for generating a complete detector repertoire in artificial immune systems

Chia sẻ: Thi Thi | Ngày: | Loại File: PDF | Số trang:5

29
lượt xem
0
download
 
  Download Vui lòng tải xuống để xem tài liệu đầy đủ

We present a novel data structure for extracting data from protected self set. That helps to produce all possible detectors, or a complete detector repertoire, with lower time and space complexities compare to recent novel approaches. Theoretical analysis and experimental results show that our approach is effective and feasible. These new valuable characteristics can make complex AIS have the highest ability to detect data changes and to reduce false detection.

Chủ đề:
Lưu

Nội dung Text: A new approach for generating a complete detector repertoire in artificial immune systems

Nguyễn Văn Trường và đtg<br /> <br /> Tạp chí KHOA HỌC & CÔNG NGHỆ<br /> <br /> 80(04): 121 - 125<br /> <br /> A NEW APPROACH FOR GENERATING A COMPLETE DETECTOR<br /> REPERTOIRE IN ARTIFICIAL IMMUNE SYSTEMS<br /> Nguyen Van Truong1, Pham Dinh Lam2<br /> 1<br /> <br /> College of Education – TNU; 2 Board of Information Technology - TNU<br /> <br /> ABSTRACT<br /> Generation of the Detector set is a key problem in Artificial Immune Systems (AIS) due to cost of<br /> time and space. A weakness in many detector generating algorithms was random generation of<br /> detectors so that many candidate detectors may be discarded. We present a novel data structure for<br /> extracting data from protected self set. That helps to produce all possible detectors, or a complete<br /> detector repertoire, with lower time and space complexities compare to recent novel approaches.<br /> Theoretical analysis and experimental results show that our approach is effective and feasible.<br /> These new valuable characteristics can make complex AIS have the highest ability to detect data<br /> changes and to reduce false detection.<br /> Keywords: Artificial immune system, Intrusion detection system, self set, detector set, complete<br /> detector repertoire, false detection<br /> <br /> INTRODUCTION*<br /> It is impractical for security software to detect<br /> all intrusions in computer networks. Many<br /> network technologies on security came into<br /> being. The more successful one is based on<br /> AIS, which illustrates biology immune system<br /> on computers. It is evaluated as a new and<br /> effective soft computing method in the field<br /> of network security and information security.<br /> Especially, it is suitable for building up the<br /> Intrusion Detection Systems (IDS) that can<br /> protect computer systems against intruders<br /> and the destruction of computer viruses or<br /> other malicious software system.<br /> This paper is concerned with one aspect of<br /> computer security: ability of detecting all<br /> kinds of changes or attacks, both known and<br /> unknown ones. The method explored involves<br /> the most discussed immune models: Negative<br /> selection algorithm. In the algorithm,<br /> generating the detector set plays a very<br /> important role and helps to increase the<br /> overall performance of AIS. Our method is<br /> to concentrate on producing complete<br /> repertoire for negative selection in not only<br /> security systems, but also in problem<br /> requiring some tolerance of noise, or<br /> involving dynamic streams of data (such as<br /> system call sequences).<br /> *<br /> <br /> Tel: 0915016063; Email: nvtruongtn@gmail.com<br /> <br /> There are many researches on intrusion<br /> detection generating, such as Exhaustive<br /> Detector Generating Algorithm, Linear Time<br /> Detector generating algorithm, the Greedy<br /> Detector Generating Algorithm [2], [7]. Our<br /> algorithm involves a special data structure<br /> called Location table. It plays an important<br /> part in our algorithm and leads to reduce both<br /> time and space complexities. Furthermore,<br /> our approach has many advantages when<br /> applying in dynamic environment as<br /> mentioned in the 5th section.<br /> The remaining of the paper is organized as<br /> follows. In the 2nd section, we give a brief<br /> literature review of AIS. Our technique of<br /> generating complete repertoire, the main part of<br /> the paper, is presented in the 3rd section. Some<br /> analyses and experiments are given in the 4th<br /> section. The 5th section concludes the paper.<br /> LITERATURE REVIEW<br /> In this section, we first described negative<br /> selection algorithm, very commonly used<br /> algorithm in AISs. After that, we present a bit<br /> some recent researches in the field.<br /> Artificial<br /> negative<br /> selection<br /> is<br /> a<br /> computational imitation of self/nonself<br /> discrimination, firstly designed as a change<br /> detection method. This mechanism is first<br /> given by Forrest et al. [1]. The outline of a<br /> typical negative selection algorithm contains<br /> two stages [10].<br /> 121<br /> <br /> Nguyễn Văn Trường và đtg<br /> <br /> Tạp chí KHOA HỌC & CÔNG NGHỆ<br /> <br /> Figure 1. Original model of detector generation<br /> <br /> In the generation stage (Fig. 1), the detectors<br /> are generated by some random process and<br /> censored by trying to match self samples<br /> taken from set S. Those candidates that match<br /> are eliminated and the rest are kept as<br /> detectors in set D. In the detection stage (Fig.<br /> 2), the collection of detectors (or detector set)<br /> is used to check whether an incoming data<br /> instance is self or nonself. If it matches any<br /> detector, it is claimed as nonself or an<br /> anomaly. This description is limited to some<br /> extent, but conveys the essential idea.<br /> <br /> Figure 2. Detection of new instances<br /> <br /> There have never been Vietnamese<br /> individuals or groups have studied<br /> systematically artificial immune system and<br /> its application, except that NC Group (Natural<br /> Computing) of the Military Technical<br /> Academy. However, it only studied some<br /> theory aspects without experiments [5]. Our<br /> 122<br /> <br /> 80(04): 121 - 125<br /> <br /> recent research evolves negative selection<br /> algorithm, but it only concentrates on how to<br /> detect effectively by using given location<br /> table without generating detectors [6]. The<br /> approach therefore is not suitable for<br /> distributed environment. There are ongoing<br /> studies of many foreign researches in the field<br /> [7]. But their algorithms’ complexities are<br /> much lager than our ones. The details of our<br /> algorithm are described in the following<br /> section.<br /> GENERATING COMPLETE REPERTOIRE<br /> In the Fig. 3, we illustrate the process for<br /> generating a detector set by using the number<br /> of five contiguous matches required for a<br /> match. The string to be protected is logically<br /> segmented into five equal-length “self”<br /> strings. To generate the detector, random<br /> strings are produced and matched against<br /> each of the self strings. The first two strings,<br /> 00000111 and 00000010, are eliminated<br /> because they both match self string 00000110<br /> at at least five contiguous positions. The<br /> string 11101001 fails to match any string in<br /> the self at at least five contiguous positions,<br /> so it is accepted into the detector set.<br /> <br /> Figure 3. Illustration of generating Detector set<br /> <br /> Our approach is derived from previous work<br /> [6]. Data space is often huge, so AIS need<br /> huge memory storage. This also leads to<br /> increase<br /> of<br /> time<br /> complexity.<br /> Our<br /> improvement has lower complexity than that<br /> of current fastest algorithm. Detail notations<br /> are described in the following before giving<br /> our algorithm.<br /> <br /> Nguyễn Văn Trường và đtg<br /> <br /> Tạp chí KHOA HỌC & CÔNG NGHỆ<br /> <br /> - U is the set of all possible binary strings of<br /> length l, and r is the matching threshold.<br /> - s is the binary string, s ∈ U.<br /> - D is the detector set, D ⊂ U.<br /> - A is two dimensions matrix presenting<br /> location table with the size 2r.(l-r+1)<br /> The pseudo code of the first part of the<br /> algorithm to create table A is described as<br /> follow.<br /> Initialize all cells of A to 0;<br /> For each s string to be protected, s ∈ U<br /> For j ∈ [1, l - r+1]<br /> Begin for<br /> i = is an integer number of sub-string of s<br /> whose length is r and starting position is j<br /> within the string s;<br /> A[i, j] = 1;<br /> End for<br /> For example, given self set S containing five<br /> strings to be protected S = {s1 = 01011; s2 =<br /> 11001; s3 = 10110; s4 = 00101; s5 = 01010}; l<br /> = 5, r = 3. Applying our algorithm we had the<br /> table A below.<br /> 0<br /> 1<br /> 2<br /> 3<br /> <br /> 1<br /> 0<br /> 1<br /> 1<br /> 0<br /> <br /> 2<br /> 0<br /> 0<br /> 1<br /> 1<br /> <br /> 3<br /> 0<br /> 1<br /> 1<br /> 1<br /> <br /> 4<br /> <br /> 0<br /> <br /> 1<br /> <br /> 0<br /> <br /> 5<br /> <br /> 1<br /> <br /> 1<br /> <br /> 1<br /> <br /> 6<br /> <br /> 1<br /> <br /> 0<br /> <br /> 1<br /> <br /> 7<br /> <br /> 0<br /> <br /> 0<br /> <br /> 0<br /> <br /> For s1 = 01011: with j = 1 we calculate i =<br /> (010)2 = (2)10. Thus, A[2,1] equals to 1. A[6,2]<br /> equals to 0 because there are no sub-string<br /> 110 (or 6 in integer form) of all five strings<br /> starting in 2nd position. Other values in the<br /> table are calculated in a similar way.<br /> The pseudo code of the second part of the<br /> algorithm for generating complete repertoire<br /> is as follow. In the algorithm, the function<br /> right(d, r-1) returns a r – 1 bit string from the<br /> end of the bit string d.<br /> D=∅;<br /> <br /> 80(04): 121 - 125<br /> <br /> Function Try(d,i)<br /> begin<br /> if (i = l)D ← d;<br /> else<br /> begin<br /> d1=right (d,r-1) + “1”;<br /> d0 = right (d,r-1) + “0”;<br /> k = integer form of d1;<br /> if (A[k,i] = 0) Call Try(d1,i+1);endif<br /> k = integer form of d0;<br /> if (A[k,i] = 0) Call Try(d0,i+1);endif<br /> end<br /> endif<br /> end<br /> for i satisfying A[i,1] = 0<br /> d = binary form of integer i having r bit;<br /> Call Try(d,r);<br /> endfor<br /> For example, with A[4,1] = 0, we initialize d<br /> = 100 as first three bits of candidate detectors.<br /> We calculate d1 = 001 and d0 = 000.<br /> With d1 = 001, we have k = 1, A[1,2] = 0 so<br /> that d1 will be used in next step. Similarly, in<br /> the next step, we calculate d0 = 010 and d1 =<br /> 011. These two strings have values 2 and 3 in<br /> integer form, respectively. Both A[2,3] and<br /> A[3,3] equal to 1, so that we cannot find out<br /> any detector in this step.<br /> With d0 = 000, we have k = 0, A[0,2] = 0 so<br /> that d0 will be used in next step. It is similar<br /> and easy to find out one detector d = 10001 in<br /> the step.<br /> By examining A[0,1], A[3,1] and A[7,1] in<br /> similar way, we can find out a complete<br /> detector repertoire D containing six detectors<br /> D = {d1 = 00000; d2 = 01100; d3 = 01111; d4<br /> = 10000; d5 = 11111; d6 = 11100}.<br /> ANALYSIS AND EXPERIMENT<br /> The principle data structure used in this<br /> algorithm consists of the large (l − r) × 2r A<br /> array to save data extracted from self. This<br /> has an impact on the time and space<br /> complexities of the algorithm: time: O((l –<br /> r).|S| + O(2l-r)), and space: O(2r.(l-r)).<br /> 123<br /> <br /> Nguyễn Văn Trường và đtg<br /> <br /> Tạp chí KHOA HỌC & CÔNG NGHỆ<br /> <br /> The above algorithm runs in time linear in the<br /> size of the self set (for given parameters l and<br /> r). This is much lower than algorithm presents<br /> in [7]. Our algorithm’s complexities are 18<br /> times lower than that of the algorithms in [7]<br /> in some cases. The detailed comparison is in<br /> table 1 below.<br /> We now present our experiment on AIS for<br /> detecting real viruses. Our viruses are quite<br /> strange to some well known antivirus<br /> software; and they can affect .COM and .EXE<br /> file types in the current directory only. The<br /> experiments were carried out by compared the<br /> viruses detecting ability of our AIS-IDS and<br /> that of famous commercial antivirus software<br /> KIS, Kaspersky Internet Security. The<br /> software was licensed and updated on Feb 15,<br /> 2011. The software can not detect our virus<br /> and can only detect the viruses after a few<br /> days. This can be explained as follows: 1.<br /> Database of the software has not had or has<br /> not updated signatures of the viruses; 2.<br /> Mechanism of virus detection based on<br /> behavior is not good enough. In contrast, our<br /> approach allows to detect changes even one<br /> bit of data immediately. Our system can alert<br /> 100% accurately, it means that the rate of<br /> false positive reduces to zero and the rate of<br /> false negative reduces to minimum.<br /> <br /> 80(04): 121 - 125<br /> <br /> Picture 1. Our system found 3 files infected with virus<br /> <br /> Picture 2. KIS can not detect virus<br /> <br /> Table 1. The result comparison<br /> <br /> |S|<br /> <br /> Our alg.’s<br /> space comp.:<br /> O(2r.(l-r))<br /> <br /> Space comp. of<br /> the algo. in [7]:<br /> O(2r.(l-r)2)<br /> <br /> Our algo.’s time<br /> complexity:<br /> O((l – r).|S| +<br /> O(2l-r))<br /> <br /> Time comp. of<br /> the algo. in [7]:<br /> O((l – r).|S| +<br /> O(2r(l-r))<br /> <br /> l<br /> <br /> R<br /> <br /> 250<br /> <br /> 16<br /> <br /> 10<br /> <br /> 6,144<br /> <br /> 36,864<br /> <br /> 1,564<br /> <br /> 7,644<br /> <br /> 500<br /> <br /> 16<br /> <br /> 11<br /> <br /> 10,240<br /> <br /> 51,200<br /> <br /> 2,532<br /> <br /> 12,740<br /> <br /> 500<br /> <br /> 32<br /> <br /> 14<br /> <br /> 294,912<br /> <br /> 5,308,416<br /> <br /> 271,144<br /> <br /> 303,912<br /> <br /> 1,000<br /> <br /> 16<br /> <br /> 12<br /> <br /> 16,384<br /> <br /> 65,536<br /> <br /> 4,016<br /> <br /> 20,384<br /> <br /> 1,000<br /> <br /> 32<br /> <br /> 14<br /> <br /> 294,912<br /> <br /> 5,308,416<br /> <br /> 280,144<br /> <br /> 312,912<br /> <br /> 1,000,000<br /> <br /> 32<br /> <br /> 15<br /> <br /> 557,056<br /> <br /> 9,469,952<br /> <br /> 17,000,362<br /> <br /> 17,557,056<br /> <br /> 124<br /> <br /> Nguyễn Văn Trường và đtg<br /> <br /> Tạp chí KHOA HỌC & CÔNG NGHỆ<br /> <br /> CONCLUSIONS<br /> The key of IDS based on artificial immune is<br /> generating the Detector set. Our approach<br /> helps to reduce both time and space<br /> complexities. This is really new result in the<br /> area. In the future, we plan to report more<br /> detail experimental data about the algorithm<br /> as well as to apply them for dynamic data<br /> space in our next papers. Because we believe<br /> that location table A can contain not only bit<br /> data, but also integer data for dynamic S. It<br /> means that A[i,j] will increase or decrease one<br /> when adding a self to S or deleting a self from<br /> S, respectively. Furthermore, it is our belief<br /> that our method can be used for variable<br /> matching length r. This new idea will be<br /> suitable for various problems in the field of<br /> computer science.<br /> ACKNOWLEDGMENT<br /> The authors would like to thank Dr. Nguyen<br /> Xuan Hoai and many colleagues in IT R&D<br /> Center - HNU, for their useful suggestions<br /> when authors presented the research in seminar<br /> series on computational intelligence last year.<br /> REFERENCES<br /> [1].Forrest et al, Self-Nonself Discrimination in a<br /> Computer, in Proceedings of 1994 IEEE<br /> Symposium on Research in Security and Privacy,<br /> Oakland, CA, 202-212, 1994.<br /> [2].F. Liu, Q. Wang and X. Gao, Survey of<br /> Artificial Immune System, in Proceedings of The<br /> First IEEE International Symposium on Systems<br /> and Control in Aerospace and Astronautics, 985989, 2006.<br /> <br /> 80(04): 121 - 125<br /> <br /> [3].Jamie Twycross at al., Detecting Anomalous<br /> Process Behavior using Second Generation<br /> Artificial<br /> Immune<br /> Systems,<br /> Journal<br /> of<br /> Unconventional Computing, Vol. 1, pp. 1–26,<br /> 2010.<br /> [4].L. N de Castro and J. Timmis, Artificial<br /> Immune Systems: A New Computational<br /> Intlligence Approach, Springer-Verlag, 2002.<br /> [5].Nguyễn Xuân Hoài, Nguyễn Văn Trường, Vũ<br /> Mạnh Xuân, “Hệ miễn dịch nhân tạo và ứng<br /> dụng”, Tạp chí Khoa học và Công nghệ - ĐH Thái<br /> Nguyên, số 2, 2007, 13-18.<br /> [6].Nguyen Van Truong, Pham Dinh Lam,<br /> Improving negative selection algorithm in<br /> artificial immune systems for computer virus<br /> detection, Journal of Science and Technology,<br /> Thai Nguyen University, 72(10), 2010, 53-58.<br /> [7].Patrik D’haeseleer et al., An Immunological<br /> Approach to Change Detection: Algorithms,<br /> Analysis and Implications, IEEE Symposium on<br /> Security and Privacy, 1996.<br /> [8].U. Aickelin, J. Greensmith and J. Twycross,<br /> Immune System Approaches to Intrusion<br /> Detection - A Review, in The Proceedings of the<br /> Third International Conference on Artificial<br /> Immune Systems, LNCS 3239, 316-329, 2004.<br /> [9].T. Lin et al. Research on The Network<br /> Intrusion Detection Based on The Immune<br /> System, in Proceedings of the Fifth IEEE<br /> International Conference on Machine Learning<br /> and Cybernetics, 4479-4482, 2006.<br /> [10]. Zhou Ji, Dipankar Dasgupta, Revisiting<br /> Negative Selection Algorithms, Evolutionary<br /> Computation 15(2), 2007.<br /> <br /> TÓM TẮT<br /> MỘT CÁCH TIẾP CẬN MỚI ĐỂ SINH RA MỘT KHO ĐẦY ĐỦ CÁC BỘ DÒ<br /> TRONG HỆ MIỄN DỊCH NHÂN TẠO<br /> Nguyễn Văn Trường1*, Phạm Đình Lâm2<br /> 1<br /> <br /> Trường Đại học Sư phạm – ĐH Thái Nguyên; 2 Đại học Thái Nguyên<br /> <br /> Vấn đề chủ yếu trong các Hệ miễn dịch nhân tạo là việc sinh ra tập bộ dò tốn nhiều chi phí về thời<br /> gian và bộ nhớ. Điểm yếu trong nhiều thuật toán sinh bộ dò là do các bộ dò được sinh ra ngẫu<br /> nhiên nên nhiều ứng cử viên cho bộ dò bị loại bỏ. Chúng tôi trình bày một cấu trúc dữ liệu mới để<br /> trích rút dữ liệu từ tập tế bào cần bảo vệ. Nó cho phép sinh ra tất cả các bộ dò, hay kho đầy đủ các<br /> bộ dò, với chi phí thấp hơn về thời gian và bộ nhớ so với những nghiên cứu gần đây. Những phân<br /> tích lý thuyết và kết quả thực nghiệm chứng tỏ cách tiếp cận của chúng tôi hiệu quả và khả thi.<br /> Những đặc điểm giá trị này cho phép các hệ thống AIS phức tạp có khả năng cao nhất để phát hiện<br /> những thay đổi trong dữ liệu và làm giảm phát hiện âm cực.<br /> Từ khóa: Hệ miễn dịch nhân tạo, Hệ phát hiện xâm nhập, tập tế bào, tập bộ dò, kho đầy đủ các<br /> bộ dò, phát hiện âm cực<br /> *<br /> <br /> Tel: 0915016063; Email: nvtruongtn@gmail.com<br /> <br /> 125<br /> <br />
ADSENSE

CÓ THỂ BẠN MUỐN DOWNLOAD

 

Đồng bộ tài khoản
2=>2