DLP - what does it mean? What are DLP systems and features of their use in an enterprise? dlp systems.

Leakage channels that lead to the removal of information outside the company's information system can be network leaks (for example, e-mail or ICQ), local (use of external USB drives), stored data (databases). Separately, you can highlight the loss of media (flash memory, laptop). A system can be attributed to the DLP class if it meets the following criteria: multi-channel (monitoring of several possible channels of data leakage); unified management (unified management tools for all monitoring channels); active protection (compliance with security policy); considering both content and context.

The competitive advantage of most systems is the analysis module. Manufacturers emphasize this module so much that they often name their products after it, for example, “Label-Based DLP Solution”. Therefore, the user often chooses solutions not based on performance, scalability, or other criteria traditional for the corporate information security market, but on the basis of the type of document analysis used.

Obviously, since each method has its advantages and disadvantages, the use of only one document analysis method makes the solution technologically dependent on it. Most manufacturers use several methods, although one of them is usually the "flagship" method. This article is an attempt to classify the methods used in the analysis of documents. Their strengths and weaknesses are assessed based on the experience of practical application of several types of products. The article fundamentally does not consider specific products, because. the main task of the user when choosing them is to eliminate marketing slogans like “we will protect everything from everything”, “unique patented technology” and realize what he will be left with when the sellers leave.

Container Analysis

This method analyzes the properties of a file or other container (archive, cryptodisk, etc.) that contains information. The colloquial name for such methods is “solutions on labels”, which quite fully reflects their essence. Each container contains a label that uniquely identifies the type of content contained within the container. The mentioned methods practically do not require computing resources to analyze the information being moved, since the label fully describes the user's rights to move content along any route. In a simplified form, such an algorithm sounds like this: “there is a label - we forbid it, there is no label - we skip it.”

The advantages of this approach are obvious: the speed of analysis and the complete absence of errors of the second kind (when the system mistakenly detects an open document as confidential). Such methods are called "deterministic" in some sources.

The disadvantages are also obvious - the system only cares about tagged information: if the tag is not set, the content is not protected. It is necessary to develop a procedure for labeling new and incoming documents, as well as a system to counter the transfer of information from a labeled container to an unlabeled one through buffer operations, file operations, copying information from temporary files, etc.

The weakness of such systems is also manifested in the organization of labeling. If they are placed by the author of the document, then by malicious intent he has the opportunity not to mark the information that he is going to steal. In the absence of malicious intent, sooner or later negligence or carelessness will appear. If you are required to tag a specific employee, for example, an information security officer or system administrator, then he will not always be able to distinguish confidential content from open content, since he does not thoroughly know all the processes in the company. So, the "white" balance should be posted on the company's website, and the "gray" or "black" balance cannot be taken out of the information system. But only the chief accountant can distinguish one from the other, i.e. one of the authors.

Labels are usually divided into attribute, format and external. As the name suggests, the former are placed in file attributes, the latter are placed in the fields of the file itself, and the third are attached to (associated with) the file by external programs.

Container structures in IB

Sometimes the advantages of solutions based on tags are also low performance requirements for interceptors, because they only check tags, i.e. act like turnstiles in the subway: "if you have a ticket - go through." However, do not forget that miracles do not happen - in this case, the computational load is shifted to workstations.

The place of decisions on labels, whatever they may be, is the protection of document storages. When a company has a document storage, which, on the one hand, is replenished quite rarely, and on the other hand, the category and confidentiality level of each document are precisely known, then it is easiest to organize its protection using labels. You can organize the placement of labels on documents entering the repository using an organizational procedure. For example, before sending a document to the repository, the employee responsible for its functioning can contact the author and the specialist with the question of what level of confidentiality to set for the document. This task is especially successfully solved with the help of format marks, i.e. each incoming document is stored in a secure format and then issued at the request of the employee, indicating it as allowed to read. Modern solutions allow you to assign access rights for a limited time, and after the expiration of the key, the document simply stops being read. It is according to this scheme that, for example, the issuance of documentation for public procurement tenders in the United States is organized: the procurement management system generates a document that can be read without the possibility of changing or copying the contents only of the tender participants listed in this document. The access key is valid only until the deadline for submitting documents to the competition, after which the document ceases to be read.

Also, with the help of solutions based on tags, companies organize document circulation in closed segments of the network in which intellectual property and state secrets circulate. Probably, now, according to the requirements of the Federal Law “On Personal Data”, the document flow in the personnel departments of large companies will also be organized.

Content analysis

When implementing the technologies described in this section, unlike those described earlier, on the contrary, it is completely indifferent in which container the content is stored. The purpose of these technologies is to extract meaningful content from a container or intercept a transmission over a communication channel and analyze the information for prohibited content.

The main technologies in detecting prohibited content in containers are signature control, hash-based control, and linguistic methods.

Signatures

The simplest control method is to search the data stream for some sequence of characters. Sometimes a forbidden sequence of characters is called a "stop word", but in a more general case it can be represented not by a word, but by an arbitrary set of characters, for example, by the same label. In general, this method can not be attributed to content analysis in all its implementations. For example, in most devices of the UTM class, the search for forbidden signatures in the data stream occurs without extracting the text from the container, when analyzing the stream "as is". Or, if the system is configured for only one word, then the result of its work is the determination of a 100% match, i.e. method can be classified as deterministic.

However, more often the search for a specific sequence of characters is still used in text analysis. In the vast majority of cases, signature systems are configured to search for several words and the frequency of occurrence of terms, i.e. we will still refer this system to content analysis systems.

The advantages of this method include independence from the language and ease of replenishing the dictionary of prohibited terms: if you want to use this method to search for a word in Pashto in the data stream, you do not need to know this language, you just need to know how it is written. It is also easy to add, for example, transliterated Russian text or "Albanian" language, which is important, for example, when analyzing SMS texts, ICQ messages or blog posts.

The disadvantages become apparent when using a non-English language. Unfortunately, most manufacturers of text analysis systems work for the American market, and the English language is very "signature" - word forms are most often formed using prepositions without changing the word itself. In Russian, everything is much more complicated. Take, for example, the word "secret" dear to the heart of an information security officer. In English, it means both the noun "secret", the adjective "secret", and the verb "to keep secret". In Russian, several dozen different words can be formed from the root “secret”. Those. if in an English-speaking organization it is enough for an information security officer to enter one word, in a Russian-speaking organization they will have to enter a couple of dozen words and then change them in six different encodings.

In addition, such methods are unstable to primitive coding. Almost all of them give in to the favorite trick of novice spammers - replacing characters with similar ones. The author repeatedly demonstrated to security officers an elementary trick - the passage of confidential text through signature filters. A text containing, for example, the phrase "top secret" is taken, and a mail interceptor configured for this phrase. If the text is opened in MS Word, then a two-second operation: Ctrl + F, "find "o" (Russian layout)", "replace with "o" (English layout)", "replace all", "send document" - makes the document absolutely invisible to this filter. It is all the more disappointing that such a replacement is carried out by regular means of MS Word or any other text editor, i.e. they are available to the user, even if he does not have local administrator rights and the ability to run encryption programs.

Most often, signature-based flow control is included in the functionality of UTM devices, i.e. solutions that clean traffic from viruses, spam, intrusions and any other threats that are detected by signatures. Since this feature is "free", users often feel that this is enough. Such solutions really protect against accidental leaks, i.e. in cases where the outgoing text is not changed by the sender in order to bypass the filter, but they are powerless against malicious users.

masks

An extension of the functionality of the search for stopword signatures is the search for their masks. It is a search for such content that cannot be accurately specified in the base of "stop words", but its element or structure can be specified. Such information should include any codes characterizing a person or enterprise: TIN, account numbers, documents, etc. It is impossible to search for them using signatures.

It is unreasonable to set the number of a specific bank card as a search object, but you want to find any credit card number, no matter how it is written - with spaces or together. This is not just a desire, but a requirement of the PCI DSS standard: it is forbidden to send unencrypted plastic card numbers by e-mail, i.e. it is the user's responsibility to find such numbers in the e-mail and discard prohibited messages.

Here, for example, is a mask that specifies a stop word such as the name of a confidential or secret order, the number of which starts from zero. The mask takes into account not only an arbitrary number, but also any case and even the substitution of Russian letters for Latin ones. The mask is written in the standard "REGEXP" notation, although different DLP systems may have their own, more flexible notation. The situation is even worse with phone numbers. This information is classified as personal data, and you can write it in a dozen ways - using different combinations of spaces, different types of brackets, plus and minus, etc. Here, perhaps, a single mask is indispensable. For example, in anti-spam systems, where a similar task has to be solved, several dozen masks are used simultaneously to detect a telephone number.

Many different codes inscribed in the activities of companies and their employees are protected by many laws and represent commercial secrets, banking secrets, personal data and other legally protected information, so the problem of detecting them in traffic is a prerequisite for any solution.

Hash functions

Various types of hash functions for samples of confidential documents were at one time considered a new word in the leak protection market, although the technology itself has been around since the 1970s. In the West, this method is sometimes called "digital fingerprints", i.e. "digital fingerprints", or "shindles" in scientific slang.

The essence of all methods is the same, although specific algorithms for each manufacturer may differ significantly. Some algorithms are even patented, which confirms the uniqueness of the implementation. The general scenario of action is as follows: a database of samples of confidential documents is being collected. An “imprint” is taken from each of them, i.e. meaningful content is extracted from the document, which is reduced to some normal, for example (but not necessarily) text form, then the hashes of all content and its parts, such as paragraphs, sentences, fives of words, etc., are removed, the detail depends on the specific implementation. These fingerprints are stored in a special database.

The intercepted document is cleared of service information in the same way and brought to a normal form, then fingerprints-shindles are removed from it using the same algorithm. Received prints are searched in the database of prints of confidential documents, and if found, the document is considered confidential. Since this method is used to find direct quotes from a sample document, the technology is sometimes called "anti-plagiarism".

Most of the advantages of this method are also its disadvantages. First of all, this is the requirement to use sample documents. On the one hand, the user does not have to worry about stop words, significant terms and other information that is completely non-specific for security officers. On the other hand, "no pattern, no protection" poses the same problems with new and incoming documents as with label-based technologies. A very important advantage of this technology is its focus on working with arbitrary character sequences. From this follows, first of all, independence from the language of the text - even hieroglyphs, even Pashto. Further, one of the main consequences of this property is the possibility of taking fingerprints from non-textual information - databases, drawings, media files. It is these technologies that Hollywood studios and world recording studios use to protect media content in their digital storages.

Unfortunately, low-level hash functions are not robust to the primitive encoding discussed in the signature example. They easily cope with changing the order of words, rearranging paragraphs and other tricks of "plagiarists", but, for example, changing letters throughout the document destroys the hash pattern and such a document becomes invisible to the interceptor.

Using only this method complicates the work with forms. So, an empty loan application form is a freely distributed document, and a completed one is confidential, since it contains personal data. If you simply take a fingerprint from an empty form, then the intercepted completed document will contain all the information from the empty form, i.e. The prints will largely match. Thus, the system will either let confidential information through or prevent empty forms from being freely distributed.

Despite the shortcomings mentioned, this method is widely used, especially in a business that cannot afford qualified employees, but operates on the principle of "put all confidential information in this folder and sleep well." In this sense, requiring specific documents to protect them is somewhat similar to solutions based on labels, only stored separately from samples and preserved when changing the file format, copying part of the file, etc. However, a large business that has hundreds of thousands of documents in circulation is often simply not able to provide samples of confidential documents, because. the company's business processes do not require it. The only thing that is (or, more honestly, should be) in every enterprise is the "List of information constituting a trade secret." Making patterns out of it is not a trivial task.

The ease of adding samples to a controlled content database often plays tricks on users. This leads to a gradual increase in the fingerprint base, which significantly affects the performance of the system: the more samples, the more comparisons of each intercepted message. Since each print takes up from 5 to 20% of the original, the base of prints gradually grows. Users notice a sharp drop in performance when the database begins to exceed the filtering server's RAM. Usually the problem is solved by regularly auditing sample documents and removing outdated or duplicate samples, i.e. saving on implementation, users lose on operation.

Linguistic Methods

The most common method of analysis today is the linguistic analysis of the text. It is so popular that it is often referred to colloquially as "content filtering". carries the characteristics of the entire class of content analysis methods. From the point of view of classification, both hash analysis, and signature analysis, and mask analysis are “content filtering”, i.e. traffic filtering based on content analysis.

As the name implies, the method works only with texts. You will not use it to protect a database consisting only of numbers and dates, especially drawings, drawings and a collection of favorite songs. But with texts, this method works wonders.

Linguistics as a science consists of many disciplines - from morphology to semantics. Therefore, linguistic methods of analysis also differ from each other. There are methods that use only stop words, only entered at the root level, and the system itself already compiles a complete dictionary; there are terms based on the distribution of weights encountered in the text. There are linguistic methods and their imprints based on statistics; for example, a document is taken, the fifty most used words are counted, then the 10 most used words in each paragraph are selected. Such a "dictionary" is an almost unique characteristic of the text and allows you to find meaningful quotes in "clones".

Analysis of all the subtleties of linguistic analysis is not within the scope of this article, so we will focus on the advantages and disadvantages.

The advantage of the method is complete insensitivity to the number of documents, i.e. rare for corporate information security scalability. The content filtering base (a set of key vocabulary classes and rules) does not change in size depending on the appearance of new documents or processes in the company.

In addition, users note in this method the similarity with "stop words" in that if the document is delayed, then it is immediately clear why this happened. If a fingerprint-based system reports that a document is similar to another, then the security officer will have to compare the two documents himself, and in linguistic analysis he will receive already marked up content. Linguistic systems along with signature filtering are so common because they allow you to start working without changes in the company immediately after installation. There is no need to bother with tagging and fingerprinting, inventorying documents and doing other non-specific work for a security officer.

The disadvantages are just as obvious, and the first one is language dependence. In every country whose language is supported by the manufacturer, this is not a disadvantage, but from the point of view of global companies that have, in addition to a single language of corporate communication (for example, English), many more documents in local languages in each country, this is a clear disadvantage.

Another drawback is a high percentage of type II errors, which requires a qualification in the field of linguistics (to fine-tune the filtering base) to reduce it. Standard industry databases typically give 80-85% filtering accuracy. This means that every fifth or sixth letter is intercepted by mistake. Setting the base to an acceptable 95-97% accuracy is usually associated with the intervention of a specially trained linguist. And although to learn how to adjust the filtering base, it is enough to have two days of free time and speak the language at the level of a high school graduate, there is no one to do this work, except for a security officer, and he usually considers such work to be non-core. Attracting a person from outside is always risky - after all, he will have to work with confidential information. The way out of this situation is usually to buy an additional module - a self-learning "autolinguist", which is "fed" with false positives, and it automatically adapts the standard industry base.

Linguistic methods are chosen when they want to minimize interference in the business, when the information security service does not have the administrative resource to change the existing processes for creating and storing documents. They work always and everywhere, albeit with the disadvantages mentioned.

Popular channels of accidental leaks mobile storage media

InfoWatch analysts believe that mobile media (laptops, flash drives, mobile communicators, etc.) remain the most popular channel for accidental leaks, since users of such devices often neglect data encryption tools.

Another common cause of accidental leaks is paper media: it is more difficult to control than electronic media, since, for example, after a sheet leaves the printer, it can only be monitored “manually”: control over paper media is weaker than control over computer information. Many leak protection tools (you can’t call them full-fledged DLP systems) do not control the output channel of information to the printer, so confidential data easily leaks out of the organization.

This problem can be solved by multifunctional DLP systems that block the sending of unauthorized information for printing and check the correspondence of the postal address and the addressee.

In addition, the provision of leak protection is greatly complicated by the growing popularity of mobile devices, because there are no corresponding DLP clients yet. In addition, it is very difficult to detect a leak in the case of cryptography or steganography. An insider, in order to bypass some kind of filter, can always turn to the Internet for “best practices”. That is, DLP-means protect quite poorly from an organized intentional leak.

The effectiveness of DLP tools can be hampered by their obvious flaws: modern leak protection solutions do not allow you to control and block all available information channels. DLP systems will monitor corporate email, web usage, instant messaging, external media, document printing and content hard drives. But Skype remains out of control for DLP systems. Only Trend Micro has managed to declare that it can control the operation of this communication program. The remaining developers promise that the corresponding functionality will be provided in the next version of their security software.

But if Skype promises to open its protocols to DLP developers, other solutions, such as Microsoft Collaboration Tools for organizing collaboration, remain closed to third-party programmers. How to control the transmission of information over this channel? Meanwhile, in the modern world, the practice is developing when specialists remotely unite into teams to work on a common project and disintegrate after its completion.

The main sources of leaks of confidential information in the first half of 2010 are still commercial (73.8%) and government (16%) organizations. About 8% of leaks come from educational institutions. The nature of the leaking confidential information is personal data (almost 90% of all information leaks).

The leaders in leaks in the world are traditionally the United States and Great Britain (Canada, Russia and Germany are also in the top five countries by the largest number of leaks, with significantly lower rates), which is due to the peculiarity of the legislation of these countries, which requires reporting all incidents of confidential data leakage. Analysts at Infowatch predict a decrease in the share of accidental leaks and an increase in the share of intentional leaks next year.

Implementation difficulties

In addition to the obvious difficulties, the implementation of DLP is also hampered by the difficulty of choosing the right solution, since various vendors of DLP systems profess their own approaches to the organization of protection. Some have patented algorithms for analyzing content by keywords, while others offer a method of digital fingerprints. How to choose the best product in these conditions? What is more efficient? It is very difficult to answer these questions, since there are very few implementations of DLP systems today, and there are even fewer real practices for their use (which one could rely on). But those projects that were nevertheless implemented showed that consulting accounts for more than half of the scope of work and budget, and this usually causes great skepticism among management. In addition, as a rule, existing business processes of the enterprise have to be restructured to meet the requirements of DLP, and companies are having difficulty doing this.

To what extent does the introduction of DLP help to comply with the current requirements of regulators? In the West, the introduction of DLP systems is motivated by laws, standards, industry requirements and other regulations. According to experts, clear legal requirements available abroad, guidelines for ensuring requirements are the real engine of the DLP market, since the introduction of special solutions eliminates claims from regulators. We have a completely different situation in this area, and the introduction of DLP systems does not help to comply with the law.

Some incentive for the introduction and use of DLP in a corporate environment may be the need to protect the trade secrets of companies and comply with the requirements of the federal law "On Trade Secrets".

Almost every enterprise has adopted such documents as the “Regulations on trade secrets” and “List of information constituting a trade secret”, and their requirements should be followed. There is an opinion that the law "On Trade Secrets" (98-FZ) does not work, however, company executives are well aware that it is important and necessary for them to protect their trade secrets. Moreover, this awareness is much higher than the understanding of the importance of the law “On Personal Data” (152-FZ), and it is much easier for any manager to explain the need to introduce confidential document management than to talk about the protection of personal data.

What prevents the use of DLP in the process of automating the protection of trade secrets? According to the Civil Code of the Russian Federation, in order to introduce a trade secret protection regime, it is only necessary that the information has some value and be included in the appropriate list. In this case, the owner of such information is required by law to take measures to protect confidential information.

At the same time, it is obvious that DLP will not be able to solve all issues. In particular, cover access to confidential information to third parties. But there are other technologies for this. Many modern DLP solutions can integrate with them. Then, when building this technological chain, a working system for protecting trade secrets can be obtained. Such a system will be more understandable for the business, and it is the business that will be able to act as the customer of the leak protection system.

Russia and the West

According to analysts, Russia has a different attitude towards security and a different level of maturity of companies supplying DLP solutions. The Russian market focuses on security specialists and highly specialized problems. Data breach prevention people don't always understand what data is valuable. In Russia, a "militarist" approach to the organization of security systems: a strong perimeter with firewalls and every effort is made to prevent penetration inside.

But what if a company employee has access to an amount of information that is not required to perform his duties? On the other hand, if we look at the approach that has been formed in the West in the last 10-15 years, we can say that more attention is paid to the value of information. Resources are directed to where valuable information is located, and not to all the information in a row. Perhaps this is the biggest cultural difference between the West and Russia. However, analysts say the situation is changing. Information is beginning to be perceived as a business asset, and it will take some time to evolve.

There is no comprehensive solution

100% leak protection has not yet been developed by any manufacturer. The problems with using DLP products are formulated by some experts as follows: effective use of the experience of dealing with leaks used in DLP systems requires the understanding that a lot of work on providing leak protection must be done on the customer's side, since no one knows better than him information flows.

Others believe that it is impossible to protect against leaks: it is impossible to prevent information leakage. Since the information is of value to someone, it will be received sooner or later. Software tools can make obtaining this information more costly and time consuming. This can significantly reduce the benefit of possessing information, its relevance. This means that the efficiency of DLP systems should be monitored.

Business performance in many cases depends on maintaining the confidentiality, integrity and availability of information. Currently, one of the most pressing threats in the field of information security (IS) is the protection of confidential data from unauthorized user actions.
This is due to the fact that most of the traditional protection tools such as antiviruses, firewalls (Firewall) and intrusion prevention systems (IPS) are not able to provide effective protection against insiders (insiders), the purpose of which may be to transfer information outside the company for later use. – sales, transfers to third parties, publications in the public domain, etc. To solve the problem of accidental and intentional leaks of confidential data, designed Data Loss Prevention (DLP) systems.
Such systems create a secure "digital perimeter" around the organization, analyzing all outgoing, and in some cases, incoming information. Controlled information is not only Internet traffic, but also a number of other information flows: documents that are taken outside the protected security loop on external media, printed on a printer, sent to mobile media via Bluetooth, WiFi, etc.
DLP systems analyze data flows crossing the perimeter of the protected information system. When confidential information is detected in this stream, the active component of the system is triggered and the transmission of a message (packet, stream, session) is blocked. Identification of confidential information in data streams is carried out by analyzing the content and identifying special features: the signature of the document, specially introduced labels, hash function values from a certain set, etc.
Modern DLP systems have a huge number of parameters and characteristics that must be taken into account when choosing a solution for organizing the protection of confidential information from leaks. Perhaps the most important of these is the network architecture used. According to this parameter, the products of the class under consideration are divided into two large groups: gateway (Fig. 1) and host (Fig. 2).
The first group uses a single server to which all outgoing network traffic of the corporate information system is directed. This gateway processes it in order to detect possible leaks of confidential data.

Rice. 1. Functional diagram of a gateway DLP solution

The second option is based on the use of special programs - agents that are installed on the end nodes of the network - workstations, application servers, etc.

Rice. 2. Functional diagram of the host DLP solution

Recently, there has been a strong trend towards the universalization of DLP systems. There are no or almost no solutions left on the market that could be called purely host or gateway solutions. Even those developers who have been developing only one direction for a long time add modules of the second type to their solutions.
There are two reasons for the transition to the universalization of DLP solutions. The first of them is different areas of application for different types of systems. As mentioned above, host DLP solutions allow you to control all kinds of local, and network - Internet channels for leaking confidential information. Based on the fact that in the vast majority of cases an organization needs full protection she needs both. The second reason for universalization is some technological features and limitations that do not allow purely gateway DLP systems to fully control all the necessary Internet channels.
Since it is not possible to completely prohibit the use of potentially dangerous data transmission channels, it is possible to put them under control. The essence of control is monitoring all transmitted information, identifying confidential information among it, and performing certain operations specified by the organization's security policy. Obviously, the main, most important and time-consuming task is data analysis. It is on its quality that the efficiency of the entire DLP system depends.

Data flow analysis methods for DLP

The task of analyzing the data flow in order to identify confidential information can be safely called non-trivial. Since the search for the necessary data is complicated by many factors that need to be taken into account. Therefore, to date, several technologies have been developed to detect attempts to transfer confidential data. Each of them differs from others in its principle of operation.
Conventionally, all leak detection methods can be divided into two groups. The first includes those technologies that are based on the analysis of the texts of transmitted messages or documents themselves (morphological and statistical analyzes, templates). By analogy with anti-virus protection, they can be called proactive. The second group consists of reactive methods (digital prints and marks). They detect leaks by the properties of documents or the presence of special marks in them.

Morphological analysis

Morphological analysis is one of the most common content methods for detecting leaks of confidential information. The essence of this method is to search for specific words and/or phrases in the transmitted text.
The main advantage of this method is its versatility. On the one hand, morphological analysis can be used to control any communication channels, starting with files copied to removable drives, and ending with messages in ICQ, Skype, social networks, and on the other hand, it can be used to analyze any texts and track any information. At the same time, confidential documents do not need any preliminary processing. And protection takes effect immediately after the processing rules are enabled and applies to all specified communication channels.
The main disadvantage of morphological analysis is the relatively low efficiency of detecting confidential information. Moreover, it depends both on the algorithms used in the protection system and on the quality of the semantic core used to describe the protected data.

Statistical analysis

The principle of operation of statistical methods lies in the probabilistic analysis of the text, which allows us to assume its confidentiality or openness. Their work usually requires preliminary training of the algorithm. During it, the probability of finding certain words, as well as phrases in confidential documents, is calculated.
The advantage of statistical analysis is its versatility. At the same time, it should be noted that this technology works in the normal mode only as part of maintaining the constant learning of the algorithm. So, for example, if in the process of learning the system was offered an insufficient number of contracts, then it will not be able to determine the fact of their transfer. That is, the quality of the statistical analysis depends on the correctness of its settings. At the same time, it is necessary to take into account the probabilistic nature of this technology.

Regular expressions (patterns)

The essence of the method is as follows: the security administrator defines a string template for confidential data: the number of characters and their type (letter or number). After that, the system starts looking for combinations in the analyzed texts that satisfy it, and applies the actions specified in the rules to the found files or messages.
The main advantage of templates is the high efficiency of detecting the transfer of confidential information. With regard to incidents of accidental leaks, it tends to 100%. Cases with intentional transfers are more complicated. Knowing the capabilities of the used DLP system, an attacker can counteract it, in particular, by separating characters with different characters. Therefore, the methods used to protect confidential information must be kept secret.
The disadvantages of templates include, first of all, the limited scope of their application. They can only be used for standardized information, such as the protection of personal data. Another disadvantage of the method under consideration is the relatively high frequency of false positives. For example, a passport number consists of six digits. But, if you set such a pattern, then it will work every time when 6 digits are found in a row. And this can be the contract number sent to the client, the amount, etc.

Digital prints

Under the digital imprint this case is understood as a whole set of characteristic elements of the document, according to which it can be determined with high certainty in the future. Modern DLP solutions are capable of detecting not only entire files, but also their fragments. In this case, you can even calculate the degree of compliance. Such solutions allow you to create differentiated rules that describe different actions for different match percentages.
An important feature of digital prints is that they can be used not only for text, but also for spreadsheet documents, as well as for images. This opens up a wide field for the application of the technology under consideration.

Digital tags

The principle of this method is as follows: special marks are applied to the selected documents, which are visible only to the client modules of the used DLP solution. Depending on their presence, the system allows or prohibits certain actions with files. This allows not only to prevent the leakage of confidential documents, but also to limit the work of users with them, which is an undoubted advantage of this technology.
The disadvantages of this technology include, first of all, the limited scope of its application. It can be used to protect only text documents, and already existing ones. This does not apply to newly created documents. Partially, this disadvantage is leveled by methods of automatic creation of tags, for example, based on a set of keywords. However, this aspect reduces the technology of digital labels to the technology of morphological analysis, that is, in fact, to the duplication of technologies.
Another disadvantage of digital tag technology is that it can be easily bypassed. It is enough to manually type the text of the document in the letter (do not copy it through the clipboard, but type it), and this method will be powerless. Therefore, it is only good in combination with other protection methods.

Main functions of DLP systems:

The main functions of DLP systems are visualized in the figure below (Fig. 3)

control of information transfer via the Internet using E-Mail, HTTP, HTTPS, FTP, Skype, ICQ and other applications and protocols;
control of saving information on external media - CD, DVD, flash, Cell phones and so on.;
protection of information from leakage by controlling the output of data to print;
blocking attempts to send / save confidential data, informing information security administrators about incidents, creating shadow copies, using a quarantine folder;
search for confidential information on workstations and file servers by keywords, document labels, file attributes and digital fingerprints;
prevention of information leaks by controlling the life cycle and movement of confidential information.

Rice. 3. Main functions of DLP systems

Protection of confidential information in a DLP system is carried out at three levels:

Level 1 - Data-in-Motion– data transmitted over network channels:

web (HTTP/HTTPS protocols);
instant messaging services (ICQ, QIP, Skype, MSN, etc.);
corporate and personal mail (POP, SMTP, IMAP, etc.);
wireless systems (WiFi, Bluetooth, 3G, etc.);
ftp - connections.

Level 2 - Data at Rest- data statically stored on:

servers;
workstations;
laptops;
data storage systems (SHD).

Level 3 - Data-in-Use– data used on workstations.

The DLP class system includes the following components:

control and monitoring center;
agents on user workstations;
DLP network gateway installed on the Internet edge.

In DLP systems, confidential information can be determined by a number of different features, as well as in various ways, the main ones are:

morphological analysis of information;
statistical analysis of information;
regular expressions (templates);
digital fingerprint method;
digital label method.

The introduction of DLP systems has long been not just a fashion, but a necessity, because the leakage of confidential data can lead to huge damage to the company, and most importantly, have a long-term impact on the company's business. In this case, the damage can be not only direct, but also indirect. Because in addition to the main damage, especially in the case of disclosure of information about the incident, your company "loses face". It is very, very difficult to assess the damage from the loss of reputation in money! But the ultimate goal of creating an information technology security system is to prevent or minimize damage (direct or indirect, material, moral or otherwise) inflicted on the subjects of information relations through an undesirable impact on information, its carriers and processing processes.

28.01.2014 Sergei Korablev

The choice of any enterprise-level product is not a trivial task for technical specialists and decision makers. Choosing a Data Leak Protection (DLP) data loss prevention system is even more difficult. The lack of a unified conceptual system, regular independent comparative studies and the complexity of the products themselves force consumers to order pilot projects from manufacturers and independently conduct numerous tests, determining the range of their own needs and correlating them with the capabilities of the systems being tested

Such an approach is certainly correct. A balanced, and in some cases even a hard-won decision simplifies further implementation and avoids disappointment in the operation of a particular product. However, the decision-making process in this case can be delayed, if not for years, then for many months. In addition, the constant expansion of the market, the emergence of new solutions and manufacturers further complicate the task of not only choosing a product for implementation, but also creating a preliminary shortlist of suitable DLP systems. Under such conditions, up-to-date reviews of DLP systems are of undoubted practical value for technical specialists. Should a particular solution be included in the test list, or would it be too complex to implement in a small organization? Can the solution be scaled to a company of 10,000 employees? Can a DLP system control business-critical CAD files? An open comparison will not replace thorough testing, but will help answer basic questions that arise at the initial stage of the DLP selection process.

Members

The most popular (according to the Anti-Malware.ru analytical center as of mid-2013) DLP systems of InfoWatch, McAfee, Symantec, Websense, Zecurion and Jet Infosystem companies in the Russian information security market were selected as participants.

For the analysis, commercially available versions of DLP systems were used at the time of preparation of the review, as well as documentation and open reviews of products.

Criteria for comparing DLP systems were selected based on the needs of companies of various sizes and industries. The main task of DLP systems is to prevent leaks of confidential information through various channels.

Examples of products from these companies are shown in Figures 1-6.

Figure 3 Symantec product

Figure 4. InfoWatch product

Figure 5. Websense product

Figure 6. McAfee product

Operating modes

Two main operating modes of DLP systems are active and passive. Active - usually the main mode of operation, which blocks actions that violate security policies, such as sending confidential information to an external Mailbox. Passive mode is most often used at the stage of system configuration to check and adjust settings when the proportion of false positives is high. In this case, policy violations are recorded, but restrictions on the movement of information are not imposed (Table 1).

In this aspect, all the considered systems turned out to be equivalent. Each of the DLPs can work both in active and passive modes, which gives the customer a certain freedom. Not all companies are ready to start operating DLP immediately in blocking mode - this is fraught with disruption of business processes, dissatisfaction on the part of employees of controlled departments and claims (including justified ones) from management.

Technologies

Detection technologies make it possible to classify information transmitted via electronic channels and identify confidential information. Today, there are several basic technologies and their varieties, similar in essence, but different in implementation. Each technology has both advantages and disadvantages. In addition, different types of technologies are suitable for analyzing information of different classes. Therefore, manufacturers of DLP solutions try to integrate the maximum number of technologies into their products (see Table 2).

In general, the products provide a large number of technologies that, if properly configured, provide a high percentage of recognition of confidential information. DLP McAfee, Symantec and Websense are rather poorly adapted for the Russian market and cannot offer users support for "language" technologies - morphology, transliteration analysis and masked text.

Controlled channels

Each data transmission channel is a potential channel for leaks. Even one open channel can negate all the efforts of the information security service that controls information flows. That is why it is so important to block channels that are not used by employees for work, and control the rest with the help of leak prevention systems.

Despite the fact that the best modern DLP systems are capable of monitoring a large number of network channels (see Table 3), it is advisable to block unnecessary channels. For example, if an employee works on a computer only with an internal database, it makes sense to disable his access to the Internet altogether.

Similar conclusions are also valid for local leakage channels. True, in this case it can be more difficult to block individual channels, since ports are often used to connect peripherals, I / O devices, etc.

Encryption plays a special role in preventing leaks through local ports, mobile drives and devices. Encryption tools are quite easy to use, their use can be transparent to the user. But at the same time, encryption allows you to exclude a whole class of leaks associated with unauthorized access to information and the loss of mobile drives.

The situation with the control of local agents is generally worse than with network channels (see Table 4). Successfully controlled by all products only USB devices and local printers. Also, despite the importance of encryption noted above, such a possibility is present only in certain products, and the forced encryption function based on content analysis is present only in Zecurion DLP.

To prevent leaks, it is important not only to recognize sensitive data during transmission, but also to limit the distribution of information in a corporate environment. To do this, manufacturers include tools in DLP systems that can identify and classify information stored on servers and workstations in the network (see Table 5). Data that violates information security policies must be deleted or moved to secure storage.

To detect confidential information on the corporate network nodes, the same technologies are used as to control leaks through electronic channels. The main difference is architectural. If network traffic or file operations are analyzed to prevent leakage, then stored information, the contents of workstations and network servers, is examined to detect unauthorized copies of confidential data.

Of the considered DLP systems, only InfoWatch and Dozor-Jet ignore the use of means for identifying information storage locations. This is not a critical feature for electronic leak prevention, but it greatly limits the ability of DLP systems to proactively prevent leaks. For example, when a confidential document is located within a corporate network, this is not an information leak. However, if the location of this document is not regulated, if the information owners and security officers do not know about the location of this document, this can lead to a leak. Unauthorized access to information is possible or the appropriate security rules will not be applied to the document.

Ease of management

Characteristics such as ease of use and control can be as important as the technical capabilities of solutions. After all, a really complex product will be difficult to implement, the project will take more time, effort and, accordingly, finances. An already implemented DLP system requires attention from technical specialists. Without proper maintenance, regular auditing and adjustment of settings, the quality of recognition of confidential information will drop dramatically over time.

The control interface in the native language of the security officer is the first step to simplify the work with the DLP system. It will not only make it easier to understand what this or that setting is responsible for, but will also significantly speed up the process of configuring a large number of parameters that need to be configured for the system to work correctly. English can be useful even for Russian-speaking administrators for an unambiguous interpretation of specific technical concepts (see Table 6).

Most solutions provide quite convenient management from a single (for all components) console with a web interface (see Table 7). The exceptions are the Russian InfoWatch (there is no single console) and Zecurion (there is no web interface). At the same time, both manufacturers have already announced the appearance of a web console in their future products. The lack of a single console in InfoWatch is due to the different technological basis of the products. The development of its own agency solution was discontinued for several years, and the current EndPoint Security is the successor to a third-party product, EgoSecure (formerly known as cynapspro), acquired by the company in 2012.

Another point that can be attributed to the disadvantages of the InfoWatch solution is that to configure and manage the flagship DLP product InfoWatch TrafficMonitor, you need to know a special scripting language LUA, which complicates the operation of the system. Nevertheless, for most technical specialists, the prospect of improving their own professional level and learning an additional, albeit not very common, language should be perceived positively.

The separation of system administrator roles is necessary to minimize the risks of preventing the appearance of a superuser with unlimited rights and other machinations using DLP.

Logging and reporting

The DLP archive is a database that accumulates and stores events and objects (files, letters, http requests, etc.) recorded by the system's sensors during its operation. The information collected in the database can be used for various purposes, including for analyzing user actions, for saving copies of critical documents, as a basis for investigating information security incidents. In addition, the database of all events is extremely useful at the stage of implementing a DLP system, since it helps to analyze the behavior of the DLP system components (for example, to find out why certain operations are blocked) and to adjust security settings (see Table 8).

In this case, we see a fundamental architectural difference between Russian and Western DLPs. The latter do not archive at all. In this case, DLP itself becomes easier to maintain (there is no need to maintain, store, backup and study a huge amount of data), but not to operate. After all, the archive of events helps to configure the system. The archive helps to understand why the transmission of information was blocked, to check whether the rule worked correctly, and to make the necessary corrections to the system settings. It should also be noted that DLP systems need not only initial configuration during implementation, but also regular “tuning” during operation. A system that is not properly maintained, not brought up by technical specialists, will lose a lot in the quality of information recognition. As a result, both the number of incidents and the number of false positives will increase.

Reporting is an important part of any activity. Information security is no exception. Reports in DLP systems perform several functions at once. First, concise and understandable reports allow heads of information security services to quickly monitor the state of information security without going into details. Second, detailed reports help security officers adjust security policies and system settings. Thirdly, visual reports can always be shown to top managers of the company to demonstrate the results of the DLP system and the information security specialists themselves (see Table 9).

Almost all competing solutions discussed in the review offer both graphical, convenient for top managers and heads of information security services, and tabular reports, more suitable for technical specialists. Graphical reports are missing only in DLP InfoWatch, for which they were lowered.

Certification

The question of the need for certification for information security tools and DLP in particular is open, and experts often argue on this topic within professional communities. Summarizing the opinions of the parties, it should be recognized that certification itself does not provide serious competitive advantages. At the same time, there are a number of customers, primarily government organizations, for which the presence of a particular certificate is mandatory.

In addition, the existing certification procedure does not correlate well with the software development cycle. As a result, consumers are faced with a choice: to buy an already outdated, but certified version of the product or an up-to-date, but not certified version. The standard way out in this situation is to purchase a certified product "on the shelf" and use the new product in a real environment (see Table 10).

Comparison results

Let's summarize the impressions of the considered DLP solutions. In general, all participants made a favorable impression and can be used to prevent information leaks. Differences in products allow you to specify the scope of their application.

The InfoWatch DLP system can be recommended to organizations for which it is fundamentally important to have a FSTEC certificate. However, the latest certified version of InfoWatch Traffic Monitor was tested at the end of 2010, and the certificate expires at the end of 2013. Agent-based solutions based on InfoWatch EndPoint Security (also known as EgoSecure) are more suitable for small businesses and can be used separately from Traffic Monitor. The combined use of Traffic Monitor and EndPoint Security can cause scaling issues in large companies.

Products of Western manufacturers (McAfee, Symantec, Websense), according to independent analytical agencies, are much less popular than Russian ones. The reason is the low level of localization. And it's not even the complexity of the interface or the lack of documentation in Russian. Features of technologies for recognizing confidential information, pre-configured templates and rules are "sharpened" for the use of DLP in Western countries and are aimed at fulfilling Western regulatory requirements. As a result, the quality of information recognition in Russia turns out to be noticeably worse, and compliance with the requirements of foreign standards is often irrelevant. At the same time, the products themselves are not bad at all, but the specifics of using DLP systems on the Russian market are unlikely to allow them to become more popular than domestic developments in the foreseeable future.

Zecurion DLP is notable for good scalability (the only Russian DLP system with confirmed implementation for more than 10,000 workplaces) and high technological maturity. What is surprising, however, is the lack of a web console that would help simplify the management of an enterprise solution aimed at various market segments. Among the strengths of Zecurion DLP are - high quality confidential information recognition and a full line of leak prevention products, including protection on the gateway, workstations and servers, identification of information storage locations and tools for data encryption.

The Dozor-Jet DLP system, one of the pioneers of the domestic DLP market, is widely distributed among Russian companies and continues to grow its client base due to extensive connections of the Jet Infosystems system integrator, part-time and DLP developer. Although technologically DLP is somewhat behind its more powerful counterparts, its use can be justified in many companies. In addition, unlike foreign solutions, Dozor Jet allows you to archive all events and files.

Leakage of commercially significant information can lead to significant losses for the company - both financial and reputational. Configuring DLP components allows you to track internal correspondence, email messages, data exchange, work with cloud storage, launch applications on the desktop, connect external devices, reports, SMS messages, telephone conversations. All suspicious transactions are monitored and a reporting base is created on tracked precedents. To do this, DLP systems have built-in mechanisms for determining the system of confidential information, for which special document markers and their very content (by keywords, phrases, sentences) are analyzed. Row possible advanced settings on the control of personnel (the legality of actions within the company, the use of work resources, up to printouts on printers).

If full control over data transfer is a priority, then the initial setup of DLP will be to identify and determine possible information leaks, control end devices and allow users to access company resources. If the priority is statistics on the movement of important corporate information within the organization, then channels and methods of data transmission are calculated to track it. DLP systems are configured individually for each enterprise, based on the expected threat models, categories of violations, and identification of possible information leakage channels.

DLP occupy a large market niche in the field of economic security. Based on the research of the Anti-Malware.ru Analytical Center, there is a noticeable increase in the demand of companies for DLP systems, an increase in sales and an expansion of the product line. The actual setting is to prevent the transfer of unwanted information not only from the inside to the outside, but also from the outside to the inside of the enterprise information network. Moreover, given the widespread virtualization in corporate information systems and the widespread use of mobile devices through which business control of mobile employees is carried out, this is one of the highest priority tasks.

It is important to take into account the integration of the selected DLP systems with the corporate IT network, the applications that the company uses. To successfully prevent data leakage and prompt action to curb the misuse of corporate information, it is necessary to establish stable work DLP, set up functionality in accordance with the tasks, set up work with internal corporate electronic mailboxes, USB drives, instant messengers, cloud storage, mobile devices, and in the case of working in a large corporation, integration with a SIEM system within the SOC.

Entrust the implementation of the DLP system to specialists. System integrator "Radius" will install and configure DLP in accordance with the standards and norms of information security, as well as the specifics of the client company.

Introduction

The review is intended for all those who are interested in the DLP solutions market and, first of all, for those who want to choose the right DLP solution for their company. The review considers the DLP systems market in the broad sense of the term, gives a brief description of the global market and a more detailed description of the Russian segment.

Systems for protecting valuable data have existed since their inception. Over the centuries, these systems have developed and evolved along with humanity. With the beginning of the computer era and the transition of civilization to the post-industrial era, information has gradually become the main value of states, organizations and even individuals. And computer systems have become the main tool for its storage and processing.

States have always protected their secrets, but states have their own means and methods, which, as a rule, did not influence the formation of the market. In the post-industrial era, banks and other financial organizations have become frequent victims of computer leaks of valuable information. The world banking system was the first to need legal protection of its information. The need to protect privacy has also been recognized in medicine. As a result, for example, the United States adopted the Health Insurance Portability and Accountability Act (HIPAA), the Sarbanes-Oxley Act (SOX), and the Basel Committee on Banking Supervision issued a series of recommendations called "Basel Accords". Such steps gave a powerful impetus to the development of the market for computer information protection systems. Following the growing demand, companies began to appear offering the first DLP systems.

What are DLP systems?

There are several generally accepted interpretations of the term DLP: Data Loss Prevention, Data Leak Prevention or Data Leakage Protection, which can be translated into Russian as “data loss prevention”, “data leakage prevention”, “data leakage protection”. The term became widespread and took hold in the market around 2006. And the first DLP systems arose a little earlier precisely as a means of preventing the leakage of valuable information. They were designed to detect and block the network transmission of information identifiable by keywords or expressions and pre-created digital "fingerprints" of confidential documents.

Further development of DLP systems was determined by incidents, on the one hand, and legislative acts of states, on the other. Gradually, the need to protect against various types of threats led companies to the need to create comprehensive protection systems. Currently, developed DLP products, in addition to direct data leakage protection, provide protection against internal and even external threats, employee time tracking, control of all their actions on workstations, including remote work.

At the same time, blocking the transfer of confidential data, the canonical function of DLP systems, has become absent in some modern solutions attributed by developers to this market. Such solutions are suitable only for monitoring the corporate information environment, but as a result of the manipulation of terminology, they began to be called DLP and refer to this market in a broad sense.

Currently, the main interest of developers of DLP systems has shifted towards the breadth of coverage of potential channels of information leakage and the development of analytical tools for investigating and analyzing incidents. The latest DLP products intercept document viewing, printing and copying to external media, running applications on workstations and connecting external devices to them, and modern analysis of intercepted network traffic allows you to detect a leak even for some tunneling and encrypted protocols.

In addition to developing their own functionality, modern DLP systems provide ample opportunities for integration with various related and even competing products. Examples include the widespread support for the ICAP protocol provided by proxy servers and the integration of the DeviceSniffer module, which is part of the SearchInform Information Security Loop, with Lumension Device Control. Further development of DLP systems leads to their integration with IDS / IPS products, SIEM solutions, document management systems and workstation protection.

DLP systems are distinguished by the way data leaks are detected:

when using (Data-in‑Use) - at the user's workplace;
during transmission (Data-in‑Motion) - in the company's network;
at storage (Data-at‑Rest) - on servers and workstations of the company.

DLP systems can recognize critical documents:

on formal grounds, this is reliable, but requires preliminary registration of documents in the system;
content analysis - this can give false positives, but allows you to detect critical information in any documents.

Over time, both the nature of threats and the composition of customers and buyers of DLP systems have changed. The modern market imposes the following requirements on these systems:

support for several methods for detecting data leakage (Data in‑Use, Data -in‑Motion, Data-at‑Rest);
support for all popular network data transfer protocols: HTTP, SMTP, FTP, OSCAR, XMPP, MMP, MSN, YMSG, Skype, various P2P protocols;
the presence of a built-in directory of websites and the correct processing of traffic transmitted to them (web mail, social media, forums, blogs, job search sites, etc.);
support for tunneling protocols is desirable: VLAN, MPLS, PPPoE, and the like;
transparent control of secure SSL/TLS protocols: HTTPS, FTPS, SMTPS and others;
support for VoIP telephony protocols: SIP, SDP, H.323, T.38, MGCP, SKINNY and others;
the presence of hybrid analysis - support for several methods of recognizing valuable information: by formal features, by keywords, by matching content with a regular expression, based on morphological analysis;
the ability to selectively block the transmission of critical important information via any monitored channel in real time; selective blocking (for individual users, groups or devices);
the ability to control user actions over critical documents is desirable: viewing, printing, copying to external media;
the ability to control network protocols for working with mail servers Microsoft Exchange (MAPI), IBM Lotus Notes, Kerio, Microsoft Lync, etc. is desirable. to analyze and block messages in real time using protocols: (MAPI, S/MIME, NNTP, SIP, etc.);
interception, recording and recognition of voice traffic is desirable: Skype, IP-telephony, Microsoft Lync;
the presence of a graphics recognition module (OCR) and content analysis;
support for the analysis of documents in several languages;
maintenance of detailed archives and logs for the convenience of investigating incidents;
it is desirable to have developed tools for analyzing events and their relationships;
the ability to build various reports, including graphical reports.

Thanks to new trends in the development of information technologies, new functions of DLP‑products are also becoming in demand. With the widespread use of virtualization in corporate information systems, it became necessary to support it in DLP solutions as well. The ubiquitous use of mobile devices as a business tool has fueled the emergence of mobile DLP. The creation of both corporate and public "clouds" required their protection, including DLP systems. And, as a logical continuation, it led to the emergence of "cloud" information security services (security as a service - SECaaS).

How a DLP system works

A modern information leakage protection system, as a rule, is a distributed software and hardware complex, consisting of a large number of modules for various purposes. Some of the modules operate on dedicated servers, some - on the workstations of the company's employees, and some - on the workplaces of security officers.

Dedicated servers may be required for modules such as the database and sometimes for information analysis modules. These modules, in fact, are the core and no DLP system can do without them.

The database is necessary for storing information, ranging from control rules and detailed information about incidents to all documents that have come into the system's field of vision for a certain period. In some cases, the system can even store a copy of all company network traffic intercepted over a given period of time.

Information analysis modules are responsible for analyzing texts extracted by other modules from various sources: network traffic, documents on any information storage devices within the company. Some systems have the ability to extract text from images and recognize intercepted voice messages. All parsed texts are matched against predefined rules and flagged accordingly when a match is found.

To control the actions of employees, special agents can be installed on their workstations. Such an agent must be protected from user interference in its work (in practice this is not always the case) and can both passively monitor its actions and actively prevent those that are prohibited to the user by the company's security policy. The list of controlled actions may be limited to logging in/out of the system and connecting USB devices, and may include intercepting and blocking network protocols, shadow copying documents to any external media, printing documents to local and network printers, transferring information via Wi-Fi and Bluetooth And much more. Some DLP systems are capable of recording all keystrokes (key-logging) and saving screen shots (screen-shots), but this is beyond common practice.

Usually, as part of a DLP system, there is a control module designed to monitor the operation of the system and its administration. This module allows you to monitor the performance of all other system modules and configure them.

For the convenience of a security analyst, a DLP system can have a separate module that allows you to set up a company's security policy, track its violations, conduct their detailed investigation and generate the necessary reports. Oddly enough, ceteris paribus, it is the ability to analyze incidents, conduct a full-fledged investigation, and report that comes to the fore in importance in a modern DLP system.

Global DLP market

The market for DLP systems began to take shape already in this century. As mentioned at the beginning of the article, the very concept of "DLP" spread around 2006. The largest number of companies that created DLP systems originated in the United States. There was the greatest demand for these solutions and a favorable environment for the creation and development of such a business.

Almost all of the companies that started DLP systems and achieved notable success in this were bought or absorbed, and their products and technologies were integrated into larger information systems. For example, Symantec acquired Vontu (2007), Websense acquired PortAuthority Technologies Inc. (2007), EMC Corp. acquired RSA Security (2006), and McAfee acquired a number of companies: Onigma (2006), SafeBoot Holding B.V. (2007), Reconnex (2008), TrustDigital (2010), tenCube (2010).

Currently, the world's leading manufacturers of DLP systems are: Symantec Corp., RSA (a division of EMC Corp.), Verdasys Inc, Websense Inc. (in 2013 bought by a private company Vista Equity Partners), McAfee (in 2011 bought by Intel). A significant role in the market is played by Fidelis Cybersecurity Solutions (acquired by General Dynamics in 2012), CA Technologies and GTB Technologies. A clear illustration of their position in the market, in one of the sections, can serve as the magic quadrant of the analytical company Gartner at the end of 2013 (Figure 1).

Figure 1. DistributionpositionsDLP-systems in the world marketByGartner

Russian DLP market

In Russia, the market for DLP systems began to take shape almost simultaneously with the world market, but with its own peculiarities. This happened gradually, as incidents arose and attempts were made to deal with them. In 2000, Jet Infosystems was the first company in Russia to develop a DLP solution (at first it was a mail archive). A little later, in 2003, InfoWatch was founded as a subsidiary of Kaspersky Lab. It was the decisions of these two companies that set the benchmarks for the rest of the players. A little later, these included Perimetrix, SearchInform, DeviceLock, SecureIT (renamed Zecurion in 2011). As the state creates legislative acts relating to the protection of information (the Civil Code of the Russian Federation, article 857 “Banking secrecy”, 395-1-FZ “On banks and banking activities”, 98-FZ “On commercial secrets”, 143-FZ “On acts of civil status ”, 152-FZ “On Personal Data”, and others, about 50 types of secrets in total), the need for protection tools increased and the demand for DLP systems grew. And a few years later, the “second wave” of developers came to the market: Falcongaze, MFI Soft, Trafica. It is worth noting that all these companies had developments in the field of DLP much earlier, but they became substitutions on the market relatively recently. For example, the MFI Soft company began developing its DLP solution back in 2005, and announced itself on the market only in 2011.

Even later, the Russian market became interesting for foreign companies as well. In 2007-2008 Symantec, Websense and McAfee products became available to us. Most recently, in 2012, GTB Technologies introduced its solutions to our market. Other world market leaders also keep trying to enter the Russian market, but so far without noticeable results. IN last years The Russian DLP market has been showing stable growth (over 40% annually) for several years, which attracts new investors and developers. As an example, we can name the company Iteranet, which since 2008 has been developing elements of a DLP system for internal purposes, then for corporate customers. The company is currently offering its Business Guardian solution to Russian and foreign customers.

The company separated from Kaspersky Lab in 2003. At the end of 2012, InfoWatch occupies more than a third of the Russian DLP market. InfoWatch offers a full range of DLP solutions for customers ranging from medium-sized businesses to large corporations and government agencies. InfoWatch Traffic Monitor solutions are the most demanded in the market. The main advantages of their solutions are: advanced functionality, unique patented traffic analysis technologies, hybrid analysis, support for multiple languages, built-in web resource directory, scalability, a large number of preset configurations and policies for different industries. Distinctive features of the InfoWatch solution are a single management console, monitoring the actions of employees under suspicion, an intuitive interface, the formation of security policies without the use of Boolean algebra, the creation of user roles (security officer, company manager, HR director, etc.). Disadvantages: lack of control over user actions on workstations, heavy weight of InfoWatch Traffic Monitor for medium-sized businesses, high cost.

The company was founded back in 1991 and today is one of the pillars of the Russian DLP market. Initially, the company developed systems for protecting organizations from external threats, and its entry into the DLP market is a natural step. Jet Infosystems is an important player in the Russian information security market, providing system integration services and developing its own software. In particular, Dozor-Jet's own DLP solution. Its main advantages are: scalability, high performance, the ability to work with Big Data, a large set of interceptors, a built-in directory of web resources, hybrid analysis, an optimized storage system, active monitoring, work "in a gap", tools for quick search and analysis of incidents, advanced technical support, including in the regions. The complex also has the ability to integrate with systems of classes SIEM, BI, MDM, Security Intelligence, System and Network Management. Own know-how - the "Dossier" module, designed to investigate incidents. Disadvantages: insufficient functionality of agents for workstations, poor development of control over user actions, the solution is focused only on large companies, high cost.

An American company that started its business in 1994 as a manufacturer of information security software. In 1996, she introduced her first own development "Internet Screening System" to control the actions of personnel on the Internet. In the future, the company continued to work in the field of information security, developing new segments and expanding the range of products and services. In 2007, the company strengthened its position in the DLP market by acquiring PortAuthority. In 2008 Websense entered the Russian market. The company currently offers a comprehensive Websense Triton product to protect against confidential data leaks, as well as appearance threats. Main advantages: unified architecture, performance, scalability, multiple delivery options, predefined policies, advanced reporting and event analysis tools. Disadvantages: no support for a number of IM protocols, no support for the morphology of the Russian language.

Symantec Corporation is a recognized global leader in the DLP solutions market. This happened after the purchase in 2007 of Vontu, a major manufacturer of DLP systems. Since 2008, Symantec DLP has been officially represented on the Russian market. At the end of 2010, Symantec was the first foreign company to localize its DLP product for our market. The main advantages of this solution are: powerful functionality, a large number of methods for analysis, the ability to block a leak through any controlled channel, a built-in website directory, the ability to scale, a developed agent for analyzing events at the workstation level, rich international implementation experience and integration with other Symantec products. The disadvantages of the system include high cost and lack of control over some popular IM protocols.

This Russian company was founded in 2007 as a developer of information security tools. The main advantages of the Falcongaze SecureTower solution are: ease of installation and configuration, user-friendly interface, control of a larger number of data transmission channels, advanced information analysis tools, the ability to monitor employee actions at workstations (including viewing screenshots of the desktop), a graph analyzer of personnel relationships, scalability, fast search by intercepted data, visual reporting system according to various criteria.

Disadvantages: it does not provide work in a gap at the gateway level, limited options for blocking the transfer of confidential data (only SMTP, HTTP and HTTPS), the absence of a module for searching confidential data in the enterprise network.

American company founded in 2005. Thanks to its own developments in the field of information security, it has great potential for development. She entered the Russian market in 2012 and successfully implemented several corporate projects. The advantages of its solutions are: high functionality, control of multiple protocols and channels of potential data leakage, original patented technologies, modularity, integration with IRM. Disadvantages: partial Russian localization, no Russian documentation, no morphological analysis.

A Russian company founded in 1999 as a system integrator. In 2013 it was reorganized into a holding company. One of the activities is to provide a wide range of services and products for information security. One of the company's products is the proprietary Business Guardian DLP system.

Advantages: high speed of information processing, modularity, territorial scalability, morphological analysis in 9 languages, support for a wide range of tunneling protocols.

Disadvantages: limited ability to block information transfer (only supported by plug-ins for MS Exchange, MS ISA/TMG and Squid), limited support for encrypted network protocols.

MFI Soft is a Russian company developing information security systems. Historically, the company specializes in complex solutions for telecom operators, so it pays great attention to data processing speed, fault tolerance and efficient storage. Since 2005, MFI Soft has been developing in the field of information security. The company offers on the market the DLP system of the APK "Garda Enterprise", focused on large and medium-sized enterprises. System advantages: ease of deployment and configuration, high performance, flexible settings for detection rules (including the ability to record all traffic), extensive control over communication channels (in addition to the standard set, including VoIP telephony, P2P and tunneling protocols). Disadvantages: the absence of certain types of reports, the lack of opportunities to block the transfer of information and the search for places to store confidential information in the enterprise network.

The Russian company, founded in 1995, initially specialized in the development of information storage and retrieval technologies. Later, the company applied its experience and developments in the field of information security, created a DLP solution called "Information Security Circuit". Advantages of this solution: broad possibilities for intercepting traffic and analyzing events on workstations, monitoring employees' working hours, modularity, scalability, advanced search tools, speed of processing search requests, graph-connections of employees, own patented search algorithm "Similar Search", own training center for training analysts and technical specialists of clients. Disadvantages: limited ability to block the transfer of information, lack of a single management console.

A Russian company founded in 1996 and specializing in the development of DLP and EDPC solutions. The company moved into the category of DLP manufacturers in 2011, adding to its world-famous DeviceLock solution (device and port control on Windows workstations) in the EDPC category, components that provide network channel control and content analysis and filtering technologies. Today DeviceLock DLP implements all data leakage detection methods (DiM, DiU, DaR). Benefits: flexible architecture and modular licensing, ease of installation and management of DLP policies, incl. through AD group policies, original proprietary mobile device control technologies, support for virtualized environments, availability of agents for Windows and Mac OS, full control of mobile employees outside the corporate network, a resident OCR module (used, among other things, when scanning data storage locations). Disadvantages: lack of a DLP agent for Linux, the version of the agent for Mac computers implements only contextual control methods.

A young Russian company specializing in Deep Packet Inspection (DPI) technologies for network traffic analysis. Based on these technologies, the company is developing its own DLP system called Monitorium. System advantages: ease of installation and configuration, user-friendly interface, flexible and intuitive policy creation mechanism, suitable even for small companies. Disadvantages: limited analysis capabilities (no hybrid analysis), limited control capabilities at the workstation level, no ability to search for locations where unauthorized copies of confidential information are stored in the corporate network.

conclusions

Further development of DLP products is in the direction of consolidation and integration with products of related areas: personnel control, protection against external threats, and other segments of information security. At the same time, almost all companies are working on creating lightweight versions of their products for small and medium-sized businesses, where the ease of deploying a DLP system and its ease of use is more important than complex and powerful functionality. Also, the development of DLP for mobile devices, support for virtualization technologies and SECaaS in the "clouds" continues.

Taking into account all of the above, it can be assumed that the rapid development of the global, and especially the Russian DLP markets, will attract both new investments and new companies. And this, in turn, should lead to further growth in the quantity and quality of DLP products and services offered.