InfoWatch is a privately-held company, delivering its enterprise customers software solutions to monitor and manage information flow (founded by Kaspersky Lab)
InfoWatch information analysis and protection products comprise a set of unique carefully-optimized technologies, such as our linguistic analysis technology.
InfoWatch linguistic analysis technology proves efficient in categorizing and analyzing unstructured data.
InfoWatch Linguistic Analysis Benefits
Unlike other conventional information analysis technologies that work pretty well for structured information (information with explicit structure stored in specialized repositories, marked with special labels and etc.), linguistic analysis is an exceptional tool to categorize data that is intended for human consumption and resides in communication channels rather than in protected server repositories. According to expert estimation this unstructured data comprises up to 80% of the modern enterprise data.
InfoWatch linguistic analysis technology automatically detects information category and confidentiality level based on the terms contained in the analyzed information piece. The analysis is done using the content filtering database (CFD).
The content filtering database depicts various corporate information categories and considers multiple data confidentiality attributes, such as business field specifics, corporate information security requirements and so on.
After linguistic analysis the information is assigned a category that reflects its subject. The analyzed information can include terms and expressions from several categories, that's why it can be assigned several categories from the content filtering database.
The content filtering database is crucial for precise information categorization. This database should be regularly updated: categories and terms should be added or removed, etc.
InfoWatch data protection solution – InfoWatch Traffic Monitor Enterprise – is supplied with a pre-installed general content filtering database. This makes possible for the linguistic analysis technology to be applied instantaneously right after implementation. However content filtering database customization delivers higher categorization granularity.
Based on its many years' experience in serving leading telecommunication carries, government and financial institutions, oil&gas companies, InfoWatch has developed several industry-specific content filtering databases that include categories and terms, relevant for all companies working in the specific vertical market.
Such vertically-adapted content filtering databases boost about 60-70% of categorization reliability.
In a vertically-adapted content filtering database about 80% of categories are relevant for all companies in this market segment. The remaining 20% are categories relevant for a specific company. Including these categories increases categorization granularity and reliability. The vertical-adapted content filtering database can be customized to suit the needs of a specific company either manually, or using the Autolinguist technology developed by InfoWatch.
InfoWatch Autolinguist is a supplementary software product that is used together with InfoWatch Traffic Monitor Enterprise. The product features automatic content filtering database creation or can be used for vertically-adapted content filtering database customization.
To create such a database with InfoWatch Autolinguist, the customer only needs to collect the input documents, sort them out (for example, financial or legal documents, public and confidential, etc.) and load them into the software. The product automatically extracts terms and expressions that will be further used for information categorization. InfoWatch Autolinguist output is a compiled content filtering database that includes company-specific categories and terms, ready to be used inside InfoWatch Traffic Monitor for linguistic analysis.